Medical statistics is one of the main areas of statistical activity and has had a considerable influence on clinical medicine and public health through its applications, for example, to the design and analysis of clinical trials and to the epidemiology of chronic and infectious diseases. . Currently, a large number of statistical tools are used for medical decision making in the main activities of diagnosis, treatment and prognosis in medicine theses. These tools provide undeniable help in improving medical outcomes.

These include the measurement of uncertainty by probability, medical indicators and indices, reference ranges and scoring systems. In addition, there are tools such as the Odds Ratio (Odds Ratio), sensitivity, specificity and predictivities, the area under the ROC curve (Receiver Operating Characteristic Curve **)** , the probability ratios and the cost-benefit analysis, which are applied usually in medical research, but which have implications in everyday clinical activities. These tools have become so thoroughly integrated into medical practice that statistical medicine can be considered a medical specialty in its own right.

## Examples of Research with the use of Medical Statistics

Some examples of research topics using medical statistics are:

Analysis of clinical trial data

Survival time modeling

Meta-analysis methods

systematic reviews

Detection of factors associated with hypertension

Predictors of smoking cessation

Identification of danger signs among newborns and mothers

Modeling of pain sensitivity in healthy individuals

Exploration of the genetic bases and the biological pathway of pain perception

Enabling learning when evaluating new health technologies

Consideration of recall bias in the analysis of case-control studies

Case-Only Study Designs in Epidemiology

Estimation of the reproduction number and other epidemiological parameters of infectious diseases

Evaluation and expansion of the self-controlled case series method

Inference of infectious diseases from multivariate serological survey data

Methodological development of syndromic and laboratory statistical surveillance systems

Multivariate meta-analysis

Statistical methods to assess the safety of vaccines and other drugs

Statistical methods for frailty models in survival analysis

Evaluation of quality of life measures

Diagnostic test statistics

Surgical site infection analysis

Statistical methods for single case studies in neuropsychology

Statistical methods for surveillance of mass vaccination programs

## Hypothesis formulation

Whether in clinical practice or in a clinical research laboratory, clinicians often make observations that lead to questions about a particular exposure and a specific disease. For example, it can be observed in clinical practice that several patients taking a certain antihypertensive treatment develop pulmonary symptoms within two weeks of taking the drug.

These types of questions can lead to formal hypotheses that can then be tested with appropriate research study designs and analytical methods. Identifying the study question or hypothesis is a critical first step in planning a study or reviewing the medical literature.

## Identification of Study Objectives

It is important to understand in advance what the objectives of the study are. Here are some questions that can facilitate the process of identifying the objectives of the study:

Is the goal to determine the efficacy of a drug or device under ideal conditions?

How well does a drug or device work in a free-living population (ie effectiveness)?

The causes or risk factors of a disease

The burden of disease in the community

Is the objective of the study to provide information for a quality management activity?

Will the study explore the cost-effectiveness of a certain treatment or diagnostic tool? The hypotheses and objectives of a study are the keys to determining the study design and the most appropriate statistical tests.

## Study Design

Once the study question(s) and objectives have been identified, it is important to select the appropriate study design.

### descriptive studies

The main classification scheme for epidemiological studies distinguishes between descriptive and analytical studies. Descriptive epidemiology focuses on the distribution of diseases by populations, by geographic locations, and by frequency over time. Analytical epidemiology deals with the determinants, or etiology, of disease and tests hypotheses generated by descriptive studies.

#### Correlational Studies

Correlational studies, also called ecological studies, use measures that represent the characteristics of entire populations to describe a given disease in relation to some variable of interest (eg, medication use, age, health care utilization). A correlation coefficient (i.e., Pearson’s “r,” Spearman’s “T,” or Kendall’s “K”) quantifies the degree of linear relationship between the exposure of interest or “predictor” and the disease or “outcome.” ” studied.

#### Case Reports

Case reports and case series are commonly published and describe the experience of a single patient or a series of patients with similar diagnoses. A key design limitation of case report and case series studies is the lack of a comparison group. However, these study designs are often useful for the recognition of new diseases and the formulation of hypotheses about possible risk factors.

#### Cross-Sectional Studies

Cross-sectional studies are also known as prevalence studies. In this type of study, exposure and disease status among people in a well-defined population are evaluated at the same time. Cross-sectional studies are especially useful for estimating the population burden of the disease.

### analytical studies

#### Observational Studies

In observational studies, researchers record participants’ exposures (for example, smoking, cholesterol level) and outcomes (for example, having a heart attack). In contrast, an experimental study consists of assigning one group of patients to one treatment and another group of patients to a different treatment or to no treatment at all. There are two main types of observational studies: case-control and cohort. A case-control study is one in which participants are chosen based on whether they have (cases) or not (controls) the disease of interest.

Ideally, cases should be representative of all people who develop the disease and controls should be representative of all people who do not. Cases and controls are then compared as to whether or not they have the exposure of interest. You can check the difference in the prevalence of exposure between groups with and without disease. In this type of study, the Odds Ratio is the appropriate statistical measure that reflects the differences in exposure between groups.

#### Prospective Studies

In prospective studies, the disease/outcome has not yet occurred. The study investigator should follow the participants in the future to assess any differences in disease incidence/outcome between types of exposure. Disease incidence/outcome is compared between exposed and unexposed groups by calculating relative risk (RR).

#### Experimental Studies

Experimental or interventional studies are often called clinical trials. In these studies, participants are randomly assigned to an exposure (such as a drug, device, or procedure). “The main advantage of this feature (ie: randomized controlled trials) is that if treatments are randomized in a sufficiently large sample size, intervention studies have the potential to provide a degree of assurance about the validity of a result.” which is simply not possible with any observational design option.” Experimental studies are often considered therapeutic or preventive.

#### Outcome Studies

Outcome studies assess the actual effect on the patient (eg, morbidity, mortality, functional ability, satisfaction, return to work or school) over time as a result of their encounter(s). (s) with health care processes and systems. An example of this type of study would be one that evaluated the percentage of patients with a myocardial infarction (MI) who were given a beta-blocker medication and subsequently had another MI.

For some diseases, there may be a significant time lag between the process event and the outcome of interest. This often results in some patients being lost to follow-up, which can lead to erroneous conclusions unless methods are used that “censor” or otherwise adjust for missing, time-dependent covariates.

## What are the appropriate statistical tests?

Once the appropriate design has been determined for a particular study question, it is important to consider the appropriate statistical tests that should be (or have been) performed on the data collected. This is relevant whether you are reviewing a scientific article or planning a clinical study. To begin, we will examine the terms and calculations that are primarily used to describe measures of central tendency and dispersion. These measurements are important for understanding key aspects of any data set.

### Measures of central tendency

There are three commonly known measures of central tendency: the mean, the median, and the mode. The arithmetic mean or average is calculated by adding the values of the observations in the sample and dividing the sum by the number of observations in the sample. This measure is frequently reported for continuous variables: age, blood pressure, pulse, body mass index (BMI), to name a few. The median is the value of the middle observation after ordering all the observations from smallest to largest. It is very useful for ordinal or non-normally distributed data.

The mode is the most frequent value among all the observations in the data set. There may be more than one mode. The mode is most useful on nominal or categorical data. Typically no more than two (bimodal) are described for a given data set.

### Measures of dispersion

Measures of dispersion or variability provide information about the relative position of other data points in the sample. These measures include the following: range, interquartile range, standard deviation, standard error of the mean (SEM), and the coefficient of variation.

### Evaluation of diagnostic tests

In order to understand the aetiology of disease and to provide appropriate and effective health care to people suffering from a given disease, it is essential to distinguish between people in the population who do and do not have the disease of interest. We typically rely on screening and diagnostic tests available at medical centers to obtain information about the disease status of our patients.

However, it is important to assess the quality of these tests in order to make reasonable decisions about their interpretation and use in clinical decision making. When evaluating the quality of diagnostic and screening tests, it is important to consider the validity (ie, sensitivity and specificity) as well as the predictive value (ie, positive and negative predictive values) of the test.

### odds ratio

The odds ratio (OR) is a measure of effect size and is commonly used to compare the results of clinical trials. It is the probability that an event will occur in one group compared to the probability that it will occur in another group.

For example, one research study compared two groups of women who developed diabetes during their pregnancies. One group was treated with metformin and the other with insulin. The researchers recorded how many of the mothers gave birth earlier than expected (less than 37 weeks after becoming pregnant). When they calculated the odds of preterm birth, the odds ratio (OR) for metformin was 1.06. This means that women taking metformin had a small (1.06-fold) increased chance of having a preterm birth compared to women taking insulin.

### Rock Curves

Receiver Operating Characteristic (ROC) curves were developed to assess radar quality. In medicine, ROC curves are a way to analyze the accuracy of diagnostic tests and to determine the best threshold or “cut-off” value to distinguish between positive and negative test results.

Diagnostic tests are almost always a compromise between sensitivity and specificity. The ROC curves provide a graphical representation of this compromise. Setting the cutoff value too low may result in very high sensitivity (ie, no disease would be missed) but at the cost of specificity (ie, many false-positive results). Setting the cutoff value too high will result in high specificity at the expense of sensitivity.

Consider a study that measures the accuracy of B-type natriuretic peptide (BNP) as evidence of impaired left ventricular function. They measured BNP levels in 155 elderly patients, who also underwent echocardiography (the gold standard for diagnosis). A ROC curve was created by plotting the sensitivity versus specificity of 1 for different cutoff values of BNP. The sensitivity and specificity of the BNP test were calculated and plotted, assuming a level of 19.8 pmol/L as the cut-off point for a positive test.

The best cut-off point has the highest sensitivity and the lowest specificity, so it is positioned as high as possible on the vertical axis and as far to the left as possible on the horizontal axis. The area under a ROC curve is a measure of the usefulness or “discrimination” value of a test in general. The larger the area, the more useful the test. The maximum possible area under the curve is simply a perfect square and has an area of 1.0. The curve has an area of 0.85. The 45° diagonal line represents a test that has no discriminative value, that is, it is completely useless.

### Relative Risk – Relative Risk (RR)

Relative risk is the ratio of the risks of an event for the exposed group to the risks for the unexposed group. Therefore, the relative risk provides an increase or decrease in the probability of an event based on a given exposure. Relative risk has the advantage of being a risk ratio, which means that it can be applied to populations with different disease prevalence. Relative risk does not specify the absolute risk of the event occurring.