Statistical analysis is about investigating trends, patterns, and relationships using quantitative data. It is an important research tool used by scientists, governments, companies, and other organizations. In this article we will provide you the complete details about statistical analysis.
To draw valid conclusions, statistical analysis requires careful planning from the beginning of the research process. Hypotheses need to be specified and decisions made about the research design, sample size, and sampling procedure.
Once the sample data is collected, you can organize and summarize the data using descriptive statistics. You can then use inferential statistics to formally test hypotheses and make estimates about the population. Finally, you can interpret and generalize your conclusions.
Steps to carry out Statistical Analysis
Below are 5 detailed steps to carry out the Statistical analysis
Step 1: Write the hypotheses and plan the research design
To collect valid data for statistical analysis, one must first specify the hypotheses and plan the research design.
Write the statistical hypotheses
While the null hypothesis always predicts that there is no effect or relationship between the variables, the alternative hypothesis states your investigation’s prediction of an effect or relationship.
Example: Statistical hypothesis to test an effect
Null Hypothesis: A 5-minute meditation exercise will have no effect on adolescent math test scores.
Alternative Hypothesis: A 5-minute meditation exercise will improve teens’ math test scores.
Example: Statistical hypothesis to test a correlation
Null hypothesis: Parental income and average grade are not related to each other in college students.
Alternative hypothesis: Parental income and GPA are positively correlated in college students.
Research Design Planning
First, decide whether your research will use a descriptive, correlational, or experimental design. Experiments directly influence the variables, while descriptive and correlational studies only measure the variables.
Experimental: You can test a cause-and-effect relationship (for example, the effect of meditation on test scores) using statistical comparison or regression tests.
Correlational: You can explore relationships between variables (for example, parental income and grade point average) without any assumption of causality using correlation coefficients and significance tests.
Descriptive: You can study the characteristics of a population or phenomenon (for example, the prevalence of anxiety in US college students) using statistical tests to draw conclusions from sample data.
Research design also refers to whether you are going to compare participants at the group level, the individual level, or both.
Between subjects: group-level results are compared for participants who have been exposed to different treatments (for example, those who performed a meditation exercise versus those who did not).
Within-subjects: Repeated measures of participants who have participated in all treatments in a study are compared (for example, scores before and after performing a meditation exercise).
Mixed (factorial) design: one variable is altered between subjects and another variable is altered within subjects (for example, the before and after scores of participants who did or did not do a meditation exercise).
Measurement of variables
When planning a research design, you must operationalize your variables and decide exactly how you are going to measure them.
For statistical analysis, it is important to consider the measurement level of your variables, which tells you what kind of data they contain:
Categorical data represents groupings. They can be nominal (eg gender) or ordinal (eg level of language skills).
Quantitative data represents quantities. They can be on an interval scale (for example, test score) or on a ratio scale (for example, age).
Many variables can be measured with different levels of precision. For example, age data can be quantitative (8 years old) or categorical (young). If a variable is coded numerically (for example, level of agreement from 1 to 5), it does not automatically mean that it is quantitative rather than categorical.
Identifying the level of measurement is important for choosing the appropriate statistics and hypothesis tests. For example, a mean score can be calculated with quantitative data, but not with categorical data.
In a research study, in addition to measures of the variables of interest, data on relevant characteristics of the participants are often collected.
Step 2: Collect data from a sample
Population vs. sample
Sampling for statistical analysis
There are two main approaches to selecting a sample.
Probability Sampling: Each member of the population has a chance of being selected for the study by random selection.
Non-probability sampling: Some members of the population are more likely than others to be selected for the study due to criteria such as convenience or voluntary self-selection.
In theory, to obtain highly generalizable results, a probability sampling method should be used. Random selection reduces sampling bias and ensures that the sample data is truly typical of the population. Parametric tests can be used to make robust statistical inferences when data is collected using probability sampling.
But in practice it is rarely possible to collect the ideal sample. Although non-probability samples are more likely to be biased, they are much easier to recruit and collect data. Nonparametric tests are more appropriate for nonprobability samples, but lead to weaker inferences about the population.
If you want to use parametric tests for non-probability samples, you have to show that
Your sample is representative of the population to which you are generalizing your results.
The sample lacks systematic bias.
Note that external validity means that you can only generalize your conclusions to other people who share the characteristics of your sample. For example, the results of Western, educated, industrialized, wealthy, and democratic samples (eg, US college students) are not automatically applicable to all populations that are not of this type.
If you apply parametric tests to non-probability sample data, be sure to explain the limitations of the generalizability of your results in the discussion section.
Create a proper sampling procedure
Based on the resources available for your research, decide how you will recruit participants.
Will you have the resources to publicize your study widely, even outside of your university setting?
Will you have the means to recruit a diverse sample that represents a broad population?
Do you have time to contact and follow up with members of hard-to-reach groups?
Example: Sampling (correlational study)
Its primary population of interest is male college students in the United States. Using social media advertising, you recruit male college seniors from a smaller subpopulation: seven Boston-area colleges.
The participants are volunteers for the survey, so it is a non-probabilistic sample.
Calculate sufficient sample size
Before recruiting participants, decide on your sample size by consulting other studies in your field or using statistics. Too small a sample may be under representative of the sample, while too large a sample will be more expensive than necessary.
There are many sample size calculators on the Internet. Different formulas are used depending on whether there are subgroups or the degree of rigor of the study (for example, in clinical research). As a general rule, a minimum of 30 units or more per subgroup is necessary.
To use these calculators, you need to understand and enter these key components:
Significance level (alpha): The risk of rejecting a true null hypothesis that you are willing to assume, typically set at 5%.
Statistical power: the probability that your study will detect an effect of a certain size if there is one, usually 80% or more.
Expected Effect Size: A standardized indication of the magnitude of the expected result of your study, usually based on other similar studies.
Population Standard Deviation: An estimate of the population parameter based on a previous study or your own pilot study.
Step 3: Summarize the data with descriptive statistics
Once you have collected all the data, you can inspect it and calculate descriptive statistics that summarize it.
Inspect the data
There are several ways to inspect the data, including the following:
Organizing the data for each variable in frequency distribution tables.
Displaying data for a key variable in a bar chart to see the distribution of responses.
Visualizing the relationship between two variables using a scatter plot.
By visualizing your data in tables and graphs, you can assess whether your data follows a skewed or normal distribution and whether there are outliers or missing data.
A normal distribution means that the data is distributed symmetrically about a center where most of the values lie, and that the values get smaller at the extremes.
In contrast, a skewed distribution is skewed and has more values at one end than the other. It is important to consider the shape of the distribution, since only some descriptive statistics with skewed distributions should be used.
Extreme outliers can also produce misleading statistics, so a systematic approach to dealing with these values may be necessary.
Calculate measures of central tendency
Measures of central tendency describe where the majority of values in a data set are located. Three main measures of central tendency are often presented:
Mode: The most popular response or value in the data set.
Median: The value that is exactly in the middle of the data set when ordered from smallest to largest.
Mean: The sum of all values divided by the number of values.
However, depending on the shape of the distribution and the level of measurement, only one or two of these measures may be appropriate. For example, many demographic characteristics can only be described using mode or proportions, while a variable such as reaction time may not have a mode.
Calculate measures of variability
Variability measures indicate the spread of values in a data set. Four main measures of variability are often presented:
Range: The highest value minus the lowest value in the data set.
Interquartile range: the range of the middle of the data set.
Standard Deviation: The average distance between each value in the data set and the mean.
Variance: the square of the standard deviation.
Once again, the shape of the distribution and the level of measurement should guide the choice of variability statistics. The interquartile range is the best measure for skewed distributions, while the standard deviation and variance provide the best information for normal distributions.
Step 4: Test hypotheses or make estimates with inferential statistics
A number that describes a sample is called a statistic, while a number that describes a population is called a parameter. Using inferential statistics, you can draw conclusions about population parameters based on sample statistics.
Researchers often use two main methods (simultaneously) to make inferences in statistics.
Estimation: calculation of the population parameters from the sample statistics.
Hypothesis tests: a formal process for testing the predictions of research on the population using samples.
Two types of estimates of population parameters can be made from sample statistics:
A point estimate: a value that represents your best estimate of the exact parameter.
An interval estimate: A range of values that represents your best estimate of where the parameter lies.
If your goal is to infer and report population characteristics from sample data, it is best to use both point and interval estimates in your work.
A sample statistic can be considered to be a point estimate of the population parameter when a representative sample is available (for example, in a large public opinion poll, the proportion of a sample that supports the current government is taken as the population proportion of government supporters).
There is always an error in the estimate, so you should also provide a confidence interval as an interval estimate to show the variability around a point estimate.
A confidence interval uses the standard error and z-score of the standard normal distribution to indicate where the population parameter is expected to be found most of the time.
Using data from a sample, hypotheses about relationships between population variables can be tested. Hypothesis tests start with the assumption that the null hypothesis is true in the population, and statistical tests are used to assess whether or not the null hypothesis can be rejected.
Statistical tests determine where in an expected distribution of the sample data the sample data would lie if the null hypothesis were true. These tests give two main results:
A test statistic that tells you how much your data differs from the null hypothesis of the test.
A p-value tells you the probability of getting your results if the null hypothesis is actually true in the population.
There are three types of statistical tests:
Comparison: they evaluate the group differences in the results.
Regression: they evaluate the cause and effect relationships between the variables.
Correlation: they evaluate the relationships between the variables without assuming causality.
The choice of statistical test depends on the research questions, the research design, the sampling method, and the characteristics of the data.
Parametric tests make powerful inferences about the population from the sample data. But to use them, some assumptions must be met and only certain types of variables can be used. If your data violates these assumptions, you can perform the appropriate data transformations or use alternative nonparametric tests instead.
A regression models the extent to which changes in a predictor variable produce changes in the outcome variable(s).
Simple linear regression includes a predictor variable and an outcome variable.
Multiple linear regression includes two or more predictor variables and one outcome variable.
Comparison tests typically compare group means. They can be the means of different groups within a sample (for example, a treatment group and a control group), the means of a sample group taken at different times (for example, pre-test and post-test scores) , or a sample mean and a population mean.
A t-test is for exactly 1 or 2 groups when the sample is small (30 or less).
A z test is for exactly 1 or 2 groups when the sample is large.
An ANOVA is for 3 or more groups.
The z and t tests have subtypes based on the number and types of samples and on the hypotheses:
If you only have one sample that you want to compare to the population mean, use a one-sample test.
If you have paired measurements (within-subjects design), use a dependent (paired) samples test.
Also consider if you have completely separate measurements from two unmatched groups (between-subjects design), use an independent samples (unpaired) test.
If you expect a difference between the groups in a specific direction, use a one-tailed test.
If you don’t have any expectations about the direction of the difference between the groups, use a two-tailed test.
The only parametric correlation test is Pearson’s r. The correlation coefficient (r) indicates the strength of a linear relationship between two quantitative variables.
However, to check whether the correlation in the sample is strong enough to be significant in the population, one must also perform a significance test on the correlation coefficient, usually a t-test, to obtain a p-value. This test uses the sample size to calculate how much the correlation coefficient differs from zero in the population.
Step 5: Interpret the results
The last step of the statistical analysis is the interpretation of the results.
In hypothesis testing, statistical significance is the main criterion for drawing conclusions. The p-value is compared to a set level of significance (usually 0.05) to decide if the results are statistically significant or not.
Statistically significant results are considered unlikely if they are due to chance alone. There is only a very low probability of such an outcome occurring if the null hypothesis is true in the population.
A statistically significant result does not necessarily mean that there are important real-life applications or clinical outcomes for a finding.
Instead, the effect size indicates the practical importance of the results. It is important to report the effect size along with inferential statistics to get a full picture of the results. You should also report interval estimates of effect sizes if you are writing an APA-style paper.
Type I and Type II errors are errors made in the research conclusions. A type I error means rejecting the null hypothesis when it is actually true, while a type II error means not rejecting the null hypothesis when it is false.
One can try to minimize the risk of these errors by selecting an optimal significance level and ensuring high power. However, there is a balance between the two errors, so a fine balance is necessary.
Frequentist Statistics vs. Bayesian Statistics
Traditionally, frequentist statistics emphasize testing the significance of the null hypothesis and always start with the assumption of a true null hypothesis.
However, Bayesian statistics has gained popularity as an alternative approach in recent decades. In this approach, previous research is used to continually update hypotheses based on expectations and observations.
The Bayes factor compares the relative strength of the evidence for the null hypothesis versus the alternative rather than reaching a conclusion about rejecting the null hypothesis or not.
We hope that you have clearly understood the Statistical analysis.