Hypothesis testing is a fundamental aspect of statistical analysis, used to make inferences or draw conclusions about a population based on sample data or compare groups or individuals. It is a critical tool in various fields, including science, medicine, and social sciences, enabling researchers to test theories and assumptions rigorously. This blog will delve into the concept of hypothesis testing, its types, steps involved, and real-world applications.
What is hypothesis testing?
- In Layman Terms
- An assumption / A statement which can be tested as true or false.
- In Technical Terms
- A statistical test used to determine if we can draw conclusion about population from sample.
- Though most of the time in hypothesis testing we talk about population and sample, it can be extended beyond it as well.
- For example, it can be used to compare characteristics of two or more groups or two or more individuals as well.
- A statistical test used to determine if 2 or more groups have similar parameters/characteristics. In short if 2 or more groups share any similarities or is there any significant difference between them or not.
- A statistical test used to determine if we can draw conclusion about population from sample.
Key Components of Hypothesis Testing
- Null Hypothesis (H0):
- In most of the cases this hypothesis stats similarity or equality.
- The similarity can be between population and sample.
- Two or more groups might have same characteristics.
- The similarity can be between 2 individuals.
- Average score in Mathematics = average score in English.
- We can say sample mean = population mean.
- Sample mean can be equal to 5
- Alternative Hypothesis (H1) or (Ha) :
- In most of the cases this hypothesis stats dis-similarity or non-equality. It will be either one way or two way.
- One sided examples will be:
- sample mean is greater than population mean.
- It can be like sample mean is less than population mean.
- sample mean < 5
- Two sided examples will be:
- sample mean is greater than or less than population mean.
- It will be like sample mean is not equal to population mean.
- sample mean ≠ 5.
- sample mean > 5 or sample mean < 5.
- Significance Level (α):
- This is the threshold for rejecting the null hypothesis. It is typically set at 0.05. It will work in co-ordination with p-value.
- Test Statistic:
- This is a standardized value calculated from sample data, used to determine whether to reject the null hypothesis. Examples include t-scores, z-scores, and chi-square statistics.
- P-Value:
- This is the probability of obtaining test results at least as extreme as the observed results, assuming the null hypothesis is true. A low p-value (< α) indicates strong evidence against the null hypothesis. It will work in co-ordination with Significance Level (α).
- Decision:
- Based on the p-value and the significance level, you decide whether to reject or fail to reject the null hypothesis.
- In general cases if p-value < Significance Level [α] then reject null hypothesis.
- In general cases if p-value > Significance Level [α] then accept null hypothesis.
Why do we need hypothesis testing?
- Hypothesis testing is a fundamental tool in statistics that helps us to make informed decision based on limited data.
- It allows to evaluate possibility of hypothesis by analyzing the sample data and accessing the likelihood that observed pattern could be due to random chance or not.
- It helps to determine how likely it is that the observations are due to chance or randomness or a true underlying effect.
- We rarely have entire population data. Hypothesis testing helps us to draw conclusions about population by analyzing a smaller sample.
- It helps to reduce the influence of personal bias and subjectivity.
- It helps to take decisions based on evidence rather than intuition or guess work.
How to formulate Hypotheses?:
- Null Hypothesis (H₀): A statement of no effect or no difference or similarity.
- Alternative Hypothesis (H₁): A statement indicating the presence of an effect or difference or dis-similarity.
Steps in Hypothesis Testing:
- Step 1: State the Hypotheses: Formulate the null and alternative hypotheses.
- For example, if you are testing whether a new medicine is more effective than the current one, your hypotheses might be:
- H0: The new medicine is not more effective than the current medicine.
- H1: The new medicine is more effective than the current medicine.
- For example, if you are testing whether a new medicine is more effective than the current one, your hypotheses might be:
- Step 2: Choose the Significance Level: Select your α level, commonly set at 0.05. This represents a 5% risk of rejecting the null hypothesis when it is actually true.
- Step 3: Collect Data and Calculate the Test Statistic: Gather your sample data and calculate the appropriate test statistic. For example, if comparing means and sample size is below 20, you might use a t-test.
- Step 4: Determine the P-Value: Using statistical software or tables, find the p-value associated with your test statistic. For t-test, z-test we have got tables which can be used to find p-values.
- Step 5: Make a Decision: If the p-value ≤ α, reject the null hypothesis (suggesting the alternative hypothesis may be true). If the p-value > α, fail to reject the null hypothesis (not enough evidence to support the alternative hypothesis).
- Step 6: Interpret the Results: Draw conclusions based on your decision. If you rejected the null hypothesis, you might conclude that the new medicine is more effective. If you failed to reject it, you might conclude that there is no sufficient evidence to support the claim that the new medicine is more effective.
Where hypothesis testing be used?:
- Medical Research
- Testing the effectiveness of a new medicine. Researchers might test whether the new medicine leads to a statistically significant improvement in patient outcomes compared to a placebo.
- Psychology
- Studying the impact of a cognitive-behavioral therapy (CBT) program on reducing symptoms of depression. Hypothesis testing can determine if there is a significant difference in depression scores before and after the therapy.
- Education
- Evaluating a new teaching method. Researchers might test whether students taught using a new interactive method perform better on standardized tests compared to those taught using traditional methods.
- Business and Marketing
- Analyzing the effect of a new marketing campaign on sales. A company can test if there is a significant increase in sales after the implementation of the campaign compared to the previous period.
- Economics
- Assessing the impact of a policy change on unemployment rates. Economists might test whether a new economic policy has significantly reduced unemployment compared to the period before the policy was implemented.
- Environmental Science
- Evaluating the effect of pollution control measures. Researchers might test whether levels of a specific pollutant in a river have significantly decreased after the implementation of new regulations.
- Engineering
- Quality control in manufacturing. Engineers can test if the dimensions of parts produced by a new machine differ significantly from the specified tolerances.
- Sociology
- Studying the effect of social interventions. Sociologists might test whether a community program aimed at reducing crime has led to a significant decrease in crime rates.
- Agriculture
- Testing the effectiveness of a new fertilizer. Agronomists might test whether crops treated with the new fertilizer yield significantly higher outputs than those treated with the standard fertilizer.
- Finance
- Assessing investment strategies. Financial analysts can test whether a new trading algorithm significantly outperforms the market average over a certain period.
- Public Health
- Evaluating the impact of a vaccination campaign. Public health officials might test if there is a significant reduction in the incidence of a disease following a mass vaccination campaign.
- Sports Science
- Testing training programs. Sports scientists might test whether athletes using a new training regimen show significant improvements in performance metrics compared to those using a conventional regimen.
- Pharmaceuticals
- Medicine development and clinical trials. Hypothesis testing is used extensively to determine if new medications are more effective than existing treatments or placebos.
- Anthropology
- Examining cultural practices. Anthropologists might test hypotheses related to the impact of specific cultural practices on social behavior within a community.
Common Types of Hypothesis Tests
- One-sample t-test: Tests if the mean of a single sample is equal to a known value.
- Two-sample t-test: Tests if the means of two independent samples are equal.
- Paired t-test: Tests if the means of two related samples are equal.
- One-sample z-test: This test compares the mean of a single sample to a hypothesized value. You would use this if you want to see if your sample mean is significantly different from a specific value you have in mind.
- Two-sample z-test: This category deals with comparing the means of two groups.
Here, we have two subcategories:
Paired z-test (dependent samples): This is used when you have measurements from the same subjects or units at two different times or under two different conditions. For instance, you might test the weight of individuals before and after a diet program.
Independent (unrelated samples) z-test: This is used when you have two independent groups and want to compare their means. There’s no connection between the samples, unlike paired data. An example could be comparing the average height of men and women. - Chi-square test: Tests the independence of categorical variables.
- ANOVA: Tests if there are differences between the means of three or more groups.
- Mann-Whitney U Test (Wilcoxon Rank-Sum Test): Used to compare differences between two independent samples when the data does not necessarily follow a normal distribution.
- Wilcoxon Signed-Rank Test: Used to compare two related samples when the data does not necessarily follow a normal distribution.
- Kruskal-Wallis H Test: A non-parametric version of ANOVA used when comparing more than two samples that do not follow a normal distribution.
- Friedman Test: A non-parametric version of repeated measures ANOVA used for comparing more than two related samples.
- Shapiro-Wilk Test: Tests the null hypothesis that the data was drawn from a normal distribution.
- Kolmogorov-Smirnov Test: Tests if a sample comes from a particular distribution.
- Anderson-Darling Test: Another test for checking if a sample comes from a specific distribution, commonly used for normality testing.
You will get video on this topic on the following link. Don’t forget to visit it.