You have 3 free guides left 😟
Unlock your guides
You have 3 free guides left 😟
Unlock your guides

12.2 Multiple comparisons and post-hoc tests

3 min readaugust 7, 2024

When conducting multiple statistical tests, the risk of false positives increases. help control this risk by adjusting significance levels. These methods ensure that overall error rates stay within acceptable limits, maintaining the integrity of research findings.

Post-hoc tests come into play after finding significant effects in analyses like . They allow for between group means, helping researchers pinpoint specific differences. Various post-hoc tests exist, each with unique strengths for different research scenarios.

Multiple Comparison Corrections

Controlling False Positives

Top images from around the web for Controlling False Positives
Top images from around the web for Controlling False Positives
  • (FWER) represents the probability of making at least one (false positive) among all hypotheses tested
  • FWER increases as the number of hypotheses tested increases, leading to a higher chance of obtaining false positives
  • Multiple comparison corrections aim to control the FWER by adjusting the significance level (α) for each individual hypothesis test
  • Controlling FWER ensures that the overall Type I error rate is maintained at the desired level (usually 0.05) across all comparisons

Bonferroni and Holm-Bonferroni Corrections

  • is a simple and conservative method for controlling FWER
    • Divides the desired overall significance level (α) by the number of hypotheses tested (m) to obtain the adjusted significance level for each individual test: αadjusted=αm\alpha_{adjusted} = \frac{\alpha}{m}
    • Ensures that the FWER is controlled at the desired level, but may be overly conservative, leading to reduced (increased )
  • is a step-down procedure that improves upon the Bonferroni correction
    • Orders the p-values from smallest to largest and compares each p-value to a sequentially adjusted significance level: αadjusted,i=αmi+1\alpha_{adjusted, i} = \frac{\alpha}{m - i + 1}, where ii is the rank of the p-value
    • Offers more power than the Bonferroni correction while still controlling FWER

False Discovery Rate

  • (FDR) is an alternative approach to multiple comparison corrections that controls the expected proportion of false positives among all significant results
  • FDR is less conservative than FWER control methods and provides a better balance between Type I and Type II errors
  • is a popular method for controlling FDR
    • Orders the p-values from smallest to largest and compares each p-value to a sequentially adjusted threshold: im×α\frac{i}{m} \times \alpha, where ii is the rank of the p-value and mm is the total number of hypotheses tested
    • Identifies the largest p-value that satisfies the condition and declares all hypotheses with smaller or equal p-values as significant

Post-hoc Tests

Pairwise Comparisons

  • Post-hoc tests are used to make pairwise comparisons between group means after a significant overall effect has been found in an ANOVA
  • Pairwise comparisons involve testing the differences between all possible pairs of group means
  • Multiple comparison corrections are often applied to control the FWER or FDR when conducting pairwise comparisons
  • Common post-hoc tests for pairwise comparisons include test, , and

Tukey's HSD and Scheffe's Tests

  • Tukey's Honest Significant Difference (HSD) test is a widely used post-hoc test for pairwise comparisons
    • Computes a critical value based on the studentized range distribution, which depends on the number of groups and the degrees of freedom for the error term
    • Controls the FWER for all pairwise comparisons and is more powerful than the Bonferroni correction when the number of groups is large
  • Scheffe's test is another post-hoc test that can be used for pairwise comparisons and complex contrasts
    • Uses the F-distribution to compute a critical value and is more conservative than Tukey's HSD test
    • Offers simultaneous confidence intervals for all possible contrasts, making it flexible for testing any linear combination of means

Dunnett's Test

  • Dunnett's test is a specialized post-hoc test used when comparing several treatment groups to a single
  • Computes a critical value based on the Dunnett's distribution, which accounts for the correlation between the comparisons to the control group
  • Controls the FWER for the comparisons between each treatment group and the control group
  • Useful in experiments where the main interest lies in comparing treatments to a control (e.g., drug trials comparing different doses to a placebo)
© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.


© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.
Glossary
Glossary