🦠Epidemiology Unit 5 – Causation and Causal Inference

Causation and causal inference are crucial concepts in epidemiology, helping researchers understand how exposures affect health outcomes. These tools allow scientists to determine if relationships between variables are truly causal or merely associative, guiding public health interventions and policy decisions. From key concepts like counterfactual outcomes to study designs like randomized controlled trials, this unit covers the methods used to establish causality. It explores confounding, bias, and statistical techniques that help researchers draw accurate conclusions about cause-and-effect relationships in complex health scenarios.

Key Concepts and Definitions

  • Causation establishes a direct relationship between an exposure and an outcome where the exposure is responsible for causing the outcome
  • Causal inference involves using data and analytical methods to determine the causal effect of an exposure on an outcome
  • Counterfactual outcomes consider what would have happened to an exposed group if they had not been exposed and compare that to the actual observed outcome
  • Causal criteria (Hill's criteria) provide guidelines for evaluating evidence to determine if an observed association is causal
    • Includes strength, consistency, specificity, temporality, biological gradient, plausibility, coherence, experiment, and analogy
  • Causal effect estimates the difference in outcomes between an exposed and unexposed group while accounting for potential confounders
  • Directed acyclic graphs (DAGs) visually represent the causal relationships between variables and help identify potential confounders and biases
  • Potential outcomes framework compares the outcomes of the same individual under different exposure conditions

Types of Causal Relationships

  • Direct causation occurs when an exposure directly affects an outcome without any intermediary variables (smoking directly causing lung cancer)
  • Indirect causation involves the exposure affecting the outcome through one or more intermediary variables (obesity leading to diabetes which increases the risk of heart disease)
  • Bidirectional causation happens when two variables cause each other (poverty and poor health outcomes)
  • Reverse causation occurs when the outcome precedes and causes the exposure (cancer causing weight loss, not vice versa)
  • Spurious association appears as a causal relationship between two variables but is actually due to a third variable causing both (ice cream sales and drowning rates both increase in summer due to hot weather)
  • Effect modification happens when the effect of an exposure on an outcome varies depending on the level of a third variable (age modifying the effect of smoking on lung cancer risk)
  • Sufficient cause is a cause that inevitably produces the outcome, while a necessary cause is a cause that must be present for the outcome to occur

Causal Models and Diagrams

  • Structural causal models (SCMs) use mathematical equations to represent the causal relationships between variables
    • SCMs allow for the estimation of causal effects and the testing of causal assumptions
  • Directed acyclic graphs (DAGs) are visual representations of the causal relationships between variables
    • DAGs consist of nodes representing variables and directed edges representing causal relationships
    • DAGs help identify potential confounders, mediators, and colliders
  • Causal diagrams can be used to determine which variables need to be controlled for in order to estimate the causal effect of an exposure on an outcome
  • Confounding can be represented in a DAG as a common cause of both the exposure and the outcome
  • Mediation occurs when the effect of an exposure on an outcome is partially or fully explained by an intermediate variable
  • Collider bias can occur when conditioning on a common effect of both the exposure and the outcome, creating a spurious association

Confounding and Bias

  • Confounding occurs when a third variable is associated with both the exposure and the outcome, distorting the true causal relationship
    • Confounders can create spurious associations or mask true causal effects
  • Selection bias happens when the study participants are not representative of the target population, leading to biased estimates of the causal effect
  • Information bias arises when the exposure or outcome data are inaccurately measured or classified, leading to misclassification
  • Recall bias occurs when participants' ability to accurately report past exposures or outcomes is influenced by their current health status or other factors
  • Immortal time bias can occur in cohort studies when a period of follow-up time during which the outcome cannot occur is incorrectly attributed to the exposed group
  • Confounding by indication happens when the indication for receiving the exposure is also associated with the outcome, creating a spurious association
  • Strategies to address confounding include randomization, restriction, matching, stratification, and adjustment in statistical analyses

Study Designs for Causal Inference

  • Randomized controlled trials (RCTs) randomly assign participants to exposure groups, ensuring that potential confounders are balanced between groups
    • RCTs are considered the gold standard for causal inference but may not always be feasible or ethical
  • Cohort studies follow exposed and unexposed groups over time to compare the incidence of the outcome
    • Prospective cohort studies enroll participants before the outcome occurs, while retrospective cohort studies use existing data
  • Case-control studies compare the exposure history of cases (those with the outcome) to controls (those without the outcome)
    • Case-control studies are useful for rare outcomes but are prone to selection and recall bias
  • Cross-sectional studies measure the exposure and outcome at a single point in time, making it difficult to establish temporal relationships
  • Natural experiments take advantage of naturally occurring variations in exposure to estimate causal effects (comparing outcomes before and after a policy change)
  • Quasi-experimental designs, such as difference-in-differences and regression discontinuity, use non-random assignment to exposure groups to estimate causal effects

Statistical Methods for Causal Analysis

  • Propensity score methods estimate the probability of receiving the exposure based on observed covariates and use this score to balance the exposure groups
    • Propensity score matching pairs exposed and unexposed individuals with similar propensity scores
    • Propensity score stratification groups individuals into strata based on their propensity scores and estimates the causal effect within each stratum
  • Instrumental variable analysis uses a variable that is associated with the exposure but not directly with the outcome to estimate the causal effect
    • Instrumental variables help address unmeasured confounding but require strong assumptions
  • Difference-in-differences estimators compare the change in outcomes over time between exposed and unexposed groups, assuming that any confounding factors remain constant over time
  • Regression discontinuity designs estimate the causal effect by comparing outcomes just above and below a threshold for exposure assignment
  • Mediation analysis decomposes the total effect of an exposure on an outcome into direct and indirect effects through a mediator variable
  • Sensitivity analyses assess the robustness of causal effect estimates to potential unmeasured confounding or violations of assumptions

Challenges and Limitations

  • Unmeasured confounding can bias causal effect estimates if important confounders are not accounted for in the analysis
  • Measurement error in the exposure or outcome can lead to biased estimates of the causal effect
  • Violations of the positivity assumption occur when there are no unexposed individuals for some levels of the confounders, making it difficult to estimate the causal effect
  • Generalizability of causal effect estimates may be limited if the study population is not representative of the target population
  • Causal inference methods rely on strong assumptions, such as exchangeability, consistency, and positivity, which may not always hold in practice
  • Ethical considerations may limit the use of certain study designs or the ability to randomize exposure
  • Complexity of real-world causal relationships, such as time-varying exposures and outcomes, can make causal inference challenging

Real-World Applications

  • Estimating the causal effect of a new drug on patient outcomes in clinical trials
  • Evaluating the impact of public health interventions, such as smoking cessation programs, on population health
  • Assessing the causal relationship between environmental exposures and health outcomes in environmental epidemiology
  • Investigating the causal effects of social determinants of health, such as education and income, on health disparities
  • Analyzing the effectiveness of policy interventions, such as taxes on sugar-sweetened beverages, on health outcomes
  • Studying the causal impact of lifestyle factors, such as diet and physical activity, on chronic disease risk
  • Evaluating the causal effects of healthcare delivery systems and quality improvement initiatives on patient outcomes
  • Examining the causal relationships between occupational exposures and workers' health in occupational epidemiology


© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.