Common Probability Distributions to Know for Biostatistics

Understanding common probability distributions is key in biostatistics and probabilistic methods. These distributions help model real-world phenomena, from normal data patterns to rare events, guiding decision-making and analysis in various fields, including health and research.

  1. Normal (Gaussian) Distribution

    • Symmetrical, bell-shaped curve characterized by its mean (ยต) and standard deviation (ฯƒ).
    • Approximately 68% of data falls within one standard deviation of the mean, 95% within two, and 99.7% within three (Empirical Rule).
    • Central to many statistical methods due to the Central Limit Theorem, which states that the sum of a large number of independent random variables tends to be normally distributed.
  2. Binomial Distribution

    • Models the number of successes in a fixed number of independent Bernoulli trials (e.g., coin flips).
    • Defined by two parameters: the number of trials (n) and the probability of success (p).
    • Useful for calculating probabilities of discrete outcomes, such as the likelihood of getting a certain number of heads in a series of coin tosses.
  3. Poisson Distribution

    • Describes the number of events occurring in a fixed interval of time or space, given a known average rate (ฮป) and independence of events.
    • Particularly useful for modeling rare events, such as the number of phone calls received at a call center in an hour.
    • The mean and variance of a Poisson distribution are both equal to ฮป.
  4. Exponential Distribution

    • Models the time between events in a Poisson process, characterized by the rate parameter (ฮป).
    • Memoryless property: the probability of an event occurring in the next time interval is independent of how much time has already elapsed.
    • Commonly used in survival analysis and reliability engineering to model lifetimes of objects or time until an event occurs.
  5. Chi-Square Distribution

    • A distribution of the sum of the squares of k independent standard normal random variables, used primarily in hypothesis testing and confidence interval estimation.
    • Commonly applied in tests of independence and goodness-of-fit tests in categorical data analysis.
    • The shape of the distribution depends on the degrees of freedom (df), with more degrees of freedom resulting in a distribution that approaches normality.
  6. Student's t-Distribution

    • Similar to the normal distribution but with heavier tails, making it more suitable for small sample sizes.
    • Defined by degrees of freedom, which affects the shape; as sample size increases, it approaches the normal distribution.
    • Used primarily in hypothesis testing and constructing confidence intervals for means when the population standard deviation is unknown.
  7. Uniform Distribution

    • All outcomes are equally likely within a specified range, characterized by minimum (a) and maximum (b) values.
    • Can be discrete (e.g., rolling a fair die) or continuous (e.g., selecting a random number between 0 and 1).
    • Useful in simulations and scenarios where each outcome has the same probability of occurring.
  8. Bernoulli Distribution

    • A special case of the binomial distribution with a single trial, representing two possible outcomes: success (1) or failure (0).
    • Defined by a single parameter, the probability of success (p).
    • Fundamental in probability theory and serves as the building block for more complex distributions.
  9. Beta Distribution

    • A continuous distribution defined on the interval [0, 1], characterized by two shape parameters (ฮฑ and ฮฒ).
    • Flexible in modeling random variables that represent proportions or probabilities.
    • Commonly used in Bayesian statistics and for modeling random variables that are constrained to a finite range.
  10. Gamma Distribution

    • A continuous distribution defined by a shape parameter (k) and a scale parameter (ฮธ), often used to model waiting times.
    • Generalizes the exponential distribution; when k is an integer, it can represent the sum of k independent exponential random variables.
    • Useful in various fields, including queuing models and reliability analysis.


ยฉ 2024 Fiveable Inc. All rights reserved.
APยฎ and SATยฎ are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

ยฉ 2024 Fiveable Inc. All rights reserved.
APยฎ and SATยฎ are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.