You have 3 free guides left 😟
Unlock your guides
You have 3 free guides left 😟
Unlock your guides

6.3 Marginal and conditional distributions

3 min readjuly 19, 2024

Joint, marginal, and conditional distributions are key concepts in probability theory. They help us understand how random variables relate to each other and how to calculate probabilities in complex scenarios.

These distributions are essential for analyzing real-world data and making predictions. By mastering them, you'll be able to tackle problems in fields like data science, machine learning, and statistical inference with confidence.

Joint, Marginal, and Conditional Distributions

Derivation of marginal probabilities

Top images from around the web for Derivation of marginal probabilities
Top images from around the web for Derivation of marginal probabilities
  • P(X=x,Y=y)P(X=x, Y=y) describes the probability of two random variables XX and YY simultaneously taking on specific values xx and yy
  • [P(X=x)](https://www.fiveableKeyTerm:p(x=x))[P(X=x)](https://www.fiveableKeyTerm:p(x=x)) or [P(Y=y)](https://www.fiveableKeyTerm:p(y=y))[P(Y=y)](https://www.fiveableKeyTerm:p(y=y)) obtained by summing the joint probability distribution over all values of the other variable
    • For discrete random variables, calculate P(X=x)P(X=x) by summing P(X=x,Y=y)P(X=x, Y=y) over all values of yy: P(X=x)=yP(X=x,Y=y)P(X=x) = \sum_y P(X=x, Y=y)
    • Similarly, calculate P(Y=y)P(Y=y) by summing P(X=x,Y=y)P(X=x, Y=y) over all values of xx: P(Y=y)=xP(X=x,Y=y)P(Y=y) = \sum_x P(X=x, Y=y)
    • For continuous random variables, calculate the fX(x)f_X(x) by integrating the joint probability density function f(x,y)f(x,y) over all values of yy: fX(x)=f(x,y)dyf_X(x) = \int_{-\infty}^{\infty} f(x,y) dy
    • Similarly, calculate fY(y)f_Y(y) by integrating f(x,y)f(x,y) over all values of xx: fY(y)=f(x,y)dxf_Y(y) = \int_{-\infty}^{\infty} f(x,y) dx
  • Examples:
    • Rolling two dice (discrete): P(sum=7)=i=16P(die 1=i,die 2=7i)P(\text{sum} = 7) = \sum_{i=1}^6 P(\text{die 1} = i, \text{die 2} = 7-i)
    • Bivariate normal distribution (continuous): fX(x)=12πσ1σ21ρ2exp(12(1ρ2)[(xμ1)2σ122ρ(xμ1)(yμ2)σ1σ2+(yμ2)2σ22])dyf_X(x) = \int_{-\infty}^{\infty} \frac{1}{2\pi\sigma_1\sigma_2\sqrt{1-\rho^2}}\exp\left(-\frac{1}{2(1-\rho^2)}\left[\frac{(x-\mu_1)^2}{\sigma_1^2}-\frac{2\rho(x-\mu_1)(y-\mu_2)}{\sigma_1\sigma_2}+\frac{(y-\mu_2)^2}{\sigma_2^2}\right]\right) dy

Calculation of conditional probabilities

  • [P(X=xY=y)](https://www.fiveableKeyTerm:p(x=xy=y))[P(X=x|Y=y)](https://www.fiveableKeyTerm:p(x=x|y=y)) or [P(Y=yX=x)](https://www.fiveableKeyTerm:p(y=yx=x))[P(Y=y|X=x)](https://www.fiveableKeyTerm:p(y=y|x=x)) represents the probability of one random variable taking on a specific value given the value of the other random variable
  • For discrete random variables, calculate P(X=xY=y)P(X=x|Y=y) by dividing the joint probability P(X=x,Y=y)P(X=x, Y=y) by the marginal probability P(Y=y)P(Y=y): P(X=xY=y)=P(X=x,Y=y)P(Y=y)P(X=x|Y=y) = \frac{P(X=x, Y=y)}{P(Y=y)}
    • Similarly, calculate P(Y=yX=x)P(Y=y|X=x) by dividing P(X=x,Y=y)P(X=x, Y=y) by P(X=x)P(X=x): P(Y=yX=x)=P(X=x,Y=y)P(X=x)P(Y=y|X=x) = \frac{P(X=x, Y=y)}{P(X=x)}
  • For continuous random variables, calculate the fXY(xy)f_{X|Y}(x|y) by dividing the joint probability density function f(x,y)f(x,y) by the marginal probability density function fY(y)f_Y(y): fXY(xy)=f(x,y)fY(y)f_{X|Y}(x|y) = \frac{f(x,y)}{f_Y(y)}
    • Similarly, calculate fYX(yx)f_{Y|X}(y|x) by dividing f(x,y)f(x,y) by fX(x)f_X(x): fYX(yx)=f(x,y)fX(x)f_{Y|X}(y|x) = \frac{f(x,y)}{f_X(x)}
  • Examples:
    • Drawing cards (discrete): P(suit=heartsrank=king)=P(suit=hearts,rank=king)P(rank=king)=1/524/52=14P(\text{suit} = \text{hearts} | \text{rank} = \text{king}) = \frac{P(\text{suit} = \text{hearts}, \text{rank} = \text{king})}{P(\text{rank} = \text{king})} = \frac{1/52}{4/52} = \frac{1}{4}
    • Gaussian mixture model (continuous): fXY(xy)=i=1kπiN(x,yμi,Σi)i=1kπiN(yμi,y,Σi,yy)f_{X|Y}(x|y) = \frac{\sum_{i=1}^k \pi_i \mathcal{N}(x, y | \mu_i, \Sigma_i)}{\sum_{i=1}^k \pi_i \mathcal{N}(y | \mu_{i,y}, \Sigma_{i,yy})}

Relationships among probability distributions

  • contains all information about the relationship between two random variables
  • Marginal distributions derived from the joint distribution by summing or integrating over the other variable
  • Conditional distributions calculated from the joint distribution by dividing the joint probability by the marginal probability of the given variable
  • Relationship between joint, marginal, and conditional distributions for discrete random variables:
    • P(X=x,Y=y)=P(X=xY=y)P(Y=y)P(X=x, Y=y) = P(X=x|Y=y) \cdot P(Y=y)
    • P(X=x,Y=y)=P(Y=yX=x)P(X=x)P(X=x, Y=y) = P(Y=y|X=x) \cdot P(X=x)
  • Relationship between joint, marginal, and conditional distributions for continuous random variables:
    • f(x,y)=fXY(xy)fY(y)f(x,y) = f_{X|Y}(x|y) \cdot f_Y(y)
    • f(x,y)=fYX(yx)fX(x)f(x,y) = f_{Y|X}(y|x) \cdot f_X(x)
  • Examples:
    • Coin flips (discrete): P(heads on 1st flip,tails on 2nd flip)=P(heads on 1st flip)P(tails on 2nd flipheads on 1st flip)P(\text{heads on 1st flip}, \text{tails on 2nd flip}) = P(\text{heads on 1st flip}) \cdot P(\text{tails on 2nd flip} | \text{heads on 1st flip})
    • Multivariate Gaussian distribution (continuous): f(x,y)=fXY(xy)fY(y)=12πσxexp((xμxy)22σx2)12πσyexp((yμy)22σy2)f(x,y) = f_{X|Y}(x|y) \cdot f_Y(y) = \frac{1}{\sqrt{2\pi}\sigma_x}\exp\left(-\frac{(x-\mu_{x|y})^2}{2\sigma_x^2}\right) \cdot \frac{1}{\sqrt{2\pi}\sigma_y}\exp\left(-\frac{(y-\mu_y)^2}{2\sigma_y^2}\right)

Applications of marginal and conditional distributions

  • Use marginal distributions to:
    1. Calculate probabilities of events involving a single random variable
    2. Determine the expected value and variance of a single random variable
  • Use conditional distributions to:
    1. Calculate probabilities of events involving one random variable given the value of another
    2. Determine the expected value and variance of a random variable given the value of another
  • Examples of applications:
    • Bayesian inference: update beliefs about a hypothesis (posterior probability) based on observed data (likelihood) and prior beliefs (prior probability)
    • Decision-making under uncertainty: choose actions that maximize expected utility, where utility depends on the probability of different outcomes
    • Machine learning: model the relationship between input features and output variables, such as in naive Bayes classifiers or conditional random fields
© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.


© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.
Glossary
Glossary