You have 3 free guides left 😟
Unlock your guides
You have 3 free guides left 😟
Unlock your guides

8.2 Covariance and correlation

3 min readjuly 19, 2024

and correlation measure how two variables change together. Covariance shows if they move in the same or opposite directions, while correlation tells us how strong that relationship is on a scale from -1 to 1.

These concepts help us understand connections between things like height and weight or income and education. They're useful in finance, science, and social research to spot patterns and make predictions about related variables.

Covariance and Correlation

Definition of covariance and correlation

Top images from around the web for Definition of covariance and correlation
Top images from around the web for Definition of covariance and correlation
  • Covariance quantifies the joint variability of two random variables from their individual means
    • indicates variables tend to move in the same direction relative to their means (height and weight)
    • indicates variables tend to move in opposite directions relative to their means (price and demand)
  • Covariance formula: Cov(X,Y)=E[(XμX)(YμY)]Cov(X,Y) = E[(X - \mu_X)(Y - \mu_Y)]
    • μX\mu_X and μY\mu_Y represent the means of random variables XX and YY
  • Correlation measures the strength and direction of the between two random variables
    • Ranges from -1 (perfect negative linear relationship) to 1 (perfect positive linear relationship)
    • Correlation of 0 implies no linear relationship (income and favorite color)
  • Correlation formula: ρXY=Cov(X,Y)σXσY\rho_{XY} = \frac{Cov(X,Y)}{\sigma_X \sigma_Y}
    • σX\sigma_X and σY\sigma_Y represent the standard deviations of random variables XX and YY

Calculation of joint distributions

  • Covariance calculation: Cov(X,Y)=E[XY]E[X]E[Y]Cov(X,Y) = E[XY] - E[X]E[Y]
    • E[XY]E[XY] represents the expected value of the product of XX and YY
    • E[X]E[X] and E[Y]E[Y] represent the individual expected values (means) of XX and YY
  • Correlation calculation: ρXY=Cov(X,Y)Var(X)Var(Y)\rho_{XY} = \frac{Cov(X,Y)}{\sqrt{Var(X)Var(Y)}}
    • Var(X)Var(X) and Var(Y)Var(Y) represent the variances of random variables XX and YY
  • For discrete random variables, calculate expected values using the probability mass function (PMF)
    • Example: Roll two fair dice, let XX be the sum and YY be the product of the numbers rolled
  • For continuous random variables, calculate expected values using the joint probability density function (PDF)
    • Example: XX and YY represent the heights of a randomly selected male and female student

Properties of statistical relationships

  • Covariance properties
    • Cov(X,X)=Var(X)Cov(X,X) = Var(X), covariance of a variable with itself equals its variance
    • Cov(X,Y)=Cov(Y,X)Cov(X,Y) = Cov(Y,X), covariance is symmetric
    • Cov(aX+b,cY+d)=acCov(X,Y)Cov(aX + b, cY + d) = ac \cdot Cov(X,Y) for constants aa, bb, cc, and dd
  • Correlation properties
    • ρXX=1\rho_{XX} = 1, a variable is perfectly correlated with itself
    • ρXY=ρYX\rho_{XY} = \rho_{YX}, correlation is symmetric
    • ρXY1|\rho_{XY}| \leq 1, correlation is bounded between -1 and 1
  • Relationship between independence and covariance/correlation
    • If XX and YY are independent, then Cov(X,Y)=0Cov(X,Y) = 0 and ρXY=0\rho_{XY} = 0
    • However, Cov(X,Y)=0Cov(X,Y) = 0 or ρXY=0\rho_{XY} = 0 does not necessarily imply independence (non-linear relationships)

Applications in linear analysis

  • Interpret covariance and correlation values
    1. Determine the direction of the linear relationship (positive or negative)
    2. Assess the strength of the linear relationship (magnitude of correlation)
  • Covariance interpretation is scale-dependent and difficult to compare across different variable pairs
  • Correlation provides a standardized measure of linear relationship strength for easier comparison
  • Applications across various fields
    • Finance: Portfolio risk analysis and diversification (stocks and bonds)
    • : Assessing similarity between signals (audio and video)
    • Machine learning: Feature selection and dimensionality reduction (customer preferences)
    • Social sciences: Studying relationships between variables (education and income)
© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.


© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.
Glossary
Glossary