You have 3 free guides left 😟
Unlock your guides
You have 3 free guides left 😟
Unlock your guides

Limit theorems are crucial in understanding how random variables behave as sample sizes grow. They explain convergence in probability, , and . These concepts help us grasp the long-term behavior of random processes.

Key theorems like the and form the backbone of . They allow us to make predictions and draw conclusions from data, bridging the gap between theoretical probability and real-world applications in various fields.

Convergence in probability

  • Fundamental concept in probability theory that describes how a sequence of random variables converges to a certain value as the sample size increases
  • Convergence in probability is a weaker notion compared to almost sure convergence and convergence in distribution, but it is still an important tool for studying the asymptotic behavior of random variables

Weak law of large numbers

Top images from around the web for Weak law of large numbers
Top images from around the web for Weak law of large numbers
  • States that the sample mean of a sequence of independent and identically distributed (i.i.d.) random variables converges in probability to the population mean as the sample size increases
  • Formally, if X1,X2,X_1, X_2, \ldots are i.i.d. random variables with finite mean μ\mu, then for any ε>0\varepsilon > 0, limnP(Xˉnμ>ε)=0\lim_{n \to \infty} P(|\bar{X}_n - \mu| > \varepsilon) = 0, where Xˉn=1ni=1nXi\bar{X}_n = \frac{1}{n} \sum_{i=1}^n X_i is the sample mean
  • Provides a theoretical justification for the use of sample means as estimators of population means (sample average of dice rolls, average height of a population)

Convergence of random variables

  • A sequence of random variables {Xn}\{X_n\} converges in probability to a random variable XX if, for any ε>0\varepsilon > 0, limnP(XnX>ε)=0\lim_{n \to \infty} P(|X_n - X| > \varepsilon) = 0
  • Denoted as XnpXX_n \xrightarrow{p} X
  • Convergence in probability implies that the probability of the difference between XnX_n and XX being larger than any fixed value approaches zero as nn increases (coin flips converging to 0.5, sample converging to population variance)

Continuous mapping theorem

  • States that if a sequence of random variables converges in probability to a limit and a function is continuous at that limit, then the sequence of the function applied to the random variables converges in probability to the function of the limit
  • Formally, if XnpXX_n \xrightarrow{p} X and gg is a continuous function at XX, then g(Xn)pg(X)g(X_n) \xrightarrow{p} g(X)
  • Allows for the preservation of convergence in probability under continuous transformations (convergence of sample variance implies convergence of sample standard deviation)

Almost sure convergence

  • Stronger notion of convergence compared to convergence in probability, where a sequence of random variables converges to a certain value with probability one
  • Almost sure convergence implies convergence in probability, but the converse is not always true

Strong law of large numbers

  • States that the sample mean of a sequence of i.i.d. random variables converges almost surely to the population mean as the sample size increases
  • Formally, if X1,X2,X_1, X_2, \ldots are i.i.d. random variables with finite mean μ\mu, then P(limnXˉn=μ)=1P(\lim_{n \to \infty} \bar{X}_n = \mu) = 1, where Xˉn=1ni=1nXi\bar{X}_n = \frac{1}{n} \sum_{i=1}^n X_i is the sample mean
  • Provides a stronger result compared to the , ensuring convergence with probability one (long-term frequency of heads in coin flips, average of a large number of dice rolls)

Kolmogorov's three-series theorem

  • Provides necessary and sufficient conditions for the almost sure convergence of a series of independent random variables
  • The three conditions are:
    1. n=1P(Xn>ε)<\sum_{n=1}^\infty P(|X_n| > \varepsilon) < \infty for some ε>0\varepsilon > 0
    2. n=1E[Xn1{Xn1}]<\sum_{n=1}^\infty E[X_n \mathbf{1}_{\{|X_n| \leq 1\}}] < \infty
    3. n=1Var(Xn1{Xn1})<\sum_{n=1}^\infty \text{Var}(X_n \mathbf{1}_{\{|X_n| \leq 1\}}) < \infty
  • If all three conditions are satisfied, then the series n=1Xn\sum_{n=1}^\infty X_n converges almost surely (convergence of random harmonic series, convergence of random geometric series)

Borel-Cantelli lemmas

  • Two lemmas that provide conditions for the almost sure convergence of events
  • First Borel-Cantelli lemma: If n=1P(An)<\sum_{n=1}^\infty P(A_n) < \infty, then P(lim supnAn)=0P(\limsup_{n \to \infty} A_n) = 0, meaning that the events AnA_n occur only finitely many times with probability one
  • Second Borel-Cantelli lemma: If the events AnA_n are independent and n=1P(An)=\sum_{n=1}^\infty P(A_n) = \infty, then P(lim supnAn)=1P(\limsup_{n \to \infty} A_n) = 1, meaning that the events AnA_n occur infinitely often with probability one (almost sure divergence of harmonic series, almost sure recurrence of random walks)

Convergence in distribution

  • Describes how the distribution of a sequence of random variables converges to a limiting distribution
  • Convergence in distribution does not imply convergence in probability or almost sure convergence, but the converse implications hold under certain conditions

Central limit theorem

  • States that the standardized sum of a sequence of i.i.d. random variables with finite mean and variance converges in distribution to a standard normal distribution as the sample size increases
  • Formally, if X1,X2,X_1, X_2, \ldots are i.i.d. random variables with mean μ\mu and variance σ2\sigma^2, then i=1nXinμσndN(0,1)\frac{\sum_{i=1}^n X_i - n\mu}{\sigma \sqrt{n}} \xrightarrow{d} N(0, 1) as nn \to \infty
  • Provides a fundamental result for statistical inference, allowing for the approximation of the distribution of sample means and sums (distribution of average heights, sum of dice rolls)

Characteristic functions

  • A tool for studying the convergence in distribution of random variables
  • The characteristic function of a random variable XX is defined as φX(t)=E[eitX]\varphi_X(t) = E[e^{itX}], where ii is the imaginary unit
  • Convergence of implies convergence in distribution: If φXn(t)φX(t)\varphi_{X_n}(t) \to \varphi_X(t) for all tt, then XndXX_n \xrightarrow{d} X (convergence of binomial to Poisson distribution, convergence of sample variance to chi-squared distribution)

Lindeberg-Feller theorem

  • Generalizes the central limit theorem to sequences of independent, but not necessarily identically distributed, random variables
  • Provides conditions under which the standardized sum of a sequence of random variables converges in distribution to a standard normal distribution
  • The Lindeberg condition: 1sn2i=1nE[(Xiμi)21{Xiμi>εsn}]0\frac{1}{s_n^2} \sum_{i=1}^n E[(X_i - \mu_i)^2 \mathbf{1}_{\{|X_i - \mu_i| > \varepsilon s_n\}}] \to 0 for all ε>0\varepsilon > 0, where sn2=i=1nVar(Xi)s_n^2 = \sum_{i=1}^n \text{Var}(X_i)
  • If the Lindeberg condition holds and max1inVar(Xi)sn20\frac{\max_{1 \leq i \leq n} \text{Var}(X_i)}{s_n^2} \to 0, then i=1n(Xiμi)sndN(0,1)\frac{\sum_{i=1}^n (X_i - \mu_i)}{s_n} \xrightarrow{d} N(0, 1) (convergence of non-identically distributed random variables, convergence of triangular arrays)

Delta method

  • A technique for deriving the of a function of an asymptotically normal estimator
  • If n(Xnθ)dN(0,σ2)\sqrt{n}(X_n - \theta) \xrightarrow{d} N(0, \sigma^2) and gg is a differentiable function with g(θ)0g'(\theta) \neq 0, then n(g(Xn)g(θ))dN(0,σ2[g(θ)]2)\sqrt{n}(g(X_n) - g(\theta)) \xrightarrow{d} N(0, \sigma^2 [g'(\theta)]^2)
  • Allows for the construction of and hypothesis tests for functions of estimators (asymptotic distribution of sample variance, asymptotic distribution of maximum likelihood estimators)

Functional limit theorems

  • Extend the concept of convergence in distribution to function spaces, describing the convergence of stochastic processes to limiting processes
  • are essential for studying the asymptotic behavior of empirical processes and other stochastic processes arising in statistics and probability

Donsker's theorem

  • States that the empirical process, which is the difference between the empirical distribution function and the true distribution function, converges in distribution to a Brownian bridge process in the space of continuous functions on the unit interval
  • Formally, if X1,X2,X_1, X_2, \ldots are i.i.d. random variables with distribution function FF, then n(FnF)dBF\sqrt{n}(F_n - F) \xrightarrow{d} B \circ F in C[0,1]C[0, 1], where FnF_n is the empirical distribution function and BB is a standard Brownian bridge
  • Provides a foundation for the study of empirical processes and their applications in statistics (goodness-of-fit tests, confidence bands for distribution functions)

Empirical process theory

  • Studies the asymptotic behavior of empirical processes, which are stochastic processes based on random samples
  • Empirical processes include the empirical distribution function, empirical characteristic function, and empirical moment functions
  • develops tools for deriving the limiting distributions of functionals of empirical processes, such as suprema and integrals (Kolmogorov-Smirnov test, Cramér-von Mises test)

Brownian motion approximation

  • Approximates the behavior of certain stochastic processes by Brownian motion or related processes, such as Brownian bridge or fractional Brownian motion
  • is particularly useful for studying the asymptotic properties of partial sum processes, empirical processes, and other stochastic processes with dependent increments
  • Examples include the invariance principle for partial sums of weakly dependent random variables, the functional central limit theorem for martingales, and the approximation of queueing processes by reflected Brownian motion (approximation of random walks, approximation of cumulative sums)

Rates of convergence

  • Quantify the speed at which a sequence of random variables or a stochastic process converges to its limiting distribution or process
  • are important for assessing the accuracy of approximations, constructing confidence intervals, and deriving higher-order asymptotic expansions

Berry-Esseen theorem

  • Provides a bound on the rate of convergence in the central limit theorem for i.i.d. random variables with finite third moments
  • If X1,X2,X_1, X_2, \ldots are i.i.d. random variables with mean μ\mu, variance σ2\sigma^2, and finite third absolute moment ρ\rho, then supxP(i=1nXinμσnx)Φ(x)Cρσ3n\sup_x |P(\frac{\sum_{i=1}^n X_i - n\mu}{\sigma \sqrt{n}} \leq x) - \Phi(x)| \leq \frac{C \rho}{\sigma^3 \sqrt{n}}, where Φ\Phi is the standard normal distribution function and CC is a universal constant
  • Provides a quantitative bound on the error of the normal approximation for standardized sums (error in approximating binomial distribution by normal, error in approximating Poisson distribution by normal)

Edgeworth expansions

  • Provide higher-order asymptotic expansions for the distribution of a standardized sum of i.i.d. random variables
  • are based on the cumulants of the random variables and involve Hermite polynomials
  • The first-order Edgeworth expansion is given by P(i=1nXinμσnx)=Φ(x)κ36σ3n(1x2)ϕ(x)+o(1n)P(\frac{\sum_{i=1}^n X_i - n\mu}{\sigma \sqrt{n}} \leq x) = \Phi(x) - \frac{\kappa_3}{6\sigma^3\sqrt{n}} (1 - x^2) \phi(x) + o(\frac{1}{\sqrt{n}}), where κ3\kappa_3 is the third cumulant and ϕ\phi is the standard normal density function
  • Edgeworth expansions provide more accurate approximations compared to the central limit theorem and can be used for constructing asymptotic confidence intervals and improving the coverage probability of (improved approximation of binomial distribution, improved approximation of chi-squared distribution)

Sanov's theorem

  • Describes the rate of convergence of the empirical distribution of a sequence of i.i.d. random variables to the true distribution in terms of the Kullback-Leibler divergence
  • If X1,X2,X_1, X_2, \ldots are i.i.d. random variables with distribution PP and QQ is another distribution, then P(1ni=1nδXiB(Q,ε))eninfQB(Q,ε)D(QP)P(\frac{1}{n} \sum_{i=1}^n \delta_{X_i} \in B(Q, \varepsilon)) \approx e^{-n \inf_{Q' \in B(Q, \varepsilon)} D(Q' || P)}, where B(Q,ε)B(Q, \varepsilon) is the ball of radius ε\varepsilon around QQ in the space of probability measures and D(QP)D(Q' || P) is the Kullback-Leibler divergence between QQ' and PP
  • has applications in large deviations theory, , and information theory (rate of convergence of empirical distributions, error exponents in hypothesis testing)

Applications of limit theorems

  • Limit theorems have numerous applications in various areas of statistics, including inference, hypothesis testing, and confidence interval construction
  • The asymptotic properties derived from limit theorems provide the foundation for many statistical procedures and allow for the development of efficient and robust methods

Statistical inference

  • Limit theorems are essential for deriving the asymptotic distributions of estimators, such as maximum likelihood estimators, method of moments estimators, and least squares estimators
  • The of estimators, established through the central limit theorem or its extensions, allows for the construction of confidence intervals and hypothesis tests (asymptotic distribution of sample mean, asymptotic distribution of regression coefficients)

Hypothesis testing

  • Limit theorems provide the basis for various hypothesis testing procedures, such as the likelihood ratio test, Wald test, and score test
  • The asymptotic distributions of test statistics under the null and alternative hypotheses can be derived using the central limit theorem, , or other limit theorems
  • These asymptotic distributions are used to determine critical values and p-values for the tests (t-test for comparing means, chi-squared test for independence)

Confidence intervals

  • Limit theorems enable the construction of asymptotic confidence intervals for parameters of interest
  • The central limit theorem and its extensions provide the asymptotic normality of estimators, which can be used to derive the confidence intervals
  • The delta method allows for the construction of confidence intervals for functions of parameters by linearizing the function and applying the central limit theorem (confidence interval for population mean, confidence interval for variance)

Asymptotic normality

  • Many statistical procedures rely on the asymptotic normality of estimators or test statistics, which is established through limit theorems
  • The central limit theorem, , and delta method are commonly used to derive the asymptotic normality of various quantities in statistics
  • Asymptotic normality allows for the use of standard normal critical values and p-values, simplifying the implementation and interpretation of statistical methods (asymptotic normality of maximum likelihood estimators, asymptotic normality of likelihood ratio test statistics)
© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.


© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.
Glossary
Glossary