Limit theorems are crucial in understanding how random variables behave as sample sizes grow. They explain convergence in probability, , and . These concepts help us grasp the long-term behavior of random processes.
Key theorems like the and form the backbone of . They allow us to make predictions and draw conclusions from data, bridging the gap between theoretical probability and real-world applications in various fields.
Convergence in probability
Fundamental concept in probability theory that describes how a sequence of random variables converges to a certain value as the sample size increases
Convergence in probability is a weaker notion compared to almost sure convergence and convergence in distribution, but it is still an important tool for studying the asymptotic behavior of random variables
Weak law of large numbers
Top images from around the web for Weak law of large numbers
Distribution of Sample Means (3 of 4) | Concepts in Statistics View original
Is this image relevant?
random variable - Convergence in probability vs. almost sure convergence - Cross Validated View original
Is this image relevant?
Scientific Memo: Understanding the empirical law of large numbers and the gambler's fallacy View original
Is this image relevant?
Distribution of Sample Means (3 of 4) | Concepts in Statistics View original
Is this image relevant?
random variable - Convergence in probability vs. almost sure convergence - Cross Validated View original
Is this image relevant?
1 of 3
Top images from around the web for Weak law of large numbers
Distribution of Sample Means (3 of 4) | Concepts in Statistics View original
Is this image relevant?
random variable - Convergence in probability vs. almost sure convergence - Cross Validated View original
Is this image relevant?
Scientific Memo: Understanding the empirical law of large numbers and the gambler's fallacy View original
Is this image relevant?
Distribution of Sample Means (3 of 4) | Concepts in Statistics View original
Is this image relevant?
random variable - Convergence in probability vs. almost sure convergence - Cross Validated View original
Is this image relevant?
1 of 3
States that the sample mean of a sequence of independent and identically distributed (i.i.d.) random variables converges in probability to the population mean as the sample size increases
Formally, if X1,X2,… are i.i.d. random variables with finite mean μ, then for any ε>0, limn→∞P(∣Xˉn−μ∣>ε)=0, where Xˉn=n1∑i=1nXi is the sample mean
Provides a theoretical justification for the use of sample means as estimators of population means (sample average of dice rolls, average height of a population)
Convergence of random variables
A sequence of random variables {Xn} converges in probability to a random variable X if, for any ε>0, limn→∞P(∣Xn−X∣>ε)=0
Denoted as XnpX
Convergence in probability implies that the probability of the difference between Xn and X being larger than any fixed value approaches zero as n increases (coin flips converging to 0.5, sample converging to population variance)
Continuous mapping theorem
States that if a sequence of random variables converges in probability to a limit and a function is continuous at that limit, then the sequence of the function applied to the random variables converges in probability to the function of the limit
Formally, if XnpX and g is a continuous function at X, then g(Xn)pg(X)
Allows for the preservation of convergence in probability under continuous transformations (convergence of sample variance implies convergence of sample standard deviation)
Almost sure convergence
Stronger notion of convergence compared to convergence in probability, where a sequence of random variables converges to a certain value with probability one
Almost sure convergence implies convergence in probability, but the converse is not always true
Strong law of large numbers
States that the sample mean of a sequence of i.i.d. random variables converges almost surely to the population mean as the sample size increases
Formally, if X1,X2,… are i.i.d. random variables with finite mean μ, then P(limn→∞Xˉn=μ)=1, where Xˉn=n1∑i=1nXi is the sample mean
Provides a stronger result compared to the , ensuring convergence with probability one (long-term frequency of heads in coin flips, average of a large number of dice rolls)
Kolmogorov's three-series theorem
Provides necessary and sufficient conditions for the almost sure convergence of a series of independent random variables
The three conditions are:
∑n=1∞P(∣Xn∣>ε)<∞ for some ε>0
∑n=1∞E[Xn1{∣Xn∣≤1}]<∞
∑n=1∞Var(Xn1{∣Xn∣≤1})<∞
If all three conditions are satisfied, then the series ∑n=1∞Xn converges almost surely (convergence of random harmonic series, convergence of random geometric series)
Borel-Cantelli lemmas
Two lemmas that provide conditions for the almost sure convergence of events
First Borel-Cantelli lemma: If ∑n=1∞P(An)<∞, then P(limsupn→∞An)=0, meaning that the events An occur only finitely many times with probability one
Second Borel-Cantelli lemma: If the events An are independent and ∑n=1∞P(An)=∞, then P(limsupn→∞An)=1, meaning that the events An occur infinitely often with probability one (almost sure divergence of harmonic series, almost sure recurrence of random walks)
Convergence in distribution
Describes how the distribution of a sequence of random variables converges to a limiting distribution
Convergence in distribution does not imply convergence in probability or almost sure convergence, but the converse implications hold under certain conditions
Central limit theorem
States that the standardized sum of a sequence of i.i.d. random variables with finite mean and variance converges in distribution to a standard normal distribution as the sample size increases
Formally, if X1,X2,… are i.i.d. random variables with mean μ and variance σ2, then σn∑i=1nXi−nμdN(0,1) as n→∞
Provides a fundamental result for statistical inference, allowing for the approximation of the distribution of sample means and sums (distribution of average heights, sum of dice rolls)
Characteristic functions
A tool for studying the convergence in distribution of random variables
The characteristic function of a random variable X is defined as φX(t)=E[eitX], where i is the imaginary unit
Convergence of implies convergence in distribution: If φXn(t)→φX(t) for all t, then XndX (convergence of binomial to Poisson distribution, convergence of sample variance to chi-squared distribution)
Lindeberg-Feller theorem
Generalizes the central limit theorem to sequences of independent, but not necessarily identically distributed, random variables
Provides conditions under which the standardized sum of a sequence of random variables converges in distribution to a standard normal distribution
The Lindeberg condition: sn21∑i=1nE[(Xi−μi)21{∣Xi−μi∣>εsn}]→0 for all ε>0, where sn2=∑i=1nVar(Xi)
If the Lindeberg condition holds and sn2max1≤i≤nVar(Xi)→0, then sn∑i=1n(Xi−μi)dN(0,1) (convergence of non-identically distributed random variables, convergence of triangular arrays)
Delta method
A technique for deriving the of a function of an asymptotically normal estimator
If n(Xn−θ)dN(0,σ2) and g is a differentiable function with g′(θ)=0, then n(g(Xn)−g(θ))dN(0,σ2[g′(θ)]2)
Allows for the construction of and hypothesis tests for functions of estimators (asymptotic distribution of sample variance, asymptotic distribution of maximum likelihood estimators)
Functional limit theorems
Extend the concept of convergence in distribution to function spaces, describing the convergence of stochastic processes to limiting processes
are essential for studying the asymptotic behavior of empirical processes and other stochastic processes arising in statistics and probability
Donsker's theorem
States that the empirical process, which is the difference between the empirical distribution function and the true distribution function, converges in distribution to a Brownian bridge process in the space of continuous functions on the unit interval
Formally, if X1,X2,… are i.i.d. random variables with distribution function F, then n(Fn−F)dB∘F in C[0,1], where Fn is the empirical distribution function and B is a standard Brownian bridge
Provides a foundation for the study of empirical processes and their applications in statistics (goodness-of-fit tests, confidence bands for distribution functions)
Empirical process theory
Studies the asymptotic behavior of empirical processes, which are stochastic processes based on random samples
Empirical processes include the empirical distribution function, empirical characteristic function, and empirical moment functions
develops tools for deriving the limiting distributions of functionals of empirical processes, such as suprema and integrals (Kolmogorov-Smirnov test, Cramér-von Mises test)
Brownian motion approximation
Approximates the behavior of certain stochastic processes by Brownian motion or related processes, such as Brownian bridge or fractional Brownian motion
is particularly useful for studying the asymptotic properties of partial sum processes, empirical processes, and other stochastic processes with dependent increments
Examples include the invariance principle for partial sums of weakly dependent random variables, the functional central limit theorem for martingales, and the approximation of queueing processes by reflected Brownian motion (approximation of random walks, approximation of cumulative sums)
Rates of convergence
Quantify the speed at which a sequence of random variables or a stochastic process converges to its limiting distribution or process
are important for assessing the accuracy of approximations, constructing confidence intervals, and deriving higher-order asymptotic expansions
Berry-Esseen theorem
Provides a bound on the rate of convergence in the central limit theorem for i.i.d. random variables with finite third moments
If X1,X2,… are i.i.d. random variables with mean μ, variance σ2, and finite third absolute moment ρ, then supx∣P(σn∑i=1nXi−nμ≤x)−Φ(x)∣≤σ3nCρ, where Φ is the standard normal distribution function and C is a universal constant
Provides a quantitative bound on the error of the normal approximation for standardized sums (error in approximating binomial distribution by normal, error in approximating Poisson distribution by normal)
Edgeworth expansions
Provide higher-order asymptotic expansions for the distribution of a standardized sum of i.i.d. random variables
are based on the cumulants of the random variables and involve Hermite polynomials
The first-order Edgeworth expansion is given by P(σn∑i=1nXi−nμ≤x)=Φ(x)−6σ3nκ3(1−x2)ϕ(x)+o(n1), where κ3 is the third cumulant and ϕ is the standard normal density function
Edgeworth expansions provide more accurate approximations compared to the central limit theorem and can be used for constructing asymptotic confidence intervals and improving the coverage probability of (improved approximation of binomial distribution, improved approximation of chi-squared distribution)
Sanov's theorem
Describes the rate of convergence of the empirical distribution of a sequence of i.i.d. random variables to the true distribution in terms of the Kullback-Leibler divergence
If X1,X2,… are i.i.d. random variables with distribution P and Q is another distribution, then P(n1∑i=1nδXi∈B(Q,ε))≈e−ninfQ′∈B(Q,ε)D(Q′∣∣P), where B(Q,ε) is the ball of radius ε around Q in the space of probability measures and D(Q′∣∣P) is the Kullback-Leibler divergence between Q′ and P
has applications in large deviations theory, , and information theory (rate of convergence of empirical distributions, error exponents in hypothesis testing)
Applications of limit theorems
Limit theorems have numerous applications in various areas of statistics, including inference, hypothesis testing, and confidence interval construction
The asymptotic properties derived from limit theorems provide the foundation for many statistical procedures and allow for the development of efficient and robust methods
Statistical inference
Limit theorems are essential for deriving the asymptotic distributions of estimators, such as maximum likelihood estimators, method of moments estimators, and least squares estimators
The of estimators, established through the central limit theorem or its extensions, allows for the construction of confidence intervals and hypothesis tests (asymptotic distribution of sample mean, asymptotic distribution of regression coefficients)
Hypothesis testing
Limit theorems provide the basis for various hypothesis testing procedures, such as the likelihood ratio test, Wald test, and score test
The asymptotic distributions of test statistics under the null and alternative hypotheses can be derived using the central limit theorem, , or other limit theorems
These asymptotic distributions are used to determine critical values and p-values for the tests (t-test for comparing means, chi-squared test for independence)
Confidence intervals
Limit theorems enable the construction of asymptotic confidence intervals for parameters of interest
The central limit theorem and its extensions provide the asymptotic normality of estimators, which can be used to derive the confidence intervals
The delta method allows for the construction of confidence intervals for functions of parameters by linearizing the function and applying the central limit theorem (confidence interval for population mean, confidence interval for variance)
Asymptotic normality
Many statistical procedures rely on the asymptotic normality of estimators or test statistics, which is established through limit theorems
The central limit theorem, , and delta method are commonly used to derive the asymptotic normality of various quantities in statistics
Asymptotic normality allows for the use of standard normal critical values and p-values, simplifying the implementation and interpretation of statistical methods (asymptotic normality of maximum likelihood estimators, asymptotic normality of likelihood ratio test statistics)