5.2 Matrix Formulation of Simple Linear Regression
4 min read•july 30, 2024
Matrix formulation simplifies simple linear regression, making it easier to handle large datasets and complex models. It allows for efficient computation of parameter estimates and provides a compact representation of the regression problem.
This approach extends naturally to multiple regression and other linear models. Understanding matrix notation is crucial for advanced statistical techniques and computational methods in data analysis and machine learning.
Linear Regression in Matrix Form
Matrix Notation
Top images from around the web for Matrix Notation
Introduction to Matrices | Boundless Algebra View original
Is this image relevant?
Simple Linear regression algorithm in machine learning with example graph - Codershood View original
Is this image relevant?
Introduction to Matrices | Boundless Algebra View original
Is this image relevant?
Simple Linear regression algorithm in machine learning with example graph - Codershood View original
Is this image relevant?
1 of 2
Top images from around the web for Matrix Notation
Introduction to Matrices | Boundless Algebra View original
Is this image relevant?
Simple Linear regression algorithm in machine learning with example graph - Codershood View original
Is this image relevant?
Introduction to Matrices | Boundless Algebra View original
Is this image relevant?
Simple Linear regression algorithm in machine learning with example graph - Codershood View original
Is this image relevant?
1 of 2
The simple linear regression model with n observations can be expressed as y=Xβ+ε, where:
y is an n×1 vector of response values
X is an n×2
β is a 2×1 vector of parameters ( and )
ε is an n×1 vector of errors
The first column of the design matrix X consists of a vector of ones, representing the intercept term, while the second column contains the predictor variable values
The error vector ε is assumed to have a multivariate normal distribution with mean zero and variance-covariance matrix σ2I, where I is the n×n identity matrix
Model Assumptions
The relationship between the response variable and the predictor variable is linear
The errors are independently and identically distributed (i.i.d.) with a normal distribution
The errors have a mean of zero and a constant variance σ2
The predictor variable is measured without error and is fixed (non-random)
Design Matrix and Parameter Vector
Design Matrix Structure
The design matrix X for simple linear regression with n observations and one predictor variable is an n×2 matrix
The first column is a vector of ones (1,1,...,1), representing the intercept term
The second column contains the values of the predictor variable (x1,x2,...,xn)
Example: For a simple linear regression with 5 observations and predictor values (2,4,6,8,10), the design matrix X would be:
1 & 2 \\
1 & 4 \\
1 & 6 \\
1 & 8 \\
1 & 10
\end{bmatrix}$$
Parameter Vector
The parameter vector β is a 2×1 vector containing the intercept (β0) and the slope (β1) of the simple linear regression model
β=[β0β1]
The structure of the design matrix X and parameter vector β allows for a concise representation of the simple linear regression model in matrix form
Least Squares Estimation with Matrices
Objective Function
The estimation problem in matrix form aims to minimize the sum of squared , which can be expressed as (y−Xβ)T(y−Xβ), where (y−Xβ) represents the vector of residuals
To find the least squares estimates of the parameters, we differentiate the sum of squared residuals with respect to β and set the resulting expression equal to zero
The resulting equation, known as the normal equation, is XT(y−Xβ)=0, where XT represents the transpose of the design matrix X
Solving the Normal Equations
The normal equations can be solved for the least squares estimates of the parameters β by premultiplying both sides by (XTX)−1, resulting in:
β^=(XTX)−1XTy, where β^ represents the least squares estimates of the parameters
The matrix (XTX)−1 is known as the variance-covariance matrix of the parameter estimates, and its diagonal elements provide the variances of the intercept and slope estimates
Example: Using the design matrix X from the previous example and a response vector y=357911, the least squares estimates can be calculated as:
β^=(XTX)−1XTy=[53030220]−1[35210]=[11]
Normal Equations Derivation
Expanding the Least Squares Problem
The normal equations for simple linear regression can be derived by expanding the least squares estimation problem XT(y−Xβ)=0
Multiplying out the parentheses yields XTXβ=XTy, where XTX is a 2×2 matrix and XTy is a 2×1 vector
The expanded form of the normal equations is:
\sum_{i=1}^n 1 & \sum_{i=1}^n x_i \\
\sum_{i=1}^n x_i & \sum_{i=1}^n x_i^2
\end{bmatrix} \begin{bmatrix}
\beta_0 \\
\beta_1
\end{bmatrix} = \begin{bmatrix}
\sum_{i=1}^n y_i \\
\sum_{i=1}^n x_iy_i
\end{bmatrix}$$
Solving for Parameter Estimates
The normal equations in matrix form can be solved for the least squares estimates of the parameters β by premultiplying both sides by (XTX)−1
The resulting expression for the least squares estimates is:
\sum_{i=1}^n 1 & \sum_{i=1}^n x_i \\
\sum_{i=1}^n x_i & \sum_{i=1}^n x_i^2
\end{bmatrix}^{-1} \begin{bmatrix}
\sum_{i=1}^n y_i \\
\sum_{i=1}^n x_iy_i
\end{bmatrix}$$
The matrix (XTX)−1 is the variance-covariance matrix of the parameter estimates, and its diagonal elements provide the variances of the intercept and slope estimates
The off-diagonal elements of (XTX)−1 represent the covariances between the intercept and slope estimates