You have 3 free guides left 😟
Unlock your guides
You have 3 free guides left 😟
Unlock your guides

is a crucial technique for fitting Generalized Linear Models. It finds the parameter values that make the observed data most likely, given the chosen probability distribution and link function.

MLE for GLMs involves maximizing the log-likelihood function, which depends on the specific exponential family distribution. The process typically requires iterative numerical methods, yielding estimates for regression coefficients and dispersion parameters.

Likelihood Function for GLMs

Formulation and General Form

Top images from around the web for Formulation and General Form
Top images from around the web for Formulation and General Form
  • The likelihood function for a GLM is the product of the probability density or mass functions for each observation, assuming the observations are independent
  • The specific form of the likelihood function depends on the chosen exponential family distribution for the response variable (Bernoulli, Poisson, Gaussian)
  • The likelihood function for a GLM with n observations takes the general form:
    • L(β;y)=i=1nf(yi;θi,ϕ)L(β; y) = ∏ᵢ₌₁ⁿ f(yᵢ; θᵢ, ϕ)
    • f(yi;θi,ϕ)f(yᵢ; θᵢ, ϕ) is the probability density or mass function for the ith observation
    • θiθᵢ is the natural parameter
    • ϕϕ is the dispersion parameter

Relationship between Parameters and Predictors

  • The natural parameter θiθᵢ is related to the linear predictor ηiηᵢ through the link function:
    • g(μi)=ηi=xiTβg(μᵢ) = ηᵢ = xᵢᵀβ
    • μiμᵢ is the mean of the response variable for the ith observation
    • xixᵢ is the vector of predictor variables
    • ββ is the vector of regression coefficients
  • The dispersion parameter ϕϕ is a measure of the variability in the response variable and is assumed to be constant across observations in a GLM

Log-Likelihood Function for GLMs

Derivation and Decomposition

  • The log-likelihood function is obtained by taking the natural logarithm of the likelihood function:
    • (β;y)=log(L(β;y))=i=1nlog(f(yi;θi,ϕ))ℓ(β; y) = log(L(β; y)) = ∑ᵢ₌₁ⁿ log(f(yᵢ; θᵢ, ϕ))
  • The log-likelihood function for a GLM can be decomposed into three components:
    • (β;y)=i=1n[yiθib(θi)]/a(ϕ)+c(yi,ϕ)ℓ(β; y) = ∑ᵢ₌₁ⁿ [yᵢθᵢ - b(θᵢ)] / a(ϕ) + c(yᵢ, ϕ)
    • b(θi)b(θᵢ) is the cumulant function
    • a(ϕ)a(ϕ) is a function of the dispersion parameter
    • c(yi,ϕ)c(yᵢ, ϕ) is a function of the response variable and the dispersion parameter

Exponential Family Distribution-Specific Functions

  • The cumulant function b(θi)b(θᵢ) is specific to the chosen exponential family distribution and determines the relationship between the natural parameter θiθᵢ and the mean μiμᵢ of the response variable
  • The functions a(ϕ)a(ϕ) and c(yi,ϕ)c(yᵢ, ϕ) are also specific to the chosen exponential family distribution and are related to the dispersion parameter ϕϕ and the response variable yiyᵢ
  • The , defined as the gradient of the log-likelihood function with respect to the regression coefficients ββ, is used to find the maximum likelihood estimates of the parameters

Maximum Likelihood Estimation for GLMs

Estimation Process

  • Maximum likelihood estimation (MLE) is a method for estimating the parameters of a GLM by finding the values of the parameters that maximize the log-likelihood function
  • The MLE of the regression coefficients ββ is obtained by setting the score function equal to zero and solving the resulting system of equations:
    • (β;y)/β=0∂ℓ(β; y) / ∂β = 0
  • In most cases, the MLE of ββ cannot be obtained analytically and requires iterative numerical optimization methods (Newton-Raphson algorithm, Fisher scoring algorithm)

Iterative Optimization and Convergence

  • The iterative process starts with initial values for the parameters and updates them in each iteration until convergence is achieved
  • Convergence is determined when the change in the parameter estimates or the log-likelihood function falls below a specified tolerance level
  • The MLE of the dispersion parameter ϕϕ, if not known, can be obtained by maximizing the profile likelihood function, which is the log-likelihood function evaluated at the MLE of ββ

Standard Errors and Information Matrix

  • The standard errors of the estimated parameters can be obtained from the inverse of the observed information matrix
  • The observed information matrix is the negative Hessian matrix of the log-likelihood function evaluated at the MLE

Interpreting GLM Coefficients

  • The estimated regression coefficients ββ represent the change in the linear predictor ηiηᵢ for a unit change in the corresponding predictor variable, holding other predictors constant
  • The interpretation of the coefficients depends on the link function used in the GLM:
    • Log link: coefficients represent the change in the log of the mean response for a unit change in the predictor
    • Logit link: coefficients represent the change in the log odds of the response for a unit change in the predictor

Hypothesis Tests and Significance

  • The significance of the estimated coefficients can be assessed using hypothesis tests (Wald test, likelihood ratio test)
  • The Wald test statistic is the ratio of the estimated coefficient to its standard error and follows a standard under the null hypothesis that the coefficient is zero
  • The likelihood ratio test compares the log-likelihood of the fitted model to that of a reduced model without the predictor of interest and follows a chi-square distribution with degrees of freedom equal to the difference in the number of parameters between the two models

Confidence Intervals and Exponentiated Coefficients

  • Confidence intervals for the estimated coefficients can be constructed using the standard errors and the appropriate critical values from the standard normal or t-distribution, depending on the sample size and the distributional assumptions
  • The exponentiated coefficients, known as odds ratios or risk ratios, provide a more interpretable measure of the association between the predictors and the response variable, particularly for binary or count responses
© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.


© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.
Glossary
Glossary