You have 3 free guides left 😟
Unlock your guides
You have 3 free guides left 😟
Unlock your guides

is a powerful technique for finding optimal solutions while satisfying specific restrictions. It's crucial in data science and statistics, where we often need to balance multiple objectives and limitations.

In this topic, we explore various types of constraints, methods for solving constrained problems, and important concepts like KKT conditions. We'll also dive into practical applications, including and support vector machines.

Constrained optimization overview

  • Constrained optimization deals with finding the optimal solution to a problem while satisfying certain constraints or restrictions
  • Involves optimizing an objective function subject to equality and/or
  • Plays a crucial role in various fields, including data science, statistics, machine learning, and operations research

Equality constraints

Linear equality constraints

Top images from around the web for Linear equality constraints
Top images from around the web for Linear equality constraints
  • Constraints expressed as linear equations, such as Ax=bAx = b
  • Restrict the to a hyperplane or a line
  • Can be handled using techniques like substitution or elimination
  • Example: x1+2x2=5x_1 + 2x_2 = 5

Nonlinear equality constraints

  • Constraints involving nonlinear functions of the decision variables
  • Introduce curvature to the feasible region
  • Require specialized techniques, such as or Newton's method
  • Example: x12+x22=1x_1^2 + x_2^2 = 1

Inequality constraints

Linear inequality constraints

  • Constraints expressed as linear inequalities, such as AxbAx \leq b
  • Define a half-space or a convex polytope as the feasible region
  • Can be handled using methods like or interior-point methods
  • Example: 2x13x262x_1 - 3x_2 \leq 6

Nonlinear inequality constraints

  • Constraints involving nonlinear functions of the decision variables
  • Create nonlinear boundaries for the feasible region
  • Require specialized techniques, such as or penalty methods
  • Example: x12+x224x_1^2 + x_2^2 \leq 4

Karush-Kuhn-Tucker (KKT) conditions

  • Necessary conditions for a solution to be optimal in a constrained optimization problem
  • Generalize the concept of Lagrange multipliers to handle inequality constraints
  • Consist of four conditions: stationarity, primal feasibility, dual feasibility, and complementary slackness

Stationarity condition

  • Requires the gradient of the Lagrangian function to be zero at the optimal point
  • Ensures that the solution is a stationary point of the Lagrangian function
  • Mathematically expressed as f(x)+i=1mλigi(x)+j=1pμjhj(x)=0\nabla f(x^*) + \sum_{i=1}^m \lambda_i^* \nabla g_i(x^*) + \sum_{j=1}^p \mu_j^* \nabla h_j(x^*) = 0

Primal feasibility condition

  • Ensures that the optimal solution satisfies all the equality and inequality constraints
  • For : gi(x)=0,i=1,,mg_i(x^*) = 0, \quad i = 1, \ldots, m
  • For inequality constraints: hj(x)0,j=1,,ph_j(x^*) \leq 0, \quad j = 1, \ldots, p

Dual feasibility condition

  • Requires the Lagrange multipliers associated with inequality constraints to be non-negative
  • Mathematically expressed as μj0,j=1,,p\mu_j^* \geq 0, \quad j = 1, \ldots, p
  • Ensures that the dual problem is feasible

Complementary slackness condition

  • States that the product of each inequality constraint and its corresponding Lagrange multiplier must be zero at the optimal point
  • Mathematically expressed as μjhj(x)=0,j=1,,p\mu_j^* h_j(x^*) = 0, \quad j = 1, \ldots, p
  • Implies that either the constraint is active (hj(x)=0h_j(x^*) = 0) or the Lagrange multiplier is zero (μj=0\mu_j^* = 0)

Lagrange multipliers method

Lagrangian function formulation

  • Combines the objective function and the constraints into a single function called the Lagrangian
  • Introduces Lagrange multipliers (λi\lambda_i) for each equality constraint
  • Lagrangian function: L(x,λ)=f(x)+i=1mλigi(x)L(x, \lambda) = f(x) + \sum_{i=1}^m \lambda_i g_i(x)
  • Allows solving the constrained optimization problem as an unconstrained problem

Solving Lagrange multiplier equations

  • Set the gradient of the Lagrangian function with respect to both the decision variables and Lagrange multipliers to zero
  • Solve the resulting system of equations to find the optimal solution and the values of Lagrange multipliers
  • Example: For f(x,y)=x2+y2f(x, y) = x^2 + y^2 subject to x+y=1x + y = 1, solve L(x,y,λ)=0\nabla L(x, y, \lambda) = 0

Penalty methods for constraints

Quadratic penalty functions

  • Convert a constrained optimization problem into an unconstrained problem by adding a quadratic penalty term to the objective function
  • Quadratic penalty term: 12μi=1m(gi(x))2+12μj=1p(max(0,hj(x)))2\frac{1}{2\mu} \sum_{i=1}^m (g_i(x))^2 + \frac{1}{2\mu} \sum_{j=1}^p (\max(0, h_j(x)))^2
  • The penalty parameter μ\mu controls the severity of the penalty for violating constraints
  • As μ0\mu \rightarrow 0, the solution of the penalized problem approaches the solution of the original constrained problem

Augmented Lagrangian methods

  • Combine the Lagrangian function with a quadratic penalty term
  • Augmented Lagrangian function: LA(x,λ,μ)=f(x)+i=1mλigi(x)+12μi=1m(gi(x))2L_A(x, \lambda, \mu) = f(x) + \sum_{i=1}^m \lambda_i g_i(x) + \frac{1}{2\mu} \sum_{i=1}^m (g_i(x))^2
  • Iteratively update the Lagrange multipliers and the penalty parameter to converge to the optimal solution
  • Offer better convergence properties compared to the quadratic penalty method

Barrier and interior-point methods

Logarithmic barrier functions

  • Transform inequality constraints into a barrier term added to the objective function
  • Logarithmic barrier term: 1tj=1plog(hj(x))-\frac{1}{t} \sum_{j=1}^p \log(-h_j(x))
  • The barrier parameter tt controls the strength of the barrier
  • As tt \rightarrow \infty, the solution of the barrier problem approaches the solution of the original constrained problem
  • Maintains strict feasibility of the iterates with respect to the inequality constraints

Central path following

  • Follows the central path, which is a smooth curve parameterized by the barrier parameter tt
  • Solves a sequence of barrier problems for increasing values of tt
  • Uses Newton's method or other techniques to solve each barrier problem
  • Converges to the optimal solution as tt \rightarrow \infty

Active set methods

Working set selection

  • Identifies the set of constraints that are active (satisfied as equalities) at the current iterate
  • Partitions the constraints into active and inactive sets
  • Updates the working set based on the Lagrange multipliers and the constraint violations
  • Aims to predict the set of active constraints at the optimal solution

Solving equality constrained subproblems

  • Solves a sequence of equality constrained subproblems, considering only the active constraints
  • Uses techniques like null space methods or range space methods to handle the equality constraints
  • Performs a line search or trust region step to ensure progress towards the optimal solution
  • Updates the working set and repeats the process until convergence

Sequential quadratic programming (SQP)

Quadratic programming subproblems in SQP

  • Approximates the original problem by a sequence of (QP) subproblems
  • Each QP subproblem minimizes a quadratic approximation of the Lagrangian function subject to linearized constraints
  • The QP subproblem at iteration kk: minimize 12dTHkd+f(xk)Td\frac{1}{2}d^T H_k d + \nabla f(x_k)^T d subject to gi(xk)Td+gi(xk)=0,i=1,,m\nabla g_i(x_k)^T d + g_i(x_k) = 0, \quad i = 1, \ldots, m hj(xk)Td+hj(xk)0,j=1,,p\nabla h_j(x_k)^T d + h_j(x_k) \leq 0, \quad j = 1, \ldots, p
  • Solves each QP subproblem to obtain a search direction dkd_k

Quasi-Newton approximations in SQP

  • Uses quasi-Newton methods, such as BFGS or L-BFGS, to approximate the Hessian matrix HkH_k of the Lagrangian function
  • Avoids the expensive computation of the exact Hessian matrix
  • Updates the Hessian approximation using the change in the gradient of the Lagrangian function and the step taken
  • Ensures superlinear convergence under certain conditions

Applications of constrained optimization

Portfolio optimization with constraints

  • Optimizes the allocation of assets in a portfolio subject to constraints
  • Constraints may include budget constraints, risk limits, or diversification requirements
  • Example: Maximize expected return subject to a maximum volatility constraint

Constrained least squares problems

  • Finds the best fit to data points subject to constraints on the parameters
  • Constraints may arise from prior knowledge or physical limitations
  • Example: Fitting a curve to data points with non-negativity constraints on the coefficients

Support vector machines (SVM)

  • Constructs a hyperplane that separates two classes of data points with the maximum margin
  • Formulated as a constrained optimization problem
  • Constraints ensure that the hyperplane correctly classifies the training data
  • Example: Finding the optimal hyperplane in a linearly separable binary classification problem
© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.


© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.
Glossary
Glossary