๐Mathematical Methods for Optimization Unit 16 โ Semidefinite Programming
Semidefinite Programming (SDP) is a powerful optimization technique that extends linear programming to handle matrix inequalities. It deals with optimizing linear objectives subject to constraints on positive semidefinite matrices, offering a versatile framework for solving complex problems in various fields.
SDP has applications in control theory, machine learning, and combinatorial optimization. It provides efficient methods for approximating hard problems and obtaining bounds on optimal values. With its ability to handle nonlinear constraints through clever reformulations, SDP has become an essential tool in modern optimization theory and practice.
Semidefinite Programming (SDP) is a subfield of convex optimization that generalizes linear programming (LP) and second-order cone programming (SOCP)
SDP deals with optimizing a linear objective function subject to linear matrix inequality (LMI) constraints
LMI constraints require a matrix to be positive semidefinite (eigenvalues are non-negative)
SDP problems can be solved efficiently using interior-point methods or other specialized algorithms
Many problems in engineering, control theory, and combinatorial optimization can be formulated as SDP problems
SDP provides a powerful framework for approximating hard optimization problems and obtaining bounds on their optimal values
SDP has applications in areas such as sensor network localization, graph theory, and polynomial optimization
SDP is a relatively recent field, with the first polynomial-time algorithm for SDP developed in the 1990s (Nesterov-Nemirovski algorithm)
Key Concepts and Definitions
Positive semidefinite matrix: A symmetric matrix A is positive semidefinite if xTAxโฅ0 for all vectors x
Equivalently, all eigenvalues of A are non-negative
Linear matrix inequality (LMI): An inequality of the form A(x)โชฐ0, where A(x) is a symmetric matrix that depends linearly on the decision variables x
Primal SDP problem: Minimize cTx subject to A(x)โชฐ0, where c is a given vector and A(x)=A0โ+โi=1nโxiโAiโ for given symmetric matrices A0โ,A1โ,โฆ,Anโ
Dual SDP problem: Maximize tr(A0โY) subject to tr(AiโY)=ciโ for i=1,โฆ,n and Yโชฐ0, where Y is a symmetric matrix variable
Schur complement: A useful tool for converting nonlinear matrix inequalities into LMIs
For a symmetric block matrix (ABTโBCโ) with Aโป0, the Schur complement is CโBTAโ1Bโชฐ0
S-procedure: A technique for converting quadratic constraints into LMIs using additional scalar variables
Sum-of-squares (SOS) decomposition: Expressing a polynomial as a sum of squared polynomials, which can be formulated as an SDP problem
The Math Behind SDP
SDP problems can be written in the standard primal form: Minimize cTx subject to A(x)โชฐ0, where A(x)=A0โ+โi=1nโxiโAiโ
The matrices A0โ,A1โ,โฆ,Anโ are given symmetric matrices, and x is the vector of decision variables
The dual SDP problem is: Maximize tr(A0โY) subject to tr(AiโY)=ciโ for i=1,โฆ,n and Yโชฐ0
Strong duality holds for SDP under mild conditions (Slater's condition), meaning the optimal values of the primal and dual problems are equal
SDP problems can be solved using interior-point methods, which follow a central path to the optimal solution
The central path is defined by the solutions to the perturbed KKT conditions: A(x)โชฐ0, Yโชฐ0, A(x)Y=ฮผI, and tr(AiโY)=ciโ for i=1,โฆ,n, where ฮผ>0 is a barrier parameter
The complexity of interior-point methods for SDP is polynomial in the problem size (number of variables and constraints)
SDP can be used to obtain tight bounds for hard optimization problems by considering a relaxation of the original problem
For example, the max-cut problem in graph theory can be approximated using an SDP relaxation (Goemans-Williamson algorithm)
Many nonlinear optimization problems can be converted into SDP form using techniques like the Schur complement and S-procedure
Solving SDP Problems
Interior-point methods are the most common approach for solving SDP problems
These methods follow a central path to the optimal solution by solving a sequence of perturbed KKT conditions
The basic steps of an interior-point method for SDP are:
Choose an initial point (x0,Y0,S0) that satisfies the perturbed KKT conditions for a large value of ฮผ
Compute the Newton direction (ฮx,ฮY,ฮS) by solving a system of linear equations based on the perturbed KKT conditions
Update the current point using a step size ฮฑ chosen to maintain positive semidefiniteness: (xk+1,Yk+1,Sk+1)=(xk,Yk,Sk)+ฮฑ(ฮx,ฮY,ฮS)
Decrease the barrier parameter ฮผ and repeat steps 2-4 until convergence
The choice of initial point and step size can significantly impact the convergence speed of interior-point methods
Other methods for solving SDP problems include:
Cutting-plane methods: Generate linear inequalities that cut off non-optimal solutions
Augmented Lagrangian methods: Solve a sequence of unconstrained problems with a penalty term for constraint violations
First-order methods: Use gradient information to update the solution iteratively (e.g., mirror descent, conditional gradient)
Exploiting sparsity and structure in the problem data can lead to more efficient algorithms for large-scale SDP problems
Applications in Real Life
SDP has numerous applications across various fields, including:
Control theory: Designing optimal controllers, analyzing stability of dynamical systems
Signal processing: Sensor network localization, beamforming, robust estimation
Combinatorial optimization: Max-cut problem, graph coloring, stable set problem
Polynomial optimization: Finding global bounds for nonconvex problems, sum-of-squares programming
In sensor network localization, SDP can be used to estimate the positions of sensors based on noisy distance measurements between pairs of sensors
The problem is formulated as an SDP by relaxing the rank constraint on the Gram matrix of sensor positions
In portfolio optimization, SDP can be used to find the optimal allocation of assets that minimizes risk while achieving a target return
The risk is measured by the variance of the portfolio, which can be expressed as a quadratic form in the asset weights
In power systems, SDP is used for optimal power flow (OPF) problems, which aim to minimize the cost of generating and distributing electricity subject to physical and operational constraints
SDP relaxations of the OPF problem provide lower bounds on the optimal cost and can help identify the global optimum
In quantum information theory, SDP is used to characterize the set of entanglement witnesses and to find the optimal measurements for distinguishing quantum states
SDP has also been applied to problems in computer vision, statistics, and computational biology, among other areas
Comparison with Other Optimization Methods
SDP is a generalization of linear programming (LP) and second-order cone programming (SOCP)
LP deals with optimizing a linear objective subject to linear equality and inequality constraints
SOCP includes constraints of the form โฅAx+bโฅ2โโคcTx+d, which define a second-order cone
SDP is more expressive than LP and SOCP, as it can handle constraints involving positive semidefinite matrices
However, SDP problems are generally more computationally expensive to solve than LP or SOCP problems of similar size
SDP is a special case of cone programming, which deals with optimizing a linear objective subject to constraints involving a convex cone
Other examples of cone programming include exponential cone programming and power cone programming
SDP is related to nonlinear programming (NLP), which deals with optimizing a nonlinear objective subject to nonlinear constraints
SDP can be used to obtain convex relaxations of nonconvex NLP problems, providing lower bounds on the optimal value
Compared to other global optimization techniques like branch-and-bound or interval analysis, SDP provides a more systematic way to obtain tight bounds for nonconvex problems
However, the size of the SDP relaxation grows quickly with the problem dimension, limiting its applicability to large-scale problems
In some cases, SDP can be combined with other optimization techniques (e.g., interior-point methods for NLP) to develop more efficient algorithms for specific problem classes
Advanced Topics and Extensions
Exploiting sparsity in SDP: Many large-scale SDP problems have sparse data matrices, which can be exploited to develop more efficient algorithms
Chordal sparsity and clique decomposition techniques can be used to break down the SDP into smaller subproblems
Sparse matrix factorization methods (e.g., Cholesky factorization) can be used to solve the Newton system in interior-point methods more efficiently
Low-rank SDP: In some applications, the optimal solution to the SDP has a low rank, which can be exploited to reduce the computational complexity
Burer-Monteiro factorization: Replace the positive semidefinite matrix variable X with a low-rank factorization X=RRT, where R has a fixed rank
Riemannian optimization techniques can be used to optimize directly over the set of low-rank matrices
Robust SDP: Dealing with uncertainty in the problem data by considering worst-case scenarios
Ellipsoidal uncertainty: The uncertain parameters are assumed to lie within an ellipsoid, leading to an SDP with additional LMI constraints
Chance constraints: The constraints are required to hold with a certain probability, which can be approximated using SDP
Noncommutative SDP: An extension of SDP where the variables are matrices and the order of matrix multiplication matters
Useful for problems in quantum information theory and operator theory
Requires specialized algorithms and software tools (e.g., NCSOStools, NCPOPSolver)
SDP hierarchies: A sequence of increasingly tight SDP relaxations for nonconvex problems
Lasserre hierarchy: Based on sum-of-squares representations of nonnegative polynomials
Lovรกsz-Schrijver hierarchy: Based on lifting the problem to a higher-dimensional space and projecting back to the original space
SDP and algebraic geometry: Connections between SDP and polynomial optimization, real algebraic geometry, and moment problems
SDP can be used to test for the existence of sum-of-squares decompositions and to compute moments of probability measures
Tricks and Tips for SDP Success
Formulate the problem carefully: The way you formulate the SDP can greatly impact its size and computational complexity
Try to minimize the number of variables and constraints by exploiting problem structure and symmetry
Use the Schur complement and S-procedure to convert nonlinear constraints into LMIs whenever possible
Scale the problem: Numerical issues can arise when the problem data has vastly different scales
Normalize the data matrices and rescale the variables to have similar magnitudes
Use a well-conditioned basis for the null space of the equality constraints
Choose a good initial point: The initial point for interior-point methods should be chosen carefully to ensure fast convergence
Use a heuristic like the analytic center of the feasible set or the solution to a simplified problem (e.g., with a subset of constraints)
Infeasible start techniques can be used to find an initial point that satisfies the LMI constraints but not necessarily the equality constraints
Exploit sparsity and structure: Many SDP problems have sparse data matrices or special structure (e.g., block-diagonal, low-rank) that can be exploited
Use sparse matrix representations and factorization methods to reduce memory usage and computational cost
Identify and exploit any symmetry or invariance properties of the problem
Use a high-quality SDP solver: The choice of SDP solver can significantly impact the efficiency and reliability of the solution process
Some popular SDP solvers include SDPA, MOSEK, SeDuMi, and SDPT3
Consider using a modeling language like YALMIP or CVX to formulate the problem in a high-level way and interface with multiple solvers
Verify the solution: It's important to check the quality of the computed solution, especially for ill-conditioned or degenerate problems
Check the primal and dual residuals to ensure they are within acceptable tolerances
Compute the eigenvalues of the primal and dual solution matrices to check for positive semidefiniteness
Use a post-processing step to round or refine the solution if necessary (e.g., for combinatorial problems)
Experiment with different formulations and relaxations: There may be multiple ways to formulate the same problem as an SDP, leading to relaxations of varying tightness
Try different formulations and compare their strengths and weaknesses
Use a hierarchy of relaxations (e.g., Lasserre hierarchy) to obtain increasingly tight bounds for nonconvex problems