You have 3 free guides left 😟

Light

You have 3 free guides left 😟

18.2 Deterministic dynamic programming

5 min read•july 30, 2024

Dynamic programming is a powerful problem-solving technique that breaks complex problems into simpler subproblems. It's all about finding optimal solutions by leveraging the principle of optimality and using Bellman's equation to solve problems recursively.

In this section, we'll explore the key concepts of dynamic programming, including variables, decision variables, and problem formulation. We'll also dive into forward and backward approaches, and examine algorithms and implementation considerations for solving DP problems efficiently.

Dynamic Programming Principles

Fundamental Concepts

Top images from around the web for Fundamental Concepts

CS 360: Lecture 21: Single Source Shortest Paths - Bellman-Ford Algorithm View original
Is this image relevant?
Dynamic Programming - Basics Behind View original
Is this image relevant?
Notes on Reinforcement Learning (2): Dynamic Programming - Billy Ian's Short Leisure-time Wander View original
Is this image relevant?
CS 360: Lecture 21: Single Source Shortest Paths - Bellman-Ford Algorithm View original
Is this image relevant?
Dynamic Programming - Basics Behind View original
Is this image relevant?

1 of 3

Top images from around the web for Fundamental Concepts

CS 360: Lecture 21: Single Source Shortest Paths - Bellman-Ford Algorithm View original
Is this image relevant?
Dynamic Programming - Basics Behind View original
Is this image relevant?
Notes on Reinforcement Learning (2): Dynamic Programming - Billy Ian's Short Leisure-time Wander View original
Is this image relevant?
CS 360: Lecture 21: Single Source Shortest Paths - Bellman-Ford Algorithm View original
Is this image relevant?
Dynamic Programming - Basics Behind View original
Is this image relevant?

1 of 3

Dynamic programming breaks down complex problems into simpler subproblems and uses their solutions to construct the solution of the original problem
Principle of optimality states that an optimal solution to a problem contains optimal solutions to all its subproblems
Bellman's equation provides a recursive formulation for solving optimization problems by relating the value of a decision problem at one point in time to the value at a later time
State variables represent the current condition of the system and contain all the information necessary to make optimal decisions going forward
Decision variables represent the choices available at each stage of the problem-solving process

Problem Formulation

Objective function quantifies the goal of the optimization problem, typically expressed as a function to be maximized or minimized
Constraints define the limitations or requirements that must be satisfied by the solution, including both stage-wise and overall constraints
Formulate the problem by identifying state variables, decision variables, objective function, and constraints
Break down the problem into stages, with each stage representing a point where decisions are made
Define the relationship between stages using the : $V_t(x_t) = \max_{u_t} \{r_t(x_t, u_t) + V_{t+1}(f_t(x_t, u_t))\}$ $V_{t} (x_{t}) = max_{u_{t}} {r_{t} (x_{t}, u_{t}) + V_{t + 1} (f_{t} (x_{t}, u_{t}))}$
- $V_t(x_t)$ at stage t
- $x_t$ state at stage t
- $u_t$ decision at stage t
- $r_t(x_t, u_t)$ immediate reward
- $f_t(x_t, u_t)$ state transition function

Forward vs Backward Recursion

Forward Recursion (Bottom-Up Approach)

Starts from the base case and builds up the solution by solving subproblems in a forward manner until the final solution is reached
Typically uses iteration to solve subproblems in a specific order
Often more efficient in terms of memory usage as it avoids the overhead of recursive function calls

Example (Fibonacci sequence):

fib = [0, 1]
for i in range(2, n+1):
    fib.append(fib[i-1] + fib[i-2])
return fib[n]

Backward Recursion (Top-Down Approach)

Begins with the final stage and works backwards, solving subproblems and building the optimal solution from the end to the beginning
Often implemented using recursion with memoization to avoid redundant calculations
Can be more intuitive for certain problems and allows for solving only necessary subproblems

Example ():

def knapsack(W, wt, val, n, memo={}):
    if n == 0 or W == 0:
        return 0
    if (W, n) in memo:
        return memo[(W, n)]
    if wt[n-1] > W:
        result = knapsack(W, wt, val, n-1, memo)
    else:
        result = max(val[n-1] + knapsack(W-wt[n-1], wt, val, n-1, memo),
                     knapsack(W, wt, val, n-1, memo))
    memo[(W, n)] = result
    return result

Optimal Value Function and Policy

Optimal value function represents the best possible value of the objective function that can be achieved from a given state at a particular stage
leads to the optimal value function at each stage and state of the problem
Memoization stores the results of expensive function calls and returns the cached result when the same inputs occur again, improving computational efficiency
Principle of subproblem independence ensures that the solution to a subproblem depends only on the current state and future decisions, not on how the current state was reached
Curse of dimensionality refers to the exponential increase in as the number of state variables increases, often requiring approximation methods for high-dimensional problems (linear programming relaxation)

Dynamic Programming Algorithms

Tabulation and Iteration Methods

Tabulation involves filling a table with solutions to subproblems, starting from the smallest and working towards the largest
computes the optimal value function by iteratively updating the value estimates for each state until is achieved
Policy iteration alternates between policy evaluation and policy improvement steps to find the optimal policy

Example (Longest Common Subsequence):

def lcs(X, Y):
    m, n = len(X), len(Y)
    L = [[0] * (n + 1) for _ in range(m + 1)]
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if X[i-1] == Y[j-1]:
                L[i][j] = L[i-1][j-1] + 1
            else:
                L[i][j] = max(L[i-1][j], L[i][j-1])
    return L[m][n]

Implementation Considerations

Efficient data structures (arrays, hash tables) are crucial for storing and accessing intermediate results
Dynamic programming often involves trade-offs between time complexity and space complexity
Parallel processing techniques can be applied to certain problems to improve computational efficiency by solving independent subproblems simultaneously

Example (Matrix Chain Multiplication):

def matrix_chain_order(p):
    n = len(p) - 1
    m = [[0] * n for _ in range(n)]
    for L in range(2, n + 1):
        for i in range(n - L + 1):
            j = i + L - 1
            m[i][j] = float('inf')
            for k in range(i, j):
                q = m[i][k] + m[k+1][j] + p[i]*p[k+1]*p[j+1]
                if q < m[i][j]:
                    m[i][j] = q
    return m[0][n-1]

Computational Complexity of Dynamic Programming

Time and Space Complexity Analysis

Time complexity typically expressed in terms of the number of subproblems and the time required to solve each subproblem
Space complexity refers to the amount of memory required to store the solutions to subproblems and any additional data structures used in the algorithm
Overlapping subproblems property allows for significant reduction in time complexity compared to naive recursive approaches
Optimal substructure property ensures that the overall optimal solution can be constructed from optimal solutions of subproblems
Asymptotic analysis techniques (Big O notation) describe the upper bound of the growth rate of dynamic programming algorithms as input size increases

Complexity Reduction Techniques

State aggregation combines similar states to reduce the state space and computational complexity
Approximate dynamic programming uses function approximation techniques to estimate the value function in high-dimensional problems
Heuristic approaches trade off optimality for efficiency by using problem-specific knowledge to guide the search for solutions

Example (0/1 Knapsack with branch and bound):

def knapsack_branch_and_bound(W, wt, val, n):
    class Item:
        def __init__(self, weight, value):
            self.weight = weight
            self.value = value
            self.ratio = value / weight

    items = [Item(wt[i], val[i]) for i in range(n)]
    items.sort(key=lambda x: x.ratio, reverse=True)

    def bound(weight, value, index):
        if weight > W:
            return 0
        bound_value = value
        j = index
        while j < n and weight + items[j].weight <= W:
            weight += items[j].weight
            bound_value += items[j].value
            j += 1
        if j < n:
            bound_value += (W - weight) * items[j].ratio
        return bound_value

    def knapsack_recursive(weight, value, index):
        nonlocal max_value
        if index == n:
            max_value = max(max_value, value)
            return
        if weight + items[index].weight <= W:
            if bound(weight + items[index].weight, value + items[index].value, index + 1) > max_value:
                knapsack_recursive(weight + items[index].weight, value + items[index].value, index + 1)
        if bound(weight, value, index + 1) > max_value:
            knapsack_recursive(weight, value, index + 1)

    max_value = 0
    knapsack_recursive(0, 0, 0)
    return max_value

AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

About Us

About Fiveable Blog Careers Testimonials Code of Conduct Terms of Use Privacy Policy CCPA Privacy Policy

Resources

Cram Mode AP Score Calculators Study Guides Practice Quizzes Glossary Crisis Text Line Request a Feature

Stay Connected

AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

About Us

About Fiveable Blog Careers Testimonials Code of Conduct Terms of Use Privacy Policy CCPA Privacy Policy

Resources

Cram Mode AP Score Calculators Study Guides Practice Quizzes Glossary Crisis Text Line Request a Feature

AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

Back

Glossary

You have 3 free guides left 😟

You have 3 free guides left 😟

18.2 Deterministic dynamic programming

Dynamic Programming Principles

Fundamental Concepts

Top images from around the web for Fundamental Concepts

Top images from around the web for Fundamental Concepts

Problem Formulation

Forward vs Backward Recursion

Forward Recursion (Bottom-Up Approach)

Backward Recursion (Top-Down Approach)

Optimal Value Function and Policy

Dynamic Programming Algorithms

Tabulation and Iteration Methods

Implementation Considerations

Computational Complexity of Dynamic Programming

Time and Space Complexity Analysis

Complexity Reduction Techniques

© 2024 Fiveable Inc. All rights reserved.

AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

About Us

Resources

Stay Connected

© 2024 Fiveable Inc. All rights reserved.

AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

About Us

Resources

© 2024 Fiveable Inc. All rights reserved.

AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

Back

Next