Intro to Algorithms

🧩Intro to Algorithms Unit 12 – Greedy Algorithms: Scheduling & Coding

Greedy algorithms are optimization techniques that make locally optimal choices at each step, aiming for a global optimum. They're used in scheduling problems to allocate resources efficiently over time, minimizing completion time or meeting deadlines. These algorithms are simple to implement but don't always yield optimal solutions. Scheduling involves assigning tasks to resources, considering factors like processing times, due dates, and machine availability. Common problems include single machine, parallel machine, and flow shop scheduling. Greedy approaches often use priority rules to rank tasks, making them fast but not always optimal for complex scenarios.

What Are Greedy Algorithms?

  • Optimization algorithms make the locally optimal choice at each stage with the hope of finding a global optimum
  • Build up a solution piece by piece, always choosing the next piece that offers the most obvious and immediate benefit
  • Make a locally optimal choice in the hope that this choice will lead to a globally optimal solution
  • Do not always yield optimal solutions, but may approximate the optimal solution in a reasonable time
  • Produce good solutions on some mathematical problems, but not on others
    • Work well for problems that have "optimal substructure"
    • If a problem can be solved optimally by breaking it into sub-problems and then recursively finding the optimal solutions to the sub-problems, it is said to have optimal substructure
  • Commonly used for optimization problems (finding the best solution from all feasible solutions)
  • Generally easy to implement and understand compared to other optimization algorithms like dynamic programming

Key Concepts in Scheduling

  • Scheduling involves allocating resources over time to perform a collection of tasks
  • Objective is typically to minimize the total time to complete all tasks (makespan) or meet certain deadlines
  • Tasks may have dependencies (precedence constraints) where some tasks can only start after others have completed
  • Resources can include machines, workers, or other limited assets needed to complete tasks
  • Preemption allows tasks to be paused and resumed later, while non-preemptive scheduling requires each task to run to completion once started
  • Release time is the earliest a task can begin, while the deadline is the latest a task can complete without penalty
  • Lateness is the amount of time a task finishes after its deadline, while tardiness only counts positive lateness values
  • Idle time is any period where a resource is not being used but is available

Common Scheduling Problems

  • Single machine scheduling minimizes total completion time with tasks that have different processing times
    • Shortest processing time (SPT) first is optimal
  • Parallel machine scheduling assigns tasks to multiple identical machines to minimize makespan
    • Longest processing time (LPT) first is a good heuristic
  • Flow shop scheduling has multiple stages each with one machine, and all tasks follow the same route
    • No passing is allowed between machines
  • Job shop scheduling also has multiple stages but tasks can have different routes
    • Passing is allowed, making it more complex than flow shop
  • Open shop scheduling allows tasks to be processed in any order on the machines
    • Combines both job shop and parallel machine features
  • Resource constrained project scheduling (RCPSP) schedules project activities subject to precedence and resource capacity constraints
    • Minimizes total project duration
  • Workforce scheduling assigns workers to shifts or tasks over a time period
    • Considers worker availability, skills, preferences and labor regulations

Greedy Approach to Scheduling

  • Greedy algorithms build a schedule incrementally by making the locally optimal choice at each decision point
  • Often use a priority rule or dispatch policy to rank tasks by importance or urgency
    • Shortest processing time first (SPT) minimizes total completion time
    • Earliest due date first (EDD) minimizes maximum lateness
    • Minimum slack first (MS) minimizes maximum tardiness
    • Longest processing time first (LPT) minimizes makespan on parallel machines
  • Greedy is not always optimal but is usually fast and simple to implement
  • Can be used as a heuristic to seed other optimization methods like local search
  • Examples of greedy scheduling algorithms:
    • List scheduling for parallel machines and precedence constraints
    • Moore's algorithm for single machine scheduling with due dates
    • Coffman-Graham algorithm for multiprocessor task scheduling with precedence constraints
    • Hu's algorithm for precedence constrained scheduling on two machines

Coding Greedy Algorithms

  • Greedy algorithms can be implemented using simple loops and conditional statements
  • Often use a sorting step to rank tasks by the greedy selection criteria
  • Break ties arbitrarily when two choices have equal value
  • Data structures like priority queues can make selection of the best choice efficient
  • Pseudocode for a generic greedy algorithm:
    function Greedy(a, n) 
        for i from 1 to n
            x = Select(a)
            if Feasible(x) then
                Solution = Solution + x
        return Solution
    
  • Python has built-in functions and libraries useful for implementing greedy algorithms
    • sorted()
      and
      sort()
      to order data
    • heapq
      for efficient priority queues
    • queue.PriorityQueue
      from the standard library
  • May need to define a custom comparison function for sorting tasks by the greedy criteria
  • Important to test code on a range of problem instances to check for correctness and efficiency

Pros and Cons of Greedy Methods

Advantages:

  • Simple to understand and implement
  • Very efficient, usually run in O(n log n) time or better
  • Can be used on a wide variety of problems
  • Often give solutions that are close to optimal
  • Useful as heuristics or approximation algorithms when optimal methods are too slow Disadvantages:
  • Not always optimal, may get stuck in a local optimum
  • Difficult to prove that a greedy algorithm always gives the best solution
  • May require sorting which adds to the complexity
  • Greedy choices can depend on the order of the input
  • Less powerful than more complex methods like dynamic programming
  • Cannot be applied to problems without optimal substructure

Real-World Applications

  • Scheduling tasks or jobs on machines or processors
    • Manufacturing and production scheduling
    • Multiprocessor and parallel machine scheduling
  • Project management and resource allocation
    • Scheduling construction tasks based on worker and equipment availability
  • Workforce scheduling and planning
    • Assigning nurses to shifts based on skills and preferences
  • Supply chain and logistics optimization
    • Scheduling shipments and deliveries by truck or plane
  • Interval scheduling and bandwidth allocation
    • Scheduling ads in radio/tv broadcasting
    • Assigning bandwidth to maximize number of users
  • Bioinformatics and DNA sequencing
    • Fragment assembly in shotgun sequencing
  • Huffman coding and data compression
    • Constructing optimal prefix-free codes

Practice Problems and Solutions

  1. Single machine scheduling with due dates and minimizing maximum lateness
    • Input: processing times pip_i and due dates did_i for nn tasks
    • Output: a schedule that minimizes Lmax=maxi(Cidi)L_max = max_i (C_i - d_i) where CiC_i is the completion time of task ii
    • Greedy algorithm: Earliest Due Date (EDD) first
      • Sort tasks in non-decreasing order of due dates did_i
      • Schedule tasks in this order
    • Optimality: EDD is optimal for this problem
    • Time complexity: O(n log n) for sorting
  2. Parallel machine scheduling with the objective of minimizing makespan
    • Input: processing times pip_i for nn tasks and mm identical parallel machines
    • Output: an assignment of tasks to machines that minimizes the makespan Cmax=maxjC(Mj)C_max = max_j C(M_j)
    • Greedy algorithm: Longest Processing Time (LPT) first
      • Sort tasks in non-increasing order of processing times pip_i
      • Assign each task to the machine that will be idle first
    • Performance: LPT has a worst-case ratio of 4/3 - 1/(3m) compared to optimal makespan
    • Time complexity: O(n log n) for sorting, O(n log m) for assignment with a priority queue
  3. Interval scheduling to maximize number of compatible tasks scheduled
    • Input: a set of nn tasks where each task has a start time sis_i and finish time fif_i
    • Output: a maximum cardinality subset of mutually compatible tasks, i.e. no overlap in time
    • Greedy algorithm: Earliest finish time first
      • Sort tasks by non-decreasing order of finish times fif_i
      • Go through sorted list, selecting a task if compatible with previously selected tasks
    • Optimality: Earliest finish time is optimal for this problem
    • Time complexity: O(n log n) for sorting


© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.