Algorithm Analysis -- Week 9
Introduction
This week we will continue to look at the greedy method of solving problems.
Note that the links from this page are to handouts that will be distributed the night of class. Only print out those handouts if you could not attend class.
Main topics this week:
Greedy Method vs Dynamic Programming
Knapsack Problem
0-1 Knapsack Problem
Fractional Knapsack Problem
0-1 Knapsack Problem, Again
Knapsack Problem Exercise
Review for Midterm 2
Next Week
Greedy Method vs Dynamic Programming
The same types of problems (optimization problems) can often be solved using both the greedy method and dynamic programming. Often, the greedy method produces more efficient solutions (but not always).
It is, however, often very difficult to determine if the greedy method really does produce optimal solutions. Recall the example of the greedy algorithm for making change. It seemed to work okay for American coins, but would not work for any arbitrary set of coins. For dynamic programming, it is usually much easier to know that the solution is optimal--if the principle of optimality holds and we find a recursive solution, the entire solution is optimal.
The knapsack problem is a famous optimization problem. It is concerned with figuring out how much you can fit into a knapsack (or backpack) to maximize the profit of what is inside the knapsack. Imagine that you are a salesperson, and you can only sell what is inside your knapsack. You would not want to take the cheapest items, you would want to take the items that made you the most profit.
There are two variations on the knapsack problem. In the 0-1 Knapsack Problem, you must either take an entire item or not. You cannot take half of an item. In the Fractional Knapsack Problem, you can take only part of an item.
In the 0-1 Knapsack Problem, you must either take an item or not take an item. You may not take part of an item. Since the goal is to optimize the profit of the items in the knapsack, we might consider using the Greedy method to solve the problem. Using the Greedy method, we need to have a selection procedure...a way of picking which item will contribute most to the profit.
Let's consider the following items:
Item # | Profit | Weight |
1 | $50 | 5 |
2 | $60 | 10 |
3 | $70 | 20 |
Which one of these is the best choice? We could pick according to the maximum profit. Let's say that our knapsack can hold 20 units of weight. If we pick according to the maximum profit, we pick item #3, for a total profit of $70. However, a better solution is to pick items 1 and 2 for a total profit of $110. So picking according to maximum profit is not a good solution.
We might pick according to minimum weight, but consider the following items:
Item # | Profit | Weight |
1 | $5 | 5 |
2 | $3 | 10 |
3 | $70 | 20 |
In this case, picking according to minimum weight for a knapsack that can hold 20 weight units would result in items 1 and 2 for a total profit of $8, instead of the better solution of item 3 with a total profit of $70.
Another option is to pick according to the profit per unit of weight ratio. Consider these items:
Item # | Profit | Weight | Profit/Weight |
1 | $50 | 5 | $10 |
2 | $60 | 10 | $6 |
3 | $140 | 20 | $7 |
If we pick according to the best profit/weight ratio, we'll end up with items 1 and 2 for a total profit of $110. The better solution is item 3 for a total profit of $140. This assumes a knapsack capacity of 20 weight units.
Clearly, the greedy method does not always produce an optimal solution for the 0-1 knapsack method. What about the fractional knapsack problem?
Let's use the last selection procedure we looked at, that of choosing items with the highest profit per unit of weight. Now, though, we can take only part of an item. Consider the following items:
Item # | Profit | Weight | Profit/Weight |
1 | $50 | 5 | $10 |
2 | $60 | 10 | $6 |
3 | $140 | 20 | $7 |
The highest profit/weight ratio is item 1, so we pick it. Its weight is 5, and our knapsack can hold 30, so we take the entire item. The next highest profit/weight ratio is item 3. Its weight is 20, and we have 25 space left in the knapsack, so we take the entire item. The next highest profit/weight ratio is item 2. We have 5 space left in our knapsack, and the item weighs 10. So we can take 50% of the item. 50% of the item would give us a profit of $30 for that portion of the item. The total profit is then $50 + $140 + $30 = $220. That is the optimal solution.
The greedy method does indeed give an optimal solution as long as you can take only part of an item. If the items to be taken are things like televisions, you obviously cannot take part of an item. But if the items are bags of coins or gold dust, you can take only part of them.
The greedy method did not work for the 0-1 knapsack problem. Since this is an optimization problem, we might expect that dynamic programming would work. In fact, dynamic programming can be used to solve this version of the knapsack problem, where the greedy method failed.
Like most dynamic programming algorithms, this uses a two-dimensional array. If we named the array P, then P [i][w] is the profit to be gained from choosing from the first i items, with a weight limit of w. We will fill in the contents of this array, starting with low values of i and w, moving until we have calculated the final solution, with i = n and w = W (capital W is the weight limit of the knapsack). As we move farther along the array, we will put values in that represent the maximum profit.
The basic question that we're asking for every element in the array is to find the maximum profit between two choices:
We simply calculate both of those values and pick the maximum.
You can find the dynamic programming algorithm here. What would the big Oh of this algorithm be?
Let's try this algorithim with some sample items. Here are one set of items we tried with the greedy approach.
Item # | Profit | Weight |
1 | $50 | 5 |
2 | $60 | 10 |
3 | $70 | 20 |
You can see that our basic operation is to take the maximum of the profit we had figured for a given w before considering a new item, or the profit considering the new item and whatever we could fit into w besides that new item. Look back at the P array you came up with until this makes sense.
If W is very high, then this algorithm is not very efficient. As you noticed while running through this algorithm, many of the entries in P are the same. Page 169 in your book describes a modification to the algorithm where you only calculate the entries in P that are important. This makes the algorithm more efficient.
The basic approach is to start at P [n][W]. You know that you must calculate that element, because it is the answer to the problem. From there, you know that you must calculate both P [n-1][W] and P [n-1][W - weight [n]]. For those two elements, you can figure out which other four elements must be calculated. For those four elements, you can figure out which eight elements must be calculated, etc. You should notice a 2n pattern evolving. That gives you an O(2n) for this algorithm.
Let's see how this works on the following data:
Item # | Profit | Weight |
1 | $50 | 10 |
2 | $30 | 5 |
3 | $70 | 15 |
Assume a knapsack capacity of 15.
I will write the profit and weights for some items on the board, and give you a capacity for the knapsack. You will run the algorithms for both the fractional knapsack problem and the 0-1 knapsack problem and tell me what the optimal items are for both cases. This exercise is worth 10 points.
For the 0-1 knapsack problem, you may either calculate the entire array, or only those elements that are important.
We've seen how dynamic programming can help with the two-machine flow shop scheduling problem, but there are other kinds of scheduling problems. The greedy method can be used to solve some of them.
Scheduling with Deadlines
In this scheduling problem, each job to be run has a profit and a deadline. If the job is run by the deadline, then we get the profit for that job. If the job is run after the deadline, we get no profit from the job. Each job takes exactly one time unit to run. For example:
Job | Deadline | Profit |
1 | 2 | 30 |
2 | 1 | 35 |
3 | 2 | 25 |
4 | 1 | 40 |
The goal is to figure out which jobs we can do to get the maximum profit. Since some of the jobs have the same deadline, we probably won't be able to do all the jobs. We could figure out all possible combinations of jobs. That brute force approach would be O(n!), which is very inefficient.
We can make a more efficient algorithm by noticing a couple of things. First, not all combinations of jobs are possible. For example, since items 2 and 4 both have a deadline of 1, we cannot have a combination where we run job 4 and then job 2 (because after running job 4, the time has advanced to 2, so the deadline for job 2 has passed). Second, given a choice between jobs, we would want to choose the one that gives us the maximum profit.
The first step in the algorithm is to sort the jobs in order from highest profit to lowest profit.
Job | Deadline | Profit |
4 | 1 | 40 |
2 | 1 | 35 |
1 | 2 | 30 |
3 | 2 | 25 |
Our algorithm is:
while (the problem is not solved)
{
select next jobif (S would have valid combinations with that job added)
add job to Sif (there are no more jobs)
problem is solved
}
S is a set of jobs. Let's run this on the data up above.
It looks like we're processing each item once, which would give us a big Oh of O(n). But, the question about whether S would have valid combinations means that we will need some sort of loop to check multiple combinations. In the worst case, we would end up with another loop based on n, giving us O(n2).
Try it yourself on this data (it has already been sorted):
Job | Deadline | Profit |
1 | 3 | 50 |
2 | 1 | 45 |
3 | 1 | 40 |
4 | 2 | 35 |
5 | 3 | 30 |
6 | 2 | 25 |
7 | 1 | 20 |
We will review topics for the second midterm.
Next week we will have the second midterm.