Problem Statement:
We are given an array of n items, each with a weight ($\boldsymbol{W_i}$) and a profit ($\boldsymbol{P_i}$), and a bag (aka knapsack) which has a limited capacity of $\boldsymbol{C}$ (where $\boldsymbol{C}$ is the maximum amount of weight that the bag can hold). The goal is to find a subset of the items such that their total weight is <= C and the sum of their profits is maximum.
Given:
- $C$
- $W : [W_0, W_1, … W_{n-1}] \ (Size = n)$
- $P : [P_0, P_1, … P_{n-1}] \ (Size = n)$
Note: Each item can only be used once
Brute-Force/Recursive Approach
For each item, we can make 2 choices: to include it or to exclude it. Therefore, the number of combinations choices we have is $\boldsymbol{2^n}$. We can recursively choose or not choose an item, based on the space available in the bag, and then keep track of the maximum profit attained throught all the subsets/combinations we go through.
The recursive function knapsack will take the weight array (W), profit array (P), current space left in the bag (Space) and the current index of the item (i).
For each item, it will first check to see if the weight of the item W[i] doesn’t excede the current capacity (Space) of the bag. If it does, the item will be excluded (since there is no other choice). Otherwise, we will find the maximum profit between including and excluding the item (because it is possible that the item is in the subset of items that result in the maximum profit and it is also possible that it is not in that subset - we need to consider both possibilites).
0/1 Knapsack Recursive Decision Tree (Source)
Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
#include <bits/stdc++.h>
using namespace std;
/**
* @param W array of item weights
* @param P array of item profits
* @param Space current capacity of the bag (before choosing item i)
* @param i current item index (start from n-1)
*/
int knapsack(const vector<int> &W, const vector<int> &P, int Space, int i)
{
if (i == -1 || Space == 0)
return 0;
if (W[i] > Space)
return knapsack(W, P, Space, i - 1); //Exclude the item
else
return max(knapsack(W, P, Space, i - 1), //Exclude the item
P[i] + knapsack(W, P, Space - W[i], i - 1)); //Include the item
}
int main()
{
vector<int> W = {7, 2, 4}, P = {10, 5, 6};
int C = 7;
cout << knapsack(W, P, C, W.size() - 1);
return 0;
}
Complexity Analysis
- Time complexity: $\boldsymbol{O(2^n)}$ - Size of recursion tree will be $\boldsymbol{2^n}$
- Space complexity: $\boldsymbol{O(n)}$ - The depth of the recursion tree can go up to $\boldsymbol{n}$.
Recursive (Top-down) + Memoization
If we notice, out recursive function only depends on two things: the items we choose (i) and the current capacity of the bag (Space). The weights and profit arrays remain constant in each call (we could choose to make them global as well). Therefore, if we think of the recursive call as knapsack(Space, i) then we can see that out recursive function re-calculates a call with the same parameters multiple times. Example:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
In the following recursion tree, K(a, b) refers
to knapSack(Space, i).
The recursion tree is for following sample inputs.
W[] = {1, 1, 1}, P[] = {10, 20, 30}, C = 2
K(C, n-1)
K(2, 3)
/ \
/ \
K(2, 2) K(1, 2)
/ \ / \
/ \ / \
K(2, 1) K(1, 1) K(1, 1) K(0, 1)
/ \ / \ /
/ \ / \ /
K(2, 0) K(1, 0) K(1, 0) K(0, 0) K(1, 0)
We can see that K(1, 1) is calculated multiple times. Eventhough it may seem like a small calculation from the example, it is actually very expensive. Imagine if the tree was very deep and we can to calculate the same sub-problems (aka sub-calls) multiple times, it would make the time complexity exponential (which is seen in the brute-force approach). Caching the result saves this time and only explores new sub-problems/sub-calls that haven’t been cached.
Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
#include <bits/stdc++.h>
using namespace std;
// A hash function used to hash a pair of any kind
struct hash_pair {
template <class T1, class T2>
size_t operator()(const pair<T1, T2>& p) const
{
auto hash1 = hash<T1>{}(p.first);
auto hash2 = hash<T2>{}(p.second);
return hash1 ^ hash2;
}
};
unordered_map<pair<int, int>, int, hash_pair> dp;
/**
* @param W array of item weights
* @param P array of item profits
* @param Space current capacity of the bag (before choosing item i)
* @param i current item index (start from n-1)
*/
int knapsack(const vector<int> &W, const vector<int> &P, int Space, int i)
{
if(dp.find({i, Space}) != dp.end())
return dp[{i, Space}];
if (i == -1 || Space == 0)
return dp[{i, Space}] = 0;
if (W[i] > Space)
return dp[{i, Space}] = knapsack(W, P, Space, i - 1); //Exclude the item
else
return dp[{i, Space}] = max(knapsack(W, P, Space, i - 1), //Exclude the item
P[i] + knapsack(W, P, Space - W[i], i - 1)); //Include the item
}
int main()
{
vector<int> W = {7, 2, 4}, P = {10, 5, 6};
int C = 7;
cout << knapsack(W, P, C, W.size() - 1);
return 0;
}
Complexity Analysis
- Time complexity: $\boldsymbol{O(n*C)}$ - This is because the recursive function will only branch or call other recursive functions if it hasn’t cached the query yet. The number of possible queries is $\boldsymbol{n*C}$ (For each item, we have to explore for all
capacities <= C) - Space complexity: $\boldsymbol{O(n + \text{size of recursive call stack})}$ - We are caching $\boldsymbol {n*C}$ results in our Hash Map.
Bottom-Up (Tabular)
In order to understand this approach, we need to refer back to the recurrence relation (how and when we call the recursive calls) we used in the Top-down method.
1
2
3
4
5
6
7
8
9
//Base case:
dp(i, 0) = 0
dp(-1, Space) = 0
//Recurrence Relation
if (W[i] > Space)
dp(i, Space) = dp(i-1, Space)
else
dp(i, Space) = max(dp(i-1, Space), P[i] + dp(i-1, Space - W[i]))
We know that i will be in the range of [0, n-1] and Space will be in the range of [1, C]. We can use this to change dp into a 2D matrix/array for caching the results. In addition, we will be generating the answer from the bottom up (i.e. calculating and storing the results of smaller sub problems then using those to calculate and store the larger ones). In top down, we start from the top (the answer we need, and recursively call the small sub problems), but here, we start from the bottom.
One more point to note is that, our base case checks to see if i is -1 which we can’t use if we have an array (since they only go to index 0). Therefore, we will shift the values of i by +1 (make it 1 indexed) and reserve 0 for the -1 case (basically, in the code, dp[0][Space] = 0 since 0 works like -1)
Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
#include <bits/stdc++.h>
using namespace std;
/**
* @param W array of item weights
* @param P array of item profits
* @param C capacity of the bag
*/
int knapsack(const vector<int> &W, const vector<int> &P, int C)
{
int n = W.size();
vector<vector<int>> dp(n + 1, vector<int>(C + 1));
//dp[0][Space] = 0 (by default in vector)
//dp[i][0] = 0 (by default in vector)
for (int i = 1; i <= n; ++i)
{
for (int Space = 1; Space <= C; ++Space)
{
if (W[i - 1] > Space)
dp[i][Space] = dp[i - 1][Space];
else
dp[i][Space] = max(dp[i - 1][Space], P[i - 1] + dp[i - 1][Space - W[i - 1]]);
}
}
return dp[n][C];
}
int main()
{
vector<int> W = {7, 2, 4}, P = {10, 5, 6};
int C = 7;
cout << knapsack(W, P, C, W.size() - 1);
return 0;
}
Complexity Analysis
- Time complexity: $\boldsymbol{O(n*C)}$ - We have 2 nested for loops that run in total of $\boldsymbol{n*C}$ times.
- Space complexity: $\boldsymbol{O(n*C)}$ - We are caching $\boldsymbol{n*C}$ results in our 2D matrix/array.