Dynamic Programming – 6 : Optimum sum of tree nodes

Optimum sum of tree nodes

We are given a rooted tree T with root r. Each node x has a real value v(x) associated with it. We are to find a subset of nodes of the tree, such the sum of the values of the selected nodes is maximized, subject to the condition that if we choose a node x, we cannot choose any of its children or its parent.

Solution.

Notation. Let OPT (x) denote the optimum value on a sub-tree rooted at node x. Let OPT (1,x) denote the optimum value on a sub-tree rooted at node x subject to the condition that x is selected as part of the optimal subset. Let OPT(0,x) denote the optimum value on a sub-tree rooted at x subject to the condition that x is not selected as part of the optimal subset.

Then, we have that:

OPT(x) = max \{OPT(1,x), OPT(0,x)\}.

OPT(1,x) = v(x) + \sum_y OPT (0,y) (where the summation is over all children y of x.)

OPT(0,x) = \sum_y OPT (y) (where the summation is over all children y of x.)

Also, if x is a leaf, then OPT(1,x) = v(x), and OPT(0,x) = 0.

Pseudocode for computing OPT(root)

We can compute OPT(root) in a bottom up fashion starting from the leaves. The recurrence equations for a node x involve the OPT values for its children. Thus, before computing the OPT value for node x, we compute the OPT values for its children.

Let the nodes of the tree be arranged in levels from Level 0 till Level L such that the root is at Level 0.

For i going from L downto 0:

For all nodes x in Level i:

Compute OPT[1,x], OPT[0,x] and OPT[x] using the above recurrence.

Set Soln[x] = 1 if OPT[1,x] >= OPT [0,x]. Else, set Soln[x] = 0.

End.

—-

The above code has a running time of O(n) where n represents the number of nodes in the tree. [ We compute OPT values for n nodes. For each node, we compute the OPT values by summing up the OPT values over all of its children. Aggregating over all nodes, and observing the each node can have at most one parent, we find that the summing up operation costs O(n). Therefore, the total cost is O(n+n) which is O(n). Another way to look at it is that the aggregate cost of the summing up operations over all nodes is O(m) [where m represents the number of edges in the tree], which for a tree is O(n). ]

—-

Computing the optimal subset:

We do so using the Soln[x] values.

Function: Print_Solution (x)

If (Soln[x] == 1)

Print x.

For all grand-children y of x:

Print_Solution (y);

Else

For all children y of x:

Print_Solution (y);

End.

—-

Print_Solution(root) has a running time of O(n).

—-

Dynamic Programming – 5 : Matrix chain multiplication

Matrix chain mnultiplication

We are given a sequence of n matrices <M_1, \ldots, M_n >, such that matrix M_i has dimensions a_i \times a_{i+1}. Our task is to compute an optimal parenthesization of this sequence of matrices such that the sequence is fully parenthesized.

Terminology: A sequence of matrices is fully parenthesized if the sequence is either a single matrix or a product of 2 fully parenthesized sequences.

Optimality here refers to minimization of the number of scalar multiplications involved in computing the product of the matrix sequence.

Note 1:

// C code for multiplying A (m x n) with B (n x p). Product is C (m x p).

for (i=0; i < m; i++)

for (j=0; j < p; j++)

{

C[i][j] = 0;

for (k=0; k<n; k++)

C[i][j] + = A[i][k] * B[k][j];

}

//End of code.

Note 2: From the above, we can see that multiplying an (m x n) matrix with a (n x p) matrix requires m x n x p scalar multiplications.

Note 3: Matrix multiplication is associative. So, we can parenthesize the given sequence of n matrices whichever way we want as long as we do not disturb the sequence itself.

—-

Solution.

If n > 1, we know that the required parenthesization divides the n matrix sequence into 2 smaller fully parenthesized sequences.

Let this division occur at matrix M_k, i.e. the number of scalar multiplications in <M_1, \ldots, M_n> equals the sum of the number of scalar multiplications in <M_1, \ldots, M_k> and <M_{k+1}, \ldots, M_n> plus the value a_1 \times a_{k+1} \times a_n.

Notation: Let OPT(i,j) denote the number of scalar multiplications in the optimal parenthesization of the sequence <M_i, \ldots, M_j>.

From the above, we have that:

OPT(1,n) = \min_{1 \leq k \leq n-1}\{ OPT (1, k) + OPT(k+1, n) + a_1 \times a_{k+1} \times a_{n+1} \}.

Generalizing this, we can say that, for 1 \leq i < j \leq n:

OPT(i, j) = \min_{i \leq k \leq j-1}\{ OPT (i, k) + OPT(k+1, j) + a_i \times a_{k+1} \times a_{j+1} \}

Further, OPT(i,i) = 0 for all possible i.

—-

Pseudocode for computing optimum number of scalar multiplications:

For i from 1 to n:

For j from 1 to (n-i):

Compute OPT [i,j] according to the above recurrence.

Set Soln[i,j] = k; // k is the index that optimizes OPT[i,j] in the recurrence.

End of algorithm.

The above code will take O(n^3) to find OPT[i,n] (in the process, computing the OPT[i,j] and Soln[i,j] table values).

Computing the optimal solution i.e. parenthesization

We can do so using the Soln [i,j] table.

Function: Print_Solution (i, j)

If ( i == j)

Print A_i

Else

Print (

Print_Solution (i, Soln[i,j] );

Print_Solution ( Soln[i,j] + 1, j);

Print )

End.

—-

This function takes O(n) time for the call Print_Solution (1, n).

Dynamic Programming – 4 : Sequence Alignment

Sequence Alignment

We are given 2 sequences X = <x_1, \ldots, x_m> and Y = < y_1, \ldots, y_n>. Our task is to produce a valid matching of the elements of X and Y such that the cost of the matching is minimized.

Terminology:

A valid matching denotes a one-to-one correspondence of (some subset of) elements of X and Y with the added condition that there should be no crossing in the matching.i.e. if x_i is matched to y_j, and x_p is matched to y_q, then i < p implies j < q.

e.g.

X = < e l e p h a n t >

Y = < m o n k e y >

The matching M = \{ (1,2), (3,3), (6, 5) \} refers to the valid matching where we match x_1, y_2 and x_3, y_3 and x_6, y_5. Note that there are no crossings in M.

Cost of a matching M is the sum of:

(1) matching cost m_{x_i,y_j} for each (i,j) \in M.

(2) gap cost d for each element of X or Y which is not part of M.

Solution.

Given X and Y, we have 2 choices: either include (m,n) in the optimum matching, or exclude it from the optimum matching.

Case 1:

If (m,n) is part of optimum matching, then the optimum cost would be m_{x_m, y_n} plus the optimum cost of matching the sequences <x_1, \ldots, x_{m-1} and < y_1, \ldots, y_{n-1}.

Notation: Let OPT(i,j) denote the optimum cost of matching <x_1, \ldots, x_i> and <y_1, \ldots, y_j>.

Therefore, we have:

OPT (m,n) = m_{x_m, y_n} + OPT(m-1, n-1)

Case 2:

If (m,n) is not part of the optimum matching, then one of the following should hold true (because of the no crossing condition):

(1) x_m is not part of the optimum matching.

or, (2) y_n is not part of the optimum matching.

Therefore, we have:

OPT(m,n) = min \{ OPT(m-1, n) + d, OPT (m,n-1) + d\}

—-

Generalizing the above recurrence, we have, for 1 \leq i \leq m and 1 \leq j \leq n:

OPT (i, j) = min \{ m_{x_i, y_j} + OPT (i-1, j-1), OPT (i-1, j) + d, OPT (i, j-1) + d\}

Also, we have the base conditions that:

OPT (i, 0) = i \times d for all 0 \leq i \leq m; and analogously for OPT (0, j).

—-

Pseudocode for computing optimum value:

Initialize OPT [i, 0] and OPT [0,j] as given above.

For i from 1 to m:

For j from 1 to n:

Compute OPT [i,j] using the given recurrence.

End of algorithm.

—-

The above algorithm computes the optimum value OPT (m,n) in O(mn) time.

—-

Computing the optimum solution using OPT [i,j] values:

Let us denote :

(1) m_{x_i, y_j} + OPT (i-1, j-1) by Expr1.

(2) d + OPT (i, j-1) by Expr2.

(3) d + OPT (i-1, j) by Expr3.

Function: OPT_Solution (i,j)

If (i = = 0) OR ( j = = 0)

Return;

If Expr1 <= both Expr2 and Expr3

OPT_Solution (i-1, j-1);

Print: Index i of X is matched to Index j of Y.

Else if: Expr2 <= both Expr1 and Expr3

OPT_Solution (i, j-1);

Else:

OPT_Solution (i-1, j);

End of function.

—-

What we want to compute is OPT_Solution (m, n). The above function computes that in time O(m+n).

—-

Dynamic Programming – 3 : Subset Sum

Subset Sum

We are given n items \{I_1, \ldots, I_n \}, each having non-negative integral weight w_i. We are also given a non-negative integral weight bound W. Our task is to find a valid subset of items which maximizes the total weight of the items in the subset.

Terminology: A valid subset T is one such that the total weight of items in T is at most W.

Solution.

For item I_n, we have 2 choices : either to include it in the optimum solution, or to not include it.(Note that if w_n > W, then we have just one choice  — exclude I_n from the optimum solution.)

Case 1:

If I_n is included in the optimum solution, the optimum solution would be w_n plus the optimum solution obtained from the set of items \{I_1, \ldots, I_{n-1}\} with the weight bound as W-w_n.

Notation: Let OPT (i, w) denote the value of the optimum solution obtained on the set \{I_1, \ldots, I_i\} with weight bound w.The original task, then, is to compute OPT(n, W).

We have that, if I_n is included in the optimum solution,

OPT (n, W) = w_n + OPT (n-1, W-w_n).

Case 2:

If I_n is not included in the optimum solution, then we have that:

OPT (n, W) = OPT (n-1, W).

—-

Generalizing the above, we have that, for 1 \leq i \leq n and 0 \leq w \leq W:

OPT ( i, w) = OPT (i-1, w) [if w_i > w]

Else,

OPT (i, w) = max \{ w_i + OPT (i-1, w- w_i), OPT (i-1, w) \}

—-

Also, OPT (0, w) = 0 for all w.

—-

Pseudocode for computing the optimum value:

Based on the above recurrence, we can construct an algorithm.

For i from 1 to n:

For w from 0 to W:

Compute OPT[i,w] using the above recurrence.

End.

—-

The above code runs in O(nW) time [since O(1) time is spent for computing each OPT[i,w] value].

—-

Computing the Optimum solution:

We can compute the optimum solution recursively given the table of OPT[i,w] values.

Function: Print_OPT (k, w)

If k == o

Return;  //Breaking the recursion. We can also have the condition as k == 0 OR w == 0

If w_k > w

Print_OPT (k-1, w);

Else if w_k + OPT (k-1, w-w_k) > OPT (k-1, w)

Print_OPT (k-1, w- w_k);

Print I_k

Else

Print_OPT (k-1, w);

End of function.

The optimum solution is printed on calling Print_OPT (n, W). The time taken is O(n).

—-


Extended Longest Path problem

These are slides from a talk on the Longest Path Problem at the Penn State Theory seminar.

Talk_longest path

In the Longest Path Problem, we are given an undirected graph, G =(V,E), and integer k \geq 0 , and are asked to find out whether or not there exists a simple path of length k in G.

The Extended Longest Path Problem is a variant of the above. Here, we are asked to find out all pairs of vertices that have a simple path of length k between them.

Read more of this post

Longest Path: Method of Random Orientations

In this post, we shall consider an NP-complete problem, namely that of determining whether a simple path of length k exists in a given graph, G = (V,E).

Formally,

LONGEST PATH PROBLEM

Input: undirected graph, G = (V,E), integer k \geq 0 .

Parameter: k

To find: Does G contain a simple path of length, k?

Read more of this post

Vertex Cover and Set Cover – 4

Question: We are given an instance of the Vertex Cover problem. We are also given an algorithm for the Set Cover problem. Using this algorithm as a black box, solve the given instance of the Vertex Cover problem.

Solution:

Given instance of Vertex Cover problem: undirected graph, G = (V, E); positive integer k.
Question: Does G have a vertex cover of size at most k?

===

Read more of this post

%d bloggers like this: