Dynamic Programming – 2 : Longest Common Subsequence

Longest Common Subsequence (LCS)

We are given 2 sequences X = < x_1, \ldots, x_m > and Y = <y_1, \ldots, y_n >. Our task is to find the LCS of X and Y.

Terminology: A subsequence of X is a set of elements that can be obtained from the elements of X such that no 2 two elements of the subsequence are out of order wrt the ordering defined by X. e.g. if X = < E L E P H A N T >, then < E P H T>, < E E A > etc are subsequences of X.

A common subsequence of X and Y is a sequence Z that is a subsequence of both X and Y. The longest such Z is an LCS.

Solution.

Let OPT (m, n) denote the length of an LCS of X and Y. (Note that there may be more than one LCS’s, each of them having the same optimum length.)

Either of these 2 cases is true : either x_m = y_n or x_m \neq y_n.

Case 1:

If x_m = y_n then we can certainly include that element in the LCS and proceed to finding the LCS of <x_1, \ldots, x_{m-1} and <y_1, \ldots, y_{n-1}.

Notation: Let X_k denote <x_1, \ldots, x_k. Y_k is defined analogously.

Notation: Let OPT (i, j) denote the length of the LCS of X_i and Y_j.

Therefore, we have that:

OPT (m,n) = 1 + OPT (m-1, n-1) (if x_m = y_n)

Case 2:

If x_m \neq y_n, then we have that:

OPT (m, n) = max \{ OPT (m-1, n), OPT (m, n-1) \}.

—–

Generalizing the above, we have that, for i > 0 and j> 0:

OPT (i, j) = 1 + OPT (i-1, j-1) (if x_i = y_j)

and

OPT (i, j) = max \{ OPT (i, j-1), OPT (i-1, j) \} (if x_i \neq y_j)

—–

We can use the above recurrence to compute the optimum length in a bottom up fashion.

Pseudocode for computing the optimum length:

Initialize OPT (0, i) and OPT (j, 0) to be 0 for all 1 <= i <= n, and 1 <= j <= m.

For i from 1 to m:

For j from 1 to n:

Compute OPT (i,j) and OPT (j,i) using the above recurrence.

End of algorithm.

—-

What we want is the length of OPT (m,n). The above algorithm computes OPT(m,n) in O(mn) time.

—-

Using the m \times n table of OPT (i,j) values, we can compute the LCS recursively.

Function: OPT_Solution (i,j) :=

If (i == 0) OR ( j == o)

// Exit; Break recursion.

Return;

If x_i == y_j

OPT_Solution (i-1, j-1);

Print x_i

Else if OPT (i-1, j) \geq OPT (i, j-1)

OPT_Solution (i-1, j);

Else

OPT_Solution (i, j-1);

End of algorithm.
—-
The above code for printing the LCS runs in O(m+n) time.

—–


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: