## Dynamic Programming – 2 : Longest Common Subsequence

February 3, 2011 Leave a comment

**Longest Common Subsequence (LCS)
**

We are given 2 sequences and . Our task is to find the LCS of X and Y.

Terminology: A subsequence of X is a set of elements that can be obtained from the elements of X such that no 2 two elements of the subsequence are out of order wrt the ordering defined by X. e.g. if X = < E L E P H A N T >, then < E P H T>, < E E A > etc are subsequences of X.

A common subsequence of X and Y is a sequence Z that is a subsequence of both X and Y. The longest such Z is an LCS.

—

**Solution. **

Let OPT (m, n) denote the length of an LCS of X and Y. (Note that there may be more than one LCS’s, each of them having the same optimum length.)

Either of these 2 cases is true : either or .

**Case 1: **

If then we can certainly include that element in the LCS and proceed to finding the LCS of and .

**Notation: **Let denote . is defined analogously.

**Notation: **Let denote the length of the LCS of and .

Therefore, we have that:

(if )

**Case 2: **

If , then we have that:

.

—–

Generalizing the above, we have that, for and :

(if

and

(if )

—–

We can use the above recurrence to compute the optimum length in a bottom up fashion.

**Pseudocode for computing the optimum length: **

**Initialize OPT (0, i) and OPT (j, 0) to be 0 for all 1 <= i <= n, and 1 <= j <= m. **

**For i from 1 to m: **

**For j from 1 to n: **

**Compute OPT (i,j) and OPT (j,i) using the above recurrence. **

**End of algorithm. **

**—-**

What we want is the length of OPT (m,n). The above algorithm computes OPT(m,n) in O(mn) time.

—-

Using the table of OPT (i,j) values, we can compute the LCS recursively.

**Function: OPT_Solution (i,j) := **

**If (i == 0) OR ( j == o) **

**// Exit; Break recursion. **

**Return; **

**If **

**OPT_Solution (i-1, j-1); **

**Print **

**Else if **

**OPT_Solution (i-1, j); **

**Else **

**OPT_Solution (i, j-1); **

**End of algorithm. **

**—-**

The above code for printing the LCS runs in O(m+n) time.

—–