Dynamic Programming – 2 : Longest Common Subsequence

Longest Common Subsequence (LCS)

We are given 2 sequences X = < x_1, \ldots, x_m > and Y = <y_1, \ldots, y_n >. Our task is to find the LCS of X and Y.

Terminology: A subsequence of X is a set of elements that can be obtained from the elements of X such that no 2 two elements of the subsequence are out of order wrt the ordering defined by X. e.g. if X = < E L E P H A N T >, then < E P H T>, < E E A > etc are subsequences of X.

A common subsequence of X and Y is a sequence Z that is a subsequence of both X and Y. The longest such Z is an LCS.


Let OPT (m, n) denote the length of an LCS of X and Y. (Note that there may be more than one LCS’s, each of them having the same optimum length.)

Either of these 2 cases is true : either x_m = y_n or x_m \neq y_n.

Case 1:

If x_m = y_n then we can certainly include that element in the LCS and proceed to finding the LCS of <x_1, \ldots, x_{m-1} and <y_1, \ldots, y_{n-1}.

Notation: Let X_k denote <x_1, \ldots, x_k. Y_k is defined analogously.

Notation: Let OPT (i, j) denote the length of the LCS of X_i and Y_j.

Therefore, we have that:

OPT (m,n) = 1 + OPT (m-1, n-1) (if x_m = y_n)

Case 2:

If x_m \neq y_n, then we have that:

OPT (m, n) = max \{ OPT (m-1, n), OPT (m, n-1) \}.


Generalizing the above, we have that, for i > 0 and j> 0:

OPT (i, j) = 1 + OPT (i-1, j-1) (if x_i = y_j)


OPT (i, j) = max \{ OPT (i, j-1), OPT (i-1, j) \} (if x_i \neq y_j)


We can use the above recurrence to compute the optimum length in a bottom up fashion.

Pseudocode for computing the optimum length:

Initialize OPT (0, i) and OPT (j, 0) to be 0 for all 1 <= i <= n, and 1 <= j <= m.

For i from 1 to m:

For j from 1 to n:

Compute OPT (i,j) and OPT (j,i) using the above recurrence.

End of algorithm.


What we want is the length of OPT (m,n). The above algorithm computes OPT(m,n) in O(mn) time.


Using the m \times n table of OPT (i,j) values, we can compute the LCS recursively.

Function: OPT_Solution (i,j) :=

If (i == 0) OR ( j == o)

// Exit; Break recursion.


If x_i == y_j

OPT_Solution (i-1, j-1);

Print x_i

Else if OPT (i-1, j) \geq OPT (i, j-1)

OPT_Solution (i-1, j);


OPT_Solution (i, j-1);

End of algorithm.
The above code for printing the LCS runs in O(m+n) time.


%d bloggers like this: