# Matching parentheses

January 5, 2011 Leave a comment

We will look at an interesting problem of evaluating whether a sequence of left and right parentheses is *balanced. *

Firstly, we need to know what is meant by the sequence being balanced. What we basically mean is that for every left parenthesis, there exists a corresponding right parenthesis, and also that, for every right parenthesis there exists a matching left parenthesis.

Let’s consider a few examples:

( ) : This simple sequence is balanced. i.e. There is only one left parenthesis in this sequence, and it has a matching right parenthesis. Also, there is only one right parenthesis in this sequence, and that has a matching left parenthesis.

( ( ) ) : This sequence is also balanced.

For clarity, let us index the elements of this sequence using the indices 1, 2, 3, 4 as we would do in the case of an array. (i.e. **“(” **( ( ) — the highlighted “(” is Seq [1]. Similarly, in ( ( **“)” **), the highlighted “)” is Seq [3].)

Now, in the sequence ( ( ) ), we need to observe 2 things in order to ascertain whether or not it is balanced:

1. Every left parenthesis has a matching right parenthesis.

This can be seen to be true. Seq [1] has Seq [4] as its match, and Seq [2] has Seq [3] as its match.

2. Every right parenthesis has a matching left parenthesis.

Again, this is easily seen to be true. Seq [3] has Seq [2] as its match, and Seq [4] jas Seq [1] as its match.

Thus, ( ( ) ) is balanced.

Now consider this example:

( ( ( ) ) ( ) )

This sequence is also balanced because:

1. Every left parenthesis has a matching right parenthesis.

Seq [1] has Seq [8]; Seq [2] has Seq [5]; Seq [3] has Seq [4]; Seq [6] has Seq [7].

2. Every right parenthesis has a corresponding left one. This can be easily verified.

—

To see why both of the above requirements need to be satisfied in order for a sequence to be balanced, consider the following two examples:

( ) )

Here,

(1) Every left parenthesis has a matching right parenthesis. (Seq [1], the only left parenthesis in our sequence, can be matched to Seq [2].)

(2) Every right parenthesis does not have a matching left parenthesis. Seq [2] has Seq [1] as its match, but Seq [3] does not have any match.

Therefore, ( ) ) is not balanced, which also seems correct from what one would intuitively understand from the word “balanced”.

Similarly, it can be easily verified that ( ( ) is not balanced.

—

To develop a method for determining whether or not a sequence is balanced, let’s first look a bit more at what is meant a *matching parenthesis. *

Consider :

( ( ) ( ( ) ) )

[1] [2] [3] [4] [5] [6] [7] [8]

As can be seen, the matching left parenthesis for :

Seq [3] is Seq [2],

Seq [6] is Seq [5],

Seq [7] is Seq [4],

Seq [8] is Seq [1].

Intuitively, one can understand the method by which the brain works out which left parenthesis is the match for a given right parenthesis.

*Read the input from left to right. As soon as you encounter a right parenthesis (we know that its matching left parenthesis should have occurred earlier, and our intuitive definition of matching parenthesis tells us that this matching left parenthesis is the one nearest to the right one that we read), the matching left parenthesis is the one that is exactly one index before it. *

[In our example, this would translate to reading the input left to right from Seq [1] to Seq [3]. At Seq [3], we encounter a right parenthesis, and decide that its matching left parenthesis is Seq [2]. )

*Now, since we’ve matched a left and right parenthesis, we don’t care about them anymore. So we assume that they have been removed from the sequence. *

[ In our example, this would mean removing Seq [2] and Seq [3] from our sequence.]

*We continue reading the input from left to right, and as soon as we encounter a right parenthesis, we decide that its matching left parenthesis based on the facts that the left parenthesis that we are finding (1) occurs before it, and (2) is the one closest to the right parenthesis under consideration (from among all the left parenthesis that we have not yet removed from consideration). *

[In our example, this would translate to reading : Seq [4] , Seq [5] and Seq [6]. We encounter a right parenthesis at Seq [6], and search for its matching left parenthesis. We see that Seq [5] is a left parenthesis that has not yet been matched (i.e. is still part of the sequence), occurs before Seq [6], and is the closest such left parenthesis to Seq [6].

Once, we’ve done that, we remove Seq [5] and Seq [6] from our sequence. ]

*We continue in the same vein — reading the input from left to right from the point where we left off, stopping at a right parenthesis, searching for its matching left parenthesis, and, on finding such a match, removing both of the matched parentheses from the sequence, and continuing the same process again. *

[In our example, this becomes:

Read Seq [7]. Now, since, Seq [5] has already been matched (i.e. no longer part of our sequence), the correct match for Seq [7] becomes Seq [4].

Remove both Seq [7] and Seq [4] from the sequence.

Read Seq [8]. Stop, and search for its match.

We see that Seq [1] is the match (all other left parentheses have been removed).

We match Seq [1] and Seq [8], and remove both from the sequence.

We now find that the sequence is empty, and the input is also over, which means that we have matched all parentheses and that the sequence is balanced. Of course, if the sequence weren’t balanced, this wouldn’t be the case.]

—-

Now, that we have understood the idea, it should be easy enough to understand the method used to determine whether or not a sequence a balanced.

1. Scan the input from left to right, and while doing so, perform the following actions:

(i) if you encounter a left parenthesis, push it into a stack. (This stack is initially empty.)

(ii) If you encounter a right parenthesis, pop the top-most left parenthesis (if it exists) from the stack.

2. If there are no more symbols to be read from the input :

(i) and the stack is empty, the sequence is balanced; end the algorithm.

(ii) and the stack is not not empty (i.e. contains some left parentheses), the sequence is not balanced; end the algorithm.

3. If you encounter a right parenthesis, and the stack is empty, then the sequence is not balanced; end the algorithm.

—

The above algorithm using stacks is based exactly on the approach we saw earlier, and helps us to decide whether or not a sequence is balanced.