Software engineering interview questions – Intersection of two arrays

A few approaches are presented herein for solving the question on computing the intersection of two arrays. What follows is also attached in the pdf format here – algorithm1


Strings in C++ : Find the first non-repeated character in a string – 3

APPROACH 3

We can approach this problem using a lookup table. Using a lookup table, the running time can be significantly reduced.

For simplicity, let us assume that our input string can only contain characters from the English alphabet i.e. only 26 characters are possible, though each may be repeated any number of times.

The idea is to construct a lookup table having space for 26 entries. Each entry in the lookup table corresponds to a character in the input string.

We will proceed as follows— Create a lookup table of size 26. Initialize each entry in the lookup table to be 0. The entries in the lookup table correspond to the 26 characters in the English alphabet, and are equal to the number of times that particular character appears in the input string.

Traverse the input string. Say the string that you are given is “elephant”. The first character is ‘e’. Go to the lookup table entry corresponding  to ‘e’ and increment the count in that entry by 1. But which is the entry in the lookup table corresponding to ‘e’?

Let us call the lookup table L.

L[0] will store the count of ‘a’. L[1] will store the count of ‘b’. … L[25] will store the count of ‘z’.

So, for ‘e’, we go to L[4] and increment that count by 1.

Now, once we have finished traversing the string, and have the counts of all characters available to us in L, we need to output the first non-repeated character.

For this, we proceed as follows:

Traverse the input string again. Take the first character in the input string. Go to the entry corresponding to that character in L. (Note that we can do this in O(1) time.) Check if the count is 1. If so, output it. Else, go to the next character in the input string, and repeat the same procedure.

————

Now, what would be the running time of this procedure?

Initializing the look up table takes O(1) [since the size of the lookup table is a constant.]

Traversing the string to populate the lookup table with appropriate counts takes O(n) time. (n is the length of the input string.)

Traversing the string again, to output the first non-repeated character takes O(n) time.

Hence, the total running time is O(n).

The memory requirement is that of a lookup table of constant size, which is O(1).

———

In C++, we can code it as follows:

——–

#include <iostream.h>

//func f returns the first non-repeated character in s.

char f ( char *s)

{

//check for null string.

if ( ! s)

return ”;

int L [26] = {0}; //lookup table, initially all entries are 0.

//we now traverse the string to populate L.

char *t = s;

while (  *t != ”)

{

L [ *t - 'a']++; // e.g. if *t is ‘e’, we need to increment L[4] which is L ['e' - 'a'].

t++;

}

//we have now populated L with counts.

//we now traverse s and using L, output the first non-repeated char.

t = s;

while ( *t != ”)

{

if ( L [ *t - 'a' ] == 1)

return *t;

}

//no non-repeated char in s.

return ”; //we may also write: return 0; (using the ASCII value of NUL)

}

———-

Now, what if remove the simplifying assumption made earlier and say  that s can contain any character. In this case, the lookup table needs to have 128 entries (one for each ASCII character).

For the preceding program, we use L [ ch - 'a' ] to access the lookup table entry corresponding to ch. In the new lookup table having 128 entries, we can use the ASCII values of the characters as direct indices into the lookup table, i.e. we can use L[ch] to access the count of character ch.

The remaining part of the code would remain more or less the same as above.

———-

Strings in C++ : Find the first non-repeated character in a string – 2

APPROACH 2

The previous approach consider each character of the string and then compared it with each of the remaining characters (albeit only those following it) to check if it gets repeated anywhere. Hence it took O(n^2) time. We can do better than that by sorting the string. Once the string is sorted, checking if any given character in that string gets repeated becomes a simple matter of checking the elements adjacent to it.

What we propose to do is this:

- Copy the original string into a new string.

- Sort this new string. (This will take O(n \log n) time. )

But now we are faced with a problem.

We need to determine a non-repeated character in the original or new string. If we wanted just any non-repeated character, that would have been simple. — We would traverse through the sorted new string, and in O(n) time, we would have been able to output a non-repeated character in that string. Hence, the total running time would have been O(n \log n). However, what we want is  the first non-repeated character.

One idea could be to proceed as follows: Take the first character in the original string. Using  the sorted string, check if that character is being repeated. If not, return that character. Else, go to the next character in the original string and repeat the same process again.

But again, we are faced with a problem — How do we determine the location of the first character of the original string in the new sorted string in constant time? (We can, of course, do so in O(n) time. But that would only lead to an O(n^2) algorithm, which would mean that even after sorting we weren’t able to improve the running time.)

—–

From the above, we observe 2 things –

1. Sorting would help in finding repeated / non-repeated characters in a string.

2. But sorting, as such, would not help us efficiently find the first non-repeated character, because we cannot perform a constant time lookup for any given character in the sorted string.

The solution is somewhat clear given the above observations. Do a sort akin to bucket sort. i.e. keep a bucket for each character that a string can possibly have. Once you have that array of buckets each corresponding to one possible character, you can perform constant time lookup in that array for any given character. i.e. we can construct a lookup table. This leads us to the approach detailed in the next post.

Strings in C++ : Finding the first non-repeated character in a string

We are given a string, and told to find the first non-repeated character in it. Let’s look at an example: let’s say our string is “alligator”. The first non-repeated character in this string is “i”.

Now, how do we go about finding this first non-repeated character programmatically?

APPROACH 1

There is a very direct algorithm which we can employ — Take the first character of the given string (which is ‘a’ in our example string “alligator”). Compare it with each of the other characters in the string. If that character does not appear anywhere else, return that character and terminate the algorithm. Else, if the character does appear somewhere else in the string (as in the case of the ‘a’ in “alligator”), go to the 2nd character of the string, and repeat the same process (i.e. check if the 2nd character of the string gets repeated anywhere). Continue this process for each character of the string (i.e. in the k th iteration, we would compare the k th character of the string with each character in the string to determine if it appears anywhere else). If, at the end of the algorithm, we find that all characters get repeated somewhere, i.e. there is no non-repeating character, print a message to that effect.

———

The above algorithm is simple to understand and implement. It will run in O(n^2) time where n is the length of the input string.

——

In pseudocode, the algorithm would look as follows:

——-

Input: String s of length n.

Output: First non-repeated character in s.

Algorithm:

For i going from 0 upto n-1:

flag = true; // flag is a boolean variable initialized to true.

For each j going from 0 upto n-1:

If ( s[i] == s[j] )

flag = false;

End For. //end the inner For loop.

If ( flag == true ) //i.e. if flag is true when we come out of the inner For loop

return s[i];  //character s[i] is the 1st non-repeated character in s.

End For. // end the outer For loop.

//If the algorithm reaches this statement, it means that there is no non-repeated character in s.

// Hence we return a default character, which would not otherwise be returned.

// We choose the null character — NUL for this purpose.

return NUL;

———

So that was the pseudocode for the algorithm. We can translate the above code into C++ as follows:

———

#include <iostream.h>

// function f takes a string s, and returns the first non-repeated character in s.

// In case there is no non-repeated character, it returns the null character ” (having ASCII value 0)

char f (char *s)

{

//we first check if s is the null string.

if ( !s)

return ”; // return the null character is s is the null string.

int i = 0, j;

bool flag;

while ( s[i] ! = ”)

{

flag = true;

j = 0;

while ( s[j] ! = ”)

{

if ( s[i] == s[j] )

flag = false;

j ++;

}

if (flag)

return s[i];

i++;

}

return ”; //i.e. no non-repeated character was found.

}

————


Dynamic programming – Probability, gambling and a die

Let’s say we have a fair 6-sided die. Player A can earn money based on what shows up when the die is rolled. The rule of the game is as follows: Player A will roll a die upto 3 times. After each roll, Player A has two choices: (1) He can take money in dollars equal to the number showing on the upturned face of the die. (i.e. if 5 shows on the die, he can take $5.)  In this case, the die is not rolled any more, and the game ends. (2) He can decide to discard this roll of the die, and decide to roll the die again. Note that the die can be rolled at most 3 times.

Question — What is the expected payoff for Player A?

——–

Let’s say that instead of 3 rolls of the die, the game allowed only for 1 roll. What is the expected payoff for Player A in this case? It is clearly equal to the expected value of the number that shows up when the die is rolled. And since the die is fair, this expectation is equal to (1 + 2 + 3 + 4 + 5 + 6)/6 = 3.5. And Player A has no choice but to accept whatever shows up on the die because there is only one roll possible. Hence, in this game, the expected payoff for Player A is $3.5.

Now, consider what happens if instead of 1 roll, we allow for upto 2 rolls.

Now, if the 1st roll is a 6, obviously Player A would pocket $6 and decide to end the game there itself. Also, if the 1st roll is a 1, Player A can always decide to roll again [because irrespective of what shows up on the 2nd roll, it cannot be less than $1, and hence Player A has nothing to lose by deciding to discard the 1st roll and opting to roll again.]

Now consider what happens if the 1st roll is a 3. Now, Player A is faced with a predicament. What should he do? Should he decide to stop there and go home with $3, or should he discard this 3 in the hope of higher returns in the 2nd and final roll?

This is where having learnt probability well in school helps.

We know that the expected value of the number that shows up when the die is rolled once is 3.5.

Hence, if in the 1st roll, Player A gets a number greater than 3.5 i.e. either 4, 5 or 6, he can decide to stop there and take the corresponding amount of money.

However, if he gets either 1, 2 or 3 in his 1st roll, he should discard that, and decide to roll again, because the expected value of the 2nd roll is 3.5.

Hence, the expected payoff for Player A from a game which allows for at most 2 rolls is: \frac{3}{6} 3.5 + \frac {1}{6} 4 + \frac{1}{6} 5 + \frac{1}{6} 6 = 4.25.

(Here, 3/6 is the probability of getting a 1, 2 or 3 on the 1st roll, in which case the expected payoff becomes 3.5. )

——-

Now, coming back to our original game, we can decide as to how we should proceed after the die is rolled once.

We know that the expected payoff from 2 rolls of the die is $4.25.

Hence the strategy of Player A should be the following:

- If the 1st roll of the die yields either a 5 or a 6, take the money, and end the game.

- Else, discard the 1st roll, and continue the game.

The expected payoff can be computed easily based on the preceding observations:

Expected payoff for Player A = \frac{4}{6} 4.25 + \frac{1}{6} 5 + \frac{1}{6} 6 = \frac {14}{3}.

———-

This is a classic question using the principle of dynamic programming. Let’s say the the game allows for the die to be tossed upto n times.

Let E[k] denote the expected payoff for Player A in a game that allows upto n-k rolls of the dice.

The strategy for Player A after roll k should be the following:

– If the number showing on the dice is greater than E[k], take the money and stop the game.

– Else, discard that roll, and decide to continue on to the next roll.

——-

The recurrence relation for computing the E[k] values can be derived in terms of E[k+1] as was done in the case of the 3-roll example that we saw earlier. Once we have done that, we can calculate the values in the order, E[n-1], E[n-2] and so on uptil E[0].

——-

Dynamic Programming – Maximum sum contiguous subsequence

We are given an array A having n integers. What we want to find is a contiguous subseqnence (A[i], A[i+1], …, A[j]) such that the sum of the elements in that subsequence is maximized. (Note that, in general, there may be more than one such optimal subsequence.)

Note that if all elements in A are non-negative, we could directly return the entire array as the required optimal subsequence.

APPROACH 1

A simple brute-force method could be to compute the sums of all possible subseqnences in A. How many subsequences are there? There are n+ (n-1) + (n-2) + ... + 1 = O(n^2) possible subsequences. Computing the sum of a subsequence takes O(n) in the worst-case. Hence, the running time of the algorithm would be O(n^3).

In C++, we could write the following function to do what was explained above:

// start and end are the starting and ending indices of an optimal subsequence.

void f ( int* A, int n, int &start, int& end)

{

int sum, max = A[0];

for (int i = 0; i < n ; i++)

for (int j = i; j < n; j++)

{

sum = 0;

for (int k = i; k <=j; k++)

sum+= A[k];

if (sum >= max)

{

start = i;

end = j;

}

}

}

————

APPROACH 2:

We can improve the running time to O(n^2) by being a bit more clever in computing the sums of different subsequences. We observe that the sum of the subsequence A[i ... j+1] = A[j+1] + sum of the subsequence A[i ... j].

In C++, we could write as follows:

void f (int *A, int n, int &start, int &end)

{

int sum, max = A[0];

for (int i = 0; i < n; i++)

{

sum = 0;

for (int j = i; j < n; j++)

{

sum + = A[j];

if (sum >= max)

{

start = i;

end = j;

}

}

}

}

———–

APPROACH 3:

Using dynamic programming, we can solve the problem in linear time.

We consider a linear number of subproblems, each of which can be solved using previously solved subproblems in constant time, this giving a running time of O(n).

Let S[k] denote the sum of a maximum sum contiguous subsequence ending exactly at index k.

Thus, we have that:

S[k+1] = max \{S[k] + A[k+1], A[k+1] \} (for all 1 \leq k \leq n-1)

Also, S[0] = A[0].

——–

Using the above recurrence relation, we can compute the sum of the optimal subsequence for array A, which would just be the maximum over S[i] for 0 \leq i \leq n-1.

Since we are required to output the starting and ending indices of an optimal subsequence, we would use another array T where T[i] would store the starting index for a maximum sum contiguous subsequence ending at index i.

In prseducode form, the algorithm would look thus:

Create arrays S and T each of size n.

S[0] = A[0];

T[0] = 0;

max = S[0];

max_start = 0, max_end = 0;

For i going from 1 to n-1:

// We know that S[i] = max { S[i-1] + A[i], A[i] .

If ( S[i-1] > 0)

S[i] = S[i-1] + A[i];

T[i] = T[i-1];

Else

S[i] = A[i];

T[i] = i;

If ( S[i] > max)

max_start = T[i];

max_end = i;

EndFor.

Output max_start and max_end.

———–

The above algorithm takes O(n) time and O(n) space.

——–

We can however improve upon the space requirements, reducing it to O(1). The key idea is this: for computing S[i] and T[i], all we need are the values of S[i-1] and T[i-1] apart from the given array A. Hence, there is no need to store all n values of the S and T arrays.

We could proceed as follows:

max = A[0];

max_start = 0;

max_end = 0;

S = A[0];

T = 0;

For i going from 1 to n-1:

// S = max { S + A[i], A[i] )

if ( S > 0)

S = S + A[i];

Else

S = A[i];

T = i;

If ( S > max)

max_start = T;

max_end = i;

End For.

Output max_end and max_start.

———–

The above algorithm takes O(n) time and O(1) space.

———-

Greedy algorithms – Coin changing using minimum number of coins

We have coins of denominations 1 cent, 5 cents and 25 cents with us. What we would like to do is represent a total of n cents using these coins, such that a minimum number of coins are used.

For example, if we want to represent 31 cents, we could do that by using 1 25-cents coin, 1 5-cents coin and a single coin of 1-cent, thereby utilizing a total of 3 coins. And this indeed is the bast we can do. Why is that so? Let’s look at it this way — the current solution uses one 25-cent coin, one 5-cent coin and one 1-cent coin. One way in which we can try to modify this solution would be to decide to not use any 25-cent coin at all (instead of the one coin that we are currently using). That would mean that we would have to represent the shortage of 25 cents that we are now faced with, using coins of denominations 1 and 5-cents. We can see that we would need at least five coins of denomination 5-cents in order to represent 25-cents. Hence, using a 25-cent coin was actually a wise choice enabling us to save on the number of coins used.

Now, what if we had to represent 57 cents. The best solution would be to use two coins of 25-cents, one coin of 5-cents and two coins of 1-cent.

Again, we can intuitively infer its optimality by observing that:

1. if we use lesser number of 25-cents coins than what we are currently using, we would actually end up using more coins. —(If we use just one 25-cent coin, then we would require 5 coins of 5-cents each to cover up the deficit of 25-cents. Similarly if we do not use any 25-cents coins at all, we would need 10 coins of 5-cents each to cover the deficit.)  Hence, the idea is to use as many 25-cent coins as we possibly can.

2. Similarly, once we have used the maximum number of 25-cent coins, we are left with k cents (where k < 25) to represent using 1 and 5-cent coins. Analogous to what we observed above, we can infer that we must use as many 5-cent coins as we can in order to cover the deficit of k cents. Whatever is left can be represented using 1-cent coins.

———

Hence, the algorithm would look something like this.

//Represent n cents using the least number of total coins where coins are of denominations – 1, 5 and 25-cents.

1. Divide n by 25 to get quotient q1 and remainder k1.

2. Use q1 coins of 25-cents each.

3. If k1 == 0, we are done. Else, divide k1 by 5 to get quotient q2 and remainder k2.

4. Use q2 coins of 5-cents, and k2 coins of 1-cent.

———-

Now, if we have coins at our disposal having denominations 1, k, k^2, k^3-cents and we had to represent n cents with the minimum number of coins, we could do the following:-

———–

i = 3;

num = n;

While ( exponent >= 0)

{

Divide num by k^i to get quotient q and remainder r.

Use q coins of denomination k^i cents.

If ( r == 0)

break;

num = r;

i – - ;

}

———

Now, take the following example, represent 31 cents using coins of denominations 1, 10 and 25 cents such that the least number of coins are used.

Here, the idea behind the greedy algorithm of using the maximum possible number of coins of the highest denomination would not work. That approach would get us a solution that uses 6 coins : one 25-cent coin, and 6 1-cent coins.

In contrast, we can get a better solution using 4 coins: 3 coins of 10-cents each and 1 coin of 1-cent.

In the problems presented at the beginning of this post, the greedy approach was applicable since for each denomination, the denomination just smaller than it was a perfect divisor of it. (i.e. 25 was a factor of 125; 5 was a factor of 25; 1 was a factor of 5}. However, that not being true in this case {10 is not a factor of 25} leads to the failure of the greedy approach.

Recursion in C++ : The Coin Changing Problem

Let’s say that we have an infinite supply of pennies ( 1 cent ), nickels ( 5 cents ), dimes (10 cents ) and quarters (25 cents). (That would probably be a nice thing — having an infinite number of coins, whatever be the denomination. On a different note, the fact that this question assumes such an infinite supply of coins should convince everyone of the hypothetical nature of this and similar problems. That said, it is sometimes useful to get a fair idea of how to go about solving seemingly contrived problems in order to tackle realistic ones.)

Coming back to the problem — We buy a commodity and are required to pay n cents for it. Now, the interesting bit is that we are required to pay exactly n cents, i.e. we cannot pay more and get change in return. So, the question is — How many ways can we possibly pay n cents using our infinite supply of coins?

The question could be rephrased thus:

Given the equation: n = x + 5 y + 10 z + 25 w where x, y, z, w are non-negative integers, find the total number of possible solutions to the equation.

——-

Let’s take an example to clarify things — say. n = 91.

Let f (n) denote the number of ways n can be represented in terms of coins having denomination 1, 5, 10 and 25.

Thus, our task is to compute f(91).

The coin of the highest denomination is that of 25 cents.

We have 2 choices before us: (1) use at least one 25-cent coin, or  (2) do not use 25-cent coins at all.

f(n) is the sum of the number of ways we can perform (1) and (2).

(1) If we decide to use at least one 25-cent coin:

We can see that we can use either 1, 2 or 3 such coins.

If we use 1 25-cent coin, we are left with 66 cents to be represented using coins of denominations 1, 5 and 10. So, we now see that we have a smaller subproblem — that of representing 66 cents using 1, 5 and 10 cent coins.

We digress here to make a remark about notation.

—-

Let us denote by f(n, k) the total number of ways in which we can represent n cents using coins of denomination \leq k with the added condition that at least 1 coin of denomination k is used.

Thus, we have that f(n) = f(n, 25) + f(n, 10) + f(n, 10) + f(n, 1).

(The 1st term denotes the number of ways we can represent n cents using 1, 5, 10 and 25-cent coins such that at least 1 25-cent coin is used. The 2nd term — f(n,10) denotes the number of ways wherein we do not use any 25-cent coins and use at least 1 10-cent coin. Similarly, we can work out the meaning of the remaining terms. It can be observed that the events represented by each of the these terms are in some sense disjoint, which is why we can simply add these terms to get our desired result.)

——–

Going back to our example, the number of ways we can represent 66 cents using coins of denomination only 1, 5 or 10 cents is equal to f(66,10) + f(66,5) + f(66,1).

—-

We again digress to make another remark on notation.

We use g(n,k) to denote the number of ways in which we can represent n cents using coins of denominations \leq k.

Hence, our original task, which was to compute f(n) is the same as computing g(n, 25).

——–

Again, coming back to the example, we see that the number of ways to represent 66 cents using 1, 5  and 10-cent coins is simply g(66, 10).

Now, what if we decide to use 2 25-cent coins in representing 91 cents. Then, we are left with representing 41 cents using 1, 5 and 10-cent coins, which can be done in g(41, 10) ways.

Similarly, we can see that we can represent 91 cents using 3 25-cent coins in g(16, 10) ways.

—-

A little reflection would get us the following equation:-

g(91, 25) = f(91,25) + f(91, 10) + f(91, 5) + f(91, 1)

= [ g(66,10) + g(41, 10) + g(16, 10) ] + [ g(81, 5) + g(71, 5) + ... + g(1, 5) ] + [ g (86, 1) + g(81, 1) + ... + g(1, 1) ] + [ 1 ].

———

We can now write the method discussed above in the form of a function in C++.

——-

#include <iostream.h>

// Let’s say we have a global array A[4] = {25, 10, 5, 1}.

//func F takes a non-negative input n, and an index into the array A.

// It returns f (n, A[i]) [where the notation f(n,k) is as discussed earlier.]

int F (int n, int index) ;

// func G takes non-negative input n, and an index into the array A, and returns g(n,A[i]) as discussed above.

int G (int n, int index) // the initial call from the main ( ) function would be G(n,0).

{

int i = index;

int ans = 0;

for ( ; i < 4; i++)

ans + = F (n, i);

return ans;

}

//Now we give the implementation of func F.

int F ( int n, int index )

{

if ( n < A[index] )

return 0;

if ( index == 3)

return 1; // because f(n,1) == 1.

int ans = 0, quotient = n/ A[index];

for (int i = 1; i <= quotient; i++)

{

ans + = G (n – i*A[index], index+1)

}

return ans;

}

———

Minimizing waiting time – Greedy Algorithms – 2

Question.We are given n requests which need to be serviced. Assume that all requests are given to us at time t=0. It takes time, t [i] to service request i.

The waiting time of request i, denoted by W[i] is defined as :

W[i] = t[i] + \sum_{k} t[k]. (where the summation is over all requests serviced prior to servicing request i.)

We want to service the requests such that the total waiting i.e. the sum of the waiting times of the n requests, is minimized.

Solution.

This is a simple application of the greedy paradigm.

As is intuitively clear, we would like to service shorter requests first, before moving on to requests with a larger t [i] value.

We make the intuition concrete as follows:

——–

Sort the n requests in non-decreasing order of t[i] values.

Service the requests in this order.

——–

Intuitive proof of correctness of algorithm:

Consider what happens when you have to decide which request to service, and there are more than one requests. Consider any 2 such requests. If you schedule the request with the larger t[i], it would mean that the other request would have to wait for that larger amount of time before being serviced, thus adding that large quantity to the total waiting time. On the other hand, in case you service the smaller request first, the addition to the total waiting time would only be equal to the t[i] value of the smaller request. Hence, whenever you have a choice, go for the request with the smaller t[i] value.

This directly translates to the above algorithm.

——–

Scheduling Tutors – Greedy Algorithms – 1

Question.We are a tutor-scheduling organization. We have at our disposal an infinite number of tutors using whom we service the tutoring requests that come our way.

Consider any particular day. We get a certain number of tutoring requests. Each request has a specified starting time, and the duration of the tutoring session is exactly 30 minutes. Each tutor can work for any continuous period of 2 hours in a day. (The tutor need not actually be tutoring for the whole of the 2 hours; it is just that he/she will certainly not tutor outside of that 2 hour period.) We are the ones who decide the starting time of the 2 hour working period for each tutor.

We also need to remember that a tutor cannot service more than one request simultaneously.

Our aim is to service the tutoring requests using a minimum number of tutors. How can we achieve that?

Solution.

We can tackle this problem greedily.

We order the requests in non-descending order on the basis of their starting times. Take the first request. Schedule a tutor to start his/her 2 hour period at exactly the starting time of the 1st request. Now, look at the 2nd request. Ask yourself the following question: Can the 1st tutor (i.e. the one who has already been scheduled to work) also service this request? If yes, let him/her do so. If not, schedule a 2nd tutor to start his/her 2 hour period at exactly the starting time of this new request. Continue in the same manner for the succeeding requests. i.e. Do the following: For each new request, check if that request can be serviced by any of the tutors already scheduled to work. If yes, schedule the tutor who has the shortest time remaining in his/her 2 hr period, to service this request. If not, schedule a new tutor to start his/her 2 hour period at exactly the starting time of this new request.

——-

In pseudo code form, we would have:

Sort the requests in non-decreasing order of starting times.

count = 0; // this is the # of tutors assigned thus far.

flag = UNSERVICED. // an indicator for whether a particular request has been serviced or //not.

While (one or more requests remain to be serviced)

Remove the 1st request from our sorted list.

For ( each of the tutors scheduled thus far )

//Note: we iterate through the already assigned tutors in the order in which they were

// assigned (so as to be able to schedule the tutor with the least remaining time in

// his/her 2 hr period.

If ( that tutor can service this request )

Schedule him/her to service this request.

Come out of the For loop.

Set: flag = SERVICED.

If (flag == UNSERVICED)

// means that no already scheduled tutor was able to service the new request.

// We need a new tutor.

Assign a new tutor to start his/ her 2 hr period at the starting time of the new request.

Set: flag = UNSERVICED.

count ++ ;

End of While loop.

End.

———-

Analysis of running time:

- Say there are n requests. Sorting them will take O(n log n) time.

- In the worst case, we might need to schedule n tutors to service n requests.In such a scenario, the time required for assigning tutors will be O(n^2) (this is because of the fact that before we assign a new tutor we check with each of the already assigned tutors as to whether they can service the new request).

- Hence, the algorithm will run in O(n^2) time.

———-

Intuitive proof of correctness of the algorithm:

- Consider the 1st request (in sorted order). Obviously, since no tutors have been assigned thus far, we need to schedule a new tutor for this request. Now, consider the choice we have wrt when to schedule this new tutor. Certainly, we cannot schedule him/her to start his/her 2 hr period after the starting time of the 1st request. If we schedule him/her to start working before the starting time of the 1st request, the time that elapses between the moment the tutor’s 2 hr period starts and the starting time of the 1st request would be effectively wasted. Hence, the best thing would be to schedule the 1st tutor to start his/her 2 hr period at exactly the starting time of the 1st request.

- Similarly, one can argue that for each newly assigned tutor, the algorithm chooses exactly the right time for him/her to start their working period.

- Also, we observe that the algorithm does not assign any new tutors unless there is a need to do so.

- And finally, the algorithm always schedules a tutor with the least remaining time in their schedule to service a new request (when selecting among already assigned tutors). This results in maximum number of requests being serviced by already assigned tutors without having the need to assign a new one.

————–

Selection Sort – Sorting in C++ – 2

Selection sort works as follows: we select the max element in the array and place it at the last position. In the next iteration, we select the 2nd largest element of the array, and place it at the 2nd last position. Continue doing so until the array is sorted.

————–

void sel_sort (int A [ ], int n)

{

int max, max_index;

for (int i = 0; i < n-1; i++)

{

max = A[0];

max_index = 0;

for (int j = 1; j < n-i; j++)

{

if ( A[j] > max)

{

max = A[j];

max_index = j;

}

}

//now swap A[n-i] and A[max_index].

temp = A[n-i];

A[n-i] = A[max_index];

A[max_index] = temp;

}

}

———–

Running time : for all cases – O(n^2).

———–

Bubble Sort – Sorting in C++ – 1

Given an array of elements, the bubble sort algorithm for sorting the elements in no-descending order proceeds as follows:

Compare the first element with the 2nd. If the 1st element is greater, swap. Now, compare the 2nd and the 3rd elements. If the 2nd is greater than the 3rd, swap. Continue doing this until you reach the end of the array, at which point the largest element has been placed at the last position. Now do the same to place the 2nd largest element at its rightful position, then the 3rd largest at its rightful position, and so on.

———

void bubble_sort (int A [ ], int n)

{

int temp;

for (int i = 0; i < n-1; i++)

for (int j = 0; j < n-1-i; j++)

{

if (A[j] > A[j+1])

{

//swap

temp = A[j];

A[j] = A[j+1];

A[j+1] = temp;

}

}

}

——–

Running time: Average case/ Worst case / Best case : O(n^2).

Kite-cutting : More on dynamic programming – 4

Question. We are given a sheet of rectangular paper of length L and width W (where L and W are integral). We need to cut it in order to make rectangular kites. We have a set of n types of kites, each of integral dimensions, which we can make. Kite i has length l_i, width w_i and sells for a profit of p_i. As is obvious, we would like to maximize the profit we can make from our rectangular sheet of paper.

Now, we are given a cutting device to cut the sheet. The cutting device can cut a sheet of paper only either horizontally or vertically. What this means is that we cannot cut a sheet into arbitrary shapes. Given a rectangular sheet to start with, the above condition implies that we will always obtain 2 smaller rectangles from it. The smaller rectangles can be obtained by either cutting along the width or along the length.

Also, we are told that there is no restriction on the number of kites we are allowed to make of any particular type.

Solution.

We can solve this in a fairly straightforward fashion using dynamic programming.

Let’s denote by OPT(x,y) the maximum profit we can derive out of a sheet of paper of length x and width y.

Thus, our aim is to compute OPT(L,W).

Now, we know that we can cut the x \times y dimension sheet only either along its length or along its width (or we can also choose not to cut the sheet at all).

If we cut along the length, the 2 smaller rectangles obtained are of sizes: x \times w and x \times (y-w) where w lies in the range \{1, y-1\}.

Similarly, one can cut width-wise.

Also, let us denote by P(x,y) the profit we would get if by directly selling (i.e. without cutting) our x \times y sheet of paper. (Note that this means that there exists some kite-type which has exactly these dimensions.)

——

We obtain the following recurrence based on the above observations:

OPT (x,y) = \max \{ Term 1, Term 2, P(x,y) \}.

Term1 = \{ OPT(x, w) + OPT(x,y-w) : w \in \{1, \ldots, y-1\} \}.

Term2 = \{ OPT(l, y) + OPT(x-1, y) : l \in \{1, \ldots, x-1\} \}.

——-

Further, we have the following obvious results:

OPT(0, y) = 0 for all y.

OPT (x, 0) = 0 for all x.

——–

Using the above recurrence, we can compute the OPT table as follows:

——

// in C++

//we assume we have already computed P[x][y].

int OPT[L]{W];

for (int i = 0; i < L; i++)

OPT[i][W] = 0;

for ( i =0; i < W; i++)

OPT [L][i] = 0;

int j, k, temp1 = 0, temp2 = 0, max;

for (i = 0; i < L; i++)

for (j=0; j < W; j++)

{

for (k =1; k < i ; k++)

{

if (OPT[k][j] + OPT[i-k][j] > temp1)

//temp1 stores the max value of term1

temp1 = OPT[k][j] + OPT[i-k][j];

}

for ( k = 1; k < j; k++)

{

if ( OPT[i][k] + OPT [i][j-k] > temp2)

//temp 2 stores the max value of term2

temp2 = OPT[i][k] + OPT [i][j-k];

}

//now take the max of temp1, temp2 and P[i][j].

max = (temp1 > temp2) ?temp1 : temp2;

OPT[i][j] = (max > P[x][y]) ? max : P[x][y];

}

———

The answer that we want is OPT[L][W].

The above code would run in time O(L^2 W^2).

(Note that we assume P[x][y] values to be part of the input.)

——-



Frog crossing – More on dynamic programming – 3

Question. There is a river n meters wide. A frog wants to cross the river. There were originally n-1 stones across the river at a distance of 1 meter from one another. However, some of these stones were removed recently.

The frog has a peculiar jumping rule. If it jumps x meters on one jump, the next jump has to lie in the range {x-1, x, x+1}. Further, the 1st jump that the frog makes is of exactly 1 meter. (We assume that the stone at a distance of 1 meter from the frog’s end was not removed.)

Given which of the stones have been removed, how can you tell whether or not it is possible for the frog to reach the other end.

Solution.

Let us define an array, Stone [1 .. n] as:

Stone[i] = true, iff there is a stone at a distance of i meters from the frog’s end.

Note that Stone[1] and Stone[n] are already given to be true.

——

We define another quantity, can_reach[d,s] as:

can_reach [d,s] = true,

if all of the following conditions hold true:

1. Stone[d] = true.

2. Stone[s] = true.

3. Given the previous jumps of the frog, the frog can reach the stone at distance d from the bank, by directly jumping from the stone at distance s from the bank.

——

Our aim is to figure out if can_reach[n,s] is true for any value of s i.e. for any of s = {1, …, n-1}.

—–

We make some observations about can-reach [d,s]:

1. can_reach [1,0] = true.

2. can_reach [d,0] = false, if d > 1.

3. can_reach [d,s] = false, if s >= d.

4. We also have the following recurrence:

If d > 1, can_reach [d,s] =

Stone[d] AND Stone[s] AND [ can_reach [s, (d-s)-1] OR can_reach[s,d-s] OR can_reach[s,(d-s)+1] ].

——-

Using the above equations, we can compute a table of can_reach values as follows:

——-

can_reach [1,0] = true;

For (d going from 2 to n)

can_reach [d,0] = false;

For (d going from 1 to n)

For (s going from d+1 to n)

can_reach[d,s] = false;

For (d going from 2 to n)

For (s going from 1 to d-1)

can_reach[d,s] is as given by equation (4) above.

//answer

For (s going from 1 to n-1)

If (can_reach[n,s] == true)

Output: “Can reach the other bank”;

Output: “Cannot reach the other bank”;

———–

As can be seen, the algorithm runs in O(n^2) time.

Maximum Sum Subarray in C++ : More on dynamic programming – 2

Question. We are given an array A of n integers. We wish to find a contiguous subarray of A having the greatest sum.

Solution.

We note first that the required subarray can end at any of the n indexes , 0 through n-1. Hence, let us define V[k] as the value (i.e. sum) of the optimal subarray ending at index k.

Let OPT denote the value of the required optimal subarray.

Therefore, we have that:

OPT = max_{0 \leq k \leq n-1} V[k].

We can compute OPT using the V[k] values in O(n) time.

——-

We now look at computing the V[k] values.

We have the following recurrence:

V[k]=max(A[k]+V[k-1],A[k])

(Since the subarray has to end at index k, A[k] is always part of it.)

——–

Further, in order to compute the optimal subarray, we need to keep track of the starting index of the subarray. We store the starting index of the optimal subarrays for our subproblems in an array of length n called Index.

———

We can compute the V array and the Index array using the above recurrence in a bottom up manner.

Pseudocode for computing V and Index:

V[0] = A[0];

Index[0] = 0;

For (i going from 1 to n-1)

//Use the above recurrence to compute V[i].

If ( V[i-1] > 0)

V[i] = V[i-1] + A[i];

Index[i] = Index[i-1];

Else

V[i] = A[i];

Index[i] = i;

End of algorithm.

——————————

We can write the above pseudocode in C/ C++ as follows:

// 1. computing the V and Index arrays. (array A of length n is given to us.)

int V[n], Index[n], OPT;

V[0] =A[0];

Index[0] = 0;

for ( int i = 1; i < n; i ++ )

{

//use the recurrence to find V[i] and Index[i].

if ( V[i-1] > 0)

{

V[i] = A[i] + V[i-1];

//starting index of opt. subarray ending at i is Index[i-1].

Index [i] = Index[i-1];

}

else

{

V[i] = A[i];

//starting index of opt subarray ending at i is i itself.

Index[i] = i;

}

}

—————-

// 2. Now, we can compute OPT using V in O(n) time.

OPT = V[0];

int final_index = 0; //final_index will store the ending index of the opt subarray in A.

for (i = 1; i < n; i++)

{

if (V[i] > OPT)

{

OPT = V[i];

final_index = i;

}

}

————–

// 3. We can compute the optimal subarray (the starting and ending indices) from Index.

//ending index is final_index (which we have already computed).

int begin_index = Index[final_index];

———-

That completes our algorithm.

Analysis of the algorithm:

1. Computing V and Index arrays takes O(n) time.

2. Computing OPT and final_index from V takes O(n) time.

3. Computing begin_index from Index takes O(1) time.

————

Longest non-decreasing subsequence : More on dynamic programming – 1

Question. We are given an array A of n integers. We wish to find the longest subsequence such that if the indices in the subsequence are <i_1, i_2, \ldots, i_k> (where i_1 < \ldots <I_k), we want that A[i_1] \leq \ldots A[i_k].

Solution.

Let us define L(k) to be the length of the longest non-decreasing subsequence ending at index k.

Now if OPT denotes the length of the longest non-decreasing subsequence in A, then we have that:

OPT = \max_{0 \leq k \leq n-1} L(k).

We can do the above step in O(n) time.

——

Let us now look at computing L(k) values.

We have the following recurrence:

L(k) = 1 + \max_i L(i).

In the above equation, the maximum is taken over all indices 0 \leq i \leq k-1 for which A[i] \leq A[k]. (i.e. for which we can safely add A[k] to the already existing longest non-decreasing subsequence ending at i.)

In case there exists no such index i, L(k) = 1.

——-

Now that we have the recurrence, we can compute the L(k) values in a bottom up fashion iteratively. We will also keep a record of the index i (as defined above) for each L(k) that we compute. This will help us when we want to reconstruct the optimal solution.

—–

Pseudo-code for computing L(k) values:

L[0] = 1;

For ( k going from 1 to n-1)

Compute L[k] using the above recurrence.

Index[k] = i;

End.

——

Coding in C/ C++ would give us:

int L[n], Index[n], OPT = 0, final_index = -1;

//final_index is the index of the last term in the optimal subsequence. (Note: there can be //more that one optimal subsequence. We consider one of them.

L[0] = 1;

for (int k=1; k<n; k++)

{

int temp_max = 0, temp_index = k; //default values of L[k]-1 and Index[k].

for (int i = 0; i<k; i++)

{

if (A[i] <= A[k]) //it is safe to include A[k] in this subsequence.

{

if (L[i] > temp_max)

{

temp_max = L[i];

temp_index = i;

}

}

}

L[k] = 1 + temp_max;

Index[k] = temp_index;

}

//now, to compute OPT.

for (k=0; k<n; k++)

{

if (L[k] > OPT)

{

OPT = L[k];

final_index = k;

}

}

//Now, we construct the optimal subsequence using our Index array.

// We do so recursively using a function Print_Solution.

// void Print_Solution ( int Index [ ], int n);

void Print_Solution ( int Index [ ], int n, int final_index)

{

int prev_index = Index[final_index];

if (prev_index == final_index) //this is where the subsequence begins; end recursion.

return;

Print_Solution (Index, n, prev_index);

cout << A [final_index] << endl;

}

———–

Analysis of the algorithm:

- Computing the L and Index arrays takes O(n^2) time.

- Computing OPT from L values takes O(n) time.

- Constructing the optimal subsequence from Index values takes O(n) time.

———–


Dynamic Programming – 6 : Optimum sum of tree nodes

Optimum sum of tree nodes

We are given a rooted tree T with root r. Each node x has a real value v(x) associated with it. We are to find a subset of nodes of the tree, such the sum of the values of the selected nodes is maximized, subject to the condition that if we choose a node x, we cannot choose any of its children or its parent.

Solution.

Notation. Let OPT (x) denote the optimum value on a sub-tree rooted at node x. Let OPT (1,x) denote the optimum value on a sub-tree rooted at node x subject to the condition that x is selected as part of the optimal subset. Let OPT(0,x) denote the optimum value on a sub-tree rooted at x subject to the condition that x is not selected as part of the optimal subset.

Then, we have that:

OPT(x) = max \{OPT(1,x), OPT(0,x)\}.

OPT(1,x) = v(x) + \sum_y OPT (0,y) (where the summation is over all children y of x.)

OPT(0,x) = \sum_y OPT (y) (where the summation is over all children y of x.)

Also, if x is a leaf, then OPT(1,x) = v(x), and OPT(0,x) = 0.

Pseudocode for computing OPT(root)

We can compute OPT(root) in a bottom up fashion starting from the leaves. The recurrence equations for a node x involve the OPT values for its children. Thus, before computing the OPT value for node x, we compute the OPT values for its children.

Let the nodes of the tree be arranged in levels from Level 0 till Level L such that the root is at Level 0.

For i going from L downto 0:

For all nodes x in Level i:

Compute OPT[1,x], OPT[0,x] and OPT[x] using the above recurrence.

Set Soln[x] = 1 if OPT[1,x] >= OPT [0,x]. Else, set Soln[x] = 0.

End.

—-

The above code has a running time of O(n) where n represents the number of nodes in the tree. [ We compute OPT values for n nodes. For each node, we compute the OPT values by summing up the OPT values over all of its children. Aggregating over all nodes, and observing the each node can have at most one parent, we find that the summing up operation costs O(n). Therefore, the total cost is O(n+n) which is O(n). Another way to look at it is that the aggregate cost of the summing up operations over all nodes is O(m) [where m represents the number of edges in the tree], which for a tree is O(n). ]

—-

Computing the optimal subset:

We do so using the Soln[x] values.

Function: Print_Solution (x)

If (Soln[x] == 1)

Print x.

For all grand-children y of x:

Print_Solution (y);

Else

For all children y of x:

Print_Solution (y);

End.

—-

Print_Solution(root) has a running time of O(n).

—-

Dynamic Programming – 5 : Matrix chain multiplication

Matrix chain mnultiplication

We are given a sequence of n matrices <M_1, \ldots, M_n >, such that matrix M_i has dimensions a_i \times a_{i+1}. Our task is to compute an optimal parenthesization of this sequence of matrices such that the sequence is fully parenthesized.

Terminology: A sequence of matrices is fully parenthesized if the sequence is either a single matrix or a product of 2 fully parenthesized sequences.

Optimality here refers to minimization of the number of scalar multiplications involved in computing the product of the matrix sequence.

Note 1:

// C code for multiplying A (m x n) with B (n x p). Product is C (m x p).

for (i=0; i < m; i++)

for (j=0; j < p; j++)

{

C[i][j] = 0;

for (k=0; k<n; k++)

C[i][j] + = A[i][k] * B[k][j];

}

//End of code.

Note 2: From the above, we can see that multiplying an (m x n) matrix with a (n x p) matrix requires m x n x p scalar multiplications.

Note 3: Matrix multiplication is associative. So, we can parenthesize the given sequence of n matrices whichever way we want as long as we do not disturb the sequence itself.

—-

Solution.

If n > 1, we know that the required parenthesization divides the n matrix sequence into 2 smaller fully parenthesized sequences.

Let this division occur at matrix M_k, i.e. the number of scalar multiplications in <M_1, \ldots, M_n> equals the sum of the number of scalar multiplications in <M_1, \ldots, M_k> and <M_{k+1}, \ldots, M_n> plus the value a_1 \times a_{k+1} \times a_n.

Notation: Let OPT(i,j) denote the number of scalar multiplications in the optimal parenthesization of the sequence <M_i, \ldots, M_j>.

From the above, we have that:

OPT(1,n) = \min_{1 \leq k \leq n-1}\{ OPT (1, k) + OPT(k+1, n) + a_1 \times a_{k+1} \times a_{n+1} \}.

Generalizing this, we can say that, for 1 \leq i < j \leq n:

OPT(i, j) = \min_{i \leq k \leq j-1}\{ OPT (i, k) + OPT(k+1, j) + a_i \times a_{k+1} \times a_{j+1} \}

Further, OPT(i,i) = 0 for all possible i.

—-

Pseudocode for computing optimum number of scalar multiplications:

For i from 1 to n:

For j from 1 to (n-i):

Compute OPT [i,j] according to the above recurrence.

Set Soln[i,j] = k; // k is the index that optimizes OPT[i,j] in the recurrence.

End of algorithm.

The above code will take O(n^3) to find OPT[i,n] (in the process, computing the OPT[i,j] and Soln[i,j] table values).

Computing the optimal solution i.e. parenthesization

We can do so using the Soln [i,j] table.

Function: Print_Solution (i, j)

If ( i == j)

Print A_i

Else

Print (

Print_Solution (i, Soln[i,j] );

Print_Solution ( Soln[i,j] + 1, j);

Print )

End.

—-

This function takes O(n) time for the call Print_Solution (1, n).

Dynamic Programming – 4 : Sequence Alignment

Sequence Alignment

We are given 2 sequences X = <x_1, \ldots, x_m> and Y = < y_1, \ldots, y_n>. Our task is to produce a valid matching of the elements of X and Y such that the cost of the matching is minimized.

Terminology:

A valid matching denotes a one-to-one correspondence of (some subset of) elements of X and Y with the added condition that there should be no crossing in the matching.i.e. if x_i is matched to y_j, and x_p is matched to y_q, then i < p implies j < q.

e.g.

X = < e l e p h a n t >

Y = < m o n k e y >

The matching M = \{ (1,2), (3,3), (6, 5) \} refers to the valid matching where we match x_1, y_2 and x_3, y_3 and x_6, y_5. Note that there are no crossings in M.

Cost of a matching M is the sum of:

(1) matching cost m_{x_i,y_j} for each (i,j) \in M.

(2) gap cost d for each element of X or Y which is not part of M.

Solution.

Given X and Y, we have 2 choices: either include (m,n) in the optimum matching, or exclude it from the optimum matching.

Case 1:

If (m,n) is part of optimum matching, then the optimum cost would be m_{x_m, y_n} plus the optimum cost of matching the sequences <x_1, \ldots, x_{m-1} and < y_1, \ldots, y_{n-1}.

Notation: Let OPT(i,j) denote the optimum cost of matching <x_1, \ldots, x_i> and <y_1, \ldots, y_j>.

Therefore, we have:

OPT (m,n) = m_{x_m, y_n} + OPT(m-1, n-1)

Case 2:

If (m,n) is not part of the optimum matching, then one of the following should hold true (because of the no crossing condition):

(1) x_m is not part of the optimum matching.

or, (2) y_n is not part of the optimum matching.

Therefore, we have:

OPT(m,n) = min \{ OPT(m-1, n) + d, OPT (m,n-1) + d\}

—-

Generalizing the above recurrence, we have, for 1 \leq i \leq m and 1 \leq j \leq n:

OPT (i, j) = min \{ m_{x_i, y_j} + OPT (i-1, j-1), OPT (i-1, j) + d, OPT (i, j-1) + d\}

Also, we have the base conditions that:

OPT (i, 0) = i \times d for all 0 \leq i \leq m; and analogously for OPT (0, j).

—-

Pseudocode for computing optimum value:

Initialize OPT [i, 0] and OPT [0,j] as given above.

For i from 1 to m:

For j from 1 to n:

Compute OPT [i,j] using the given recurrence.

End of algorithm.

—-

The above algorithm computes the optimum value OPT (m,n) in O(mn) time.

—-

Computing the optimum solution using OPT [i,j] values:

Let us denote :

(1) m_{x_i, y_j} + OPT (i-1, j-1) by Expr1.

(2) d + OPT (i, j-1) by Expr2.

(3) d + OPT (i-1, j) by Expr3.

Function: OPT_Solution (i,j)

If (i = = 0) OR ( j = = 0)

Return;

If Expr1 <= both Expr2 and Expr3

OPT_Solution (i-1, j-1);

Print: Index i of X is matched to Index j of Y.

Else if: Expr2 <= both Expr1 and Expr3

OPT_Solution (i, j-1);

Else:

OPT_Solution (i-1, j);

End of function.

—-

What we want to compute is OPT_Solution (m, n). The above function computes that in time O(m+n).

—-

Dynamic Programming – 3 : Subset Sum

Subset Sum

We are given n items \{I_1, \ldots, I_n \}, each having non-negative integral weight w_i. We are also given a non-negative integral weight bound W. Our task is to find a valid subset of items which maximizes the total weight of the items in the subset.

Terminology: A valid subset T is one such that the total weight of items in T is at most W.

Solution.

For item I_n, we have 2 choices : either to include it in the optimum solution, or to not include it.(Note that if w_n > W, then we have just one choice  — exclude I_n from the optimum solution.)

Case 1:

If I_n is included in the optimum solution, the optimum solution would be w_n plus the optimum solution obtained from the set of items \{I_1, \ldots, I_{n-1}\} with the weight bound as W-w_n.

Notation: Let OPT (i, w) denote the value of the optimum solution obtained on the set \{I_1, \ldots, I_i\} with weight bound w.The original task, then, is to compute OPT(n, W).

We have that, if I_n is included in the optimum solution,

OPT (n, W) = w_n + OPT (n-1, W-w_n).

Case 2:

If I_n is not included in the optimum solution, then we have that:

OPT (n, W) = OPT (n-1, W).

—-

Generalizing the above, we have that, for 1 \leq i \leq n and 0 \leq w \leq W:

OPT ( i, w) = OPT (i-1, w) [if w_i > w]

Else,

OPT (i, w) = max \{ w_i + OPT (i-1, w- w_i), OPT (i-1, w) \}

—-

Also, OPT (0, w) = 0 for all w.

—-

Pseudocode for computing the optimum value:

Based on the above recurrence, we can construct an algorithm.

For i from 1 to n:

For w from 0 to W:

Compute OPT[i,w] using the above recurrence.

End.

—-

The above code runs in O(nW) time [since O(1) time is spent for computing each OPT[i,w] value].

—-

Computing the Optimum solution:

We can compute the optimum solution recursively given the table of OPT[i,w] values.

Function: Print_OPT (k, w)

If k == o

Return;  //Breaking the recursion. We can also have the condition as k == 0 OR w == 0

If w_k > w

Print_OPT (k-1, w);

Else if w_k + OPT (k-1, w-w_k) > OPT (k-1, w)

Print_OPT (k-1, w- w_k);

Print I_k

Else

Print_OPT (k-1, w);

End of function.

The optimum solution is printed on calling Print_OPT (n, W). The time taken is O(n).

—-


Dynamic Programming – 2 : Longest Common Subsequence

Longest Common Subsequence (LCS)

We are given 2 sequences X = < x_1, \ldots, x_m > and Y = <y_1, \ldots, y_n >. Our task is to find the LCS of X and Y.

Terminology: A subsequence of X is a set of elements that can be obtained from the elements of X such that no 2 two elements of the subsequence are out of order wrt the ordering defined by X. e.g. if X = < E L E P H A N T >, then < E P H T>, < E E A > etc are subsequences of X.

A common subsequence of X and Y is a sequence Z that is a subsequence of both X and Y. The longest such Z is an LCS.

Solution.

Let OPT (m, n) denote the length of an LCS of X and Y. (Note that there may be more than one LCS’s, each of them having the same optimum length.)

Either of these 2 cases is true : either x_m = y_n or x_m \neq y_n.

Case 1:

If x_m = y_n then we can certainly include that element in the LCS and proceed to finding the LCS of <x_1, \ldots, x_{m-1} and <y_1, \ldots, y_{n-1}.

Notation: Let X_k denote <x_1, \ldots, x_k. Y_k is defined analogously.

Notation: Let OPT (i, j) denote the length of the LCS of X_i and Y_j.

Therefore, we have that:

OPT (m,n) = 1 + OPT (m-1, n-1) (if x_m = y_n)

Case 2:

If x_m \neq y_n, then we have that:

OPT (m, n) = max \{ OPT (m-1, n), OPT (m, n-1) \}.

—–

Generalizing the above, we have that, for i > 0 and j> 0:

OPT (i, j) = 1 + OPT (i-1, j-1) (if x_i = y_j)

and

OPT (i, j) = max \{ OPT (i, j-1), OPT (i-1, j) \} (if x_i \neq y_j)

—–

We can use the above recurrence to compute the optimum length in a bottom up fashion.

Pseudocode for computing the optimum length:

Initialize OPT (0, i) and OPT (j, 0) to be 0 for all 1 <= i <= n, and 1 <= j <= m.

For i from 1 to m:

For j from 1 to n:

Compute OPT (i,j) and OPT (j,i) using the above recurrence.

End of algorithm.

—-

What we want is the length of OPT (m,n). The above algorithm computes OPT(m,n) in O(mn) time.

—-

Using the m \times n table of OPT (i,j) values, we can compute the LCS recursively.

Function: OPT_Solution (i,j) :=

If (i == 0) OR ( j == o)

// Exit; Break recursion.

Return;

If x_i == y_j

OPT_Solution (i-1, j-1);

Print x_i

Else if OPT (i-1, j) \geq OPT (i, j-1)

OPT_Solution (i-1, j);

Else

OPT_Solution (i, j-1);

End of algorithm.
—-
The above code for printing the LCS runs in O(m+n) time.

—–


Dynamic Programming – 1 : Weighted Interval Scheduling

In general, problems on dynamic programming can be solved by first expressing the required optimum solution in a recursive fashion, and then solving the recurrence in a bottom-up fashion.

Weighted Interval Scheduling

We are given n intervals, with each interval having start time S_i, finish time F_i and non-negative weight W_i ( for i from 1 to n).

Our task is to find a compatible subset T  of these intervals so as to maximize \sum W_I for the intervals in T.

(A compatible subset is one where no two intervals ever overlap. )

—-

Solution.

Let us assume that the n intervals are sorted in the order of their finish times. Let this ordered list of intervals be I_1, \ldots, I_n.

Let OPT (n) denote the value of optimum solution. (i.e. the sum of weights of intervals in an optimum solution.)

Consider interval I_n.

The optimum solution may either include I_n or not include I_n.

If the optimum solution does not include I_n, then we have:

OPT (n) = OPT (n-1) .

Notation: Here, OPT ( k ) denotes the optimum value computed from the set of intervals \{I_1, \ldots, I_k\}.

If the optimum solution does includes I_n, then, apart from I_n, the optimum solution can only include those intervals which do not overlap with it, i.e. only those intervals, i for which F_i \leq S_n.

Notation Let p ( k ) denote the index of the interval, i for which (1) F_i \leq S_k, and (2) F_{i+1} > S_k. Also, if all intervals have finish times > S_k, then p_k = 0.

Thus, we have that, if the optimum solution includes I_n, then:

OPT (n) = W_n + OPT (p(n)).

Therefore,

OPT (n) = max \{ OPT (n-1), W_n + OPT (p(n)) \}

Generalizing, we have that, for 1 \leq i \leq n :

OPT ( i ) = max \{ OPT (i-1), W_i + OPT (p(i)) \}

and that, OPT (0) = 0.

—–

Using the above recurrence, we can write an algorithm to solve it in a bottom-up fashion:

/* Assuming that intervals are sorted according to F_i values we have p_i values for all intervals. */

OPT (0) = 0;

For i from 1 to n:

Compute OPT(i) using the above recurrence.

Return OPT (n).
End.

—–

The above algorithm takes O(n) time.

Sorting the intervals by F_i values takes O(n \log n) (using merge sort or quick sort), following which we can compute p_i values for all intervals in O(n \log n) time (by calling a slightly modified binary search routine n times).

Therefore, the algorithm will run in O(n \log n) time.

——

Computing the optimal solution from the optimal value:

Once we have the OPT values, we can compute the optimal solution (i.e. the sets of intervals, not just the value) recursively as follows:

Function: OPT_Solution ( k ) :=

If ( k == 0 )

//Do nothing. Break recursion.

Return;

For i decreasing from k to 1:

If [ OPT ( p(k) ) + W_k >= OPT (k-1) ]

OPT_Solution ( p(k) );

// Include interval k in Optimum solution

Print k.

Else

OPT_Solution (k-1);


End of function.

—-

The function OPT_Solution (n) has a worst-case running time of O(n) [because there can be at most n recursive calls (since the argument passed to the function calls strictly decreases), and the time taken to process the non-recursive part of each call is O(1) ).

—-

 

More on the Dial-a-Ride problem

We present here a so-called Structure Theorem given by Gupta et al. [1] for the non-preemptive Dial-a-Ride problem on an n-vertex metric space (V,d) having m demand pairs, and a specified start vertex, r.

We say that a vehicle is in a pick-up phase during some portion (segment) of a tour, if it only performs pick-ups during that time interval. Similarly, we say that a vehicle is in a delivery phase during a particular segment if it only performs deliveries in that segment. If the optimal tour had a structure which allowed it to be split into pick-up and delivery phases, i.e. if the optimal tour could be split as follows: the vehicle first picks up a set of at most k items; it then delivers each of those items to their destinations; it again picks up a set of at most k items, and so on, – we would have been able to apply the greedy paradigm for the problem. We could select a best-possible set of (at most k) items to service during a segment.

However, the optimal tour may not exhibit such a structure. Gupta et al. show that a near-optimal tour displays this property.

Theorem 1. Given an instance of non-preemptive Dial-a-Ride, there exists a feasible tour, \tau within O(\log {m}) of optimal, such that \tau can be split into a set of segments – \tau = S_1. S_2 ... S_t , wherein each segment, S_i services a set, O_i of at most k demand pairs (items) by first picking up all items in O_i and then delivering them to their respective destinations.

Proof. Consider an optimal tour \sigma. Represent \sigma as a line L such that every edge in L corresponds to an edge traversal in \sigma. The i^{th} edge of L corresponds to the i^{th} edge traversed in \sigma. (That is, L is simply the tour \sigma, wherein vertices visited multiple times in \sigma are represented that many times in L.)

By the triangle inequality, we have that the number of edges in \sigma, denoted by | \sigma | (equal to the number of edges in L) is at most 2m + 1.

Let d(\sigma) denote the length of \sigma.

Define stretch of item i as the number of edges traversed in \sigma after it is picked up, and before it is delivered (i.e. the number of edges it travels on the vehicle).

Divide the m items into groups G_j, j = 1, ..., \lceil \log {(2m)} \rceil, such that G_j is the set of all items having stretch between 2^{j-1} and 2^j.

We shall service items in each group separately.

Consider group G_j. Denote by \tau_j the portion of the tour where items in G_j are serviced.

We first prove a lemma relating to \tau_j.

Lemma 2. There exists a tour \tau_j of length at most 6 d(\sigma), such that \tau_j can be divided into a set of segments, each servicing at most k items in a manner similar to that stated in Theorem 1.

Proof. Let r = 2^{j-1}. Divide L into segments having r edges each. There are \lceil \frac { |\sigma|} {r} such segments- the first r edges in L form the first segment (denoted by T_{1,j}), the next set of r edges forms segment T_{2,j}, and so on.

Denote the set of items originating in segment T_{l,j} by O_{l,j}. (i.e. Items in O_{l,j} have sources between vertices (l-1)r and lr -1. Note that items in O_{l,j} are exactly those items that cross edge (lr-1, lr) in \sigma. This implies that |O_{l,j}| \leq k. Note also that items in O_{l,j} have destinations between the vertices lr and (l+2)r - 1.

Consider set O_{l,j}.

We service it as follows: Start the vehicle (initially empty) at vertex (l-1)r. Pick up items in O_{l,j} originating in T_{l,j}. Deliver these items (\leq k) to their destinations in T_{l+1,j} and T_{l+2,j}. Return with the empty vehicle to vertex lr. Denote by \omega_{l,j} this portion of the tour where items in O_{l,j} are serviced.

By concatenating the portions \omega_{l,j} for possible values of l, we obtain a tour, \tau_j that services all items in G_j.

We make the following observations:

  • Each edge in L is a part of at most 3 portions/segments: \omega_{l,j}.
  • Each edge is traversed at most twice in any segment/portion.
  • The number of items carried by the vehicle at any time in any \omega_{l,j} is at most k.

The above observations make it clear that \tau_j has length at most 6 d(\sigma) proving Lemma 2.

We service items in each of the \lceil \log {(2m)} \rceil groups, G_j similarly. We thus get a tour that is O(\log {m}) d(\sigma), completing the proof of Theorem 1.

References:

[1] A. Gupta et al – Dial-a-Ride from k-forest, 2010.

On the Dial-a-Ride problem – 3

We present here a polynomial-time 2-approximation algorithm given by Charikar and Raghavachari [CR 2001] for the Capacitated Preemptive version of Dial-a-Ride on trees. For ease of visualization, we consider rooted trees, though the algorithm works as such for any tree.

The items are assumed to be located on the leaves of the tree, T. Consider a demand pair \{s_i, t_i\}. We will call the least common ancestor of s_i and t_i (the least-depth vertex in the unique path between s_i and t_i) as the turning point for that demand pair. Given the m demand pairs, all turning points can be easily computed in polynomial time. The algorithm that follows makes use of these pre-computed turning points.

The algorithm modifies the usual depth-first search, and runs two modified traversals in sequence. In the first traversal, called the upsweep, all items are shifted upwards from their sources to their respective turning points. In the second traversal, called the downsweep, the items are shifted downwards to their respective destinations.

Consider any edge e.

  • In the upsweep phase, the algorithm recursively explores the sub-tree rooted at e at the end of which all items that must necessarily pass upwards through e are shifted to the lower end point of e. The total number of such items is flow_u(e). The items are then shifted upwards across e. Clearly, edge e is traversed 2 max \{ 1, \lceil \frac{flow_u(e)}{k} \rceil \} times.
  • In the downsweep phase, items that need to pass downwards across e are shifted to the upper end point of e before the subtree rooted at e is recursively explored. Again, it is clear that e is traversed 2 max \{1, \lceil \frac{flow_d(e)} {k} \rceil \} times in the downsweep.

Notes:

  • Preemption is used in carrying items across edge e in the above description. The vehicle travels back and forth across e carrying items to the other end point.
  • Clearly, the algorithm runs in polynomial time.
  • This is a 2-approximation algorithm. This is obtained directly from the Steiner tree, and the flow lower bounds.

References:

  • [CR 2001] M. Charikar, B. Raghavachari: The Finite Capacity Dial-a-Ride Problem, 2001.

On the Dial-a-Ride problem – 2

We consider here the capacitated Dial-a-Ride problem on the tree metric. The m demand pairs \{(s_i, t_i)\}_{i=1}^{m} are assumed to be on the leaves of the tree, T.

  • For each demand pair (s_i, t_i), we have a unique path p_i from s_i to t_i in T.
  • Call all vertices which are on any of the paths \{p_i\}_{i=1}^{m} as interesting.
  • Call all edges which are on any of the paths \{p_i\}_{i=1}^{m} as essential.
  • Clearly, any tour that satisfies all demand pairs must cover all essential edges, and hence, all interesting vertices.

We present two lower bounds on the number of edge traversals in the optimal tour.

1. Steiner tree lower bound:- Let Q be the least cost Steiner tree spanning all interesting vertices. Q would include all essential edges, and if \cup_{i=1}^{m} p_i is not connected, Q will include other edges too. Clearly, any tour that satisfies all demand pairs must traverse every edge of Q at least twice. We call this the Steiner tree lower bound.

2. Flow lower bound:- With every edge e in T, we associate two quantities flow_u(e) and flow_d(e). For every edge e = \{u,v\}, fix, for the purpose of computation of flow_u(e) and flow_d(e), an ordering of the end points. flow_u(e) equals the number of paths, p_i which require an item to pass from vertex u to vertex v. flow_d(e) corresponds to the number of paths, p_i which require an item to pass from vertex v to vertex u. (Here, ‘u’ denotes ‘upwards’, and ‘d’ denotes ‘downwards’.)

Clearly, for a vehicle of capacity k, edge e must be traversed at least 2 max \{ \lceil \frac{flow_u(e)}{k}\rceil, \lceil \frac{flow_d(e)}{k}\rceil \} in any feasible Dial-a-Ride tour. We call this the flow lower bound.

Notes:

  • The flow lower bound provides a positive lower bound only for essential edges. The Steiner tree lower bound provides a positive constant lower bound of 2 for all essential edges, and, in case \cup_{i=1}^{m} p_i is disconnected, for certain non-essential edges.
  • The flow lower bound is always at least as strong as the Steiner tree lower bound for essential edges.
  • The two bounds can be combined in the obvious manner to give a stronger lower bound.
  • For infinite or arbitrarily high capacity k, the flow lower bound translates to a lower bound of 2 for all essential edges and zero for non-essential edges. The Steiner tree bound remains 2 for all edges in Q.
  • For k=1, the flow bound translates expectedly to 2 max\{flow_u(e), flow_d(e)\}. Intuitively, it is clear that as the capacity of the vehicle decreases, the Steiner tree lower bound, which does not take this into account, becomes progressively weaker as compared to the flow bound for essential edges.
  • Intuitively, the flow bound is concerned with the capacity of the vehicle for the flow of items across essential edges. The Steiner tree bound is concerned with the connectivity requirement in order to complete the tour using a least cost superset of essential edges.

References:

M. Charikar, B. Raghavachari: The Finite Capacity Dial-a-Ride Problem, 2001.

On the Dial-a-Ride problem

Dial-a-Ride problem: Given an undirected graph G = (V,E) with a non-negative length function l : E \rightarrow \mathbb{R}^{+}, a set of m ordered source-destination pairs (demand pairs) \{(s_i, t_i)\}_{i=1}^{m} \subset V \times V (each requiring an item to be carried from s_i to t_i), and a vehicle with capacity k, find a minimum length tour which satisfies all demand pairs such that the number of items carried by the vehicle during any point in the tour does not exceed k.

  • The problem definition may or may not specify a particular starting vertex for the tour.
  • Dial-a-Ride is a restricted version of k-delivery TSP. k-delivery TSP allows an item from any particular source to be carried to any of the specified destinations, and hence generalizes Dial-a-Ride.
  • Dial-a-Ride generalizes TSP, and hence is NP-hard. Consider a TSP instance, G=(V,E). Mark any one of the vertices in V as the source vertex, and mark the remaining vertices |V|-1 vertices as destination vertices, to obtain |V|-1 demand pairs. Set the capacity of the Dial-a-Ride tour vehicle to be arbitrarily high (infinite), and set the starting vertex as the one selected to be the source. A Dial-a-Ride tour on the reduced instance corresponds to an optimal TSP tour on the original instance.
  • There are two versions of the Dial-a-Ride problem. The preemptive version allows an item to be dropped off by the vehicle at an intermediate location (different from its destination). The non-preemptive version is more restrictive and does not allow this freedom.
  • Metric version: Here the underlying graph is an n-vertex metric space (V,d).
  • Dial-a-Ride on the tree metric: The source and destination points are assumed to be on the leaves. Apart from this restriction, the problem statement remains the same as before.

On the TSP and related problems – 2

3. k-MST: (general (as opposed to metric) version) :- Given an undirected graph G= (V,E), with non-negative edge lengths l : E \rightarrow \mathbb{R}^{+}, find a tree of minimum length that spans at least k vertices.

Notes on the k-MST problem:

  • There are two versions of the k-MST problem. The rooted version specifies a root vertex, r, which must necessarily be included in the tree. The unrooted version does not place any such restriction.
  • An \alpha-approximation algorithm for unrooted k-MST gives an \alpha-approximation algorithm for rooted k-MST. Consider an instance of rooted k-MST. Add |V| additional vertices to G, and an edge of zero length between the root and each of them. Clearly, an optimal solution of the unrooted version requiring the tree to span at least |V| + k vertices in the reduced instance, is an optimal solution for the rooted version for k vertices. Hence, an \alpha-approximation algorithm for unrooted k-MST implies an \alpha-approximation algorithm for rooted k-MST.
  • An \alpha-approximation algorithm for rooted k-MST gives an \alpha-approximation algorithm for unrooted k-MST. Consider an instance of unrooted k-MST. Run the \alpha-approximation algorithm for rooted k-MST on all vertices. The least cost solution among these gives an \alpha-approximate solution to the unrooted version. ( The least cost optimal solution over all vertices (as roots) for the rooted version is an optimal solution for the unrooted version. Hence, the least cost \alpha-approximate solution over all vertices for the rooted version gives a solution that is within \alpha-factor of the optimal for the unrooted version.)
  • Budget version of k-MST: Given an undirected graph G= (V,E), a non-negative length function l : E \rightarrow \mathbb{R}^{+}, and a budget B, find a tree that spans the maximum number of vertices subject to the condition that the length of the tree is at most the upper bound B.
  • Garg [2005] gave a 2-approximation algorithm for k-MST.
  • Metric version of k-MST: Given an n-vertex metric space, (V,d), and a number k, find a minimum length tree that spans at least k vertices.

4. k-forest: Given an undirected graph G= (V,E) with non-negative length function l : E \rightarrow \mathbb{R}^{+}, a set of m demand pairs \{s_i, t_i\}_{i=1}^{m} \subset V \times V, and a number k, find a least cost subgraph that connects (satisfies) at least k demand pairs.

  • Rooted k-MST reduces to k-forest. (k-forest generalizes rooted k-MST.) Consider a rooted k-MST instance on G= (V,E) with root r. Fix |V| - 1 demand pairs \{r,v\} where v \in V - \{r\}. An optimal solution satisfying k-1 demand pairs in this reduced instance is also an optimal solution for rooted k-MST on the original instance. Also, clearly, an \alpha-approximation algorithm for k-forest gives an \alpha-approximation algorithm for rooted k-MST.
  • Unrooted k-MST reduces in polynomial time to k-forest. This follows from the reductions from the unrooted k-MST to rooted k-MST, and from rooted k-MST to k-forest. Again, an \alpha-approximation algorithm for k-forest gives an \alpha-approximation algorithm for unrooted k-MST.
  • Metric k-forest: Given an n-vertex metric space (V,d), a set of m demand pairs \{s_i, t_i\}_{i=1}^{m} \subset V \times V, and a number k, find a least cost subgraph satisfying at least k demand pairs.
  • Metric versions of rooted and unrooted k-MST reduce in polynomial time to metric k-forest.

References:

  • Naveen Garg: Saving an epsilon: A 2-approximation for the k-MST problem in graphs, 2005.

Notes on the TSP and related problems

Here are the problem definitions and some notes on the metric versions of the Traveling Salesman Problem and related problems.

A metric space is defined as an ordered pair (M,d), where M is a set and d: M \times M \rightarrow \mathbb{R} is a distance function satisfying the following properties:

  • d(x,y) \geq 0 \forall x, y \in M.
  • d(x,y) = 0 iff x = y.
  • d(x,y) = d(y,x).
  • d(x,y) + d(y,z) \geq d(x,z).

1. TSP: Given a n-vertex metric space (V,d) and m points in V, find a minimum length tour that covers all m points.

Notes on the TSP:

  • No point that is not one of the specified m points will be visited by the optimal TSP tour. This follows directly from the triangle inequality (visiting a point, v that is not one of the m specified points, can be short cut by bypassing v to obtain a tour that is shorter than one that visits v).
  • The optimal tour is a cycle, i.e. no point is visited more than once. This, again, follows from the triangle inequality.
  • TSP is NP-hard.

2. k-delivery TSP: Given an n-vertex metric space (V,d), a Source set S \subset V, a Destination set T \subset V, and a number k, such that S \cap T = \phi, \sum_{s \in S} x_s = m and \sum_{t \in T} y_t = m, where x_s represents the number of items available (i.e. capacity) at source point s, and y_t represents the requirement at destination point, t, find a minimum length tour of a vehicle that transports all m items from source points to the destination points such that the vehicle carries at most k items at all times during the tour.

Notes on the k-delivery TSP:

  • Implicit in the problem statement are the assumptions that all source points in S are identical, and that all destination points in T are identical, and that all items are identical. In other words, any item can be transported from any source to any destination.
  • The problem can require the tour to start at a particular point. Intuitively, it seems that, unlike in the case of metric TSP, having a fixed starting point can result in a longer tour than what may be obtained given the freedom to choose any starting point.
  • There are two versions of this problem. The preemptive version allows the vehicle to drop off an item at an intermediate point that is not a destination point. The non-preemptive version does not allow this freedom.
  • For the non-preemptive version of the problem, the optimal tour would involve only the points in S \cup T. This follows from the triangle inequality.
  • k-delivery TSP generalizes TSP and hence is NP-hard.
    Consider a TSP instance requiring a least cost tour covering m specified points. Choose an arbitrary point, say r,  among the specified points, designate it as a source point with capacity m-1, and designate the remaining m-1 points as destinations with unit requirement. An optimal k-delivery TSP tour for this instance corresponds to an optimal TSP tour.
  • An application of k-delivery TSP: Suppose that Coke has a certain number of storehouses in a particular town, from where it needs to transport cartons of bottles to stores across the town. A carton can go from any storehouse to any store, and there is only one vehicle for carrying the cartons, and it can carry at most a certain number of cartons at any time.

Undecidability

REGULAR_{TM} = { <M> | M is a Turing Machine and L(M) is regular}.

Q. Prove that REGULAR_{TM} is undecidable.

Solution. We will use the fact that A_{TM} = { <M,w> | M is a Turing machine and M accepts w} is undecidable.

Assume, for the sake of contradiction, that REGULAR_{TM} is decidable.

We will construct a Turing machine, M' such that L(M') is regular if and only if M accepts w.

Let M' operate as follows on input x:-

If x is of the form 0^n1^n, then “Accept”.

Else, simulate M on w, and “Accept” if M accepts w.

This ends the construction of M'.

We can see that :-

1. If M accepts w, then L(M') is the set of all strings, and hence, is regular.

2. If M does not accept w, then L(M') is not regular.

That is, L(M') is regular iff M accepts w. \therefore  If REGULAR_{TM} is decidable, then A_{TM} is decidable.

Contradiction.

Extended Longest Path problem

These are slides from a talk on the Longest Path Problem at the Penn State Theory seminar.

Talk_longest path

In the Longest Path Problem, we are given an undirected graph, G =(V,E), and integer k \geq 0 , and are asked to find out whether or not there exists a simple path of length k in G.

The Extended Longest Path Problem is a variant of the above. Here, we are asked to find out all pairs of vertices that have a simple path of length k between them.

Read more of this post

Longest Path: Method of Random Orientations

In this post, we shall consider an NP-complete problem, namely that of determining whether a simple path of length k exists in a given graph, G = (V,E).

Formally,

LONGEST PATH PROBLEM

Input: undirected graph, G = (V,E), integer k \geq 0 .

Parameter: k

To find: Does G contain a simple path of length, k?

Read more of this post

Vertex Cover and Set Cover – 4

Question: We are given an instance of the Vertex Cover problem. We are also given an algorithm for the Set Cover problem. Using this algorithm as a black box, solve the given instance of the Vertex Cover problem.

Solution:

Given instance of Vertex Cover problem: undirected graph, G = (V, E); positive integer k.
Question: Does G have a vertex cover of size at most k?

===

Read more of this post

Vertex Cover and Set Cover – 3

Consider the Vertex Cover problem.

To recapitulate:
Given: Undirected graph, G = (V, E), positive integer, k.
Question: Does G have a Vertex Cover of size at most k?

Now, we shall look at a problem related to the above question.

How do we verify the validity of a answer to the Vertex Cover problem? Read more of this post

Vertex Cover and Set Cover – 2

Having looked briefly at the Vertex Cover problem, here we shall define the Set Cover problem.

As the name suggests, we have a set, and we need to “cover” something.

Consider the following example:

We have a set of cities, S = {Madras, Delhi, Pilani, Los Angeles, London, Paris}.

Read more of this post

Vertex Cover and Set Cover – 1

Here we will look at two very interesting problems. One is known as the Vertex Cover Problem, and the other as the Set Cover Problem. Both are quite simple to define, and visualize (and equally difficult to solve efficiently :) ).

Let us consider the Vertex Cover Problem first. As the name suggests, we have to “cover” something. We are given an undirected graph, G = (V, E). We need to select a set of vertices from V, so that all edges are “covered” by this set of vertices.

What exactly do we mean by covering? A vertex “covers” all edges that are incident upon it. This does seem to be quite a natural definition.

Read more of this post

Multiway cut – 2

In the previous post on this topic, we saw an algorithm with an approximation guarantee of 2 for the multiway cut problem. Some guidelines were also given on developing a factor (2-2/k) approximation algorithm based on the ideas of the original algorithm.

Let us briefly review that.

Problem: Given a set, S = {s_1, s_2, ..., s_k} of k vertices, find a cut of minimum size (in terms of weights of edges) that separates each s_i from all other vertices in S.

Algorithm:

1. Find an isolating cut for each s_i . Call each such isolating cut C_i .

2. Output the union of the k-1 lightest isolating cuts. (i.e. exclude the heaviest of the C_i's and output the union of the rest.)

The above set of edges is a multiway cut.

We saw how the set of edges C_1 \cup C_2 \cup ... \cup C_k formed a 2 – factor approximation to the optimal cut. Using this fact, convince yourself that the above algorithm outputs a (2-2/k) factor approximation of the optimum.

Another multiway cut algorithm:

We now present a mulitway cut algorithm that performs no worse than the above (2-2/k) factor algorithm, and indeed performs better than the above in examples which we will see.

The algorithm inherits the idea of finding an isolating cut for each vertex in the set, S. Then, it performs a greedy choice, and picks the isolating cut, C_i having minimum size. It then removes the edges of cut, C_i from the graph, G to get a new graph G’. The vertex s_i is removed from the set, S. The same procedure is repeated on this new graph. (i.e. the isolating cuts are again computed for each of the k-1 vertices left in S, and the minimum cut is selected for removal from the graph.)

Once the above procedure is repeated k-1 times, we get the required multiway cut, which has an approximation guarantee of (2-2/k), and which additionally performs better than the above (2-2/k) factor algorithm in specific test cases.

The algorithm is repeated below step-wise.

1. Let G_1 = G. and S_1 = S.

Initially, i = 1. The required multiway cut, C is initially the Null Set.

Repeat the following (Steps 2-7) k-1 times.

2. Compute isolating cuts (wrt vertices in S_i ) – C_i for each vertex in S_i.

3. Let C_j be the minimum of the cuts computed in the previous step.

4. Remove the edges of C_j from G_i to get a new graph, G_{i+1} .

5. Add the edges of C_j to C. i.e. C = C \cup C_j.

6. Remove vertex, s_j from S_i to get a new set, S_{i+1}.

7. Increment i by one.

End of algorithm.

The set, C is the required multiway cut.

—–

Things to think about:

1. Convince yourself that k-1 iterations are sufficient to generate a multiway cut. (i.e. after k-1 iterations, each of the vertices in set, S is disconnected from all other vertices in S.)

2. Can you see why this algorithm will generate a cut that is no worse than the cut produced by the earlier (2-2/k) factor algorithm.

At each step , we remove a particular vertex from the set, S, and recompute the values of the isolating cuts for the remaining vertices, based on this new S. Can you see why this could lead to a better isolating cut for a particular vertex in S than was possible under the earlier algorithm.

Let us consider an example:

multiway_alg2

Example for multiway cut: the edge weights are indicated alongside the edges.

In the above example:

The minimum isolating cut for s_3 has value = 20. (One such candidate for C_3 is the set of edges { {s2,s3), {s3,C}}.
The minimum isolating cut for s_1 has value = 15. There can be more than one such cut with value = 15. Let the choose this one: {{A,B}, {A,C}}.
Similarly, value of minimum isolating cut for s_2 is 15; and one such cut is given by: {{s2,B}, {s2,s3}}.

Now, how would the first algorithm proceed?

The union of the 2 lightest isolating cuts is outputted. Hence, the multiway cut produced by the algorithm is: {{A,B}, {A,C}, {s2,B}, {s2,s3}}. The value of this cut is 30.

How does the 2nd algorithm perform in this example?

Take a minimum value cut from the set of isolating cuts and add it to C. We can add either C_1 or C_2 since both have equal minimum value. Suppose, we add C_1 to C. Now, C = {{A,B}, {A,C}}.
Remove edges of C_1 from G, and recompute minimum isolating cuts for s_2 and s_3 on the new set, S = {s2, s3}.
We get a minimum isolating cut of value = 10 for both s_2 and s_3 . This is given by the edge {s2,s3}.
Add this edge to C, and we are done.

Therefore, C = {{A,B}, {A,C}, {s2,s3}}.

Value of this multiway cut = 25.

Hence, we can see how the 2nd algorithm performs better than the first in this example.

On a separate note, convince yourself that the second algorithm will perform no worse than the first on any input graph.

———

Another example (here the second performs asymptotically twice as better as the first):

Example for multiway cut

Example for multiway cut

Note: In the above diagram, the letter e has been used to denote \epsilon > 0 .

The set, S consists of k vertices, s_1 through s_k as shown. (The diagram is for k=5)

Performance of first algorithm on above instance:
Value of C_1 = 2 - 2 \epsilon .
For all other vertices in S, value of C_i = 2 - \epsilon.

Therefore, the value of cut produced by algorithm = (k-2) \times (2-\epsilon) + (2 - 2 \epsilon).

Performance of second algorithm on above instance:
Value of minimum C_i is 2- 2 \epsilon given by C_1 . On removing C_1 from G and s_1 from S, we find that the values of all other isolating cuts become equal to one. (See the diagram.) We proceed through the algorithm as specified to get cut, C.

Value of cut, C thus produced = (k-2) \times 1 + (2 - 2 \epsilon) .

Verify that for large values of k, as \epsilon —> 0, the ratio of values of the two cuts approaches 2.

Hence, the second algorithm performs twice as better asymptotically in this case.

3. Consider this example:

A tight example for both algorithms.

A tight example for both algorithms.

(Note: The letter, e has been used in the diagram to represent, \epsilon >0 .

Verify for yourself the following:

1. The optimal multiway cut for the above instance is of value = k. (Consider removing all edges of the cycle.)

2. The first algorithm produces a cut of value = (k-1) \times (2 - \epsilon) .

3. The second algorithm produces a cut of value = (k-1) \times (2 - \epsilon).

4. Asymptotically, both algorithms achieve an approximation factor of (2-2/k).

(I’m sorry for any errors in this post.)

Multiway Cut

A cut, C in a graph, G= (V, E) is a set of edges, whose removal from G disconnects G. i.e. G’ = (V, E – C) is disconnected. (A graph is said to be connected if there exists a path between any two vertices in the graph. Hence, a graph is said to be disconnected if there exist two vertices in the graph, say u and v, such that there exists no path between u and v in the graph.)

An s-t cut, C is a set of edges whose removal from the graph, disconnects s and t.

(Note: Throughout this discussion, we shall talk only about undirected, edge-weighted, connected graphs. Also, all edge weights are integers.)

Size of a cut: This means the total weight of all edges forming the cut.

We want to disconnect the graph; but we want to do so by removing as few (in terms of weight) edges as possible. That, we wish to find a cut of minimum size. Such a cut is called a min – cut.

Example for min – cut:

Min Cut example: All edges in the graph have weight = 20, other than the 3 highlighted edges. Those 3 edges form the min cut.

Min Cut example: All edges in the graph have weight = 20, other than the 3 highlighted edges. Those 3 edges form the min cut.

An s-t min cut, as is clear from the name, is a cut of minimum size that separates vertex s from vertex t.

Example for s- t min cut:

Example for s-t min cut

Example for s-t min cut

(The highlighted edges form an s-t min cut of value = 40. Note that the min cut value for this graph is 5 (obtained by removing edge {a,b}.)

=========

A multiway cut is a generalization of an s-t cut.

Here, instead of two vertices, s and t, we have to separate a set of k vertices, {s_1, s_2, ..., s_k} from each other.

In other words, find a set of edges, say C, such that on removing C from the graph, there exists no path between s_i and s_j for any i, j \in {1,2, ..., k} .

Our task is to find a multiway cut of minimum size.

Example of multiway cut:

By removing all edges shown in the figure, we can separate vertices, s1, s2, .., sk, from one another.

By removing all edges shown in the figure, we can separate vertices, s1, s2, .., sk, from one another. (Note that within each ellipse around each si, there may be a number of vertices and edges between those vertices. They have not been shown for sake of clarity.)

(The above example shows a multiway cut.)

========

Question: How do you find a multiway cut of minimum size?

It turns out that it is hard to find such a cut in polynomial time. But that’s not the end of the story. What we can do efficiently (in general, an algorithm is efficient if it runs in polynomial time) is to find a multiway cut that is within a factor of 2 from the optimal. In other words, if the value of the optimal multiway cut of a graph, G is OPT (G) , our algorithm will produce a cut of value \leq 2 \times OPT (G) for every graph, G.

(So, even though we don’t find an optimal cut, we find one that is guaranteed to be “close” to optimal.)

========

Even though, we do not know of any efficient algorithm, for solving a multiway cut problem exactly for arbitrary values of k, we do know how to solve it efficiently for k = 2. (Note that for k = 2, the multiway cut problem is just the s-t cut problem. An optimal s-t cut can be found efficiently. [It is actually quite simple to do it; but that is not our concern here.] )

We will use this efficient s-t min cut algorithm to find our approximate multiway cut.

————–

Let us review once again what we want. Let us denote by S the set of k vertices, {s_1, s_2, ..., s_k} .

We want to remove a set of edges, such that on their removal:
- there exists no path from s_1 to s_2 , s_1 to s_3 , and so on till s_1 to s_k .
- there exists no path from s_2 to s_1 , s_2 to s_3 , and so on till s_2 to s_k .
- there exists no path from s_3 to any of the other vertices in S.
- and so on, for each vertex in the set, S.

The above explanation of the problem actually provides us with a straightforward algorithm for the same.

Take vertex, s_1 . Find a set of edges, removing which would disconnect s_1 from every other vertex in S.
Let C_1 be such a set of minimum size (i.e. minimum total weight).

How do we find C_1? Simple; we want a min cut that separates s_1 from all other vertices of S. What we’ll do is merge all these other vertices of S into one single vertex, call it t. Find an s-t min cut with s_1 as the vertex, s, and the merged vertex as t.

(Merging: When we merge any two or more vertices, the edges between these merged vertices disappear. The other edges that were incident upon these vertices, remain as such, and are now incident upon the new merged vertex.

Example for merging vertices:

The graph on the right is obtained by merging vertices, C, D, E and F into one "super-vertex" denoted by {C,D,E,F}. Note that there are multiple edges between the "super-vertex" and some other vertices of the graph.

The graph on the right is obtained by merging vertices, C, D, E and F into one "super-vertex" denoted by {C,D,E,F}. Note that there are multiple edges between the "super-vertex" and some other vertices of the graph.

(Each of the edges {F,G} and {E,G} translates into an edge between the super-vertex and G. Hence, there are 2 edges between {C,D,E,F} and G.)

Therefore, using the efficient algorithm for finding an s-t min cut, we can compute C_1 . On removing the edges of C_1 , vertex s_1 will be disconnected from all other vertices in S.

So, the first part of our job is done (disconnecting s_1 from all others in S.)

(Note: We call C_1 as an “isolating cut” for vertex, s_1 .)

Similarly, find isolating cut C_2 corresponding to vertex s_2 . On removing the edges of C_2 , s_2 is disconnected from every other vertex of S.

Therefore, by removing the union of C_1 and C_2 , we can disconnect s_1 and s_2 from every other vertex in S (and obviously, from each other also).

Now, you get an idea of how to proceed, right?

Find an isolating cut C_i corresponding to each vertex, s_i in the set, S.

The union of all these isolating cuts, i.e. C_1 \cup C_2 \cup ... \cup C_k is a multiway cut separating s_1, s_2, ..., s_k from one another.

=========

Question: We had claimed that the cut produced by the algorithm is always within a multiplicative factor of 2 within the optimal. How do we prove this?

Let us call the cut produced by the above algorithm as C. i.e. C = C_1 \cup C_2 \cup ... \cup C_k .

Let us call the optimal cut as A.

Now, on removing the edges of A from G, we would get k (disjoint) connected components in the graph, one connected component for each vertex, s_i .

Let A_1 be the subset of edges from A, that is incident upon the connected component of s_1 .

Graph showing A1. (A1 is the set of dashed edges.) (Note that within the connected components of vertices other than s1 have not been shown. Also, the edges between the various connected components other than the one containing s1 have not been shown, for sake of clarity.)

Graph showing A1. (A1 is the set of dashed edges.) (Note that vertices and edges within the connected components of vertices other than s1 have not been shown. Also, the edges between the various connected components other than the one containing s1 have not been shown, for sake of clarity.)

As is quite obvious, A = A_1 \cup A_2 \cup ... \cup A_k .

It is also clear that each edge of A appears in exactly two A_i 's . (If an edge of A is between the connected component of vertex s_i and s_j , then it appears in the sets A_i and A_j . )

From this, we have that:

\sum_{i=1}^{k} w(A_i) = 2 \times w(A) .  —————————Eqn. (1)

Here, w(X) denotes the total weight of edges in set, X.

Now, we know that C_1 is an isolating cut of minimum size for s_1 .

Therefore, w(C_1) \leq w(A_1) .

Similarly, w(C_i) \leq w(A_i) for all i \in {1, 2, ..., k} .

Therefore, \sum_{i=1}^{k} C_i \leq \sum_{i=1}^{k} A_i .

Combining the above equation with Eqn. (1), we get that:-

\sum_{i=1}^{k} C_i \leq 2 \times w(A) .

i.e. the cut produced by our algorithm is always within a factor of 2 of the optimal multiway cut.

In other words, we are done. :)


Points to ponder:

1. Do we necessarily need to take the union of all k sets C_1 till C_k to obtain a multiway cut? What if we take the union of only k-1 of these sets? Will that union be a valid multiway cut?

2. As an extension of the above point: which k-1 sets should we choose in order to minimize the size of the cut produced? (Hint: you can discard the heaviest among the C_i 's and output the union of the rest.) We know that outputting the union of all k sets results in a factor-2 approximation algorithm. What is the approximation factor for this improved algorithm? (Hint: Prove that the approximation factor is (2 – 2/k). )

(The topic of multiway cut is dealt in Chap. 4 of “Approximation Algorithms” by Vijay Vazirani.)

Maximum Alternating Sum Subsequence

We are given a sequence of positive integers, say [x_1, x_2, ..., x_n].

We need to find a subsequence of maximum alternating sum.  S = [x_{i1}, x_{i2}, ..., x_{ik}] is said to be a subsequence if i1 < i2 < ... < ik. Alternating sum of subsequence, S is simply x_{i1} - x_{i2} + x_{i3} - x_{i4} + ... +/- x_{ik}. .

So, our task is to find a subsequence, say S^* which maximizes the above sum. (Note that there may be more than one such optimal subsequence, S^* .)

Example:

Sequence, X = [10, 2, 3, 8, 9, 1, 10] ——->  S^* = [10, 2, 9, 1, 10]  (Alternating sum = 10 – 2 + 9 – 1 + 10 = 26.)

Sequence, X = [1, 1, 4, 10, 3, 1, 10, 20] ——-> S^* = [10, 1, 20] (Alternating sum = 29.) (Another possible optimal subsequence is [1, 1, 10, 1, 20].)

Observation:-

1. The length of the optimal subsequence will be odd. (It is quite clear why this is so. If we have an even subsequence, the last term of the subsequence is subtracted in the alternating sum. So, by removing one element from this sequence, we get a sequence of higher alternating sum. [This is true since all elements of the sequence are positive.]. So, for any even subsequence, we can always find an odd subsequence of higher alternating sum.)

Development of the algorithm:

1. Now, how do we get an Odd Subsequence? Simple, by adding one element to an even subsequence. Again, how do we get an Even Subsequence? And again, by adding one element to an Odd Subsequence.

Suppose we want to find an Odd Subsequence of Maximum Alternating Sum (MAS) having its last element as x_k . How do we get that?

Answer: The element prior to x_k in this required Odd Subsequence is an element having index \leq k-1. If it is x_{k-3} then we would find an Even Subsequence of MAS ending at x_{k-3} and add x_k to that.

How do we decide which would be the index of the element preceding x_k ?

Simple. Consider an Even Subsequence of MAS ending at x_k-1 , and add x_k to that. This is one possibility. What is the Alternating Sum of this Odd Subsequence? Answer: MAS(Even Subsequence ending at x_{k-1}) + x_k.

We have to maximize the Alternating Sum of the Odd Subsequence. Therefore, the Odd Subsequence of MAS ending at x_k is precisely one that maximizes: MAS(Even Subsequence ending at x_j ) + x_k over all indices ‘j’ ranging from ’0′ to ‘k-1′. (Since, x_k is added to each of these sums, we can select an optimal one by taking the one that maximizes MAS(Even Subsequence ending at x_j ). )

Notation:

Let OPT_Odd (j) denote MAS of Odd Subsequence ending at x_j . Analogously, let OPT_Even (j) denote MAS of Even Subsequence ending at x_j .

(The solution to the problem is given by: max_{j: '1' \space to 'n'} OPT_Odd (j). )

—–

Based on the above discussion, we get the following recursion:

OPT_Odd (j) = max_{i: '0' \space to 'j-1'} OPT_Even (i) + x_j .

and, OPT_Even (j) = max_{i: '1' \space to 'j-1'} OPT_Odd (i) + x_j .

Base case values:

OPT_Odd (0) = 0; OPT_Odd (1) = x_1

OPT_Even (0) = 0; OPT_Even (1) = 0.

—–

How the algorithm computes OPT_Odd and OPT_Even values:

It computes OPT_Odd (2) using base values of OPT_Even (0) and OPT_Even (1). Similarly, it computes OPT_Even (2) using base values of OPT_Odd (0) and OPT_Odd (1).
Then, we compute OPT_Odd (3) using OPT_Even values for 0, 1 and 2. Similar is the case with OPT_Even.

We proceed this way until we compute OPT_Odd (n).

——-

As pointed above, the value of the Odd Subsequence of MAS is given by:

max_{i: '1' \space to 'n'} OPT_Odd (i).

——-

Points to ponder:

1. You can try writing the pseudocode for the algorithm, indicating the way it computes the OPT_Odd and OPT_Even values.

2. What is the running time of the implementation of the algorithm given by your pseudocode? Is it O(n^2)?

3. The above algorithm computes the Value of the optimal solution. How do we get an Optimal Subsequence itself? You might have to modify the algorithm to keep track of the index of the Even Subsequence that maximizes a particular Odd Subsequence, and also do the same for Even Subsequences. Are there other ways to find an optimal subsequence? (Computing Optimal Solutions (as opposed to optimal value) would be given in any standard textbook on algorithm design. Eg. Kleinberg/Tardos Chap. 6)

(There might be errors in the above algorithm/anlysis. I apologize for them. It would be great if you point out such errors.)

Bipartite Matching: An application

Firstly, we define the concept of a matching.

Suppose we are given a graph, G = (V, E). As the name suggests, a matching means a pairing of vertices of the graph. In other words, each vertex is “matched” to exactly one other vertex in the graph to which it had been connected by an edge in E. We say that such edges, which “match” two vertices with one another, are part of the matching.

Formally, a matching is a set of edges, S \subseteq E such that no two edges in S are adjacent to each other. (Two edges are adjacent if share a common end-vertex.)

Example:

Matching Example

Matching Example : Bold edges {(1,2), (4,5)} form a matching.

Maximum matching:

A matching of maximum cardinality is called a maximum matching. (Note that a graph may have more than one matching of maximum cardinality.)

Example:

{ (1,2), (3,4), (5,6) } form a maximum matching

{ (1,2), (3,4), (5,6) } form a maximum matching

Clearly, we can not get a matching having more than 3 edges for the above graph. (There are only 6 vertices; for each pair of vertices there can be at most one edge in the matching; hence there can not be a matching with greater than 3 edges.) Hence, this is a maximum matching.

Perfect Matching:

We were able to pair up all vertices of G in the above example. Such a matching is called a perfect matching.

A perfect matching may not exist for many graphs.

Example:

Maximum matching is of size one. Two vertices are unpaired in maximum matching.

Maximum matching is of size one. Two vertices are unpaired in maximum matching.

If the graph we consider is a bipartite graph, then the matching in such a graph is termed as a bipartite matching.

Practical application of bipartite matching: Job recruitment process

Suppose there are ‘n’ companies competing to hire students from a university, and that ‘m’ students are available for hiring. Assume that each company has only one job opening, and hence can hire at most one student.

After the tests and interviews, each company shortlists certain candidates. The company sees no distinction among its shortlisted candidates, and is willing to hire any one of them to fill its opening.

It is clear how this is a problem of bipartite matching.

Let L be the set of companies, and R be the set of students.

An edge exists between, l \in L and r \in R if r is one of the candidates shortlisted by the company.

Each company can hire at most one (by our assumption; either it hires a student, or goes home empty-handed.)

Quite obviously, each student can work in at most one company (either the student is selected by one company; or goes home empty-handed.)

Hence, this is a clear case of bipartite matching between the set of companies and the set of students.

The university would certainly like to find out the maximum number of students who can get placed through this recruitment process.

In other words, the university wishes to find out the size of the maximum bipartite matching possible for the company-student graph.

There exist polynomial time algorithms for computing a maximum bipartite matching. Hence, the problem can be solved by running any of those algorithms on the given instance of the company-student graph.

Points to ponder:

1. When does a bipartite graph have a perfect matching? What is the common structure of such bipartite graphs, which enable them to have a perfect matching? In particular, there is a very interesting theorem known as Hall’s Marriage Theorem. The theorem has two parts to it: an “if” part, and an “only if” part. One of these is simple to visualize, while the other is somewhat non-intuitive. You could read up on that.

2. There is a concept of maximal matching.

A maximal matching of a graph G = (V, E) is a matching to which no other edge of E can be added without violating the matching property. (Matching property requires that all edges of a matching be non-adjacent.)

Example:

Maximal matching: No edge can be added to the matching { (1,6), (3,4) } without violating the matching property.

Maximal matching: No edge can be added to the matching { (1,6), (3,4) } without violating the matching property.

The size of this particular maximal matching of G is 2. (Unlike maximum matchings of a graph which are all of the same size, the maximal matchings of a graph can be of different sizes.)

But we can certainly do better than this in terms of getting a matching of higher cardinality.

A Maximum matching for G: { (1,2), (3,4), (5,6) }

A Maximum matching for G: { (1,2), (3,4), (5,6) }

(You can also observe that the above matching is also a perfect matching of G (all vertices are paired).)

Question: Can you find a relation between the size of a maximal matching and the size of a maximum matching. Let l be the size of a particular maximal matching, and let m denote the size of any maximum matching (all maximum matching are, by definition, equal in size).

We know that l \leq m .

Claim: There exists a constant factor,k > 1 , such that m \leq l \times k , for all possible l .

(Notes: How can we get a matching of greater size using a given maximal matching? Clearly, we can not add an edge to it (by definition of maximality, the resultant set will not be a matching.)  So, we need to remove an edge from the maximal matching. Suppose we remove a particular edge, (u,v) . We are, in effect, freeing two vertices, u and v , which are available to be paired up with some other free vertices. Hence, on removing one particular edge from a maximal matching, we can hope to add at most two new edges in its place. This suggests that the size of a maximum matching is at most twice the size of a maximal matching (i.e. k = 2).

I’m not sure about the correctness of the above explanation, as it omits certain details, and may not be rigorous in terms of exploring all possible alternatives for building a maximum matching. On an informal level, this explanation might suffice.)

Max-Cut Problem

The Maximum Cut Problem asks us to find an edge-cut of maximum cardinality. (We are considering undirected, unweighted graphs.) In other words, we are required to find a partition of the vertex set, V into (X^*, V-X^*) such that the number of edges crossing the partition is maximized.

The algorithm presented here is based on a technique known as local search.

Algorithm:

Take any arbitrary partition of the vertex set, (X, V-X). Find a vertex which when flipped to the other set of the partition would have more of its incident edges crossing the cut.

Flip the vertex to the other set of the partition. Continue searching for such a vertex in this new partition. Continue this procedure generating new partitions until no such vertex satisfying the above condition exists.

—-

Consider this example:

graph4

We are given an arbitrary cut as shown above. Consider vertex, 1. Out of its 3 incident edges, 2 are incident on vertices of S1 (its current set), and one is incident on a vertex of set S2. Hence, by moving vertex, 1 to set, S2, we can improve the cardinality of the cut by one.

This is what we get by moving vertex,1 :-

Cut improved by flipping vertex,1
Cut improved by flipping vertex,1

Observations on the algorithm:

1. Each “flip” of a vertex from one set of the partition to the other increases the size of the cut by at least one.  (If vertex, 1 is incident on x_1 vertices in set S_1 , and x_2 vertices in set S_2 , (and x_1 > x_2 ) then flipping vertex, 1 from S_1 to S_2 would increase the size of the cut by x_1 - x_2. )

2. The above point shows why the algorithm runs in polynomial time. The maximum possible size of a cut of G=(V,E) is |E|. From Observation 1, we can conclude that the algorithm makes at most |E| flips. Further, each flip can be done in polynomial time. Hence, the algorithm is of polynomial time complexity.

3. What is the minimum number of edges that the algorithm returns in its cut? In other words, give a lower bound on the size of the cut returned by the algorithm.

This is simple to compute.

Let (X, V-X) be the cut returned after the termination of the algorithm. (Termination of the algorithm implies that flipping any vertex from one set of the partition to the other would not produce a better cut.)

Consider any vertex, v in this cut. At least half of the edges incident on vertex, v in graph G would be crossing (X, V-X). (since otherwise it would mean that flipping v would improve the cut returned by the algorithm, indicating that the algorithm has not yet terminated i.e. contradiction.)

Therefore, at least half of all edges of G cross the cut returned by the algorithm.

|C| \geq |E|/2

where |C| represents the size of the cut returned by the algorithm.

4. Consider the following example:

For the above graph, let an arbitrary cut be given by:

Arbitrary cut used as starting point for algorithm
Arbitrary cut used as starting point for algorithm

(Edges within sets S1 and S2 have been omitted for clarity.)

If we start our algorithm with this cut as a starting point, we find that we can find no vertex which is suitable for flipping.  ( Consider vertex, a: Flipping it would make it incident  on n/2 vertices of S1. But it is already incident on n/2 vertices on S2. Flipping it will not improve the cut. Same is the case with vertex, b. Similarly flipping any of 1,2,..,n would improve the cut.)

Hence, this is the best cut that the algorithm can produce when provided with this particular cut as a starting point.

Size of cut produced by algorithm = n

But this is certainly not a maximum cut for the graph. Consider the following cut :-

A Max cut for the graph
A Max cut for the graph

Size of above cut = 2n.

This is clearly the maximum cut possible for the graph, since |E| itself is 2n.

Hence, our algorithm produces a cut that is factor of 1/2 of the optimal in this particular instance.

( Please note that the output of the algorithm for a particular input graph varies according to the arbitrary cut that is chosen as its starting point. )

5. The best possible cut can be of size |E|.

From Observation 3, it is known that the algorithm produces a cut of size at least |E| / 2.

Hence, the algorithm is a factor- 1/2 approximation algorithm. (i.e. it produces a cut that is always within a factor of 1/2 of the optimal)

Observation 4 indicates that the factor of 1/2 is indeed a tight bound on the performance of the algorithm through an example where the algorithm performs no better than than a factor of 1/2. (Such examples are known as “tight examples“).

On the Shortest Path and the MST Problems

This is a pre-submission copy of my undergraduate thesis “On the Shortest Path and Minimum Spanning Tree Problems”.

On the Shortest Path and Minimum Spanning Tree problems

This is more like a restricted survey on some algorithms for computing shortest paths and MSTs, and is quite elementary.

Vertex cover Problem

Here we’ll see a simple algorithm that finds a near-optimal vertex cover of a graph.

Firstly, what is a vertex cover? A vertex cover is a set of vertices, call it S, such that the following condition holds: every edge of the graph has at least one end-point in S.

( i.e. we are “covering” all the edges of the graph through the vertex cover.)

Example: If vertex, x is adjacent to vertices, a, b, and c, then it “covers” edges {x,a}, {x,b} and  {x,c}.

What we ideally want is a vertex cover having the minimum number of vertices, i.e. we want to “cover” all the edges of the graph using as few vertices as possible.

That would an optimal vertex cover. So, what is meant by “near-optimal”? This means that the vertex cover we find is not exactly one of the minimum possible cardinality. However, it also means that this vertex is close in size, say by a constant multiplicative factor, to the optimal one.

Why should we settle for a solution that is not as good as optimal? The reason is simple: some questions are inherently difficult to solve exactly. Also, for some problems, a near-optimal solution is good enough.

(Note that we only consider graphs that do not have any isolated vertices.)

We digress here a bit to introduce another concept.

Consider a graph, G. Pick an edge of G. Now, pick another edge of G, that is not adjacent to the edge picked earlier (i.e. the two edges do not have any common end-points). These two edges are said to be disjoint.

A matching in a graph is a set of disjoint edges.

What we would like to see is how many edges we can possibly have in a matching.  A maximum matching is a matching having the maximum number of edges that any matching of that particular graph can possibly have. (Note that we say “a” maximum matching, instead of “the” maximum  matching, since a graph can have multiple maximum matchings.)

The vertex cover algorithm is based on two simple ideas.

Idea 1If we take the end-vertices of all edges forming a maximum matching of a graph, G, then this set of vertices, say S, forms a vertex cover of G.

This is simple to prove. Assume there is an edge {x,y} which is not “covered” by S, i.e. neither x nor y is in S. In that case, we can add edge {x,y} to get a new matching. This matching has a size greater than that of our maximum matching, which is not possible. Hence, all edges of G are “covered” by S.

As stated earlier, we are interested in finding a vertex cover having minimum number of vertices. How does set S fare on this account?

Idea 2: The size of a maximum matching of G is a lower bound on the size of a vertex cover of G.

i.e. The best possible vertex cover of G (one with least possible number of vertices) will have at least ‘k’ vertices, where ‘k’ is the number of edges in a maximum matching of G. ( k = |S| / 2, in our  terminology.)

This is also easily proved.  Let’s call the vertex cover as T. Now, every edge of the maximum matching, call it M, must be “covered” by T. So, every edge of M must have at least one end-vertex in T. Also, there is no overlap whatsoever among the end-vertices of any two edges of M. Therefore, T must contain at least |M| vertices.

We can’t obtain a vertex cover with lesser number of vertices.

“Idea 1″ showed how to obtain a vertex cover of size, |S| = 2 |M|.  And “Idea 2″  showed that a minimum vertex cover needs to have at least |M| vertices.

So, what we obtained through “Idea 1″ is a vertex cover that is within a factor of two from the optimal. i.e. The number of vertices it contains is no more than twice the number of vertices contained in an optimal vertex cover.

Such an algorithm is known as a factor-2 approximation algorithm.

Notes:

1.  There exist algorithms that can find a maximum matching in polynomial time. (In algorithmic terms, a polynomial running time is considered to be “efficient”.) On the other hand, we do not know of any such efficient algorithm for obtaining an optimal vertex cover. Hence, we use the matching problem to develop an efficient approximation algorithm.

2. Suppose a network of roads in a town is modeled as a graph, with the streets representing edges, and intersections between streets being the vertices. We are asked to place policemen at the street intersections such that all streets are monitored by at least one policeman. ( A policeman can monitor all streets that begin/end at his intersection.) We want to use as few policemen as possible. This is the same as the optimal vertex cover problem.

Does Kruskal produce all MSTs?

The area of Theoretical Computer Science has so many beautiful questions. Among these are a few questions, that can be called as “cute” questions. The problem raised here is one such question. :)

Here’s the question: Given a weighted connected graph, G, does Kruskal’s algorithm produce all possible Minimum Spanning Trees of G?

A couple of lines on Kruskal’s algorithm: In order to find the MST, you take the edges and order them in non-decreasing order. Then traverse this sorted sequence of edges, picking edges one by one, keeping in mind that the picked edge should not create a cycle with any of the previously picked edges.

There’s also another interesting fact about MSTs: If a graph has unique edge weights, i.e. no two edge weights are same, then that graph also has a unique MST. Also, if a graph has repetitions of certain edge weights, then there may exist more than one MST for the graph.

Now, back to the original question.

Let us take an example graph to understand the arguments that follow.

Suppose our connected graph, G has the following ordered sequence of edge weights:

1 (e1), 2 (e2), 3 (e3), 4 (e4), 5 (e5), 5 (e6), 5 (e7), 6 (e8), 7 (e9), 8 (e10), 8 (e11), 8 (e12), 8 (e13), 9 (e14), 10 (e15), …. , (em).

(The parenthesized symbols are the edge names)

Now, we apply Kruskal’s algorithm on G to get some MST. Let’s call this as Tk.

Also, let T be any random MST of G. We shall compare T with Tk to see whether T can be produced by Kruskal’s algorithm.

Simple till now, right?  :)

Now, T and Tk will generally have some common edges. (Even if they don’t the ensuing arguments will hold just as well.)

Let us consider the edges one by one (in sorted order). Let’s say that edge, e1 is part of both T and Tk. Similarly, edges, e2 and e3 are both of both T and Tk. Also, let us assume that edge, e4 is not part of either T or Tk. So far so good; all edges thus far between T and Tk are common.

Case 1: Now, let us assume that Tk contains edge, e5, but not e6; while T contains edge, e6, but not e5.

We now have to prove that Kruskal’s algorithm can produce an MST which contains e6 but not e5.

(We know that Kruskal’s algorithm selects an edge to be a part of the MST in sorted order. When two or more edges have equal weight, it randomly picks one of them and checks whether that edge can be a part of the MST. Only then does it go to the next edge of equal weight to test whether that is compatible with the tree constructed thus far.)

So, in our example, after Kruskal’s algorithm successfully processes edge, e4, it has to make a choice between e5 and e6. Let us suppose that it picks edge, e6. We know that e6 would be compatible with the previously made selections of e1 through e4 (this is because, T contains the same selection as Kruskal for e1 through e4, and also contains e6. Hence, e6 is compatible with the earlier selections i.e. it does not form a cycle with those edges).

So, one problem is sorted out. There does exist a Kruskal-produced MST, call it Tk1, having edge, e6. But, we need to ensure that this MST does not contain e5. This can be easily proved.

Here’s the proof: We know that  { e1, e2, e3 } is compatible with both e5 as well as e6. (e4 is part of neither T nor Tk, in our example). Now, Kruskal’s algorithm always tests an edge along the ordered sequence in order to accept or reject it. Since, e6 was not part of Tk, it means that { e1, e2, e3, e5, e6} form a cycle. Coming back to Tk1: e6 can’t be part of Tk1 since it forms a cycle with the previously picked edges {e1, e2, e3, e5}.

So, we have proved that if a Kruskal-produced MST and a random MST differ in an edge of the same weight, then we can apply Kruskal’s algorithm to rectify this difference.

Case 2: Let’s get back to the sample edge sequence (reproduced here for our convenience):

1 (e1), 2 (e2), 3 (e3), 4 (e4), 5 (e5), 5 (e6), 5 (e7), 6 (e8), 7 (e9), 8 (e10), 8 (e11), 8 (e12), 8 (e13), 9 (e14), 10 (e15), …. , (em).

Again, let Tk be a Kruskal-produced MST, and T be a random MST of G.

Let’s assume that T and Tk agree on edge selections until edge, e7. Also, let us assume that Tk does not contain e8, while T contains e8.

We can easily prove that this is not possible. The two facts that T agrees with Tk on edges e1 through e7, and that e8 is part of T, imply that e8 is compatible with Kruskal’s selection edges from e1 through e7. By definition, Kruskal’s algorithm must accept e8. Hence, the fact that e8 is not part of Tk is not possible. Hence, case 2 is an impossibility.

Case 3: Consider the same edge sequence:

1 (e1), 2 (e2), 3 (e3), 4 (e4), 5 (e5), 5 (e6), 5 (e7), 6 (e8), 7 (e9), 8 (e10), 8 (e11), 8 (e12), 8 (e13), 9 (e14), 10 (e15), …. , (em).

T and Tk notations are also inherited from above.

Assume that T and Tk agree on edge selections from e1 through e9. Also, assume that T contains two edges of weight, 8, say e10 and e11. On the other hand, assume that Tk contains only one edge of weight, 8, say e12.

To rectify the fact that Tk contains e12 but not e10, we apply the same arguments as in Case 1. Hence, we get that there can be a Kruskal-produced MST, call it TK1, that contains e10 but not e12.

Now, can e11 be made to be a part of a Kruskal-produced MST that contains e10 but not e12? The arguments here are similar in nature to those in Case 2. We can prove that such a case can not exist.

Case 4: Here’s the same sequence:

1 (e1), 2 (e2), 3 (e3), 4 (e4), 5 (e5), 5 (e6), 5 (e7), 6 (e8), 7 (e9), 8 (e10), 8 (e11), 8 (e12), 8 (e13), 9 (e14), 10 (e15), …. , (em).

Using the same meaning for T and Tk, we shall assume that T and Tk agree on edge selections from e1 through e8. Also, assume that Tk contains e9, while T does not contain e9.

Add edge, e9 to T. We get a cycle. It can be observed that this cycle will contain an edge with index greater than 9. i.e. one edge from among the set {e10, e11, …, em}. (This is  because, if e9 formed a cycle composed exclusively of edges selected by Kruskal from e1 through e8, it would not have been a part of Tk in the first place. Hence, the presence of a higher-indexed edge is essential in the cycle).

Now, we can remove the higher-indexed, and hence, higher-weighted edge, say  f, from the cycle and replace it in the MST, T by e9. This new tree, T – {f} + {e9}, is lower in weight than T. This is not possible. Hence, Case 5 is also an impossibility.

Case 5: Here’s the same edge sequence:

1 (e1), 2 (e2), 3 (e3), 4 (e4), 5 (e5), 5 (e6), 5 (e7), 6 (e8), 7 (e9), 8 (e10), 8 (e11), 8 (e12), 8 (e13), 9 (e14), 10 (e15), …. , (em).

Assume T and Tk agree on edge selections from e1 through e9. There can be a possibility that Tk contains 2 or more edges of weight 8, while T contains just one of them. We can show by arguments similar to those in Case  4 that such a possibility does not exist.

The cases that were considered were chosen so as to represent all possible types of interesting situations that might arise. There may be other cases possible, but hopefully it would reduce to one of these.

Explanation of this problem through an example makes, I believe, it easier for the reader and certainly, the writer. :) It is indeed a very long explanation, and probably one may want to think of the answer themselves.

So, Kruskal’s Algorithm does produce all possible MSTs of a graph G.

Some Practical Implications of this Result:

Whenever one is asked to prove that a given spanning tree is not an MST, we just need to show that the given tree can not be produced by Kruskal’s algorithm.

We had seen an interesting fact at the beginning of this post, that uniqueness of edge weights implies uniqueness of MST. (Note that we did not use this fact anywhere in showing that Kruskal’s algorithm produces all MSTs.) This is easily proven since Kruskal’s algorithm produces only one MST when run on a graph with unique edge weights, implying that only one MST exists for such a graph.

Weighted Interval Scheduling (Dynamic Programming)

Dynamic Programming is a very convenient methodology to compute in polynomial time a solution that seemingly requires exponential time. For an optimization problem, it searches through the space of possible solutions, which may be exponential in size. Now, how can it test all such solutions in polynomial time? The answer is that it builds upon the solution stage-by-stage, and for computing values at each stage, it uses the pre-computed values of the previous stages.

Does the divide-and-conquer paradigm also not do the same, i.e. both of them solve smaller sub-problems in order to find the solution to the given problem. As can be seen via examples, there is marked difference in the problems that subject themselves to dynamic programming algorithms, and those coming under divide-and-conquer algorithms. While the former reduces a possibly exponential time complexity to a polynomial one, the latter tries to improve upon the running time of an algorithm that is already of polynomial time complexity.

Let us look at a very interesting example.

We are organizing a series of Theory lectures by speakers from all over the globe. Depending on the topic being presented, and the speaker presenting it, each talk is given an importance rating, which is a positive number (with a larger value denoting higher importance). We have already decided on the exact times at which each talk would begin, and also the exact time at which it would end. Now comes the twist: we have only one lecture hall. Quite obviously, we can not have talks that have any overlap among them with respect to timing. (It’s not fun listening to two people speaking simultaneously from a single podium). The solution we opt for is to schedule the talks such that the sum total of the importance ratings of the scheduled talks is maximized.

This is known as the Weighted Interval Scheduling problem. We are given a set, S of ‘n’ events (each with a fixed start and end time) and an associated weight, w(k) for each event, k. We need to schedule the events (i.e. find a subset, T of S) such that :-
(1). there is no overlap whatsoever among the scheduled events;
and (2) the sum of weights of scheduled events is maximized.

Let us look at a solution to this problem.

For making our lives easier, we shall order the events by their finishing times (from the earliest finishing time, to the last one). That is, we have numbered the events as 1, 2, 3, …, n, such that f(1) < f(2) < f(3) < …. < f(n). [Here, f(k) denotes finish time of event, k.]

Now, assume that we schedule an event, ‘x’. Let us try to find out what all events we can schedule before ‘x’. Clearly, we can not schedule any event that overlaps with ‘x’. Hence, the finish time of any such event, ‘y’ must be  before the start time for x. i.e. f(y) < s(x). [Here, s(k) denotes start time of event, k.] Let ‘z’ be the event having the highest finish time among all such possible events,y that can be scheduled prior to ‘x’.  Let us denote the event, ‘z’ by the symbol, P(x).

From our ordering of events on the basis of finish times, we can observe that all events that have an index less than that of P(k) (i.e. come before P(k) in the ordering) can be scheduled before event, ‘k’.

Let us consider an optimal solution to the problem, O. Now, event ‘n’ (the last one) either belongs to the set, O or it does not belong to O.

If ‘n’ belongs to O :-  We can only schedule those events having index < = P(n). Also since O is optimal, the resulting subset of events chosen from among events 1, 2, .., P(n) should be optimal themselves. (Else, we could replace that subset with another one to produce a higher valued solution.)

If ‘n’ does not belong to O :- We can schedule any set of events having index <= n-1. Again since O is optimal, we will choose a subset which is optimal for the set of events 1, 2, …, (n-1).

The optimal value of the solution ( for a set of ‘n’ events), denoted by OPT (n) is then given by:

OPT (n) = max {  w(n) + OPT (P(n)),  OPT (n-1) }

Now, OPT (1) = w (1). We can quite easily compute the P(k) for each event. Using the value of P(2) and OPT(1), we get the value of OPT(2). Similarly, we get the value of OPT(3) using pre-computed OPT values for events numbered < 3 and P(k) values.

Hence, we see that computing OPT(k) for any given ‘k’ can be done in constant time given P(k) and the OPT values of previously computed indices < k.

Hence, we have achieved a running time that is linear in ‘n’. [ Computing P(k) for  all 1<=k<=n is O(n); similarly computing OPT (k) for all 1<=k<=n is again O(n).]

Thus we have seemingly miraculously achieved an O(n) time complexity algorithm for a problem that does not seem very amenable at first sight. That is the power of dynamic programming.

The above post is based on an example from Ch. 6 of the excellent textbook on “Algorithm Design” by Kleinberg and Tardos. Enjoy the book.

A Reading List for Theoretical CS

Some nice books on Algorithms /Theory of Computation and related topics:-

1. Introduction to Algorithms  (CLRS)

2. Algorithm Design (Kleinberg/Tardos)

3. Algorithms (Dasgupta/Papadimitriou/Vazirani)

4. Design and Analysis of Computer Algorithms (Aho/Hopcroft/Ullman)

5. Theory of Computation (Sipser)

6. Elements of the Theory of Computation (Papadimitriou/Lewis)

7. Randomized Algorithms (Motwani/Raghavan)

8. Combinatorial Optimization: Algorithms and Complexity (Papadimitriou/Steiglitz)

9. Combinatorial Optimization: Theory and Algorithms (Korte/Vygen)

10. Approximation Algorithms (Vazirani)

11. Quantum Computation and Quantum Information (Nielsen/Chuang)

12. Convex Optimization (Boyd/Vandenberghe) this is available online at this Stanford site.

13. The Probabilistic Method (Alon/Spencer/Erdos)

14. Elements of Information Theory (Cover/Thomas)

15. Computational Complexity: A Modern Approach (Arora/Barak) A draft of the book is available at this Princeton site.

16. Probability and Computing (Mitzenmacher/Upfal)

17. Concrete Mathematics (Graham/Knuth/Patashnik)

18. Graph Theory (Diestel)

19. Computational Complexity (Papadimitriou)

Min-cut Problem

In this post, we consider the min-cut problem. A min-cut is a cut of minimum size, i.e. the minimum number of edges whose removal disconnects the graph (also called edge connectivity). It is represented as (X, V-X) where X is a subset of V in the multi-graph G=(V, E).  (Multigraph is a graph where parallel edges are allowed.)

Now, how do we find a min-cut? We’ll approach this through a randomized algorithm. It is very straightforward.

What we want ultimately is to have a partition of V into two disjoint vertex sets, X and V-X, where each of X and V-X is connected. With this in mind, we’ll proceed by successively decreasing the number of vertices in the graph, until only two vertex sets are left.

Select an edge of G uniformly at random. Contract the end points of this edge. That is, merge the 2 end points into one single vertex. If x and y are the 2 vertices merged into a new vertex, say z, then all edges of the form (v,x) and (v,y) are replaced by edges of the form (v,z). As we keep on contracting one edge after the other (edges chosen independently and uniformly at random), we will ultimately end up with just 2 vertex sets, i.e. a cut. (This takes n-2 iterations (|V| = n.), since each contraction reduces the number of vertices by one.)

Now, what is the guarantee that the cut produced is indeed a min-cut?

Assume that the original graph has a min-cut of size, k. Let K be the set of edges corresponding to one such min-cut.

We’ll find the probability that the cut obtained in the algorithm is K. For that to happen, no edge of K must be contracted in any of the iterations. Consider the first iteration.

The min-cut size is k. Hence, the degree of each of the vertices is at least k, implying that the total number of edges is at least n*k/2. The probability that an edge of K is contracted in the first iteration is at most k/ (n*k/2) = 2/n. Hence, the probability that no edge of K is contracted is at least 1- 2/n = (n-2)/n.

Now, consider the 2nd iteration. The number of vertices is n-1. Assume that the first iteration did not contract any edge of K. Therefore, K is a cut of this contracted graph, and is also a min-cut. Again all vertices have a degree of at least k, meaning that the number of edges is at least (n-1)k/2. Similarly, the probability that no edge of K is contracted in this iteration is at least (n-3)/(n-1).

Proceeding in the same manner, we get that the probability that no edge of K is contracted in any of the iterations is at least (n-2)* (n-3) *(n-4) …. 2*1 / n*(n-1) *(n-2) …4*3 = 2 / n*(n-1) = 1/C(n,2).

Now, this is a small probability. To improve the chances of obtaining K as our min-cut, we can run the algorithm multiple times.

Running time: The algorithm can be implemented in O(n^2) steps. Now, how is that possible for a multigraph which can possibly have any unlimited number of edges. The solution is to assign a weight to one of a set of parallel edges, equal to the number of parallel edges. This way the graph has effectively at most C(n,2) edges.

For more on the min-cut problem, see Ch. 10 of Randomized Algorithms by Motwani and Raghavan.

Probabilistic Method

This is a method by which we can prove that a combinatorial object exists with non-zero probability. The interesting thing is that we need not even find any combinatorial object in order to prove its existence.

We’ll consider 2 problems here: the max-SAT and the max-cut problem.

(1) Max-cut: Find a cut in the graph with the maximum number of edges crossing the cut.

This problem is NP-hard. But we can accomplish another thing easily. We can prove that there exists a cut with at least m/2 edges crossing it. (m and n are the edges and vertices in the graph, respectively).

How do we prove that? Simple. Use the probabilistic method.

Assign each vertex of the graph, G independently and equiprobably to one of two vertex subsets. Now, consider any edge (u,v) belonging to G. What is the probability that its ends are in different subsets? Half. (Why? Because, there are 4 cases,(1) u is in subset, A and v is in subset, B; (2) u is in subset, A, and v is in subset, A; (3) u is in subset, B, and v is in subset, A; (4) u is in subset, B, and v is in subset, B. Each of these 4 events has probability 1/4. )

So now, what is the expected number of edges of G  crossing the cut? Answer: m/2 (by linearity of expectation.)

We know one simple fact: If the expected number of edges is m/2, then there exist cuts having at least m/2 edges crossing the cut. (Same as saying: if class average is 25, then  someone got at least 25).

Using the above fact, what have we just proved? There exists a cut having at least m/2 edges.

(2) Max-SAT: We have a set of m clauses, each in conjunctive normal form. What is the maximum number of clauses that can be satisfied?

We wont solve that. Instead, we’ll prove that there exists an assignment of truth values such that at least m/2 clauses are satisfied.

Proof: There are n variables in all, and m clauses. Assign truth values independently and equiprobably to each variable. Consider any one clause which has k variables. What is the probability that this clause is not satisfied? Answer: At most 2^(-k).

So, the probability that the clause is satisfied is at least 1 – 2^(-k), which is greater than or equal to 1/2.

Again, by linearity of expectation, the expected number of satisfied clauses is at least m/2.

Using the probabilistic method, we’ve proved the existence of two combinatorial objects. For more on this, read Ch. 5 of Randomized Algorithms by Motwani and Raghavan.

Shortest Paths

A path, as the name suggests connects 2 vertices in a graph. A shortest path is a path of minimum length.

If we have a particular vertex, say s, and need to find shortest paths from s to all other vertices of the graph. It is called the Single Source Shortest Path Problem (SSSP).

We may also want to find shortest paths between every possible pair of vertices in the graph. This is the All Pairs Shortest Paths Problem (APSP).

How do we find a shortest path between 2 vertices, say s and v?

Simple. Proceed in a manner that would remind us of Prim’s algorithm, which was discussed in the previous post.

We find the shortest path by sort of growing a cloud. Initially, we got only s. Now, choose that vertex which is closest to s. So now the cloud has 2 vertices, and the “diameter” of the cloud is the distance of this newly-added vertex, say x,  from s.

Next step. Now, find a vertex that is father from s than x, but is closer to s than any of the other vertices (or at least no farther than any of them). Call this vertex, y. Now, the path from s to y, will either be direct (i.e. one edge) or would be the addition of edge (x,y) to the path from s to x.

We continue in the same manner. Lets see one more step.

Suppose that the most recent vertex added to the cloud is d, and that the shortest path from s to d is s-> a -> b -> c -> d. How do we choose the next vertex to be added to the cloud?

Select the vertex that is farther from s than d (or at least no closer) and is closer to s than any other vertex outside the cloud (or at least no farther).  This vertex, say e, will have a shortest path from s, that is either a direct path (s->e), or an indirect one (s -> a -> e, or s-> a ->b ->e, or s-> a -> b ->c ->e, or s-> a -> b -> c-> d-> e).

Keep on adding vertices to the cloud until vertex v, also enters the cloud.

For the SSSP problem, keep adding vertices until the all vertices have been added.

This is known as Dijkstra’s algorithm.

It’s quite simple and also very well-known. Incidentally, it was discovered 50 years ago (1959). :)

Minimum Spanning Tree

A Minimum Spanning Tree is a tree that includes all the vertices of the graph. (Tree is a graph that is connected and has no cycles.)

So how do we find a MST of a graph? Standard textbook approach would be to use a greedy algorithm.

(Greedy algorithm: An algorithm where thinking greedily helps you to find the solution :) i.e. local optimal solutions lead to a global optimal solution.)

We’ll see three ways to find the MST.All are greedy.

1. Take the entire graph and start deleting edges one by one. Now, we have to be greedy. So, select the heaviest edge that is left in the graph, and delete it. Now, we also have to ensure that the resultant graph spans the graph. So, do not delete any edge that causes the graph to become disconnected; instead check for the next lightest edge to delete.

So, keep deleting edges starting from the heaviest to the lightest keeping in mind the above rules.

Till when do we continue this deletion??  Well, do it till you can delete no more edges without disconnecting the graph. Thats it. We have a MST.

2. Initially MST is empty. Add edges to this to get the MST. How to add edges? Again, greedily. Take the lightest edge. Add it to the MST. Now, we also need to ensure that the resultant MST remains a tree, i.e. there should be no cycles. So, while adding edges, do not add edges that cause the MST to have a cycle. Instead, go to the next heavier edge to add.

So, keep adding edges successively starting from the lightest to the heaviest, keeping in mind the above rules.

When do you stop? Simple; when addition of any more edge to the MST creates a cycle.

3. Initially the MST is empty. And again we’ll add edges greedily. Add the lightest edge to the MST. You get a connected component of 2 vertices, right? Now add the lightest edge that is incident upon this connected component. Ensure that the added edge does not create a cycle in the MST. In case the edge creates a cycle, skip adding it and proceed to the next heavier edge incident upon the connected component.

Keep adding till no more edges can be added without creating a cycle. There’s the MST.

The 2nd algorithm is called Kruskal’s algorithm, and the 3rd is called Prim’s algorithm.

Follow

Get every new post delivered to your Inbox.