## Software engineering interview questions – Intersection of two arrays

A few approaches are presented herein for solving the question on computing the intersection of two arrays. What follows is also attached in the pdf format here – algorithm1

## Greedy algorithms – Coin changing using minimum number of coins

We have coins of denominations 1 cent, 5 cents and 25 cents with us. What we would like to do is represent a total of $n$ cents using these coins, such that a minimum number of coins are used.

For example, if we want to represent 31 cents, we could do that by using 1 25-cents coin, 1 5-cents coin and a single coin of 1-cent, thereby utilizing a total of 3 coins. And this indeed is the bast we can do. Why is that so? Let’s look at it this way — the current solution uses one 25-cent coin, one 5-cent coin and one 1-cent coin. One way in which we can try to modify this solution would be to decide to not use any 25-cent coin at all (instead of the one coin that we are currently using). That would mean that we would have to represent the shortage of 25 cents that we are now faced with, using coins of denominations 1 and 5-cents. We can see that we would need at least five coins of denomination 5-cents in order to represent 25-cents. Hence, using a 25-cent coin was actually a wise choice enabling us to save on the number of coins used.

Now, what if we had to represent 57 cents. The best solution would be to use two coins of 25-cents, one coin of 5-cents and two coins of 1-cent.

Again, we can intuitively infer its optimality by observing that:

1. if we use lesser number of 25-cents coins than what we are currently using, we would actually end up using more coins. —(If we use just one 25-cent coin, then we would require 5 coins of 5-cents each to cover up the deficit of 25-cents. Similarly if we do not use any 25-cents coins at all, we would need 10 coins of 5-cents each to cover the deficit.)  Hence, the idea is to use as many 25-cent coins as we possibly can.

2. Similarly, once we have used the maximum number of 25-cent coins, we are left with $k$ cents (where $k < 25$) to represent using 1 and 5-cent coins. Analogous to what we observed above, we can infer that we must use as many 5-cent coins as we can in order to cover the deficit of $k$ cents. Whatever is left can be represented using 1-cent coins.

———

Hence, the algorithm would look something like this.

//Represent n cents using the least number of total coins where coins are of denominations – 1, 5 and 25-cents.

1. Divide n by 25 to get quotient q1 and remainder k1.

2. Use q1 coins of 25-cents each.

3. If k1 == 0, we are done. Else, divide k1 by 5 to get quotient q2 and remainder k2.

4. Use q2 coins of 5-cents, and k2 coins of 1-cent.

———-

Now, if we have coins at our disposal having denominations $1, k, k^2, k^3$-cents and we had to represent $n$ cents with the minimum number of coins, we could do the following:-

———–

i = 3;

num = n;

While ( exponent >= 0)

{

Divide num by k^i to get quotient q and remainder r.

Use q coins of denomination k^i cents.

If ( r == 0)

break;

num = r;

i – – ;

}

———

Now, take the following example, represent 31 cents using coins of denominations 1, 10 and 25 cents such that the least number of coins are used.

Here, the idea behind the greedy algorithm of using the maximum possible number of coins of the highest denomination would not work. That approach would get us a solution that uses 6 coins : one 25-cent coin, and 6 1-cent coins.

In contrast, we can get a better solution using 4 coins: 3 coins of 10-cents each and 1 coin of 1-cent.

In the problems presented at the beginning of this post, the greedy approach was applicable since for each denomination, the denomination just smaller than it was a perfect divisor of it. (i.e. 25 was a factor of 125; 5 was a factor of 25; 1 was a factor of 5}. However, that not being true in this case {10 is not a factor of 25} leads to the failure of the greedy approach.

## Recursion in C++ : The Coin Changing Problem

Let’s say that we have an infinite supply of pennies ( 1 cent ), nickels ( 5 cents ), dimes (10 cents ) and quarters (25 cents). (That would probably be a nice thing — having an infinite number of coins, whatever be the denomination. On a different note, the fact that this question assumes such an infinite supply of coins should convince everyone of the hypothetical nature of this and similar problems. That said, it is sometimes useful to get a fair idea of how to go about solving seemingly contrived problems in order to tackle realistic ones.)

Coming back to the problem — We buy a commodity and are required to pay $n$ cents for it. Now, the interesting bit is that we are required to pay exactly $n$ cents, i.e. we cannot pay more and get change in return. So, the question is — How many ways can we possibly pay $n$ cents using our infinite supply of coins?

The question could be rephrased thus:

Given the equation: $n = x + 5 y + 10 z + 25 w$ where $x, y, z, w$ are non-negative integers, find the total number of possible solutions to the equation.

——-

Let’s take an example to clarify things — say. n = 91.

Let $f (n)$ denote the number of ways $n$ can be represented in terms of coins having denomination 1, 5, 10 and 25.

Thus, our task is to compute $f(91)$.

The coin of the highest denomination is that of 25 cents.

We have 2 choices before us: (1) use at least one 25-cent coin, or  (2) do not use 25-cent coins at all.

$f(n)$ is the sum of the number of ways we can perform (1) and (2).

(1) If we decide to use at least one 25-cent coin:

We can see that we can use either 1, 2 or 3 such coins.

If we use 1 25-cent coin, we are left with 66 cents to be represented using coins of denominations 1, 5 and 10. So, we now see that we have a smaller subproblem — that of representing 66 cents using 1, 5 and 10 cent coins.

We digress here to make a remark about notation.

—-

Let us denote by $f(n, k)$ the total number of ways in which we can represent $n$ cents using coins of denomination $\leq k$ with the added condition that at least 1 coin of denomination $k$ is used.

Thus, we have that $f(n) = f(n, 25) + f(n, 10) + f(n, 10) + f(n, 1)$.

(The 1st term denotes the number of ways we can represent $n$ cents using 1, 5, 10 and 25-cent coins such that at least 1 25-cent coin is used. The 2nd term — f(n,10) denotes the number of ways wherein we do not use any 25-cent coins and use at least 1 10-cent coin. Similarly, we can work out the meaning of the remaining terms. It can be observed that the events represented by each of the these terms are in some sense disjoint, which is why we can simply add these terms to get our desired result.)

——–

Going back to our example, the number of ways we can represent 66 cents using coins of denomination only 1, 5 or 10 cents is equal to $f(66,10) + f(66,5) + f(66,1)$.

—-

We again digress to make another remark on notation.

We use $g(n,k)$ to denote the number of ways in which we can represent $n$ cents using coins of denominations $\leq k$.

Hence, our original task, which was to compute $f(n)$ is the same as computing $g(n, 25)$.

——–

Again, coming back to the example, we see that the number of ways to represent $66$ cents using 1, 5  and 10-cent coins is simply $g(66, 10)$.

Now, what if we decide to use 2 25-cent coins in representing 91 cents. Then, we are left with representing 41 cents using 1, 5 and 10-cent coins, which can be done in $g(41, 10)$ ways.

Similarly, we can see that we can represent 91 cents using 3 25-cent coins in $g(16, 10)$ ways.

—-

A little reflection would get us the following equation:-

$g(91, 25) = f(91,25) + f(91, 10) + f(91, 5) + f(91, 1)$

$= [ g(66,10) + g(41, 10) + g(16, 10) ] + [ g(81, 5) + g(71, 5) + ... + g(1, 5) ] + [ g (86, 1) + g(81, 1) + ... + g(1, 1) ] + [ 1 ].$

———

We can now write the method discussed above in the form of a function in C++.

——-

#include <iostream.h>

// Let’s say we have a global array A[4] = {25, 10, 5, 1}.

//func F takes a non-negative input n, and an index into the array A.

// It returns f (n, A[i]) [where the notation f(n,k) is as discussed earlier.]

int F (int n, int index) ;

// func G takes non-negative input n, and an index into the array A, and returns g(n,A[i]) as discussed above.

int G (int n, int index) // the initial call from the main ( ) function would be G(n,0).

{

int i = index;

int ans = 0;

for ( ; i < 4; i++)

ans + = F (n, i);

return ans;

}

//Now we give the implementation of func F.

int F ( int n, int index )

{

if ( n < A[index] )

return 0;

if ( index == 3)

return 1; // because f(n,1) == 1.

int ans = 0, quotient = n/ A[index];

for (int i = 1; i <= quotient; i++)

{

ans + = G (n – i*A[index], index+1)

}

return ans;

}

———

## Arrays in C++ – Finding appropriate pairs in an array – 4

APPROACH 3 contd.

Now, what if we remove the 2 assumptions we made earlier –

1. all elements in A are distinct.What if duplicates are now allowed?

2. Array A contains elements only in the range 0 – 99. What if the ints in A can have any possible value?

—–

If we remove the first restriction, we would possibly need to print a pair more than once depending on the number of times each element of the pair appears in A.

For example, if 10 appears thrice and 25 appears twice, and our target is 35, then we would need to print the 10, 25 pairs 6 times.

—–

Let’s now look at removing the 2nd restriction.

Earlier we were able to declare a lookup array L of a constant size =100 because of the simplifying assumption on the range of possible values in A.

Now, if all values are allowed, the size of the lookup table would have to be $2^32$ (assuming sizeof(int) == 4).

——–

Making these 2 modifications, we would be able to tackle the problem.

——-

There are a couple of observations on running time complexity that we would like to make here —

1. Whenever we want to decide which algorithm to go for, it is a good idea to look at both the time and space complexities of the algorithms concerned. Also, for a very large input size, the asymptotic i.e. big-O analysis provides a fairly accurate estimate of the complexity. However, for small input sizes, one should be careful to see what the hidden constants in the big-O notation are, as it is possible that for small input sizes these hidden constants may be the dominating factor in deciding the actual time it takes to run the algorithm.

2. If we are going to perform a similar type of query a number of times, then it makes sense to construct a sort of lookup table which can be used with minimal overhead for future queries. The time/ space required to construct the lookup table can be made to seem insignificant if the number of queries based on that table is large. However, if we are just going to make a single query of a particular type, then one would need to carefully decide if it is worth allocating a large amount of memory for a lookup table and using up a large amount of time to construct it, given that we would be using it only once.

————

## Arrays in C++ – Finding appropriate pairs in an array – 3

APPROACH 3

Having seen $O(n^2)$ and $O( n \log n)$ algorithms for this problem, the natural question to ask is whether we can do better than that.

Firstly, we observe that to output all pairs that sum to target, we would need to access all elements of the array. That gives a lower bound of $O(n)$ for any algorithm that we may think of.

So, the question is whether we can actually achieve that lower bound and devise an algorithm with a time complexity of $O(n)$.

——–

To simplify our task let us assume that the array A only contains integers in the range 0 – 99, and that A contains only distinct elements.

For example, A = { 10, 20, 70, 40, 65, 55} and target = 75.

Now, for element 10, we would like to see if 65 is contained in A. Since the ints are from a given fixed range : 0-99, we can construct a lookup table of size 100, containing the counts of the number of time each corresponding int appears in A.

If the lookup table is called L, L[i] would contain the count of the integer i.

Thus, when processing element 10, we would simply check if L[65] is 1. This takes O(1) time.

Similarly, we would lookup some values corresponding to each of the other elements in A. Hence, the total time complexity is $O(n)$.

Also, the space complexity is $O(1)$ since the size of L is constant.

———–

In pseudocode form, the above method would look like this:-

Create int array L of size 100. //we could as well use a boolean array, and indeed that would save memory.

Initialize L to all zeros.

For i going from 0 to n-1:

L [ A[i] ] ++; //increment count of element A[i].

//now print the pairs.

For i going from 0 to n-1:

If A[ i ] < (target – A [ i ] )// We need this to prevent a pair twice (e.g. as {10, 25} and {25,10} ).

If ( L [target – A[i] ] == 1 )

Print A[i], target – A[i].

————

We could code it in C++ like this:

#include <iostream.h>

void f (int *A, int n, int target)

{

int L [100] = {0};

for ( int i = 0; i < n ; i++)

L [ A[i] ] ++;

for ( i =0 ; i < n ; i++)

{

if ( A[i] < target – A[i])

if ( L [target – A[i] ] == 1 )

cout << A[i] << ” and ” << target – A[i] <<endl;

}

}

———–

## Arrays in C++ : Finding appropriate pairs in an array – 1

We are given an array A of ints, say of length n, and an integer target. Our task is to output all pairs of elements in A which add up to target.

For example, if target = 90, and the array A = {40, 20, 30, 50, 35, 60}, the output should be {40, 50}, {30, 60}. (The output is allowed to be in any order we like as long as all relevant pairs appear in it exactly once.)

————

APPROACH 1

A direct method could be to check all possible pairs of elements in A. For each pair that satisfies the required condition, we output that pair. For an array of length n, there are $n \choose 2$ possible pairs. The processing time for each pair is $O(1)$. Hence, the running time is $O(n^2)$. Also, the memory requirement is $O(1)$.

——

In pseudo code, we could write it as follows:

For i going from 0 to n-1

For j going from i+1 to n-1

If A[i] + A[j] == target

Print {A[i], A[j]}

——

Transcribing that in C++, we would get:-

——

#include <iostream.h>

// function f takes an int array A, length n of A, and an int target.

//It prints all pairs of elements in A that sum to target.

void f (int *A, int n, int target)

{

for ( int i = 0; i < n-1; i++)

for (int j = i+1; j < n-1; j++)

if (A[i] + A[j] == target)

cout << endl << “{ ” << A[i] << “, ” << A[j] << ” }”;

}

———–

## Dynamic Programming – 6 : Optimum sum of tree nodes

Optimum sum of tree nodes

We are given a rooted tree $T$ with root $r$. Each node $x$ has a real value $v(x)$ associated with it. We are to find a subset of nodes of the tree, such the sum of the values of the selected nodes is maximized, subject to the condition that if we choose a node $x$, we cannot choose any of its children or its parent.

Solution.

Notation. Let $OPT (x)$ denote the optimum value on a sub-tree rooted at node $x$. Let $OPT (1,x)$ denote the optimum value on a sub-tree rooted at node $x$ subject to the condition that $x$ is selected as part of the optimal subset. Let $OPT(0,x)$ denote the optimum value on a sub-tree rooted at $x$ subject to the condition that $x$ is not selected as part of the optimal subset.

Then, we have that:

$OPT(x) = max \{OPT(1,x), OPT(0,x)\}$.

$OPT(1,x) = v(x) + \sum_y OPT (0,y)$ (where the summation is over all children $y$ of $x$.)

$OPT(0,x) = \sum_y OPT (y)$ (where the summation is over all children $y$ of $x$.)

Also, if $x$ is a leaf, then $OPT(1,x) = v(x)$, and $OPT(0,x) = 0$.

Pseudocode for computing OPT(root)

We can compute OPT(root) in a bottom up fashion starting from the leaves. The recurrence equations for a node $x$ involve the OPT values for its children. Thus, before computing the $OPT$ value for node $x$, we compute the $OPT$ values for its children.

Let the nodes of the tree be arranged in levels from Level 0 till Level L such that the root is at Level 0.

For i going from L downto 0:

For all nodes x in Level i:

Compute OPT[1,x], OPT[0,x] and OPT[x] using the above recurrence.

Set Soln[x] = 1 if OPT[1,x] >= OPT [0,x]. Else, set Soln[x] = 0.

End.

—-

The above code has a running time of $O(n)$ where $n$ represents the number of nodes in the tree. [ We compute OPT values for n nodes. For each node, we compute the OPT values by summing up the OPT values over all of its children. Aggregating over all nodes, and observing the each node can have at most one parent, we find that the summing up operation costs O(n). Therefore, the total cost is O(n+n) which is O(n). Another way to look at it is that the aggregate cost of the summing up operations over all nodes is O(m) [where m represents the number of edges in the tree], which for a tree is O(n). ]

—-

Computing the optimal subset:

We do so using the Soln[x] values.

Function: Print_Solution (x)

If (Soln[x] == 1)

Print x.

For all grand-children y of x:

Print_Solution (y);

Else

For all children y of x:

Print_Solution (y);

End.

—-

Print_Solution(root) has a running time of $O(n)$.

—-