In this post, we shall consider an NP-complete problem, namely that of determining whether a simple path of length k exists in a given graph, G = (V,E).
Formally,
LONGEST PATH PROBLEM
Input: undirected graph, G = (V,E), integer
.
Parameter: k
To find: Does G contain a simple path of length, k?
(Note: length refers to edge length here. In particular, a simple path of length k has k edges.)
The problem has a natural extension:-
EXTENDED LONGEST PATH PROBLEM
Input: undirected graph, G = (V,E), integer
.
Parameter: k
To find: All pairs of vertices
, such that there exists a simple path of length k between i and j in G.
=======
This article is based on the seminal paper on Color-Coding by Noga Alon, Raphael Yuster, Uri Zwick [1995]. The method of random orientations is mentioned in the paper prior to a description of the color-coding approach.
=======
We concern ourselves with the Extended Longest Path problem.
Consider the adjacency matrix, A of input graph, G. (A is an
matrix, with 0 / 1 entries, which we interpret as integers.)
We have,
iff there exists an edge
.
(Note:
represents the entry at the
row and
column of matrix A.)
Consider the matrix,
.
iff there exists a path of length two between i and j in G. (Why? Verify for yourself.)
Generalizing, we can make the following statement for the matrix
.
iff there exists a path of length k between i and j in G. (Why? use induction to prove this.)
Question: Do all paths represented in the matrix entries correspond to simple paths of G?
In other words, is the following statement true:
iff there exists a simple path of length k between i and j in G.
–
By a little observation, it can be seen that for the matrix entry
to be non-zero, the path between vertices i and j need not necessarily be simple. In other words, the path may contain cycles.
Hence, we can not infer from the fact of having
, the conclusion that there exists a simple path of length k between i and j in G.
—
How do we fix this problem?
Consider a graph, H that does not have any cycles (i.e. acyclic graph). Clearly, all paths in H are necessarily simple. (For a path to not be simple, the path needs to contain a cycle, which is precluded by the fact of H being acyclic.)
Let B denote the adjacency matrix of acyclic graph, H.
We can make the following statements:-
iff there exists a path of length k between i and j in H.
And since all paths of H are necessarily simple, the above statement is equivalent to:-
iff there exists a simple path of length k between i and j in H.
Therefore, if the input graph is acyclic, raising the adjacency matrix to the power k solves the Extended Longest Path Problem.
—-
Question: How do we obtain an acyclic graph, H from the given input graph, G?
Prior to answering this question, we shall digress briefly to the look at the notion of a Directed Acyclic Graph (abbreviated as DAG).
—
Consider an undirected graph, G = (V,E) on
vertices. We shall construct a directed acyclic version of G.
Let
be a permutation of the vertices i.e.
, (
is one-one and onto, i.e. bijective.)
Call the directed version of G as
. The vertex set of
is V (same as G). For every edge of G, we have a corresponding directed edge in
.
For each edge
of G, add edge
to
iff
.
(i.e. Orient every edge of G from the vertex having lower
value to the vertex having greater
value. )
Claim 1.
is acyclic.
Proof. Assume to the contrary that
has a cycle,
.
Hence, we have the following
inequalities:-


…

.
From the first
inequalities, we obtain:-
, which is contrary to the
inequality.
Hence,
is acyclic. QED.
==
Claim 2 (Converse of Claim 1.). Given a Directed Acyclic Graph, G = (V,E), we can obtain a bijection
, such that for every edge
, the following condition holds:
.
Proof-sketch.
Given any directed acyclic graph, G = (V,E), we have:-
such that
.
(In other words, we can find a vertex in V, which has no incoming edges.)
Assign
for one such vertex, v.
Remove
from
to get a new directed acyclic graph. Find a vertex, u that satisfies the above condition in this new graph, assign
.
Proceed in this manner until all vertices are assigned a value under the function,
.
End of proof-sketch.
===
We can now return to the question raised prior to the discussion on directed acyclic graphs.
To restate:
Question: How do we obtain an acyclic graph, H from the given input graph, G?
Solution: Take a random (What does “random” mean?) permutation
. Create an directed acyclic version of G as outlined in the preceding discussion. Call this graph,
.
(By “random”, we mean a permutation selected uniformly at random from the set of n! possible permutations of the vertex set, V.)
Denote the adjacency matrix of
by B.
As seen earlier, we have:-
iff there exists a simple path in between i and j in $latex G_{dir} $ of length k.
Since every simple path in
is also a simple path in G (Why is this true?), we have:-
If
, then
a simple path between i and j in G of length k.
Question: Is the converse of this above statement true?
In other words, is the following statement true:-
If there exists a simple path between i and j in G of length k, then
.
Answer: No.
The above question can also be rephrased as:
Are all simple paths in G carried over into
?
(Take a simple path, p in G, and figure out for yourself how the edges of p need to be oriented for path, p to exist in
. In particular, what does this imply about the
values of the vertices along the path?)
==
Now, we can ask the question:
Question: What is the probability that a particular simple path,
of length k in G gets preserved in
given that we take a random permutation,
to obtain
from
?
Solution: There are only two ways in which path
can get preserved in
-
either as the directed path,
,
or as the directed path,
.
Path
exists only if
.
Path
exists only if
.
Note that we are not concerned with the
values of any of the other
vertices of G.
Hence, we obtain the following:
Probability that a particular path, p in G, of length k, is preserved in
= 2 / (k+1)!.
(How do we get that?)
===
The above probability implies that to preserve any particular simple path of G (having length k), we need to take an expected number of (k+1)! / 2 random permutations.
===
Running time of the algorithm:
We can state the expected running time of the algorithm. We need to perform the following steps.
1. Given a particular random permutation,
.
Raise the adjacency matrix, B of
to the power k.
This can be done via
matrix multiplications, each of which takes
time. (Here
is the exponent for matrix multiplication
. )
2. Repeating the same for an expected number of
random permutations.
3. Keep track of of the
pairs of vertices, for each of which we need to output a boolean Yes or No value (corresponding to whether or not a simple path of length k exists in G between that pair of vertices.) (Note: n = |V|).
Initialize all
values of this global list to No at the beginning of the algorithm. For each permutation, if the matrix entry,
and if the value corresponding to pair (i,j) is listed as No in the global list, change it to Yes.
Hence, the expected running time of the algorithm is
.
.
(We have ignored certain polynomial (in the size of the graph) factors in the running time. That is justified owing to the “supremacy” of the (k+1)! term over other factors.)