Input description: A directed graph G=(V,E).
Problem description: For transitive closure, construct a graph G'=(V,E') with edge iff there is a directed path from i to j in G. For transitive reduction, construct a small graph G'=(V,E') with a directed path from i to j in G' iff .
Discussion: Transitive closure can be thought of as establishing a data structure that makes it possible to solve reachability questions (can I get to x from y?) efficiently. After the preprocessing of constructing the transitive closure, all reachability queries can be answered in constant time by simply reporting a matrix entry. Transitive closure is fundamental in propagating the consequences of modified attributes of a graph G. For example, consider the graph underlying any spreadsheet model, where the vertices are cells and there is an edge from cell i to cell j if the result of cell j depends on cell i. When the value of a given cell is modified, the values of all reachable cells must also be updated. The identity of these cells is revealed by the transitive closure of G. Many database problems reduce to computing transitive closures, for analogous reasons.
There are three basic algorithms for computing transitive closure:
Another related algorithm, discussed in the references, runs in the same time as matrix multiplication. You might conceivably win for large n by using Strassen's fast matrix multiplication algorithm, although I for one wouldn't bother trying. Since transitive closure is provably as hard as matrix multiplication, there is little hope for a significantly faster algorithm.
Transitive reduction (also known as minimum equivalent digraph) is essentially the inverse operation of transitive closure, namely reducing the number of edges while maintaining identical reachability properties. The transitive closure of G is identical to the transitive closure of the transitive reduction of G. The primary application of transitive reduction is space minimization, by eliminating redundant edges from G that do not effect reachability. Transitive reduction also arises in graph drawing, where it is important to eliminate as many unnecessary edges as possible in order to reduce the visual clutter.
Although the transitive closure of G is uniquely defined, a graph may have many different transitive reductions, including G itself. We want the smallest such reduction, but there can be multiple formulations of the problem:
One catch with this heuristic is that it can add edges to the transitive reduction of G that are not in G. This may or may not be a problem for your application.
A heuristic for finding such a transitive reduction is to consider each edge successively and delete it if its removal does not change the transitive reduction. Implementing this efficiently means minimizing the time spent on reachability tests. Observe that directed edge (i,j) can be eliminated whenever there is another path from i to j avoiding this edge.
Implementations: LEDA (see Section ) provides an implementation of transitive closure in C++ using O(n m) time [GK79].
Combinatorica [Ski90] provides Mathematica implementations of transitive closure and reduction, as well as the display of partial orders requiring transitive reduction. See Section .
Notes: Van Leeuwen [vL90a] provides an excellent survey on transitive closure and reduction, including 33 references. Good expositions of Warshall's algorithm [War62] include [Baa88, CLR90, Man89]. The equivalence between matrix multiplication and transitive closure was proven by Fischer and Meyer [FM71], with good expositions including [AHU74].
The equivalence between transitive closure and reduction, as well as the reduction algorithm, was established in [AGU72]. Empirical studies of transitive closure algorithms include [SD75].
Estimating the size of the transitive closure is important in database query optimization. A linear-time algorithm for estimating the size of the closure is given by Cohen [Coh94].
Related Problems: Connected components (see page ), shortest path (see page ).