Data Structures for Graphs

Selecting the right data structure to represent graphs can have an enormous impact on the performance of an algorithm. Your two basic choices are adjacency matrices and adjacency lists, illustrated in Figure

An adjacency matrix is an

matrix M where (typically) M[i,j] = 1 if there is an edge from vertex i to vertex j and M[i,j]=0 if there is not. Adjacency matrices are the simplest way to represent graphs. However, they doom you to using

space no matter how many edges are in the graph. For large graphs, this will kill you. Remember that

1,000,000, and work up from there. Although there is some potential for saving space by packing multiple bits per word or simulating a triangular matrix for undirected graphs, these cost some of the simplicity that makes adjacency matrices so appealing.

Beyond simplicity, there are certain algorithmic operations that prove faster on adjacency matrices than adjacency lists. In particular, it takes

time to test whether edge (i,j) is in a graph represented by an adjacency matrix. All we must do is read the appropriate bit.

An adjacency list consists of an n-element array of pointers, where the ith element points to a linked list of the edges incident on vertex i. To test whether edge (i,j) is in the graph, we search the ith list for j. This takes

, where

is the degree of the ith vertex. For a complete or almost complete graph,

, so testing the existence of an edge can be very expensive relative to adjacency matrices. However,

can be much less than n when the graph is sparse. Most of the graphs that one encounters in real life tend to be sparse. Recall the friendship graph as an example. Further, a surprising number of the most efficient graph algorithms can be and have been designed to avoid such edge-existence queries. The key is processing the edges in a systematic order like breadth-first or depth-first search.

For most applications, adjacency lists are the right way to go. The main drawback is the complexity of dealing with linked list structures. Things can be made arbitrarily hairy by adding extra pointers for special purposes. For example, the two versions of each edge in an undirected graph, (i,j) and (j,i), can be linked together by a pointer to facilitate deletions. Also, depending upon the operations you will perform on each list, you may or may not want it to be doubly linked, so that you can move backwards as easily as you move forwards.

It is a good idea to use a well-designed graph data type as a model for building your own, or even better as the foundation for your application. We recommend LEDA (see Section

) as the best-designed general-purpose graph data structure currently available. It may be more powerful (and hence somewhat slower/larger) than what you need, but it does so many things right that you are likely to lose most of the potential do-it-yourself benefits through clumsiness.

In summary, we have the following tradeoffs between adjacency lists and matrices:

Comparison Winner

Faster to test if (x, y) is in graph? adjacency matrices

Faster to find the degree of a vertex? adjacency lists

Less memory on small graphs? adjacency lists (m+n) vs.

Less memory on big graphs? adjacency matrices (a small win)

Edge insertion or deletion? adjacency matrices O(1) vs. O(d)

Faster to traverse the graph? adjacency lists vs.

Better for most problems? adjacency lists

Comparison	Winner
Faster to test if (x, y) is in graph?	adjacency matrices
Faster to find the degree of a vertex?	adjacency lists
Less memory on small graphs?	adjacency lists (m+n) vs.
Less memory on big graphs?	adjacency matrices (a small win)
Edge insertion or deletion?	adjacency matrices O(1) vs. O(d)
Faster to traverse the graph?	adjacency lists vs.
Better for most problems?	adjacency lists