Best, Worst, and Average-Case Complexity

Next: The Big Oh Notation Up: Keeping Score Previous: The RAM Model of

Best, Worst, and Average-Case Complexity

Using the RAM model of computation, we can count how many steps our algorithm will take on any given input instance by simply executing it on the given input. However, to really understand how good or bad an algorithm is, we must know how it works over all instances.

To understand the notions of the best, worst, and average-case complexity, one must think about running an algorithm on all possible instances of data that can be fed to it. For the problem of sorting, the set of possible input instances consists of all the possible arrangements of all the possible numbers of keys. We can represent every input instance as a point on a graph, where the x-axis is the size of the problem (for sorting, the number of items to sort) and the y-axis is the number of steps taken by the algorithm on this instance. Here we assume, quite reasonably, that it doesn't matter what the values of the keys are, just how many of them there are and how they are ordered. It should not take longer to sort 1,000 English names than it does to sort 1,000 French names, for example.

Figure: Best, worst, and average-case complexity

As shown in Figure , these points naturally align themselves into columns, because only integers represent possible input sizes. After all, it makes no sense to ask how long it takes to sort 10.57 items. Once we have these points, we can define three different functions over them:

The worst-case complexity of the algorithm is the function defined by the maximum number of steps taken on any instance of size n. It represents the curve passing through the highest point of each column.
The best-case complexity of the algorithm is the function defined by the minimum number of steps taken on any instance of size n. It represents the curve passing through the lowest point of each column.
Finally, the average-case complexity of the algorithm is the function defined by the average number of steps taken on any instance of size n.

In practice, the most useful of these three measures proves to be the worst-case complexity, which many people find counterintuitive. To illustrate why worst-case analysis is important, consider trying to project what will happen to you if you bring n dollars to gamble in a casino. The best case, that you walk out owning the place, is possible but so unlikely that you should place no credence in it. The worst case, that you lose all n dollars, is easy to calculate and distressingly likely to happen. The average case, that the typical person loses 87.32% of the money that they bring to the casino, is difficult to compute and its meaning is subject to debate. What exactly does average mean? Stupid people lose more than smart people, so are you smarter or dumber than the average person, and by how much? People who play craps lose more money than those playing the nickel slots. Card counters at blackjack do better on average than customers who accept three or more free drinks. We avoid all these complexities and obtain a very useful result by just considering the worst case.

The important thing to realize is that each of these time complexities defines a numerical function, representing time versus problem size. These functions are as well-defined as any other numerical function, be it or the price of General Motors stock as a function of time. Time complexities are complicated functions, however. In order to simplify our work with such messy functions, we will need the big Oh notation.

Next: The Big Oh Notation Up: Keeping Score Previous: The RAM Model of

Algorithms
Mon Jun 2 23:33:50 EDT 1997