Next: War Story: Going Nowhere Up: Combinatorial Search and Heuristic Previous: War Story: Annealing Arrays

Parallel Algorithms

Two heads are better than one, and more generally, n heads are better than n-1. In our era of computing plenty, parallel processing seems like an exciting technique for solving combinatorial optimization problems. Today many facilities contain networks of workstations, most of them idle at night and underutilized during the day. Why not put them to work?

Parallelism seems like the easy way out of hard problems. Indeed, sometimes, for some problems, parallel algorithms are the most effective solution. High-resolution, real-time graphics applications must render thirty frames per second for realistic animation. Assigning each frame to a distinct processor, or dividing each image into regions assigned to different processors might be the only way to get the job done in time. Large systems of linear equations for scientific applications are routinely solved in parallel.

However, there are several pitfalls associated with parallel algorithms that one should be aware of:

There is often a small upper bound on the potential win - Suppose that you have access to twenty workstations that can be devoted exclusively to your job. Potentially, these could be used to speed up the fastest sequential program by up to a factor of twenty. That is nice, but much greater performance gains are potentially possible by finding a better sequential algorithm and implementing that. Your time spent parallelizing a code might well be better spent enhancing the sequential version. Performance-tuning tools such as profilers are better developed for sequential machines than for parallel models.
Speedup means nothing - Suppose my parallel program runs 16 times faster on a 16-processor machine then it does on one processor. That's great, isn't it? If you always get linear speedup and have an arbitrary number of processors, you will eventually beat any sequential algorithm. However, a carefully designed sequential algorithm can often beat an easily parallelized code running on a typical parallel machine. The one-processor parallel version of your algorithm is likely to be a crummy sequential algorithm, so measuring speedup typically provides an unfair test of the benefits of parallelism.
The classic example of this occurs in the minimax game-tree search algorithms used in computer chess programs. Brute-force tree search is embarrassingly easy to parallelize; just put each subtree on a different processor. However, a lot of work gets wasted because the same positions get considered on different machines. Moving from brute-force search to the more clever alpha-beta pruning algorithm can easily save over 99.99% of the work, thus dwarfing any benefits of parallel brute-force search. Alpha-beta can be parallelized, but not easily, and speedups are typically limited to a factor of six or so regardless of how many processors you have.
Parallel algorithms are tough to debug - Unless your problem can be decomposed into several independent jobs, the different processors will have to communicate with each other in order to end up with the correct final result. Unfortunately, the non-deterministic nature of this communication makes parallel programs notoriously difficult to debug. Perhaps the best example is Deep Blue, the world-champion chess computer. Although it beat Kasparov, over the years it has lost several games in embarrassing fashion due to bugs, mostly associated with its extensive parallelism.

I recommend considering parallel processing only after repeated attempts at solving the problem sequentially prove too slow. Even then, I would restrict attention to algorithms that parallelize the problem by partitioning the input into distinct tasks, where no communication is needed between the processors, except in collecting the final results. Such large-grain, naive parallelism can be simple enough to be readily implementable and debuggable, because it really reduces to producing a good sequential implementation. Still, there can be pitfalls in this approach, as discussed in the war story below.

Next: War Story: Going Nowhere Up: Combinatorial Search and Heuristic Previous: War Story: Annealing Arrays

Algorithms
Mon Jun 2 23:33:50 EDT 1997