Two heads are better than one, and more generally, n heads are better than n-1. In our era of computing plenty, parallel processing seems like an exciting technique for solving combinatorial optimization problems. Today many facilities contain networks of workstations, most of them idle at night and underutilized during the day. Why not put them to work?
Parallelism seems like the easy way out of hard problems. Indeed, sometimes, for some problems, parallel algorithms are the most effective solution. High-resolution, real-time graphics applications must render thirty frames per second for realistic animation. Assigning each frame to a distinct processor, or dividing each image into regions assigned to different processors might be the only way to get the job done in time. Large systems of linear equations for scientific applications are routinely solved in parallel.
However, there are several pitfalls associated with parallel algorithms that one should be aware of:
The classic example of this occurs in the minimax game-tree search algorithms used in computer chess programs. Brute-force tree search is embarrassingly easy to parallelize; just put each subtree on a different processor. However, a lot of work gets wasted because the same positions get considered on different machines. Moving from brute-force search to the more clever alpha-beta pruning algorithm can easily save over 99.99% of the work, thus dwarfing any benefits of parallel brute-force search. Alpha-beta can be parallelized, but not easily, and speedups are typically limited to a factor of six or so regardless of how many processors you have.
I recommend considering parallel processing only after repeated attempts at solving the problem sequentially prove too slow. Even then, I would restrict attention to algorithms that parallelize the problem by partitioning the input into distinct tasks, where no communication is needed between the processors, except in collecting the final results. Such large-grain, naive parallelism can be simple enough to be readily implementable and debuggable, because it really reduces to producing a good sequential implementation. Still, there can be pitfalls in this approach, as discussed in the war story below.