next up previous index CD Book Algorithms
Next: Lecture 2 - asymptotic Up: No Title Previous: No Title

Lecture 1 - analyzing algorithms

Listen To Part 1-7

Lecture Schedule

subject topics reading
Preliminaries Analyzing algorithms 1-32
" Asymptotic notation 32-37
" Recurrence relations 53-64
Sorting Heapsort 140-150
" Quicksort 153-167
" Linear Sorting 172-182
Searching Data structures 200-215
" Binary search trees 244-245
" Red-Black trees:insertion 262-272
`` Red-Black trees:deletion 272-277
MIDTERM 1
Comb. Search Backtracking
" Elements of dynamic programming 301-314
" Examples of dynamic programming 314-323
Graph Algorithms Data structures 465-477
for graphs
" Breadth/depth-first search 477-483
" Topological Sort/Connectivity 485-493
" Minimum Spanning Trees 498-510
" Single-source shortest paths 514-532
" All-pairs shortest paths 550-563
MIDTERM 2
Intractability P and NP 916-928
" NP-completeness 929-939
" NP-completeness proofs 939-951
" Further reductions 951-960
" Approximation algorithms 964-974
" Set cover / knapsack heuristics 974-983
FINAL EXAM

Listen To Part 1-8

What Is An Algorithm?

Algorithms are the ideas behind computer programs.  

An algorithm is the thing which stays the same whether the program is in Pascal running on a Cray in New York or is in BASIC running on a Macintosh in Kathmandu!

To be interesting, an algorithm has to solve a general, specified problem. An algorithmic problem is specified by describing the set of instances it must work on and what desired properties the output must have.  

Example: Sorting

Input: A sequence of N numbers tex2html_wrap_inline13209

Output: the permutation (reordering) of the input sequence such as tex2html_wrap_inline13211 .

We seek algorithms which are correct and efficient.

Correctness

For any algorithm, we must prove that it always returns the desired output for all legal instances of the problem.  

For sorting, this means even if (1) the input is already sorted, or (2) it contains repeated elements.

Correctness is Not Obvious!

The following problem arises often in manufacturing and transportation testing applications.

Suppose you have a robot arm equipped with a tool, say a soldering iron. To enable the robot arm to do a soldering job, we must construct an ordering of the contact points, so the robot visits (and solders) the first contact point, then visits the second point, third, and so forth until the job is done.   

Since robots are expensive, we need to find the order which minimizes the time (ie. travel distance) it takes to assemble the circuit board.

tex2html_wrap13295 tex2html_wrap13297
You are given the job to program the robot arm. Give me an algorithm to find the best tour!

Listen To Part 1-10

Nearest Neighbor Tour

A very popular solution starts at some point tex2html_wrap_inline13213 and then walks to its nearest neighbor tex2html_wrap_inline13215 first, then repeats from tex2html_wrap_inline13217 , etc. until done.  


Pick and visit an initial point tex2html_wrap_inline13219

tex2html_wrap_inline13221

i = 0

While there are still unvisited points

i = i+1

Let tex2html_wrap_inline13227 be the closest unvisited point to tex2html_wrap_inline13229

Visit tex2html_wrap_inline13231

Return to tex2html_wrap_inline13233 from tex2html_wrap_inline13235

This algorithm is simple to understand and implement and very efficient. However, it is not correct!

tex2html_wrap13299
tex2html_wrap13301
Always starting from the leftmost point or any other point will not fix the problem.

Listen To Part 1-11

Closest Pair Tour

Always walking to the closest point is too restrictive, since that point might trap us into making moves we don't want.  

Another idea would be to repeatedly connect the closest pair of points whose connection will not cause a cycle or a three-way branch to be formed, until we have a single chain with all the points in it.


Let n be the number of points in the set

tex2html_wrap_inline13237

For i=1 to n-1 do

For each pair of endpoints (x,y) of partial paths

If tex2html_wrap_inline13243 then

tex2html_wrap_inline13245 , tex2html_wrap_inline13247 , d = dist(x,y)

Connect tex2html_wrap_inline13251 by an edge

Connect the two endpoints by an edge.

Although it works correctly on the previous example, other data causes trouble:

tex2html_wrap13303 tex2html_wrap13305
This algorithm is not correct!

Listen To Part 1-12

A Correct Algorithm

We could try all possible orderings of the points, then select the ordering which minimizes the total length:  


tex2html_wrap_inline13253

For each of the n! permutations tex2html_wrap_inline13257 of the n points

If tex2html_wrap_inline13259 then

tex2html_wrap_inline13261 and tex2html_wrap_inline13263

Return tex2html_wrap_inline13265

Since all possible orderings are considered, we are guaranteed to end up with the shortest possible tour.

Because it trys all n! permutations, it is extremely slow, much too slow to use when there are more than 10-20 points.  

No efficient, correct algorithm exists for the traveling salesman problem, as we will see later.

Listen To Part 1-13

Efficiency

"Why not just use a supercomputer?"

Supercomputers are for people too rich and too stupid to design efficient algorithms!  

A faster algorithm running on a slower computer will always win for sufficiently large instances, as we shall see.

Usually, problems don't have to get that large before the faster algorithm wins.

Expressing Algorithms

We need some way to express the sequence of steps comprising an algorithm.

In order of increasing precision, we have English, pseudocode, and real programming languages. Unfortunately, ease of expression moves in the reverse order.

I prefer to describe the ideas of an algorithm in English, moving to pseudocode to clarify sufficiently tricky details of the algorithm.  

Listen To Part 1-14

The RAM Model

Algorithms are the only important, durable, and original part of computer science because they can be studied in a machine and language independent way.

The reason is that we will do all our design and analysis for the RAM model of computation:   

We measure the run time of an algorithm by counting the number of steps.

This model is useful and accurate in the same sense as the flat-earth model (which is useful)!  

Listen To Part 1-15

Best, Worst, and Average-Case

The worst case complexity of the algorithm is the function defined by the maximum number of steps taken on any instance of size n.  

tex2html_wrap13307
The best case complexity of the algorithm is the function defined by the minimum number of steps taken on any instance of size n.  

The average-case complexity of the algorithm is the function defined by an average number of steps taken on any instance of size n.  

Each of these complexities defines a numerical function - time vs. size!

Insertion Sort

One way to sort an array of n elements is to start with tex2html_wrap_inline13269 empty list, then successively insert new elements in the proper position:  

displaymath13203

At each stage, the inserted element leaves a sorted list, and after n insertions contains exactly the right elements. Thus the algorithm must be correct.

But how efficient is it?

Note that the run time changes with the permutation instance! (even for a fixed size problem)

How does insertion sort do on sorted permutations?

How about unsorted permutations?

Exact Analysis of Insertion Sort

Count the number of times each line of pseudocode will be executed.

Line InsertionSort(A) #Inst. #Exec.
1 for j:=2 to len. of A do c1 n
2 key:=A[j] c2 n-1
3 /* put A[j] into A[1..j-1] */ c3=0 /
4 i:=j-1 c4 n-1
5 while tex2html_wrap_inline13271 do c5 tj
6 A[i+1]:= A[i] c6
7 i := i-1 c7
8 A[i+1]:=key c8 n-1

The for statement is executed (n-1)+1 times (why?)

Within the for statement, "key:=A[j]" is executed n-1 times.

Steps 5, 6, 7 are harder to count.

Let tex2html_wrap_inline13275 the number of elements that have to be slide right to insert the jth item.

Step 5 is executed tex2html_wrap_inline13277 times.

Step 6 is tex2html_wrap_inline13279 .

Add up the executed instructions for all pseudocode lines to get the run-time of the algorithm:

tex2html_wrap_inline13281 tex2html_wrap_inline13283 tex2html_wrap_inline13285 tex2html_wrap_inline13287

What are the tex2html_wrap_inline13289 ? They depend on the particular input.

Best Case

If it's already sorted, all tex2html_wrap_inline13291 's are 1.

Hence, the best case time is

displaymath13204

where C and D are constants.

Worst Case

If the input is sorted in descending order, we will have to slide all of the already-sorted elements, so tex2html_wrap_inline13293 , and step 5 is executed

displaymath13205


next up previous index CD Book Algorithms
Next: Lecture 2 - asymptotic Up: No Title Previous: No Title

Algorithms
Mon Jun 2 09:21:39 EDT 1997