### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# ANALYSIS OF ALGORITHMS CS 325

OSU

GPA 3.95

### View Full Document

## 225

## 0

## Popular in Course

## Popular in ComputerScienence

This 171 page Class Notes was uploaded by Mrs. Maximo Lueilwitz on Monday October 19, 2015. The Class Notes belongs to CS 325 at Oregon State University taught by Staff in Fall. Since its upload, it has received 225 views. For similar materials see /class/224509/cs-325-oregon-state-university in ComputerScienence at Oregon State University.

## Reviews for ANALYSIS OF ALGORITHMS

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 10/19/15

A Brief Introduction to Algorithms Paul Cull Department of Computer Science Oregon State University September 257 2005 Contents 1 O CO g INTRODUCTION 11 What Is An Algorithm 12 What Is Analysis Of Algorithms 1 21 Proofs of correctness 12 Comparing Algorithms i i i i i i i i i i i i i i i i i i i 123 Best Algorithms TOWERS OF HANOI AN ANALYSIS OF A PROBLEM 21 The Towers of Hanoi Problem 22 A Recursive Algorithm i i i i i i i i i i i i i i i i i i i i i i i 23 HANOI is correct i i i i i i i i i i i i i i i i i i i i i i i i i i i 24 HANOI makes the fewest possible moves i i i i i i i i i i i i i i 241 Bottleneck i i i i i i i i i i i i i i i i i i i i i i i i i i i Time and space used i i i i i i i i i i i i i i i i i i i i i i i i i Improved Algorithms i i i i i i i i i i i i i i i i i i i i i i i i i 25 2 6 SOME INDUCTIONS 31 What Is Induction 3 2 Inductions for easy Sums 3 3 Euclidls Algorithm DIVIDEANDCONQUER What is Divideand Conquer i i i i i i i i i i i i i i i i i i i i The General Form for DivideandConquer Algorithms i i i i i i DivideandConquer Difference Equations i i i i i i i i i i i i i Some Divideand Conquer Examples i i i i i i i i i i i i i i i i 441 Mergesort i i i i i i i i i i i i i i i i i i i i i i i i i i i 442 Polynomial Multiplication i i i i i i i i i i i i i i i i i i 443 Matrix Multiplication i i i i i i i i i i i i i i i i i i i i 444 Solving a Linear System of Equations 44f Newton7s Method i i i i i i i i i i i i i i i i i i i i i i i Polynomial Multiplication and Fast Fourier Transform i i i i i i How Practical Are Divideand Conquer Algorithms i i i i i i i 41 4 2 4 3 4 4 4 5 4 6 5 AVERAGE CASE 51 What is Average Case i i i i i i i i i i i i i i i i i i i i i i i 52 Some Examples of Average Case Behavior i i i i i i i i i i i i i 6 LOWER BOUNDS 6 1 Trivial Lower Bounds i i i i i i i i i i i i i i i i i i i i i i i i 62 Lower Bounds on Speci c Operations i i i i i i i i i i i i i i i i 63 Lower Bounds on Sorting i i i i i i i i i i i i i i i i i i i i i i 7 EXHAUSTIVE SEARCH 71 Straightforward Exhaustive Search i i i i i i i i i i i i i i i i 72 Backtracking i i i i i i i i i i i i i i i i i i i i i i i i i i i i 73 Knightls Tour i i i i i i i i i i i i i i i i i i i i i i i i i i i i 74 The nQueens Problem i i i i i i i i i i i i i i i i i i i i i i i 8 HARD PROBLEMS 8 1 Classi cation of Algorithms and Problems i i i i i i i i i i i i 82 Nondeterministic Algorithms i i i i i i i i i i i i i i i i i i i 83 NP and Reducibility i i i i i i i i i i i i i i i i i i i i i i i i 84 NPComplete Problems i i i i i i i i i i i i i i i i i i i i i i 85 Dealing with Hard Problems i i i i i i i i i i i i i i i i i i i i 86 Approximation Algorithms i i i i i i i i i i i i i i i i i i i i i 87 The World of NP i i i i i i i i i i i i i i i i i i i i i i i i i i 88 Exercises i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i 9 DIFFERENCE EQUATIONS 91 First Order Equations i i i i i i i i i i i i i i i i i i i i i i i 92 Uniqueness Theorem i i i i i i i i i i i i i i i i i i i i i i i i 93 16 Order Equations i i i i i i i i i i i i i i i i i i i i i i i i 94 NonNegative Difference Equations i i i i i i i i i i i i i i i i i 941 Homogeneous Nonnegative Difference Equations i i i i i 942 Equations with Forcing Functions i i i i i i i i i i i i i 95 Homogeneous Periodic and Primitive i i i i i i i i i i i i i i i 96 Summary Nonhomogeneous Nonnegative Equations i i i i i i i 10 Worked Examples 101 All Simple Roots i i i i i i i i i i i i i i i i i i i i i i i i i i i 102 One Multiple Root i i i i i i i i i i i i i i i i i i i i i i i i i i 103 One Multiple Root7 Several Simple Roots i i i i i i i i i i i i i 104 The lnput is 7 p1n 7 p2n i i i i i i i i i i i i i i i i i i 11 Further Reading llll Books i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i 112 Articles i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i 106 108 108 109 Chapter 1 INTRODUCTION 11 What Is An Algorithm An algorithm is a procedure which solves a problem and is suitable for imple mentation as a program for a digital computer This informal de nition makes two important points First an algorithm solves a problemi A computer pro gram may not solve a problemi For example there are computer programs which never terminatei It would be very dif cult to say that such a program does anything let alone solve a problemi Second each step of the algorithm should be well de ned and be representable at least in principle by a programi For example s2 Pq would not be well de ned if q could be 0 77 nd the small est x such that Px is true77 would not be well de ned if no such x existed or if x were allowed to range over the positive and negative integers and Px were true for all negative x 77Add a dash of salt77 would not be an acceptable instruction in a computer program unless one had created a model world and a mapping which would translate the instruction into the modi cation of some location in a computerls memory For our purposes and for many purposes the above informal de nition of algorithm is suf cient but we should point out that a formal de nition of algorithm can be created This has been done by Turing Markoff and others Rogers 1967 The satisfying thing about all these formal de nitions is that they de ne the same class of algorithms If we have an algorithm in one of these senses then there is an equivalent algorithm in any one of the other sensesi Further the existence of a formal de nition means that if we run into dif culties with the informal de nition we could use the formal de nition to unravel the dif culty 12 What Is Analysis Of Algorithms The task of analysis of algorithms is threefold 1 To produce provably correct algorithms that is algorithms which not only solve the problem they are designed to solve but which also can be demonstrated to solve the problem E0 To compare algorithms for a problem with respect to various measures of resources eg time and space so that one can say when one algorithm is better than another 9 To nd if possible the best algorithm for a problem with respect to a particular measure of resource usage Each of these tasks will be examined in more detail later but rst I want to say a few words about where analysis of algorithms ts in the worlds of mathematics and computer science Mathematics has two great branches pure mathematics and applied mathematics Mathematics also has two major tech niques proof and computation It would be tempting to suggest that pure mathematics uses proof and applied mathematics uses computation but both pure and applied mathematics use both proof and computation The difference between the two branches is the world they work in Pure mathematics works in an abstract world It needs no reference to a real world Applied mathematics on the other hand assumes there is a real world and then constructs an abstract world The mapping or correspondence between entities of the real and abstract worlds form an essential part of applied mathematics Where does computer science t in Computer science is the science which deals with computation and computing devices In its theoretical branches computer science uses the mathematical techniques of proof and computation while in its more practical branches computer science uses engineering techniques and experimental tech niques Analysis of algorithms like applied mathematics assumes a real world This real world contains actual computers and actual computer programs From this real world an abstract world of abstract computers and abstract programs is constructed This construction is usually informal and exact de nitions of abstract entities are often not stated but a number of real world limitations disappear in the abstract world For example real computers have a xed nite memory size and they have an upper bound on the size of numbers which can be represented In the abstract world these limitations do not exist Finite but un ounded memories are assumed to exist No bound on the size of numbers existsi Again like applied mathematics analysis of algorithms uses the techniques of proof and computation to deal with the entities in its abstract world and like applied mathematics one must be cautious in applying results from the abstract world to the real world There are examples of algorithms which would work very quickly if arbitrarily large numbers could be used but implement ing these algorithms on real computers results in algorithms which are much slower As another example there are algorithms which can be shown to be quicker than other algorithms but only if the input is astronomically larger These examples make perfect sense in the abstract world but have little or no relevance for the real world Analysis of algorithms is the applied mathematics of computer science Whether it should be called mathematics or computer science usually depends on who is doing it If a mathematician does analysis of algorithms it may be called mathematics but if a computer scientist does analysis of algorithms it is usually called computer science 121 Proofs of correctness The rst of the three tasks of analysis of algorithms is to produce provably correct algorithmsi This task suggests that we need a methodology to both produce algorithms and to produce proofs of the correctness of these algorithms Later we will consider two design strategies 1 a to solve a problem break it into smaller problems of the same kind 2 b to solve a problem search an answer space until you nd the answer to the problem you have Proving correctness of algorithms based on a search strategy is usually rel atively easy Show that the answer to your problem is in the space show that your algorithm searches the whole space or that if your algorithm decides not to search a portion of the space then your algorithm has established that your answer cannot lie in the portion not to be searched Proving correctness of algorithms based on breaking a problem into smaller problems can most readily be done by using mathematical inductioni You estab lish that your algorithm correctly solves all small enough cases of the problemi Your algorithm will usually have a section which deals with these small cases but particularly when the small cases may be of size 0 and no action by the algorithm is required the section for small cases may not exist The other sec tion of your algorithm will deal with larger cases by breaking them into smaller cases and combining the solutions to the smaller cases to give the solution to the larger problemi You then have to show that if the smaller problems are solved correctly then your algorithm combines these solutions correctly to yield the solution to the larger problemi The proof procedure just outlined assumes that your algorithm is recur sive that is to solve large problems the algorithm calls itself to solve smaller problems but proofs by mathematical induction are not limited to recursive algorithmsi lnductive proofs are also natural when dealing with iterative algo rithms which are based on loops for example WHILE loops FOR loops DO loops or REPEATMUNTIL loopsi lnductive proofs of correctness of loop pro grams involve the creation of a loop invan39zmt a truthvalued function which is true on each iteration of the loop and which states that the loop has completed the desired computation when the loop is exitedi Such a proof also requires one to show that the loop does in fact terminatei When there are several loops the correctness of each loop must be established and then it must be shown that the correctness of each loop implies the correctness of the whole algorithmi The most dif cult part of such proofs is usually the identi cation of the loop invari antsi Without a knowledge of how and why the algorithm was designed such proofs are nearly impossible Some programming languages like EUCLID now allow the speci cation of the loop invariants and other propositions so that at least in principle it would be possible to mechanically establish the correctness of a fully annotated programi While it is often easier to establish the correctness of a recursive algorithm the overwhelming majority of programs are iterative rather than recursive Why is this true One reason is that a number of common programming languages like FORTRAN BASIC and COBOL do not permit recursioni Another reason is that many programmers believe that recursive programs are always slower than iterative programsi While this may have been true in languages like AL GOL and PLl and when recursive programs in ALGOL were compared to iterative programs in FORTRAN the speed advantage of iterative over recur sive programs has disappeared in languages like PASCAL and Cl 122 Comparing Algorithms For any solvable problem there will be an in nity of algorithms which solve the problemi How do we decide which is the best algorithm There are a number of possible ways to compare algorithms We will concentrate on two measures time and space We would like to say that one algorithm is faster uses less time than another algorithm if when we run the two algorithms on a computer the faster one will nish rsti Unfortunately to make this a fair test we would have to keep a number of conditions constanti For example we would have to code the two algorithms in the same programming language compile the two programs using the same compiler and run the two programs under the same operating system on the same computer and have no interference with either program while it is running Even if we could practically satisfy all these conditions we might be chagrined to nd that algorithm A is faster under conditions C but that algorithm B is faster under conditions Di To avoid this unhappy situation we will only calculate time to order We let n be some measure of the size of the problem and give the running time as a function of n For example we could use the number of digits as the measure of problem size if the problem is the addition or multiplication of two integers we could use the number of elements if the problem is to sort several elements we could use the number of edges or the number of vertices if the problem is to determine if a graph has a certain property We do not distinguish running times of the same order For our purposes two functions of n and gn have the same order if for some N there are two positive constants cl and Cg so that cllgnl S S Cglgnl for all n 2 Ni We symbolize this relation by gn read is order gn or is Big Theta ofgni Thus we will consider two algorithms to take the same time if their running times have the same order In particular we do not distinguish between algorithms whose running times are constant multiples of one another If we nd that algorithm A has a time order which is strictly less than algo rithm B then we can be con dent that for any large enough problem algorithm A will run faster than algorithm B regardless of the actual conditionsi On the other hand if algorithms A and B have the same time order then we will not predict which one will be faster under a given set of actual conditions The space used by an algorithm is the number of bits the algorithm uses to store and manipulate data We expect the space to be an increasing function of n the size of the problemi This space measurement ignores the number of bits used to specify the algorithm which has a xed constant size independent of the size of the problemi Since we have chosen bits as our unit we can be more exact about space than we can be about time We can distinguish an algorithm which uses 3n bits from an algorithm which uses 2n bitsi But we will not distinguish an algorithm which uses 3n 7 bits from an algorithm which uses 3n 1 bits because we can hide a constant number of bits within the algorithm itself 123 Best Algorithms The third task of analysis of algorithms is to nd the best algorithm with respect to a particular measure of resource usage This involves proving lower bounds77 that is showing that every algorithm which solves the problem must use at least so much of the particular resource To establish a best algorithm one must have both a proof of a lower bound and an algorithm which uses no more than this lower bound Here a distinction should be made between bounds for an algorithm and bounds for a problemi If one establishes an upper bound on a particular resource used by an algorithm for a problem then one has an upper bound both for the algorithm and for the problemi If one establishes a lower bound for a problem then one has a lower bound for all algorithms which solve the problemi But demonstrating a lower bound for one algorithm for a problem does not establish a lower bound for the problemi Chapter 2 TOWERS OF HANOI AN ANALYSIS OF A PROBLEM 21 The Towers of Hanoi Problem In this section we will demonstrate the threefold task of analysis of algorithms using the Towers of Hanoi problemi We have chosen the Towers of Hanoi prob lem because each of the tasks can be demonstrated rather easily and a best algorithm can be discovered Material in this section is based on Cull and Ecklund 1985 n the Towers of Hanoi problem one is given three towers usually called A B and C and n disks of different sizes Initially the disks are stacked on tower A in order of size with diskn the largest on the bottom and dish the smallest on the top The problem is to move the stack of disks to tower C moving the disks one at a time in such a way that a larger disk is never stacked on top of a smaller diski An extra constraint is that the sequence of moves should be as short as possible An algorithm solves the Towers of Hanoi problem if when the algorithm is given as input n the number of disks and the names of the towers then the algorithm produces the shortest sequence of moves which conforms to the above rules 22 A Recursive Algorithm The road to a best algorithm starts with some algorithm which one then at tempts to improve One often uses some sort of strategy to create an algorithmi A very useful strategy is to look at the problem and see if the solution can be expressed in terms of the solutions of several problems of the same kind but of smaller size This strategy is usually called divideand conqueri If the problem yields to the divideand conquer approach one can construct a recur sive algorithm which solves the problem This construction also gives almost immediately an inductive proof that the algorithm is correct Time and space analyses of a divideand conquer algorithm are often straightforward since the algorithm directly gives difference equations for time and space usage While these divideand conquer algorithms have many nice properties they may not use minimal time and space They may however serve as a starting point for constructing more ef cient algorithms Consideration of the Towers of Hanoi problem leads to the key observation that moving the largest disk requires that all of the other disks are out of the way Hence the n 7 1 smaller disks should be moved to tower B but this is just another Towers of Hanoi problem with fewer disks After the largest disk has been moved the n 7 1 smaller disks can be moved from B to C again this is a smaller Towers of Hanoi problem These observations lead to the following recursive algorithm PROCEDURE HAN0IABcn IF n1 THEN move the top disk from tower A to tower C ELSE HANOI ACBn l move the top disk from tower A to tower C HANOI BACn 1 ls this the best algorithm for the problem We will show that this algorithm has minimal time complexity but does not have minimal space complexity First though we prove that the algorithm correctly solves the problem the rst task of analysis of algorithms as outlined in the Introduction 23 HANOI is correct Proposition 1 The recursive algorithm HANOI correctly solves the Towers of Hanoi problem In the following proof we will use the two rules for this puzzle 1 Only one disk is moved at a time 2 A larger disk is never placed on top of a smaller disk For the algorithm to be correct it must obey these two rules as well as accom plish its task of moving n disks from one tower to anot er The proof is by induction see Chapter 3 on the number of disks To apply induction we rst need an inductive hypothesis Let us call this hypothesis HYPn and de ne it as follows HYPn HANOlABCn correctly moves the n disks 12 M n from tower A to tower C 7 Here as stated above correctly means that rules 1 and 2 are obeyedl The proof has TWO parts a base case here we will use n 1 as the base case and an induction step 7 showing that if the algorithm HANOl works correctly for n 7 1 disks then it works correctly for n disksl A minor but signi cant point is that the induction goes backwards that is we consider what the algorithm will do with n disks and following the steps we nd that the algorithm deals with n 7 1 disks rst and we can assume by HYPn 7 1 that the algorithm correctly obeying 1 and 2 moves n 7 1 disksl Notice further that HYPn is a truthvalued function 7 HYPn can only have the value TRUE or the value FALSEl HYPn doesnlt do anything in stead it simply observes whether or not HANOl does its job Now to the proof BASE CASE We will try to show that HYP1 is a true From the de nition of HYP HYP1 says that HANOlA B C 1 correctly moves disk 1 from tower A to tower C Now we look at the algorithm HANOl with the inputs A B C 1 The IF condition n 17 is TRUE so the algorithm executes the THEN part and does MOVEAC Assuming that MOVEAC moves the top disk off tower A and puts this disk on top of tower C and assuming that the top disk on tower A was disk 1 then HANOlA B C 1 has moved disk 1 from tower A to tower C and has obeyed rule 1 because only one disk was moved and has obeyed rule 2 because there is no disk smaller than disk 1 Luckily at this point HANOl runs out of instructions and so it terminatesl Putting these observations about HANOlA B C 1 we conclude that HYP1 is a true INDUCTIVE STEP The base case was the easy part Next we want to show that for each and every n gt 1 HYPn 71 gt HYPn we read this as HYP n1 implies HYP n This implication is a little trickier because we use the assumption that HYPn 7 1 is TRUE on two separate occasions Consider the algorithm HANOl with the inputs A B C n and n gt 1 The IF condition n 17 is FALSE so the algorithm executes the ELSE part The rst statement in the ELSE part is HANOlA C B n71 By our assumption that HYPn 7 1 is TRUE this recursive call to HANOl will correctly move the disks 12 l H n 7 1 from tower A to tower Bi Notice that our de nition of HYPn doesnlt really refer to moving disks to tower C rather it says that that the disks are moved to the tower whose name is the third input The next instruction is MOVEAC which we assumed moved the top disk from tower A to tower C Now the top disk on tower A was disk n so disk n is moved from tower A to tower Ci Clearly this obeys rule 1 since only one disk is being moved But why is rule 2 obeyed Well the only disks smaller than disk n are disks 1 2 i i i n 7 1 and by the previous recursive call all these disks were moved onto tower B so putting disk n on tower C cannot put it on top of a smaller disk The next instruction is the recursive call HANOlB A C n71 and by HYPn 71 this call correctly moves disks 1 2 H n 71 from tower B to tower Ci Notice that this movement obeys rule 1 because by HYPn 7 1 only one disk at a time is being moved Further rule 2 is obeyed because only disks 12 l H n 7 1 are being moved and by HYPn 7 1 none of the disks from among 12 M n71 are ever placed on a larger disk from among 1 2 Hi n71 and since any disks on towers A and C at the beginning of this recursive call are larger than n 7 1 there can be no problem in placing a disk from among 12 l H n 71 on top of the disks that began on towers A and C In sum the action of the recursive call the move and the second recursive call result in the disks 1 2 i H n 7 1 are moved from tower A to tower B then disk n is moved to tower C and nally the disks 1 2 i H n 7 1 are moved from tower B to tower C At this point the algorithm runs out of instructions and terminates and so we conclude that if HYPn 7 1 is TRUE then HYPn is TRUEi This is the END of the Proof We may want to notice that proving that this algorithm terminates nishes stops was easy because of the recursive structure As ong as a recursive algorithm calls itself a nite number of times on smaller inputs the algorithm must terminatei We should contrast this with some iterative algorithms with loopsi For such algorithms it may not be obvious that the loops will in fact nishi So a separate proof of termination may be necessary We should also note that we had to check to make sure that rules 1 and 2 were obeyedi We had to make sure that moving disk n and then moving the n 7 1 smaller disks did not cause a violation of these rules 24 HANOI makes the fewest possible moves In a later section we will calculate the minimum number of moves and use this to argue a minimum time for any algorithm which correctly solves the Towers of Hanoi puzzler Here we want to argue that the algorithm HANOl does use the fewest possible moves even though we don7t know what that minimum is e make use of a bottleneck argumenti Clearly to move disk n off tower A all of the disks 1 2 H i n 7 1 must be off tower Al Further to move disk n from tower A there must be a tower other than tower A which contains none of the disks 12 l H n 7 1 since we are not allowed to place disk n on top of any of the disks l2 n 7 ll In pictures 1 l n n71 n n71 A B C But we notice that either of these situations if they are a part of a minimal move solution involve the minimal number of moves for a Towers of Hanoi problem with n 7 ll Clearly it takes at least 1 move to move disk n to tower C and just before that move we must have 1 1 OR So after disk n is moved we still have to solve a Towers of Hanoi problem with n 7 l disksi Putting these possibilities together we see that the minimal sequence of moves involves the following con gurations of the puzz e 1 n71 n 5 n A B C A B C But this is exactly the sequence of moves followed by HANOL Here we are really doing an inductive proofi Our inductive hypothesis is HANOIABCn makes the fewest possible moves in moving n disks from tower A to tower C 7 Our argument so far is that if Minn7 l is true then is true To nish the induction we need a base case But it is easy to see that is true because at least one move is needed to move one disk from tower A to tower C and HANOIABC1 does I hope clearly use exactly one mover 241 Bottleneck As we said this is a bottleneck argumenti Where s the bottleneck The con gurations AND can be considered as the bottleneck because any minimal move solution must use these two two con gurations in sequence 7 one after39 the other We could also say that n H71 11 n71 A B C A B C would constitute a bottleneck since any solution must use at least one of these con gurations to move disk n off of tower Al n general we can say that a subproblem SP is a bottleneck for problem P if every solution to P must also solve SP But this can get a little diceyi For example if P is the problem of SORTING n items we could let SP be the problem of SORTING the smallest n 7 1 items since to SORT n items we clearly have to SORT the n 7 l smallest items The dicey part is that the n 7 l smallest items may not be sorted until all n items are sorted 25 Time and space used We would like to calculate the running time of HANOl but we don7t know how long various operations will take How long will it take to move a disk How long will it take to subtract 1 from n How long will it take to test if n 1 How long will it take to issue a procedure call Because we only wish to calculate time to order we don7t have to answer these questions exactly but we do have to make a distinction between operations which take a constant amount of time independent of n and operations whose running time depends on n One possibility is to assume that each operation takes constant time inde pendent of n AHU Aho Hopcroft and Ullman 1974 calls this assumption the uniform cost criterioni With this uniform cost assumption and letting Tn be the running time for n disks we have the difference equation Tn 2Tn 71 5 because there are 2 calls to the same procedure with n 7 1 rings and c is the sum of the constant running times for the various operations Letting T1 be the running time of the algorithm for 1 disk we nd Tn T1 c2n 1 7 c which can be veri ed by direct substitutioni This gives T01 907 since E2 S W S ltTlt1gt c5 2 2 Another possibility is to assume that some of the operations have running times which are a function of n But which function of n should we use Each of the numbers in the algorithm is between 1 and n and the disks can also be rep resented by numbers between 1 and n Since such numbers can be represented using about logn bits it seems reasonable to assume that each operation which manipulates numbers or disks has running time which is a constant times log n AHU calls this the logarithmic cost criterion and suggests using it when the numbers used by an algorithm do not have xed boundsi Using the logarithmic cost criterion we have the difference equation Tn 2Tn 71 clogn for the running time of the algorithmi This difference equation has the solution Tn 2 which can be veri ed by substitution Since the summation in this solution converges as one can demonstrate by the ratio test and assuming that the constants are positive we have Tn 92 Since both cost criteria give the same running time we conclude Proposition 2 The algorithm HANOI has running time 92 Although we have established the running time for a particular algorithm which solves the Towers of Hanoi problem we have not yet established the time complexity of the problemi We need to establish a lower bound so that every algorithm which solves the problem must have running time greater than or equal to the lower bound We establish 92 as the lower bound in the proof of the following proposition Proposition 3 The Towers of Hanoi problem has time complexity 92 Proof Following the proof of Proposition 1 a straightforward induction shows that the minimal number of moves needed to solve the Towers of Hanoi problem is 2 7 1 Since each move requires at least constant time we have established the lower bound on time complexity An upper bound for the time complexity of the problem comes from Proposition 2 Since the upper bound and lower bound are equal to order we have established the 92 time complexity of the problemi 1 Now that we know HANOlls time complexity we would like to consider its space complexityi First we will establish a lower bound on space which follows from the lower bound on time Proposition 4 Any algorithm which solves the Towers of Hanoi problem must use at least n constant bits of storage Proof Since the algorithm must produce 2 7 1 moves to solve the problem the algorithm must be able to distinguish 2 different situations If the algorithm did not distinguish this many situations then the algorithm would repeat a situation But the algorithm would behave the same way the second time it reached the repeated situation as it had the rst time So the algorithm would repeat the situation again and again and again and never halt The number of situations distinguished by an algorithm is equal to the num ber of storage situations times the number of internal situations within the algorithmi Since the algorithm has a xed nite size it can have only a con stant number of different internal situations The number of storage situa tions states is 2 to the number of storage bitsi Thus cQBITS 2 2 and so BITS 2 n 7 logc n constant In order to discuss the space complexity of the recursive algorithm let us now consider the data structure used Two possible data structures are the array and the stack An array is a set of locations indexed by a set of consecutive integers so that the information stored at a location in the array can be referenced by indicating the integer which indexes the location For example the information at location I in the array ARRAY would be referenced by ARRAYH A stack is a linearly ordered set of locations in which information can be inserted or deleted only at the beginning of the stack The towers could each be represented by an array withnlocations and each location would need at most log n bitsi So an array data structure with n log n bits would suf ce Alternately each tower could be represented by a stack Each 15 stack location would need logn bits so again this is an n log n bit structure Actually a savings could be made Since only n disks have to be represented the stack structure needs only n locations versus the Sn locations used by the array structure Another possible structure is an array in which the ith element holds the name of the tower on which the ith disk is located This structure uses only n bitsi Yet another possibility is to not represent the towers but to output the moves in the form FROM TO i Thus we could use no storage for the towers The recursive algorithm still requires space for its recursive stacki When a recursive algorithm calls itself the parameters for this new call will take the places of the previous parameters so these previous parameters are placed on a stack from which they can be recalled when the new call is completed Also placed on the stack is the return address the position in the algorithm at which execution of the old call should be resumed All of this information the parameters and the return address for a single call are referred to as a stack frame At most n stack frames will be active at any time and each frame will use a constant number of bits for the names of the towers and logn bits for the number of disks So the recursive algorithm will use n log n bits whether or not the towers are actually We uuuuaii 5 these 39 39 by the following proposition Proposition 5 The recursive algorithm HANOI correctly solves the Towers of Hanoi problem and uses 92 time and nlogn space The recursive algorithm uses more than minimal space We are faced with several possibilities 1 Minimal space is only a lower bound and is not attainable by any algo rithm to V Minimal space can only be achieved by an algorithm which uses more than minimal time 3 Some other algorithm attains both minimal time and minimal space By developing a series of iterative algorithms we will arrive at an algorithm which uses both minimal time and minimal space 26 Improved Algorithms As a rst step in obtaining a better algorithm we will consider an iterative algorithm which simulates the recursive algorithm for n 2 2 In this algorithm RECURSIVE SlM we have chosen to explicitly keep track of the stack counter because this will aid us in nding an algorithm using even less space PROCEDURE RECURSIVE SIM ABCn L11 A L21 C L31 B 16 NUM1 n1PAR1 1 PARO 1 WHILE I 2 1 DO IF NUMI gt 1 THEN L1I1 2 L1 I L2I1 L3I L3I1 L2I NUMI1 NUMI 1 PARI1 2 I2 I1 ELSE MOVE FROM L1 I TO L3 I H WHILE PARI 2 DO I I1 IF I 2 1 THEN MOVE FROM L1I TO L2I PARI 2 TEMP L1 I L1 I L3 I L3 I L2 I L2 I TEMP The names of the towers are stored in the three arrays L1 L2 L3 the number of disks in a recursive call is stored in NUM and the value of PAR indicates whether a call is the rst or second of a pair of recursive calls RECURSIVE SIM sets up the parameters for the call HANOIACBn Ii When the last move for this call is made the arrays will contain the param eters for calls with I through n 7 2 disks where each of these calls will have PAR2i The arrays will still contain the parameters for the ACBn I call with PARIi The inner WHILE loop will pop each of the calls with PAR2 leaving the array counter pointing at the ACBn I calli Since I will be I at this point the IF condition is satis ed and the MOVE FROM L1I TO L2H accomplishes the MOVE FROM A TO C of the recursive algorithm HANOIT The following assignment statements set up the call BACn I with PAR2i So when the moves for this call are completed all of the calls in the array will have PAR2 and the inner WHILE loop will pop all of these calls setting I to 0 Then the IF condition will be false so no operations are carried out and the outer WHILE condition will be false so the algorithm will terminate Proposition 6 The RECURSIVE SIM algorithm correctly solves the Towers of Hanoi problem and uses 92 time and 9nlogn space Proof Correctness follows since this algorithm simulates the recursive algorithm which we have proved correcti he major space usage is in the arrays Since each time I is incremented the corresponding NUMI is decremented and since NUMI never falls below I there are at most n 7 1 locations ever used in an array The four arrays L1 L2 L3 and PAR use only a constant amount of 17 space for each element but NUM must store a number as large as n 7 1 so it uses log n bits for an element Thus the arrays use nlogn bitsi Now we have to argue about time usage Most of the operations deal with constantsized operands so these operations will take constant time The ex ceptional operations are incrementing decrementing assigning and comparing numbers which may have log n bitsi A difference equation for the time is Tn 2Tn 71 Clogn where Tn is the time to solve a problem with n disks and C logn is the time for manipulating the numbers with log n bitsi As in the proof of Proposition 2 we have Tn 92 Notice that this algorithm does not improve on the recursive algorithm but study of this form can lead to a saving of spacer Storing the array NUM causes the use of n log n space If we did not have to store NUM the algorithm would use only n space Do we need to save NUM NUM is used as a control variable so it seems necessary But if we look at NUM1 1 we get n 1 When NUM11 is set it is set equal to NUMH 1 but then NUM1 1 1 1 NUMH 1 n 1 Thus the information we need about NUM is stored in 1 and n 1 So if we replace the test on NUMH 1 with a test on 1 n1 we can dispense with storing NUM and improve the space complexity from nlogn to This replacement does not increase the time complexity of any step in the algorithm so the time complexity remains 92 Our new procedure is PROCEDURE NEW SIM ABcn 1 L11 A L21 C L31 B PAR11 1 PAR01 1 WHILE I 2 1 D0 IF I y nl THEN L1I1 L1I L2I1 L3H L3I1 L2I PAREI1J 1 I I1 ELSE MOVE FROM L1H TU LSEI WHILE PAREI 2 D0 I 1 IF I 2 1 THEN MOVE FROM L1I T0 L2I PAREI 2 TEMP L1I L1H L3H L3 I L2H L2 I TEMP From the above observation we have Proposition 7 NEW SIM correctly solves the Towers of Hanoi problem and uses 92 time and n space Although we have reached n space we would like to decrease the space even further hopefully to nconstant bitsl If we look at the array PAR we nd that the algorithm scans PAR to nd the rst element not equal to 2 replaces that element by 2 and then replaces all the previous 27s by is This is analogous to the familiar operation of adding 1 to a binary number in which we nd the rst 0 replace it by a l and replace all the previous 17s by Olsl So it seems that we can replace the array PAR by a simple counter The number of bits in the counter will of course depend own So far this has not resulted in any saving of spacer Will there be enough in formation in the counter to determine from which tower we should move a disk The af rmative answer will enable us to achieve a minimal space algorithml To motivate the design of our minimal space algorithm we will examine the sequence of 31 moves needed to solve the problem with 5 disksl This sequence is shown in Table 1 TUWERO 12345 ECIMAL D TUWERl TUWERZ COUNT HH 12345 AIMHO lmm Table 1 Towers of Hanoi Solution for 5 disks 20 DISK bHMH WHICH mHMH WHICH bHMH WHICH WHMH H FROM OMOO HHMM HOHH OMOO MHMM OOHH OMOO H TU MOOH OMMO MHHM HOOD l HMMO MHHM MHHM O Every other move in the solution involves moving disk ll So if we know which tower contains disk 1 we would know from which tower to move in alternate moves but we might not know which tower to move to When we consider the three towers to be arranged in a circle we see from Table 1 that disk 1 always moves in a counterclockwise direction when we have an odd number of disks Similarly disk 1 always moves in a clockwise direction when we have an even number of disks Thus by keeping track of the tower which contains disk 1 and whether n is odd or even we would know how to make every other move For the moves which do not involve disk 1 we know that the move involves the two towers which do not contain disk 1 Looking again at Table l we see that the odd numbered disks always move in the same direction as disk 1 and that the even numbered disks always move in the opposite direction So knowing the towers involved and whether the disk to be moved is odd or even would allow us to decide which way to move Can we determine from a counter whether the disk being moved is odd or even If we look at the COUNT column of Table l we see that the position of the rightmost 0 tells us the number of the disk to be moved Thus a single counter with n bits is suf cient to solve the Towers of Hanoi problemi We use these facts to construct the algorithm which follows PROCEDURE TOWERSn T2 0 TOWER NUMBER COMPUTED MODULO 3 COUNT 0 COUNT HAS n BITS P2 1 if n is even 1 if n is odd WHILE TRUE DO MOVE DISK 1 FROM T TO TP COUNT COUNT 1 IF COUNT ALL 1 s THEN RETURN IF RIGHTMOST O IN COUNT IS IN EVEN POSITION THEN MOVE DISK FROM TP TO TP ELSE MOVE DISK FROM TP TO TP COUNT COUNT 1 ENDWHILE The storage used for COUNT has n bits and we have called the rightmost bit position 1 The positions from right to left are then odd even odd evenim Remarks We can still improve this algorithm by removing the rst COUNT I COUNT 1 statement and deleting the rightmost bit of COUNT This would also require changing the numbering of the bits in COUNT so that the rightmost bit is bit 0 An algorithm similar to our TOWERS has been published by T Rf Walsh 1982 We have to show that TOWERS correctly solves the Towers of Hanoi prob lemi We do this by proving that a certain sequence of moves has been ac complished when COUNT contains a number of the form 2k 7 1 so that when 21 h n the sequence of moves for HANOlABCn has been completed and the procedure Will terminate since COUNT contains all 17s Proposition 8 When COUNT 2k 71 that is COUNT 00m01w1 with 161 s then if h nMOD2 the correct moves for HANOTACBk have been completed and T contains 1 which represents B h nMOD2 the correct moves for HANOTABCk have been completed and T contains 2 which represents C Proof If h 1 the single move T to T P has been completed Which is A to C if n is odd and is A to B if n is even and T contains T P Which is 2 if n is odd and is 1 if n is even This agrees With our claimi Notice that COUNT can only take on the value 21671 immediately before the IF RETURN statementi Assume that the moves for either HANOlABCk or HANOlACBk have been completed If h nMOD2 the next move Will be A to C since by assumption T now contains 1 if k is odd the move is T P to T P Which is 1 1 to 1 1 Which represents A to C and if k is even the move is T P to T P Which is 1 1 to 1 l Which represents A to C If h nMOD2 the next move Will be A to B since by assumption T now contains 2 if k is odd the move is T P to T P Which is 2 l to 2 1 Which represents A to B and if k is even the move is T P to T P Which is 2 1 to 2 1 Which represents A to B Next COUNT Will be incremented to 0M010M0 iiei k trailing 07s When COUNT 21161 7 1 the algorithm Will have repeated the same sequence of moves as before since it only 77sees77 the rightmost information in COUNT With the difference that T Will have started With a different value The different starting value of T Will result in a cyclic permutation of the labels If h nMOD2 then the completed moves Will be HANUI AcBk A to C HANUI BACk giving HANOl ABCk1 With h1 nMOD2 and T Will contain 2 ie 1 1 If h nMOD2 then the completed moves Will be HANUI ABCk A to B HANUI CABk giving HANOl ACBk1 With h1 nMOD2 and T Will contain 1 ie 2 2 1 Proposition 9 The algorithm TOWERS uses 92 time and n constant hits of space Proof For space usage there are n bits in COUNT and a constant number of bits are used for T and P For time the initialization takes n and the WHILE loop is iterated 2 7 1 times If each iteration took a constant amount of time we would have 92 but the test and increment instruction on COUNT could take time n giving nQn So we have to show that only 92 time is used If the value in COUNT is even then incrementing and testing will only require looking at one bit If the value in COUNT is odd and COUNT 7 l2 is even then the algorithm only looks at 2 bits In fact the algorithm will look at k bits in COUNT in 2n k cases Thus the time used will be mi WM 92 k1 since 221 16246 converges We summarize these results in the following theorem Theorem 1 Any algorithm which solves the Towers of Hanoi problem for n disks must use at least 92 time and n constant bits of storage The algorithm TOWERS solves the problem and simultaneously uses minimum time and minimum space Exercises Ex 21 For the following algorithm for the Towers of Hanoi problem prove that the algorithm is correct and compute its time and space PROCEDURE HANUI ITERATIVE ABCn IF 11 mod 2 0 THEN MUVE112 A ELSE MUVEEl A gt lgt l 30 am n n1K 2K IF 11 mod 2 0 THEN MUVEEK A TO B L1 C L22 A L3 ELSE MUVEEK A TO C L1 B L22 C L3 A FUR I 1 T0 Kl DU CASE MUVEEI 0F A TO B MOVEKI 2 L1 TO L2 A TO C MOVEKI L1 TO L3 B TO A MOVEKI L2 TO L1 B TO C MOVEKI L2 TO L3 C TO A MOVEKI 2 L3 TO L1 C TO B MOVEKI 2 L3 TO L2 Hints For correctness you may want to introduce a new variable and prove a statement which says that on each iteration of the WHILE loop the new variable increases or if you want decreases and that at the end of each iteration a Hanoi problem whose size depends on the new variable has been solved You will need to give the tower names for the problem which has been solved You will also need to specify the value of the new variable For space you should know that the algorithm is storing each move in the array MOVE For time you may want to consider both the uniform and the logarithmic cost measuresl Ex 22 Buneman and Levy 1980 The following is an algorithm for the Towers of Hanoi probleml Prove its correctness and compute its time and space MOVE SMALLEST DISK ONE TOWER CLOCKWISE WHILE A DISK OTHER THAN THE SMALLEST CAN BE MOVED DO MOVE THAT DISK MOVE THE SMALLEST DISK ONE TOWER CLOCKWISE ENDWHILE Hints For correctness you should be careful since this algorithm only solves the original Towers of Hanoi problem when the number of disks is even You will probably want to introduce a new variable and prove a statement about the con guration of the disks when the number of moves completed is a speci c function of your new variable For time and space the above algorithm is incomplete since it doesnlt specify the data structure used to determine if a disk can be moved You might consider representing each tower by a stack of integers with the integers representing the disks on the tower Alternately you might consider representing the information by an array DISK so that DISK 1 contains the name of the tower which contains the 1 largest diskl You may also find it useful to show that the 1 disk is moved 27 times Chapter 3 SOME INDUCTIONS 31 What Is Induction The most important method for proving things in computer science is induc tizmi This method is sometimes called the method of mathematical induction to distinguish it from the philosophical or empirical method of induction ln empirical induction we note that a certain proposition is true in individual in stances and we generalize or induce or induct that the proposition is true for all instances For example a philosopher notices that the sun has risen for each and every morning of the philosophers life and so the philosopher inducts that the proposition the sun rises every morning77 is always true After framing this proposition the philosopher predicts that the sun will rise the next morning and when the sun does in fact rise the philosopher is happy to say I told you sol77 The philosopher spends his mornings after the sun has come up bird watching He has noted that he often sees swans and that every swan he has ever seen has been a white swani He therefore makes the statement All swans are white77 As in the sun example he predicts that the next swan he sees will be white and he is grati ed when the next swan he sees is indeed whiter Of course the philosopher is willing to admit that if he ever sees a black swan then his inductively veri ed proposition will be shown to be false But for some reason the philosopher still rmly believes that the sun will rise tomorrowi We will now take leave of the philosopher and leave to your philosophy professor the problem of explaining how the two empirically veri ed propositions the sun rises every morning and all swans are white differ from a philosophical perspectivei Mathematical induction differs from philosophical induction in that mathe matical induction is about a wellde ned mathematical object the set of natural numbers If we can prove something by mathematical induction there is no pos sibility that the proved statement might be false tomorrow or that it could be proved false by an empirical observationi Mathematical induction deals with propositionsi A proposition is a state ment that is either true or false To apply mathematical induction we need a propositional function Pn which takes as input a natural number n and gives as output one of the two values true or false Mathematical induction says that if 1 P0 is true 2 and for all n such that n 2 0 Pn implies Pn 1 then Pn is true for all n 2 0 This is not an empirical observationi It is in fact part of the de nition of natural number In the following sections we will give some simple examples of mathematical inductioni 32 Inductions for easy Sums A sum ELO n may be equivalent to a closed form expression For example EEO 1 n 1 To prove by induction one must rst decide what one is trying to prove Here one is given a sum and an expression and the claim is that the sum and the expression represent the same function That is ifl put in a value say 10 for n and evaluate the sum 1 will get some number say 5 and when I put 10 into the expression and evaluate it I will get a number 910 The claim is that S 910 that is I get the same value whether I use the sum or the expression So to frame the inductive hypothesis we use Pn 20 gn i Notice that the right side delimited by the quotation marks is is a relational expression which claims that what is on the left side of the the relational operator is equal to what is on the right side of the i To confuse things as much as possible the two signs serve different roles While the inside the quotes is a relational operator the after Pn indicates a de nition So while the inside could have the value true or false depending on whether or not Elo n does equal gn the outside just says that Pn is de ned by the whole quoted expressioni So Pn can be true or false depending on the truth or falisity of the the quoted expressioni Of course Pn is a function of the natural number variable n so for example P0 could be true Pl could be false 132 could be false and P3 could be true What we are trying to prove is that Pn is always true that is for whatever value of n we choose when we evaluate Pn we will get truer At this point we may want to throw up our hands because we are faced with an in nite number of tasks that is showing that P0 is true Pl is true 132 is true and so forth The principle of mathematical induction allows us to break this in nite amount of work into a few pieces and if we can nitely show each of these pieces we will have almost magically accomplished an in nite amount of work The principle of mathematical induction states 1 if P0 is true 2 and if for all n 2 0 Pn implies Pn 1 26 3 then Pn is true for all n 2 0 Clearly this form isolates an in nite amount of work in step We have to gure out how to do this in nite work with a nite and we hope small amount of effort The trick is that for some Pn7s we can treat n as an unknown variable and use a nite amount of work to show Pn 1 from If we don7t have to say what value n has we will really be accomplishing an in nite amount of work with nite efforti Let us return now to our sum examples and let Pn 201 n 177 For the base step we want to show that P0 is true Of course P0 20 1 0 177 after we have substituted 0 for n Now 20 1 evaluates to 1 and 0 1 also evaluates to 1 Hence P0 has the same truth value as the expression 1 1 and since we know that 1 equals 1 we conclude that P0 is true To handle the in nite amount of work for step 2 which is called the induc tive step we write out Pn 1 which is Pn 1 2251 n 1 177 Now we assume that Pn is true and try to use Pn to show Pn 1 Pn states that 1n1i MS H 0 With a little luck or insight we will try to look for something in Pn 1 that n1 will simplify if we can use this relation Consider EEO 1 we can break this sum apart because n1 n 1 lt2 1gt r i0 i0 Now we can use Pn to replace the term in the parentheses and so n1 Z 1 n 1 1 i0 But miracle of miracles this is just the relation that we had to show to show that Pn 1 is true Notice that we were able to treat n as an unevaluated variable except that we really needed n 2 0 to expand the summation so we have shown that VnZ 0 Pn Pn1i and so we can conclude by the principle of mathematical induction that Pn is true for all n 2 0 Said another way we conclude that Basically all summation examples work like this simple example We start with the inductive hypothesis Pn 20 gn i We then show the base case P0 by demonstrating that 90 which is what we want 27 because 20 Then7 we proceed to show that Pn Pn l by wr1t1ng n1 n ZN LEM fn1 and using Pn to replace Elo i by gn7 which gives gn 1 We are now left with showing that gn l gn 1 This step is usually a matter of algebraic manipulation For example7 to show that 20 239 nn 127 we would use gn nn 127 gn 1 n ln 227 and n 1 Then a simple algebraic maniplulation shows that 9nfn1 n1 nn l 2 gn 1 Ex 31 Use induction to show that Z7322 nn 162n 1 390 Ex 32 From the preVious examples7 conjecture a formula gn for i3 M H 0 Use mathematical induction to show that your formula is correct Ex 33 Use induction to show Ex 34 Use induction to show n 31 3W1 712 i0 Ex 35 Use the last two exercises to conjecture a formula for 2 4i Verify your formula using induction Ex 36 Use these examples to conjecture a formula for 2210 all7 where a is any number except 1 Verify your formula by induction 33 Euclid s Algorithm One of the oldest algorithms is Euclidls method for computing the greatest common divisor of two nonnegative integers The greatest common divisor of a and b is the largest integer that divides both a and 12 When we say that d divides a we mean that the quotient ad is an integer and that the remainder is 0 Said another way d divides a when a is an integer multiple of d that is there is an integer n so that a ndi This de nition does not work in the very special case when a b 0 because every integer divides 0 For this special case we de ne the greatest common divisor of 0 and 0 to be 0 This should not cause any dif culty because 0 is not a divisor of any other number Euclidls algorithm can be written in recursive form as GCDab a Tf b 0 GCDba mod 12 1f 12 f 0 We need to explain the operation a mod 12 We assume that a is an integer and that b is a positive integer We do not allow 12 0 We want a mod I to give us the remainder when a is divided by 12 If a lt b then a mod I a If a can be written as a nbr where 0 S r S 1271 and n 2 0 then amodb r The traditional division algorithm will produce the desired ri Notice that if a is a multiple of b then r 0 ie a mod 0 We now use mathematical induction to show that Euclidls algorithm does compute the greatest common divisor For notational purposes let 9 be the greatest common divisor of a and 12 We want to show that GOD a 12 produces g as output As a rst step in the inductive proof we need to choose an inductive variable and an inductive hypothesis Recall that a proposition is a statement that is either true or false For example 22 4 is a proposition which is true while 2 2 5 is a proposition which is false There are of course English sentences like Do your homework77 that are not propositions because one does not assign a true or false value to thes sentences The confusion between propo sitions and nonpropositions forms the basis for Abbot and Costello s famous Who s on rst77 routine Since Euclidls algorithm has two variables a and b it may not be obvious which one should be chosen as the inductive variable but the form of the algorithm having an output for b 0 and a recursive call for I f 0 suggests that I should be chosen as the inductive variable We then frame the inductive hypothesis as a function of 12 The inductive hypothesis will take a nonnegative integer as input and produce an output which is in true false For Euclid s algorithm we can use the inductive hypothesis E02 GCDa b calculates the greatest common divisor of a and b 7 GCDab 9quot I ve put in quote marks so that it s easier to see the proposition Notice that GCDab g7 has a true or false value We want to show that this proposition has the value true for all natural numbers a and 12 Now that we 29 have the inductive hypothesis E02 we want to show that E02 is true for small values of 12 This is called proving the base case In simple situations like this one we need only consider one small value of 12 in this case 12 0 So our base case is E0 and we want to show that E0 is true If 12 0 what is the largest integer which divides both a and 12 Since 12 0 all integers divide 12 If a gt 0 the largest integer which divides a is a So the largest integer which divides both a and 12 is a In the special case when a 0 every integer divides a but we have de ned the greatest common divisor to be 0 for this special case Now when 12 0 Euclidls algorithm outputs a In the special case a 0 this agrees with our de nition of greatest common divisor Also when a gt 0 a is the greates common divisor of a and I2 and Euclidls algorithm correctly outputs a Hence E0 is true and we have proved the base case ow we proceed to the inductive step We want to show that for all 12 2 0 E02 is true We do so by showing that E0 and El and E2 and and E02 7 1 all being true allows us to conclude that E02 is true We assume that 12 gt 0 and look at what the algorithm does In this case GCDal2 calls GCD02 a mod 12 What do we know about a mod 12 We know that a mod 12 is an integer and that 0 S amodb lt 12 So Ea mod 12 is somewhere among E0 El E2 E02 7 l and we have assumed that each of these propositions is true Hence we may assume that Ea mod 12 is true that is regardless ofwhat the rst argument is GCD rst argument a mod 12 produces the greatest common divisor of rst argument and a mod 12 We are left with showing that the greatest common divisor of 12 and a mod 12 is in fact the same as the greatest common divisor of a and 12 We can write a as anl2amodl2 Now if d divides the left side of this equation then d also divides the right side of this equation and viceversa Further if d divides zy and d divides I then d must also divide y Hence any divisor of a and 12 must also be a divisor of 12 and a mod 12 Conversely any divisor of 12 and a mod 12 must also be a divisor of a and 12 So the set of divisors of a and 12 is the same as the set of divisors of 12 and a mod 12 and the largest element in this set is both the greatest common divisor of a and 12 and is also the greatest common divisor of 12 and a mod 12 Hence we have shown that if E0 El E2 E02 7 l are all true then E02 must also be true Now since we have shown that E0 is true and that E0 El E2 E02 7 1 implies E02 we can conclude by the principle of mathematical induction that E02 is true for all 12 2 0 ie for all nonnegative integers 12 Ex 37 Show that the least common multiple of a and 12 can be computed easily if you know the greatest common divisor of a an Ex 38 Whats wrong with the following induction To show that all horses are the same color BASE P0 is true because aall the horses in a set with no horses are the same color as all the other horses in that set 30 INDUCTIVE STEP Pn gt Pn 1 Consider a set of n1 horsesi Remove one horse leaving n horses in the set By Pn7 all these n horses have the same color7 say browni Next remove another horse and put the n 15 back in the set Again by Pn these n horses have the same color7 say browni So all the n 1 horses have the same color brown and Pn 1 is true Ex 39 A hypercubeof dimension n has 2 vertices Where each vertex is labeled by one of the n bit binary vectorsi Two vertices have an edge between them exactly When their vectors differ in exactly one bit position Use induction to show that for n 2 2 the n dimensional hypercube has a Hamil tonian circuiti Chapter 4 DIVIDEANDCONQUER 41 What is Divideand Conquer The algorithm design strategy which breaks a given problem into several smaller problems of the same type is usually called the divideand conquer strategy In the previous section the recursive algorithm for the Towers of Hanoi problem is an example of a divideand conquer algorithmi Given a problem with n disks this algorithm converts it into two problems with n 7 1 disksi Each of the subproblems is successively broken into subproblems until problems which can be solved immediately are reached The Towers of Hanoi algorithm continues forming problems with fewer disks until it reaches problems with 1 disk which can be solved immediatelyi After the subproblems are solved the divideand conquer algorithm then combines the solutions of the subproblems to give a solution to the original problemi 1n the Towers of Hanoi example there is no explicit combining because the necessary combination is simply to solve one subproblem after the other subproblem has been solved This combination is handled by the ordering of the statements in the algorithmi The recursive structure of a divideand conquer algorithm leads directly to an inductive proof of correctness and also gives directly a difference equation for the running time of the algorithmi Consider designing by divideand conquer an algorithm to sort the elements of an n element arrayi One way to do this is to nd the largest element in the array interchange it with the last element of the array and then sort the remaining n 7 1 element array This algorithm could be written as PROCEDURE SURTn gt 1 THEN LARGESTn SURTn1 where LARGEST is an algorithm which handles the largest elementi 1f LARGEST works correctly then it is easy to prove that SORT works correctly Similarly if we know how many comparisons LARGEST used then we could compute the number of comparisons used by SORT from the formula 5n 5n71 Ln where 5n is the number of comparisons used by SORTn 5n71 is the num ber of comparisons used by SORTn 1 and Ln is the number of comparisons used by LARGESTni It is easy to design LARGESTn so that it uses exactly n 7 1 comparisonsi This gives the difference equation 5n 5n 7 1 n 7 1 When there is only 1 element in the array SORT does nothing This gives the initial condition 51 0 It is easy to check that 5n nn 7 12 satis es both the difference equation and the initial condition This recursive sorting algorithm can be easily converted to an iterative algo rithm because the recursive algorithm is tailrecursive that is the only time the algorithm calls itself is at the end of the algorithmi The corresponding iterative algorithm is FOR I n DUWNTU 2 D0 LARGESTI It is still easy to write an inductive proof of the correctness of this algorithmi The number of comparisons used by this iterative algorithm can be computed by ZLU 21 71 nn 712i 17 17 Both the recursive and iterative sorting algorithms use the same number of comparisons They also both use space to store the original array The recursive algorithm has the disadvantage that it uses a stack to keep track of the recursion This stack requires some space Further the recursive algorithm spends some time in manipulating this stacki So in this case the iterative algorithm would be preferred to the recursive algorithmi 1n the above example we have broken a problem of size n into a single problem of size n 7 1 Instead we could try to break the problem of size n into two problems of size nQi If we could solve the two problems of size nZ then we would be left with the problem of combining two sorted sequences of size n2 to form a single sorted sequence of size n Let us assume that the algorithm MERGE takes as input two sorted sequences and outputs a single sorted array which contains all the elements of the input From the MERGE algorithm we can construct a divideand conquer algorithm MERGESORT PROCEDURE MERGESORTA n F n gt 1 THEN BREAK A into two arrays A1 amp A2 EACH 0F SIZE n2 MERGESURTA1 n2 MERGESURTA2 n2 MERGEA1 A2 33 As usual it is easy to construct an inductive proof of the correctness of this algorithmi To calculate the number of comparisons used by this sort we need to know the number of comparisons used by MERGE Although this number will depend on which elements are actually in the two subarrays Al and A2 it is clear that at most n 7 l comparisons are used because each comparison results in an element being merged into its proper place in the output arrayi So in worst case the number of comparisons used by MERGESORT is given by the difference equation Cn 2Cn2 n 71 because the algorithm with input of size n calls itself twice with input of size n2 and MERGE uses at most n 7 l comparisonsi For an initial condition we have Cl 0 since the algorithm does nothing when n 1 The solution to this equation is Cn nlogn7nl where log means logarithm to the base 2 The solution can be easily veri ed using inductioni This solution can also he written as Cn nlogn If the number ofcomparisons is the measure of resource usage then MERGE SORT is preferred to the previous sorting algorithms because MERGESORT is n log n while the other sorting algorithms are n2i 42 The General Form for Divideand Conquer Algorithms ln SORT we decreased the size of the problem by 1 whereas in MERGESORT we divided the problem in half Algorithms like SORT are sometimes called subtractand canqueri Algorithms like MERGESORT are considered to be true divideand conquer algorithms In most of our examples a problem will be split into two subproblems of the same type that are half the original size Here we generalize this slightly and assume that a problem of size n is split into several say a subproblems each of size nc for some constant c For many divideand conquer algorithms 5 2 holds but the same analysis works for general c In some cases the splitting process is easy and has negligible computational cost but we want to allow for the possibility that splitting takes some time Usually the timeconsuming part of a divideand conquer algorithm is the com bining step when the answers to subproblems are used to compute the answer to the whole problemi Here is the general form of a divideand conquer algorithm The general form of a divideandConquer algorithm PROCEDURE DandCDATAnSULUTION E n is small m Solve by some special algorithm SPLITDATAn into D1nc DandCD1nc51 D21 f2DATA7D17 51 DandCD2nc52 D31 f3DATA7D17 517132752 Da 1 faltDATA7D17 5171327 527 l l 7Da717 Sail DandCDanCSa SULUTI0N2fa1DATA D1 51 D2 5 l l Da 5 RETURN SULUTI UN Often the various subproblems are solved independently but here we7ve allowed for the results of previous subproblems to be processed and then fed into the next subprobleml The function fall performs the final combination of results to produce the solution 43 Divideand Conquer Difference Equations Consider a divideand conquer algorithm which breaks a problem of size n into subproblems each of size na Assume that there are a such subproblems and that the algorithm takes 1mm time to split the original problem into subproblems and to combine the solutions of the subproblems to give the solution to the original probleml A difference equation for the time used by the algorithm is Tn aTnc lmml The solution for the time used by the algorithm is z 9 nm if a lt cm Tn nmlogn if a cm nbgna if a gt cm The derivation of the solution is not too complicated but it involves a number of details so we will not give it in full To derive the solution you can first con vert the divideand conquer equation to a standard linear constant coefficient difference equation by the substitution n CT and Tn tr Use standard techniques to solve this difference equation Determine which term in the so lution will be largest and from the assumptions that a b and c are positive and the initial condition is nonnegative show that the coefficient of this largest term is positive The above solution for the time used by the algorithm also holds for the divideandconquer equation Tn aTnc Pn where Pn is a polynomial of degree m This holds because the equation is linear so the solutions due to the various terms in the polynomial can be linearly combined and therefore the solution due to the highest power term in the polynomial will dominate Time here should be taken in a broad sense It could mean actual time as measured by a clock but it could also mean the number of times a particular operation is used For example when we considered sorting time was the number of comparisons used In many numerical algorithms time is the number of multiplications and divisions used 44 Some DivideandConquer Examples 441 Mergesort For the running time of MERGESORT we have the equation Tn 2Tn2 1m This equation holds for either the number of comparisons or for the total running time Since a 2 c 2 m l we ave a cm and the solution is Tn nlogn 442 Polynomial Multiplication A fairly common numeric problem is calculating the product of two polynomials The input is the coef cients of the two polynomials and the desired output is the coef cients of the product polynomial The usual algorithm for this problem proceeds iteratively by multiplying each coef cient of the rst polynomial by the rst coef cient of the second polynomial then multiplying each coef cient of the rst polynomial by the second coef cient of the second polynomial and adding these products to the appropriate partial coef cient of the product polynomial this process is continued until all of the coef cients of the second polynomial have been used If each of the polynomials have n coef cients then this algorithm will use nQ multiplications and n2 total operations The polynomial multiplication problem can also be solved by a divideand conquer algorithm which breaks each polynomial in half multiplies 4 halfsize polynomials and then adds the halfsize products in the appropriate way to give the product polynomial More explicitly let 131 and be two polynomials with n coef cients each Write Pm 1301 Plow2 Qltzgt 6201 chew2 where 1301 1311 Q0z and are all polynomials of degree n2 7 1 Then PIQI P0IQ0I P0IQ1I P1IQ0IIn2 P1IQ1IIn For the number of multiplications Mn we have M 4Mn27 and since a 4 c 2 m 0 We have a gt cm and the solution is nlog24 n2i For the total number of operations Tn we have Tn 4Tn2 m because all of the polynomial additions can be carried out with a multiple of n coef cient additions and the multiplications by powers of X simply shift a sequence of coef cients Since a 4 c 2 m l we have a gt cm and Tn n2i There seems little point to this diVideand conquer approach since it gives us the same running time as the usual algorithm but it suggests that all 4 of the halfsize multiplications might not be necessary because we only need the sum P0 P1 rather than both these products This sum can be computed using only one multiplication if we have P0 IQ0 and P1 because PoQi PiQo P0 P1Q0 Q1 PoQo PiQiA Our new diVideand conquer algorithm is PIQI P0IQ0Il1 IWH 1301 P1 1Qor Q1II 2 P1IQ1Il1 In ll This algorithm uses only 3 halfsize multiplications so 3Mn2 and Tn 3Tn2 m because the number of additions and subtractions is still proportional to n For we have a 3 c 2 m 0 so a gt cm and nlogz 3 For Tn we have a 3 c 2 m 1 so a gt cm and T n nbg This diVide andconquer algorithm will be faster than the usual nQ algorithm because log2 3 lt 2 443 Matrix Multiplication Another problem in Which diVideand conquer leads to a faster algorithm is matrix multiplication If n X n matrices are broken into four n2 gtlt n2 matrices then the problem is to compute the matrix C7 Where 011 012 A11 A12 B11 BM 021 022 A21 A22 B21 B22 The straightforward algorithm is 011 1411311 1412321 012 1411312 1412322 021 1421311 1422321 022 1421312 1422322 The equation for the running time of this algorithm is Tn 8Tn2 an7 because there are 8 halfsize multiplications and the additions can be done in time proportional to n2l This equation has a 87 c 2 m 2 so a gt cm and Tn nlogz 8 ng A faster diVideand conquer algorithm was designed by Strassen1969l This faster algorithm uses only 7 halfsize multiplications Strassen7s algorithm is M1 A12 7 A22 B21 B22 M2 A11 A22 Bil B22 M3 A11 7 A21Bil Bu M4 A11 A12B22 M5 A11B12 7 B22 M5 A22B21 BH M7 A21 A22Bll Cu M1M27M4M6 012 M4 M5 38 C21 M5M7 C22M2M3M5M7 It is a simple exercise in algebra to prove this algorithm correct The very real dif culty was discovering the algorithm in the rst place The running time of Strassen7s algorithm obeys the equation Tn 7Tn2 1m2 because there are 7 halfsize multiplications and the additions and subtractions of matrices can be carried out in time proportional to n2i Thus Strassenls algorithm has nlogz 7 running time 444 Solving 21 Linear System of Equations As the nal example of this section consider solving for X the system of linear equations AX B A divideandconquer approach to this problem would attempt to break this problem into several subproblems of the same kind By adding multiples of some n2 of the rows of A to the other n2 rows of A A can be reduced to the form A1 A2 0 A3 and by using the same transformation lt i If we also split X into lt gt the solution to the original problem can be given as the solutions to the vector B will be transformed to 4 1 ASX2 B2 42 A1X1 B1 A2X2 The equation for the running time of this algorithm is Tn 2Tn2 1m3 because we have two halfsize subproblems and it takes time proportional to n3 to reduce A to the required special formi Since a 2 c 2 m 3 we have a lt cm and Tn nm nS There is nothing that particularly recommends this algorithm 7 it has the same running time as the standard Gaussian elimination algorithm 7 but it is an example of a divideandconquer algorithm with a lt cmi An example of a divideand conquer algorithm with a cmis MERGESORT Examples of divideandconquer algorithms with a gt cm are the polynomial multiplication and matrix multiplication algorithmsi 445 Newton7s Method Newton7s method is often used to nd the root of a function It makes use of the idea of tangent sliding that is approximating the function by a straight line its tangent and then nding where this straight line crosses the axis If a function has a root R then 0 E N f1RIfI or rewriting fI NI Under reasonable hypotheses starting from a good initial approximation New tonls method will double the number of correct digitsi So if you have an ap proximation say 1 and 1 agrees with R on the rst 5 bits then N1 will agree with R on about the rst 10 bits Of course this procedure can be iterated so NN1 will agree with R on about the rst 20 bits and so forth Newton7s method can also be used to calculate functions In particular Newton7s method can be used to calculate square root and reciprocali To com pute VA use the function 12 7 A which clearly has A as a root Here f 1 21 and so RN17 N1i 2 2 1 7 A 1 A 1 N I I 7 21 21 2 So if you know how to divide you can quickly compute square root Let Tn be the time to compute A correct to n bitsi Then the runtime equation is Ag I A mmn9m2 because Newton7s formula will double the number of correct bits and the the operations involved in the formula are division by 2 which is merely a shift division of A2 by 1 which will take 0n2 to compute n bits and an addition which will take 0n bit operationsi So the time in bit operations to use the Newton formula will be bounded above by ImQ at least for big enough n The runtime equation can be solved by our usual method where a l c 2 and m2i Sincealtc Tn nm n2i A similar procedure will allow us to divide using Newton7s method and multiplication Since AB A lB if we can compute lB then one multiplication suf ces to compute ABi To compute lB we use 7B which clearly has lB as its only rooti Since 7112 1 1 7 B N1 17117I2127B1i Let Tn be the time in bit operations to compute lB correct to n bits then mmngm 40 As in the case of square root is the time to compute the answer correct to n2 bits and bn2 is an upper bound on the time to compute Notice that to compute Nz the formula uses a subtraction which takes On and two multiplications B 96 z and z 96 2 7 B1 and each multiplication can use 0n2 bit operationsi So again bn2 is an upper bound on the time to compute Nz correct to n bitsi As before we note a l c 2 and m 2 and since a lt cm Tn nm n2i It is worth noting that these uses of Newton7s method demonstrate that divi sion and square root can be computed in the same time order as multiplicationi Although we have only demonstrated this result for standard nQ multiplica tion notice that a lt cm will hold if m 2 1 so even if we could multiply n bit numbers in n time we would still have TSQRTTL TDIVTL TMULTTLA The technique of the next section will allow multiplication in time only slightly worse than n and this faster multiplication can be used to speed up both division and square root 45 Polynomial Multiplication and Fast Fourier Transform In practice the polynomial multiplication algorithms of the last section are not used because there is a much faster algorithmi The faster algorithm is based on the idea that there are two ways to represent a polynomial A polynomial with n coef cients may be represented by its coef cients or it may be represented by the values of the polynomial at n distinct points Either representation can be calculated from the other representationi This can be represented schematically by evaluation coefficients values interpolation If we have the coef cients of two polynomials each with n2 coef cients and we want the n 7 l coef cients of the product polynomial we could evaluate each of the input polynomials at the same n 7 1 points and multiply the corresponding values to obtain the values of the product polynomial at these n 7 1 points To obtain the desired coef cients we could interpolate a polynomial with n 7 l coef cients through these points The dif culty with this approach is that the standard methods for evalua tion and interpolation are both 9 n2 so using them would result in an n2 method for multiplying polynomialsi On the other hand there is nothing in the above discussion which forces us to use any particular set of points as eval uation and interpolation points It is easy to evaluate a polynomial at certain 41 points For example7 the value of a polynomial at 0 is simply one coefficient of the polynomial7 and the value of a polynomial at 1 is simply the sum of the coefficients It is difficult to see how to generalize the idea of evaluation at 07 but the idea of evaluation at 1 can be generalized if we are willing to allow complex numbers A complex number w is an nth root of unity if w 1 We know by the fundamental theorem of algebra that there are n complex numbers which are nth roots of unity A primitive nth root of unity is a complex number w such that w 1 and wj 1 for 1 S j S n 7 1 A primitive root is useful because its powers 100101102 Wu 1 give us the n numbers which are nth roots of unity We can use as our primitive roots the complex numbers 621139 which can be calculated by cos27rn isin27rni These numbers only need to be calculated once and stored in a table This table will also be useful because a primitive TL2 root of unity is 102 which will already be in your table To evaluate a polynomial a0 mm an71zn 1 at the n roots of unity we should compute w0 w0 i i i w0 7 a w0 w1 n 1 0 we w12 H wn712 a1 0 1n71 n71n71 anil Unfortunately if we do this by the obvious algorithm it will take time nQ even if we have already calculated the entries in the matrix However7 the matrix has very special structure and if we can make use of this structure we may be able to construct a faster algorithmi To display the structure of the matrix we will permute some of the columns Since we are planning to break things in half we will assume that n is a power of 2 Also we will number the rows and columns from 0 to n717 so the indices can be represented by using logn bitsi Our permutation will interchange column j with column R6110 where Rev is a number formed by reading the bits ofj in reverse order In our permuted matrix location i7j will contain wiRevmi 1n the lower left quadrant the matrix will have i 2 n27 j lt n27 and Revj will be even So wiRevj wn2iin2Revj wnZRevjwiin2Revj 1Revltjgtwltien2gtRevltjgt wiin2Rev0 and the lower left quadrant will be identical to the upper left quadranti The lower right quadrant has i 2 n27 j 2 n27 and Revj is odd So wiRevg 7wiin2Revj and the lower right quadrant is the negative of the upper right quadranti The upper right quadrant has i lt nZ j 2 n2 and Revj 7 1 Revj 7 So wiRevj wiwiRevj71 wiwiRevj7n2 by the diagonal matrix whose zquot diagonal entry 1s w i The upper left quadrant has i lt nQ j lt nQ and Revj is even So and the upper right quadrant is identical to the upper left quadrant multiplied i wiRer w2iRevj2 The halfsize permuted matrix will contain 102 raised to the iRevj power because 102 is a principal nan root of unity Of course in the halfsize matrix Revj will have log n 7 1 bits In the larger matrix Revj2 simply removes the low order bit which is 0 Thus the upper left quadrant will be idential to the halfsize permuted matrixi From these considerations we have F2 lt Fn DFn gt n Fn 7DFn where D is a diagonal matrix whose 16 entry is wk where w is a principal 270m root of unity As we have seen the matrix can be used to evaluate a polynomial at all the 2n roots of unity and since it turns out that this evaluation gives the coef cients of the discrete Fourier expansion of the polynomial the matrix is called the permuted Fourier matrix The form of the above matrix suggests a divideand conquer algorithm to compute the discrete Fourier transform of a vector Since the algorithm will be faster than a method based directly on the de nition of Fourier transform the algorithm is usually called the Fast Fourier Transform Assume X is a vector with 2n componentsi Let X1 be the rst n components of X and let X2 be the last n components of Xi Then FZnX lt FnXl DFnX2 gt FnXl 7 DFnXg Notice FnXl and FnXg need only be computed once each Their values can then be combined to give anXi To obtain the nice form the columns of the original matrix had to be permuted so to obtain the effect of the original matrix on a vector the vector must be permuted before the permuted matrix is applied The whole algorithm to compute the discrete Fourier transform of Y is 7 PEUTE X 7 X1 HALF7 E F FnXL con INE FnXL DFnXg 7 X2 FnXl DFnX2 FnX2 In this process the matrix F never really needs to exist The Fls in the above scheme are simply recursive calls to the procedure F The nonzero elements of 43 gt D are powers of w So the powers of w can be computed once stored in an array and used from the array when necessary The running time for this algorithm can be considered in two parts the time to permute and the time for the recursive procedure F The permutation uses the bit reversals of the number 0 through 2n 7 1 Since these numbers can be represented using log n1 bits the permutation can be computed in nlog The permutation can be done in n if the bit reversals are precomputedi The running time for the procedure for F obeys the difference equation T2n 2Tn m because there are two halfsize recursive calls and the multiplication by D and the additions and subtractions can be carried out in time proportional to n Thus T2n nlogn To use this fast algorithm for polynomial multiplication we need a fast way to interpolate or what is the same a fast way to multiply a vector by the inverse of the F matrix For the inverse procedure we need the idea of conjugate The conjugate of a complex number abi is a7 bi We represent the conjugate of the complex number Z by Z The product ZZquot a2 b2 which is the square of the length of the vector which represents Zr For a root of unity w the product wwquot 1 because the length of w is 1 The conjugate of a complex matrix is the transpose of the matrix with each element replaced by its conjugatei The inverse of F matrix is its conjugate divided by the dimension because i 7 F7 DFn F F F2nF2n lt F7 iDFn gt lt FD 717nm gt FnF DFnFD FnF 7 DFnFD FnF 7 DFnFD FnF DFnFD 7 2n 0 0 2M So F X can be quickly computed because F X FX FTX PFPX where FT is the transpose of F and P is the bitreversal permutation matrix The last equality follows because FT PFP Finally the whole algorithm to compute the coef cients of the product poly nomial is 1 Place the n coef cients of the rst polynomial in the rst n components of the vector X of dimension Zn The other n components of X will contain 0 to Similarly place the n coef cients of the second polynomial in the vector Yr 44 9 Permute both X and Y F Use the recursive algorithm to compute both FX and FYi 5 Componentwise multiply FX and FY to obtain a vector Zr 6 Permute and conjugate Zr 7 Use the recursive algorithm to compute F2 8 Conjugate and divide each component of FZ by 27L The resulting vector will contain the 2n 7 1 coef cients of the product poly nomial as its rst 2n 7 1 components The last component should contain 0 Since this algorithm will likely be carried out using oating point arithmetic the last component will likely not be exactly 0 but the value in this component will give us an estimate of how exact the other components are 46 How Practical Are Divideand Conquer Al gorithms After creating algorithms in the abstract world of analysis of algorithms we should now look back to the real world of programs and ask if the divideand conquer algorithms which are theoretically faster will be practically faster and whether they will be practical at all There are several reasons to be skeptical about the practicality of these algorithms In our abstract world recursion was available at no cost In the real world recursion may be unavailable or it may have a high cost In the abstract world we computed time only to order not distinguishing 10237 from 2n2 but in the real world these functions are very different Let us address recursion rst The tempting thing for a theoretician is to say that all modern programming languages have recursion so there is no problemi Unfortunately large amounts of programming are done in older languages which do not support recursioni So there is still some reason for considering converting recursive algorithms to iterative algorithmsi Even when recursion is available it may not be a good idea to use it Theoretically we often assume that passing data to procedures is free In the analysis in this section we have ignored space usage and we have particularly ignored extra space used to maintain stacks for the recursive procedures In the real world neither of these should be ignored Sometimes by doing a space analysis we nd that the recursive stack does not get very big so we are justi ed in ignoring it In other cases we nd that the recursive stack gets very large because we are putting copy after copy of the same data on the stack This problem can often be overcome by having only a single global copy of the data and instead of passing the data to the recursive procedure passing only a pointer to the particular part of the global copy that is needed In some cases we nd that data passed to the recursive routine may be replaced by a much simpler global structure In the Towers of Hanoi 45 example the number of disks was repeatedly passed to the recursive procedure but the equivalent informatlon could be maintained in a global counteri There are a number of methods which will suf ce to convert a recursive routine to an iterative routine but ifthey are general enough to work for all recursive routines they are too general to result in any saving in time or space Special features of a particular recursive routine can often be exploited in producing a more ef cient iterative routinei We have seen how special features can be exploited in the Towers of Hanoi example As another example the MERGESORT algorithm works from the topdown splitting a big array and passing each half to the recursive routinei This routine can be made iterative by working bottomupi Consider each element as a sorted array of size 1 and merge these 2 by 2 until you have sorted arrays of size 2 Again merge these 2 by 2 until you have sorted arrays of size 4 Continue merging until the entire array is sorted This can be accomplished without passing any arrays to procedures by keeping track of the size and the beginning and ending indices of the subarrays you are mergingi In summary it is often worthwhile to convert recursive algorithms to iterative algorithms but you should exploit the special features of the problem to create an ef cient iterative algorit m Are faster algorithms always faster Probably not In our analysis we have concentrated on asymptotic time order We expect our faster algorithms to be faster than slower algorithms for big enough inputs but the slower algorithms may well be faster for small inputsi For example in sorting there are nQ algorithms which are faster than an nlogn algorithm for n lt 20 but for n 2 20 the n log n algorithm is faster In this case unless we are dealing with very small data sets the faster algorithm really is faster On the other hand there are n2395 algorithms for matrix multiplication These algorithms don7t become faster than the standard ng algorithm until one is dealing with 50 000 X 50 000 matricesi Since no one is presently trying to multiply matrices this large the n2395 algorithms are only theoretically interestingi e a gorithms we have designed may fail to be practical in other ways The algorithms are not foolproof that is the algorithms assume that they will receive the type of data they are expecting and their behavior on unexpected data may well be rather strange Practical programs should check the input data to make sure it is of the expected kind and issue a warning if the data is not of the expected kind The algorithm may also not be practical if it solves the wrong problemi For example the FFT based polynomial multiplication algorithm is designed for dense polynomials that is polynomials in which all or almost all the coef cients are nonzero but many large scale polynomial multiplication problems arise in which the polynomials are sparse that is have almost all the coef cients equal to zero For such sparse problems there are algorithms which will easily outperform the FFT based algorithmi Similarly there are special algorithms for sorting which will be slow in general but will be very fast when the input data is almost sorte n summary there are practical issues which should be considered before a theoretically good algorithm is used as a practical programi Exercises Ex 41 You have a large number of coins and a pan balance You may put any number of coins in each pan of the balance The balance will tell you if the set of coins in one pan weighs the same as the set of coins in the other pan7 or it will tell you which set is heavier Somewhere among your coins is one coin which has a different weight from the other coins All the coins except this odd coin have exactly the same weight The problem is to nd the odd coin Design a divideand conquer algorithm to solve this problem You may as sume that the number of coins is a power of 3 Prove that your algorithm is correct Give and solve a difference equation for the number of times your algorithm uses the balance Ex 42 Design a divideandconquer algorithm to nd the two largest elements in an array Prove that your algorithm is correct Calculate the number of com parisons it uses Show by example that your algorithm uses more comparisons than necessary Ex 43 Two strings C1 and C2 commute that is7 C1C2 C2C1 iff there is a string w and two natural numbers 161 and k2 so that C1 wk1 and C2 w 2 Start from an inductive proof that w exists and construct an algorithm to nd w Here wk means www7 that is k copies of w Chapter 5 AVERAGE CASE 51 What is Average Case Up till now we have considered the running time of an algorithm to be a function of the size of the input but what happens when there are several different inputs of the same size An algorithm may treat all inputs of the same size in the same way or it may handle some inputs more quickly and some inputs more slowly The maximum of the running time over all inputs of the same size is called the worst case running time The minimum running time over all inputs of the same size is called the best case running time The running times averaged over all inputs of the same size is called the average case running time The average case running time is not the same as the average of the worst case and the best case It is often dif cult to calculate the average case time because the probability associated with each of the various inputs of a particular size is unknown For de niteness and simplicity it is often assumed that each input with the same size is equally likely to occur With this assumption average case can be calculate i 52 Some Examples of Average Case Behavior As an example of average case behavior consider the following algorithm PROCEDURE LARGETWU FIRST B1 SEC BEZ FURI 2TUnDU IF BEI gt FIRST THEN SEC FIRST FIRST BEI ELSE IF BEI gt SEC THEN SEC BEI This algorithm should nd the two largest elements in an array We will consider the number of comparisons of array elements it uses to accomplish this task In the FOR loop for each 1 the algorithm makes either 1 or 2 comparisonsi 1n best case the algorithm makes 1 comparison for each 1 giving a total of n 7 1 comparisonsi 1n worst case the algorithm makes 2 comparisons for eachl giving a total of 2n 7 1 comparisonsi Let An be the number of comparisons used on average by this algorithmi Since the algorithm proceeds in one direction across the array we may reason ably assume that for the rst n 7 1 elements the algorithm will on average use An 7 1 comparisons that is the same number it would use if the last element did not exist For the last element it will use at least one comparison It will use a second comparison exactly when Bn S FIRST but since FIRST will be the largest of the rst n7 1 elements the algorithm will use a second comparison as long as Bn is not the largest element in the array The probability that Bn is not the largest element is 771 if we assume that each element is equally likely to be the largest elementi From these considerations we have n71 AnAn711 An71271ni n For an initial condition we have 142 32 since for two elements the algorithm is equally likely to use one or two comparisons The solution to this difference equation is n Am 2n71 7 21 and for large n An 7gt 2n 717logn consti So we have that the average case of this algorithm is very close to the algorithms worst case Notice that if it took worst best2 we would get 32n 7 1 which will be a severe underestimate of the average case As an aside we should mention that this is a poorly set up algorithm since it assumes that the array has at least two locations If this algorithm were used with an array containing a single element the results would be unpredictable The QUICKSORT algorithm has a more complicated average case analysis The algorithm is PROCEDURE QUICKSORTA PICK AN ELEMENT a of A AT RANDOM DIVIDE A INTO A1 THE ELEMENTS OF A WHICH ARE LESS THAN 1 A2 THE ELEMENTS OF A WHICH EQUAL to 1 WE WILL ASSUME THERE IS ONLY ONE SUCH ELEMENT AS THE ELEMENTS OF A WHICH ARE GREATER THAN 1 RETURN QUICKSORTA1 0 A2 0 QUICKSORTA3 The worst case for QUICKSORT occurs when Al or A3 contains n 7 1 elements If we let stand for the worst case running time of QUICKSORT we have Wn 7 l m because the separation into Al A2 A3 takes time proportional to m From this we have nQ The best case for QUICKSORT occurs when Al and A3 are each approximately n21 If we let Bn be the best case running time we have approximate y Bn 2Bn2 m and Bn n logn1 For the worst case to occur each recursive splitting must have one of the sets Al or A3 emptyi For the best case to occur each recursive splitting must have Al and A3 of approximately equal size Will the average case be like the worst case or like the best case We might expect nearly equal splits to occur more frequently than onesided splits so we could guess that average case is probably closer to best case than to worst case If we let An be the average case running time and assume that every split is equally likely we have 1 n M 7 Ewe 71gt Altn 7 k bltn 1 where bn l is the time to split and combine Then nAn 7 n 71An 71 ZAUc 71 7 Ak 71 k1 k1 1 iAn7k7nAn717k k 1 k1 bnnl 7bnn7 l 2An 7 l 21m and so nAn n lAn 7 l 21m and w 5711 Letting Zn 1 we have Zn Zn 71 7371 which has the solution Zn cl 21221 Ui11ButZn 7 Cg CglOng so An 7 02n 1 03n 1 log n giving An n logn1 In summary the average case behavior of an algorithm is somewhere between the worst case and the best case behavior but it may be very close to the best case behavior or very close to the worst case behaviori Exercises Ex 51 A number of useful tricks were used in deriving the average case be havior of QUICKSORT For the procedure LARGETWO set up a difference 50 equation with a summation by assuming that each element is equally likely to be the largest elementi Then employ the techniques used on QUICKSORT to obtain the simple difference equation we used for LARGETWOi Ex 52 For the following procedure do an average case analysis Is the average case nearer to worst case or to best case PROCEDURE BIGTWU FIRST B1 SEC B2 FUR I2 2 T0 11 D0 IF EU 2 SEC THEN IF BEI gt FIRST THEN SEC FIRST FIRST ELSE SEC 2 BEI BEI Chapter 6 LOWER BOUNDS 61 Trivial Lower Bounds A lower bound is a statement that every algorithm which solves a particular problem must use at least so much of a particular resource We have already met some lower bounds in the Towers of Hanoi example There we argued that every algorithm for the problem must take at least 92 time because the output must contain 2 7 1 moves Such a lower bound is called a trivial lower bound because it is obvious that an algorithm must take at least as much time as it takes to print its output Although such a bound is called trivial it may not be trivial to compute the length of the output For example it was not immediately obvious that the output for the Towers of Hanoi must have length Trivial lower bounds can also depend on the input If we can show that to compute the correct answer an algorithm must read all of its input then the algorithm must use time which is at least as great as the length of the input For example if an algorithm is supposed to multiply an n X n matrix times an n component vector we have the trivial lower bound of nQ because if the algorithm does not read some of the input then there are vectors and matrices for which the algorithm gives the wrong answer As another example consider an algorithm which is supposed to nd the largest element in an n element array This algorithm must take time at least n since if it doesnlt look at all the elements in the array there are arrays for which the algorithm will give an incorrect answer There are problems in which an algorithm does not have to look at the whole inputi For example to determine if a list is empty an algorithm only has to look at the rst element of the list it does not need to look at all elements of the list Input and output lower bounds can vary widely Sometimes the output bound will be larger than the input boundi Sometimes the input bound will be larger than the output boundi In the Towers of Hanoi problem the output bound is 92 but the input bound is only log n because n and the names of the towers can be speci ed in log n bitsi 1n the variant of the Towers of Hanoi in which one is asked if a given con guration is used in moving the disks from Tower A to tower C the output bound is 91 since the yes or no answer can be speci ed with a single bit while the input bound is n because a n bits are needed to specify which tower contains which diski So far our lower bounds have been best case bounds that is even in best case every algorithm must use at least the time speci ed by the lower bound Consider the problem of determining if a list contains a particular element The desired element could be the first element in the list so the best case lower bound is a 91 On the other hand to be sure that the desired element is not in the list an algorithm must look at every element so the worst case lower bound is 62 Lower Bounds on Speci c Operations The trivial lower bounds merely state that an algorithm must do the required input and output they say nothing about any computation the algorithm must performi To obtain lower bounds involving specific operations the operations which the algorithm is allowed to use must be speci ed In this section we will consider algorithms in which the only allowed operations on input elements are comparisonsi This will suffice for the problems considered For other problems the operations of addition subtraction and multiplication might be allowed We refer the interested reader to Winograd1980i Let us consider deriving a lower bound on the number of comparisons needed to find the largest element in an array We use an input bound argument If some element is not compared to any other element then either the algorithm says that the uncompared element is the largest element which will be false in some cases or the algorithm says that some other element is largest which will be false when the uncompared element is largest Since elements can be compared two at a time we have that any algorithm which correctly nds the largest using only comparisons must use at least n2 comparisonsi We have that n2 comparisons are needed but we expect that n 7 1 com parisons are needed because our methods for finding the largest use n 7 1 com parisons We can consider an algorithm as forming a directed graph Each time a comparison is made a directed edge from the larger to the smaller element is put into the graph When the largest element has been found then for ev ery other element there will be a directed path in the graph from the largest to the other element If not we could replace the element without a path by an element larger than the element that the algorithm reports to be largesti Since the algorithm is assumed to use only comparisons the algorithm cannot detect this replacementi So if not all these paths exist then the algorithm will give incorrect answers If all these paths exist then the graph formed from the directed graph by replacing the directed edges with undirected edges will be connected Since a connected graph has at least n7 1 edges and since each edge cone pond to a compaii on n 7 1 compaii on are needed 53 A different way to obtain this lower bound is to think of nding the largest as a dynamic systemi As an algorithm proceeds it classi es each element as belonging to one of three sets the set U which contains elements which have been compared to no other element the set S of elements which have been compared to some other element and have been found to be smaller the set L of elements which have been compared to some other element and have been found to be larger than any element they have been compared to The states of the dynamic system will be vectors with three integer components giving the size of the sets U S and Li When the algorithm starts the dynamic system will be in state n0 0 and when the algorithm correctly terminates the dynamic system will be in state 0n 7 11 In each comparison of the algorithm the state changes in one of the six following ways where the sets from which the elements to be compared are indicated by the names of the sets above the arrows U s L H U 2 s1 L1 U s L E U s L U s L 3 U s1 L 1 U s L La U 1 s1 L or U 1 s L1 U s L E U 1 s1 L SL U S L 7 U S1 Ll or U S L Since we have a model of the action of any algorithm which nds the largest ele ment using only comparisons we can establish a lower bound for the algorithm by establishing a lower bound on the number of transitions in the dynamic sys tem required to go from state n 0 0 to state 0 n7 111 Consider the middle component It must increase from 0 to n 7 1 but this component can increase by at most 1 in any of the transitions Hence at least n 7 1 transitions in the dynamic system and n 7 1 comparisons in the algorithm are required There is even an simpler argument that n 7 1 comparisons are needed This argument is easy to remember because like many fables it makes use of a story about animals In a farmyard there are n chickens and they need to decide who is Boss chickeni Being chickens the only way they can decide is by having ghts between two chickensi What is the fewest number of ghts needed to nd the Boss Clearly when the Boss is found every other chicken must be known not to be Boss and a chicken cannot be Boss if it has lost a ght Hence each of n 7 1 chickens must have lost a ght and since each ght has only one loser there must have been at least n 7 1 ghtsi This chicken argument translates immediately into the nd largest problemi The chickens are elements The ghts are comparisons between elements So by the chicken argument at least n71 comparisons are needed to nd the largest among n elements It is important to notice that this result really does depend on the type of algorithm being considered By allowing different operations the largest element can be located using only logn comparisonsi Consider the following algorithm FUNCTION BIGij F j 11 THEN IF ai 2 aj THEN BIG 2 i ELSE BIG 2 u ELSE IF 2 NUMK 2 my 1 NUMK 7 2 THEN BIG BIGi ij2 ELSE BIG BIGij2 1 j END FUNCTION FURI 1TUnDU NUMEI naI LARGEST aBIG 1 n In this algorithm we are assuming that the elements are distinct positive inte gersi The indicates exponentiationi Let Cn be the number of comparisons used by the algorithm with n elements then Cn CnZ l and C2 1 Thus Cn log n If you also want to count the comparison between j and i 17 there are 2 logn comparisons The trick here is to make sure that the largest element is in the halfsize subset with the largest sumi Consider the case in which LARGEST is in the rst half Then nLARGEST Zn NUMUQ S gnLARGESTil 2 Kg1 lt nLARGEST E 71 NUMUQ 2 K1 so this algorithm will work correctly The use of exponentiation allows us to make the largest element so large that we can do a binary search through the subsets and locate the largest element quickly This example should indicate that lower bound proofs are sensitive to oper ations allowed in the algorithmi It is thus important in lower bound arguments to indicate the class of algorithms being considered 55 63 Lower Bounds on Sorting A very traditional and ubiquitous problem is sortingi Given n elements from a totally ordered set put them in order from the smallest to the largest We have already encountered several algorithms for this problemi Some had nQ worst case running time Some had n log n worst case running time Some had nlogn average case running time Some had n log n best case running time From these examples one might conjecture that an nlogn lower bound could be established for best case and hence for average and worst cases at least for algorithms which sort by using only comparisons Unfortunately this conjecture is false 1f the elements are already sorted an algorithm could check in only n 7 1 comparisons that the elements were in order It is not essential that the elements actually be in order for any particular arrangement of the elements an algorithm can be created which tests for this arrangement using n 7 1 comparisons and if the elements are in this particular arrangement the algorithm will sort them with no additional comparisons So any best case lower bound for sorting must be less than or equal to n 7 1 comparisons But no smaller lower bound is possible because if an array is sorted then the largest element can be found using no comparisons since the largest element is the last element in the array Since we have already demonstrated a best case lower bound of n 7 1 comparisons to nd the largest element n 7 1 comparisons is also the best case lower bound for sorting It still is reasonable to believe that a lower bound of n log n may be possible for worst case and average case To obtain these bounds we introduce a model of computation called the comparison treei Each internal node of a comparison tree contains an expression like a z b which means compare a to 12 From a comparison node there are two arrows to other nodes One arrow is labeled a S b and the other arrow is labeled a gt 12 There are also external nodes or leaves which indicate the termination of the tree The comparison tree should capture the idea of an algorithm which uses only comparisons The algorithm starts at the root of the tree does the comparison there and then follows the appropriate arrowi This process continues until a leaf is reached The following picture shows a comparison tree for an algorithm which sorts 3 elements In this example there are 6 leaves because there are 3 possible orderings of 3 elements A comparison tree for sorting n elements will have nl leaves In this example there are sets which will be sorted using only 2 comparisons but there are also sets for which this algorithm uses 3 comparisons Let us de ne the depth of the root of a tree as depth 0 and the depth of a node in the tree as 1 depth of its parent node Then the maximum number of comparisons to sort any set using a particular comparison tree algorithm is the maximum depth of a node in the tree In the example there is a node a leaf of depth 3 and there is a set on which this algorithm uses 3 comparisons To obtain a worst case lower bound we will consider the depth of comparison trees for sorting Since there are only two arrows from each node the number of nodes can only double when the depth is increased by 1 So there are at most 2D nodes at depth D and summing the nodes at every depth there are at most 2D1 nodes in a tree with maximum depth D Since there are nl possible orderings of n things a comparison tree for sorting must have at least nl leaves From these two facts we have 2D1 2 nl Taking logs and using Stirling s approximation that nl N nne nv 27m we have D 2 n log n and hence n log n is a worst case lower bound for the number of comparisons used by an algorithm which sorts using only comparisons The comparison tree model can also be used to produce an average case lower bound The average number of comparisons will be the average depth of a leaf of the comparison treei By average here we mean that we assume that each ordering of the n elements is equally likely and hence each ordering has probability lnl Now the 57 average depth of a leaf will be at least as great as the average depth of a node 0 minimize the average depth of a node we will consider the full tree with total depth L where lognl 7 1 lt L S lognl By a full tree we mean a tree which has exactly 2i nodes at depth i This full tree will give us a lower bound because any comparison tree for sorting has at least log nl nodes and we are making their depths as small as possible The average depth of a node in our full tree is at least L 1 2 L 2 nl Ri o EL712 l2 alognl72 l Again using Stirlingls approximation this is asymptotic to n log n and n log n is an average case lower bound for sorting We summarize these results in the following proposition Proposition 10 Any algorithm which sorts using only comparisons must use at least 1 n comparisons in best case 2 n log n comparisons in average case 5 n log n comparisons in worst case We close this section with two reminders One average case here assumes that all nl orderings are equally likely If there are signi cantly fewer orderings which are very likely to occur then the above average case bound does not hold For example if the inputs are almostsorted then there are n average case algorithms Two the bounds in the proposition apply for algorithms which sort using only comparisons If different types of operations are allowed then these bounds may not hold Exercises Ex 61 Prove a worst case lower bound of 3n 7 2 comparisons for any algo rithm which nds the largest and smallest elements in an array by using only comparisons Devise a divideand conquer algorithm for this problem Show 3 that your algorithm uses in 7 2 comparisons Ex 62 Create an algorithm which nds the largest and second largest elements in an array and uses only n logn 7 2 comparisons HINT Keep track of the elements which are potentially the largest and for each such element keep track of those elements which have lost only to this element Chapter 7 EXHAUSTIVE SEARCH 71 Straightforward Exhaustive Search The algorithms which we have considered so far have been based on the divide andconquer strategy that is try to break the problem into several smaller problems of the same kind An exhaustive search is a different strategy for designing algorithmsi This strategy is based on the idea of trying all possible answers Either the solution is located or no solution exists A simple problem for which this strategy is reasonable is given a list and a value is there an element of the list which contains the given value An exhaustive search algorithm would look at each element in turn until either one with the given value is found or until all of the elements have been viewed If there are n elements in the list then this algorithm will take 91 in best case and n in worst case and these are the best possible running times for any algorithm for this problemi There are also problems for which straightforward exhaustive search gives very poor algorithmsi Consider sortingi A straightforward exhaustive search would try each possible ordering until the sorted ordering is found Unfortu nately there are nl possible orderings so in worst case when the last ordering is the correct one this algorithm will take at least nl which is much much greater than the n log n taken by algorithms we have already found Satis ability is another problem to which exhaustive search can be applied The satis ability problem is given a Boolean expression is there an assignment of true andfalse to the variables which makes the expression true If there are n variables there are 2 possible truefalse assignments for the variables and since a Boolean expression of length L can be evaluated in L time the exhaustive search algorithm will take LZn time in worst case Unlike sorting it is not clear that there are faster algorithms for this problemi Also unlike the find value problem it is not clear that this is the best running time possible for this problemi 72 Backtracking While straightforward exhaustive search may be useful in some problems there is usually additional information available which will allow an algorithm to eliminate some of the possibilities One way to make use of this additional in formation is called backtracking In a backtrack algorithm the search is broken into stages and at each stage only the possibilities which can still lead to a so lution are considered The name backtracking comes from a method for finding a way through a maze At a choicepoint in the maze pick an arbitrary path which has not yet been used Continue making such choices until you reach a deadend or a choicepoint at which all paths have already been used Then go back on the path you have come down to the previous choicepoint and continue the method This last step going back on a path is called backtracking and this name is also given to the general method So in the general backtracking met 0 o the search is broken into stages 0 at each stage a choice is made 0 when the algorithm determines that none of the choices at a stage can lead to a solution then the algorithm backtracks by returning to the previous stage and taking one of the unused choicesi A problem to which backtracking can be directly applied is Hamiltonian path given a graph nd a sequence of vertices which contains each vertex exactly once and such that adjacent vertices in the sequence are connected by an edge in the graph At each stage in the backtrack algorithm an unused vertex sharing an edge with the current vertex is chosen If it is impossible to choose such a vertex the algorithm backtracks to the previous vertex and tries to choose a vertex other than the one that led to the deadendi If the graph contains a Hamiltonian path this algorithm will eventually find it because the algorithm generates all sequences of vertices with adjacent vertices connected by an e ge and no repeated verticesi A straightforward exhaustive search for this problem would generate a permutation of the vertices and then test to see if the vertices adjacent in the permutation were also adjacent in the graph ln worst case all permutations would be generated and the exhaustive search algorithm would take at least nl timer The backtrack algorithm in worst case would generate at most ll di paths where di is the degree of vertex i So in worst case we expect the backtrack algorithm to be faster than the exhaustive search algorithmi The backtrack algorithm also has the advantage that it uses information about the graph in generating a path If the backtrack algorithm uses more information about the graph in picking the next vertex we can reasonably expect the backtrack algorithm to work much more quickly in average case We will next consider quot and L quot L for the 39 C quotM problemi Consider the Boolean expression X1 X1 VTQVK EVXQ 60 Since this expression has 3 variables there are 8 possible truth assignments which we show in the following table X1 X2 X3 EXPRESSION F T F T F T F T An exhaustive search algorithm would generate 7 out of 8 of these assign ments before it found that the expression was satis ablei A backtrack algorithm could proceed according to the following tree lt picks the rst variable X1 and assigns it F then X1 F is substituted in the expression yielding the expression Fl Since F is not satis able the algorithm backtracks and tries the assignment X1 Ti When this is substituted into the expression it yields the expression X2i A reasonably clever algorithm would notice that this expression can be satis ed by the assignment X2 T and hence that the whole expression is satis ablei A less clever algorithm might try X2 F and have to backtrack once more before nding a satisfying assignmenti In either case it seems that the backtrack algorithm will be better than the exhaustive search algorithmi The backtrack algorithm should simplify the expression at each stage While this might seem complicated in general it is very easy when the expression is in clause formi ln clause form a Boolean expression is a set of clauses ANDed together and within each clause are a set of literals that is complemented and uncomplemented variables OR ed together When a variable X is assigned the value T all the clauses containing the literal X become true and can be removed from the expression while all the clauses containing X are simpli ed by the removal of Xi Similarly if X is assigned the value F then each clause containing X can be removed from the expression and each clause containing X can be simpli ed by the removal of Xi One special case should be noted if a clause containing a single variable has that variable removed then the whole expression becomes falsei How should the backtrack algorithm choose which variable to assign at each stage An algorithm could try assigning X1 then X2 and so forth but this 61 method does not take advantage of the structure of the expression If an ex pression contains a clause with a single literal then that literal must be set to true for the entire expression to be true If an expression contains a variable X which only appears in the form X and never in the form X then X can be assigned true and all the clauses containing X can be removed without affect ing the satis ability of the expression These two rules can be used to simplify the expression There is one complication if the expression contains a clause with the single literal X and another clause with a single literal X then the expression is not satis ablei er the expression has been simpli ed as much as possible by the above rules a method for choosing which assignment to make next is still needed A method which seems reasonable is to pick the literal which appears in the most clauses and to make this literal true This method is not guaranteed to nd a satisfying assignment but it will simplify the expression Such a method which is reasonable but not guaranteed to lead to a solution is called a hemistici A backtrack algorithm will still have to backtrack when the heuristic does not lead to a solution but a wellchosen heuristic can greatly reduce the number of times the algorithm has to backtracki 73 Knight 5 Tour We have already mentioned the Hamiltonian path problemi A specialization of this problem is the knights tour problemi In the knights tour problem the vertices of the graph are the squares of an n X m chessboardi Two squares are connected by an edge iff a knight can legally move from one square to the other The knight can move two squares in one direction and one square in an orthogonal direction The knightls tour problem is to nd a Hamiltonian path in this graph or equivalently move the knight legally around the chessboard so that it visits each square exactly once If nm is odd then from some starting squares there is no knightls tour The traditional chessboard has squares of two colors so that orthogonally adjacent squares are of different colors If n m is odd then there are more squares of one color Since the knight always moves from one color to the other a tour starting from a square of the less numerous color is impossible To nd a knightls tour one could use the backtrack algorithm for Hamiltonian path but as the last section suggests some sort of heuristic should be used to choose the next squarei One plausible idea is based on the degree of a square The degree of a square is the number of unused squares the knight could go to if the knight were on this square If the square has large degree then there are many ways into and out of the square and it is not too important to deal with this square immediately If a square has degree 2 then there is one way in and one way out of the square and this square should probably be used as soon as possible So the heuristic is next visit a possible square of lowest degree Even without this heuristic a backtrack algorithm will nd a knightls tour if one exists With this heuristic the backtrack algorithm may work faster As 62 well as picking the next square the degree can also be used in deciding when to backtrack Clearly a backtrack is called for when the tour reaches a dead end a square from which no unused square can be reached in one step Just before the move into this dead end the dead end square had degree 1 A square of degree 1 should be entered only if it is the last square on the board So if the lowest degree among possible next squares in 1 and there is more than one unused square then the algorithm should backtracki Further if there are ever two unused squares of degree 1 then the algorithm should backtrack since both of these squares cannot be the last squarei lnitially squares near the center of the board will have degree 8 squares nearer an edge of board will have degree less than 8 and the corner squares will have degree 2 As a partial tour is constructed the degrees of the squares will decrease How well will this heuristic work Is backtracking ever needed We will answer these questions in a moment but rst we want to consider a special case In the special case n m 4h 1 for h a positive integer and the tour starts in one of the corner squaresi For this special case a knightls tour can be found using a simple rule with no backtrackingi The simple rule is choose as the next square the possible square nearest the edge of the board The following picture shows a 5 X 5 board with each square containing a number indicating the order in which the knight visits the squares 1 14 9 2O 3 24 19 2 15 10 13 8 25 4 21 18 23 6 11 16 7 12 17 22 5 Notice that the knight starts in a corner and nishes the tour at the center of the board It is clear that the simple rule needs some ampli cation How did the knight at the square numbered 2 decide to go to the square numbered 3 rather than the square numbered 21 or the square numbered 13 and how did the knight at square 8 decide to go to square 9 rather than square 17 The extra tiebreaking rules are choose the square with lowest initial degree choose the rst square in clockwise order With these rules one can prove inductively that a knight s tour will be found in this special case The induction considers a 4h 1 1 gtlt 4h 1 1 board as a 4k 1 gtlt 4h 1 board surrounded by a two square borderi From this special case we can see that in the general case we will also need tiebreaking rules to determine which square should be picked when two possible next squares have the same degree From the special case we could try the rule which says choose the rst such square in clockwise order Unfortunately it can be shown that for some starting squares this tiebreaking rule does not give a knightls touri Another possible tiebreaking rule is to lookaheadquot choose the next pos sible square of lowest degree which has the lowest degree following squarei But again this tiebreaking rule may fail Further lookahead would be possible but it would take considerably more computationi In spite of the fact that the heuristic even with tiebreaking rules does not work in all cases it still works in a large majority of cases Some empirical stud ies show that a backtracking algorithm using the heuristic uses no backtracks in most cases one backtrack in a few cases and 2 backtracks in rare cases and never uses more than 2 backtracks A proof of any of these empirical results would be interesting but it is probably very dif cult hen heuristics work so well on a problem one should consider whether there is a reasonable nonexhaustive algorithm for the problemi Cull amp De Curtins1978 proved that if minn m 2 5 then there is a knightls tour from at least one starting square on an n X m board and i n m is even there is a knightls tour starting from any square on the board Their proof technique is constructive and takes n m time to construct the knights touri Their paper does not contain a proof that if n m is odd then there is a knightls tour starting from every square of the more numerous colori Could knightls tour still be dif cult Yes if we allow boards with holes Consider the 9 X 9 chessboardi We could remove the center 5 X 5 board and leave a 2 square wide border with a large hole in the middle In spite of this hole it is easy to start in a corner square and to move the knight so that it visits every square in the border exactly oncei On the other hand if we took the 5 X 5 chessboard and removed the center 3 X 3 board the remaining board would not have a knightls touri For these two simple the existence of a knightls tour is still an easy question but this question becomes more dif cult on general board with holes To make boards with holes 77 we allow any set of squares to be removed from the chessboardi The removed squares do not need to be contiguous or to form one hole Recently McGown Leininger and Cull2004 showed that the existence of a knightls tour on a chessboard with holes is an NPcomplete problemi For an explanation of NPcomplete see the next chapter 74 The n Queens Problem Backtracking is an effective method for the nQueens problemi Can n queens be placed on an n X n board so that no queen can take another queen that is can n objects be placed on an n X n board so that no two objects are in the same row the same column or the same diagonal The answer to this questions is 64 no when n 2 or 3 and yes otherwise The problem becomes more dif cult when one is asked for all the different ways to place n nontaking queens on an n X n board Two ways are different if one cannot be obtained from the other by rotating andor re ecting the board Since one queen will be placed in each row we will take placing a queen in a row as a stage in a backtracking algorithm The algorithm starts by placing a queen in column 1 of row 1 then it successively places queens in the lowest numbered allowed column of each successive row until either every row has a queen or there is no allowed place for a queen in the present row If there is no allowed place to put a queen in the present row then the algorithm backtracks and tries to place a queen in the next allowed place in the previous row If a queen has been placed in each row then this solution is output and the algorithm backtracks The algorithm terminates when it has backtracked to row 1 an the queen in row 1 is in column n This algorithm can be written as either a recursive or as an iterative program Since the algorithm is tailrecursive we suggest writing it as an iterative algorithm We leave the actual program as an exercise and refer the reader to Wirth1976 The algorithm we have outlined will produce all allowed placements not all different allowed placements We still need a method which determines if two placements are different Two placements are the same if one is the rotation re ection of the other The rotation by 90 and the re ection generate the Selement group of the symmetries of a square By applying each element of this group to a placement one will obtain the 8 placements which are equivalent If one had a table of the different placements found so far then one could look in the table and see if any of these 8 equivalent placements had already been found But there is an easier way Since a placement will consist of n numbers indicating the columns in which each of the n queens are placed and each column number will be between 1 and n a solution can be viewed as a single number in base n1 The algorithm is set up so that it generates the placements in numeric order Thus to see if a placement is different apply to it each of the 7 nonidentity elements of the group and determine if any of these equivalent placements gives a lower number If one of these equivalent placements gives a lower number then this placement is equivalent to an already generated placement If each of the 7 equivalent placements gives a number greater than or equal to the present placement then the present placement is different from all previous placements How fast is this backtrack algorithm for n queens A straightforward ex haustive search would generate all permutations of n things and hence take about nl time Empirical studies suggest that the backtrack algorithm runs in time proportional to TL2 2 This is much much smaller than nl A proof that this algorithm runs in the time suggested by the empirical studies would be very interesting Exerc1ses Ex 71 Use the heuristics of section 63 to construct a knightls tour of a 6 X 6 boar Ex 72 Pick starting squares and show that the heuristics of section 63 do not 65 nd a knightls tour of a 5 X 5 board even When such a tour is possible Ex 73 Determine all the different placements of 8 nontaking queens on an 8 X 8 boar l Chapter 8 HARD PROBLEMS 81 Classi cation of Algorithms and Problems We have encountered various algorithms particularly of the divideand conquer type which have running times like n nlogn and n2 where n is some measure of the size of the input We have also encountered algorithms particularly of the exhaustive search type which have running times like 92 and nl For algorithms of the rst type doubling the size of the input increases the running time by a constant factor while for algorithms of the second type doubling n increases the running time by a factor proportional to the running time If we double the speed of the computer we are using then the largest input size which our computer can solve in a given time will increase by a constant factor if we have an algorithm of the rst type for an algorithm of the second type the input size our computer can solve will only increase by an additive constant at best These considerations led Edmonds to propose that algorithms of the rst type are computationally reasonable while algorithms of the second type are computationally unreasonable More speci cally he suggested the de nitions 0 reasonable algorithm an algorithm whose running time is bounded by a polynomial in the size of the input 0 unreasonable algorithm an algorithm whose running time cannot be bounded by any polynomial in the size of the input This suggests that we should try to replace unreasonable algorithms by rea sonable algorithms Unfortunately this goal is not always attainable For the Towers of Hanoi problem any algorithm must have running time at least 92 Since some problems do not have reasonable algorithms we should classify prob lems as well as algorithms Corresponding to Edmonds de nition for algorithms Cook and Karp suggested the following de nition for problems 0 easy problem a problem which has a polynomial time bounded algorithmi 67 0 hard problem a problem for which there is no polynomial time bounded algorithml An easy problem may have unreasonable algorithmsl For example we have seen an nl exhaustive search algorithm for sorting but sorting is an easy problem because we also have an n log n sorting algorithml A hard problem on the other hand can never have a reasonable algorithml Some problems like Towers of Hanoi are hard for the trivial reason that their output is too large To avoid such outputbound problems Cook suggested considering only yesm7 problems that is problems whose output is limited to be either yes or no One such yesno problem is the variant of the Towers of Hanoi problem in which the input is a con guration and the question is is this con guration used in moving the disks from tower A to tower C Furthermore this variant is an easy probleml Usual easy problems can be transformed into easy yesno problems by giving as the instance of the yesno problem both the input and the output of the usual problem and asking if the output is correct For example sorting can be converted into a yesno problem when we give both the input and the output and ask if the output is the input in sorted order As another example matrix multiplication can be converted into a yesno problem in which we give three matrices A B and C and ask if C ABl There are also other ways to transform usual problems into yesno problemsl Examples are the above variant of the Towers of Hanoi and the variant of sorting in which we ask if the input is in sorted order To avoid problems which are hard only because of the length of their output Cook de ned 0 P the class of yesno problems which have polynomial time algorithmsl Some authors call this class PTIMEl For many problems it is easy to show that they are in Pl One gives an algorithm for the problem and demonstrates a polynomial upper bound on its running time It may be quite dif cult to show that a problem is not in P but it is clear that there are problems which are not in P For example the halting problem which asks if an algorithm ever terminates when given a particular input is not in P because the halting problem has no algorithm and therefore certainly no polynomial time algorithml Classi cation of problems would not be very useful if we only could say that some problems have polynomial time algorithms and some problems have no algorithms We would like a ner classi cation particularly one that helps classify problems which arise in practice In the next section we introduce some machinery which is needed for a ner classi cation 82 Nondeterministic Algorithms Up to this point we have discussed only deterministic algorithms In a deter ministic algorithm there are no choices the result of an instruction determines which instruction will be executed next In a nondeterministic algorithm choices 68 are allowed any one of a set of instructions may be executed nexti Nondeter ministic algorithms can be viewed as HmagicH if there is a correct choice the magic forces the nondeterministic algorithm to make this correct choice A less magic View is that if there are several possibilities the nondeterministic algo rithm does all of them in parallel Since the number of possibilities multiply at each choicepoint there may be arbitrarily many possibilities being executed at once Therefore a nondeterministic algorithm gives us unbounded parallelismi This View of nondeterminism as unbounded parallelism makes clear that non determinism does not take us out of the realm of things which can be computed deterministically because we could build a deterministic algorithm which sim ulates the nondeterministic algorithm by keeping track of all the possibilities However a nondeterministic algorithm may be faster than any deterministic algorithm for the same pro emf We de ne the time taken by a nondeterminis tic algorithm as the fewest instructions the nondeterministic algorithm needs to execute to reach an answer This de nition is not precise since what an instruc tion means is unde ned This could be made precise by choosing a model of computation like the Turing machine in which an instruction has a wellde ned meaning but this imprecise de nition should suf ce for our purposes With this de nition of time a nondeterministic algorithm could sort in n time be cause it always makes the right choice whereas a deterministic algorithm would take at least n log n time in some cases In some sense we are comparing the best case of the nondeterministic algorithm with the worst case of the deter ministic algorithm so it is not surprising that the nondeterministic algorithm 39s asteri For yesno problems we give nondeterministic algorithms even more of an edge We divide the inputs into yesinstances and noinstancesi The yes instances eventually lead to yes answers The noinstances always lead to no answers We assume that our nondeterministic algorithms cannot lead to yes as a result of some choices and to a no as a result of some other choices For a yesinstance the running time of a nondeterministic algorithm is the fewest instructions the algorithm needs to execute to reach a yes answer The running time of the nondeterministic algorithm is the maximum over all yesinstances of the running time of the algorithm for the yesinstancesi We ignore what the nondeterministic algorithm does in noinstancesi As an example of nondeterministic time consider the yesno problem given a set of n numbers each containing logn bits are there two identical numbers in the set In a yes instance a nondeterministic algorithm could guess which two numbers were identical and then check the bits of the two numbers so the running time for this nondeterministic algorithm is log On the other hand even in a yesinstance a deterministic algorithm would have to look at almost all the bits in worst case so the running time for any deterministic algorithm is at least nlog e de nition of nondeterministic time may seem strange but it does mea sure an interesting quantityi If we consider a yesno problem the yesinstances are all those objects which have a particular property The nondeterministic time is the length of a proof that an object has a certain property We ignore 69 noinstances because we are not interested in the lengths of proofs that an object does not have the property 83 NP and Reducibility Now that we have a de nition of the time used by a nondeterministic algorithm we can in analogy with the class P de ne o NP the class of yesno problems which have polynomial time nonde terministic algorithms It is immediate from this de nition that P Q NP because every problem in P has a deterministic polynomial time algorithm and we can consider a deterministic algorithm as a nondeterministic algorithm which has exactly one choice at each step It is not clear however whether P is properly contained in NP or whether P is equal to NPi Are there problems in NP which may not be in P Consider the yesno version of satis ability given a Boolean expression is there an assignment of true and false to the variables which makes the expression true This problem is in NP because if there is a satisfying assignment we could guess the assignment and in time proportional to the length of the expression we could evaluate the expression and show that the expression is true We discussed exhaustive algorithms for satis ability because these seem to be the fastest deterministic algorithms for satis abilityi No polynomial time deterministic algorithm for satis ability is known Similarly the problem given a graph does it contain a Hamiltonian path seems to be in NP but not in Pi Hamiltonian path is in ecause we can guess the Hamiltonian path and quickly ie in polynomial time check to see if the guessed path really is a Hamiltonian path Hamiltonian path does not seem to be in P because exhaustive search algorithms seem to be the fastest deterministic algorithms for this problem and these search algorithms have worst case running times which are at least 92 and hence their running times cannot be bounded by any polynomiali Another problem which is in NP but may not be in P is composite number given a positive integer n is n the product of two positive integers which are both greater than 1 Clearly this is in NP because we could guess the two factors multiply them and show that their product is it Why isn7t this problem clearly in P Everyone knows the algorithm which has running time at most nQ and either nds the factors or reports that there are no factors This well known algorithm simply tries to divide it by 2 and by each odd number from 3 to n 7 1 While this algorithm is correct its running time is not bounded by a polynomial in the size of the input Since it can be represented in binary or in some other base the size of the input is only logn bits and it cannot be bounded by any polynomial in logni This example points out that we have been 70 too loose about the meaning of size of input The of cial de nition says that the size of the input is the number of bits used to represent the input According to the of cial de nition the size of the input for this problem is logn if n is represented in binary But if n were represented in unary then the size of the input would be ni So the representation of the input can affect the classi cation of the problemi There are a great variety of problems which are known to be in NP but are not known to be in Pi Some of these problems may not seem to be in NP because they are optimization problems rather than yesno problems An example of this kind of optimization problem is the traveling salesman problem given a set of cities and distances between them what is the length of the shortest circuit which visits each city exactly once and returns to the starting city This optimization problem can be changed into a yesno problem by giving an integer B as part of the input The yesno question becomes is there a circuit which visits each city exactly once and returns to the starting city and has length at most B At rst glance the optimization problem seems harder than the yesno problem because we can solve the yesno problem by solving the optimization problem and solving the yesno problem does not give a solution to the optimization problemi But we can use the yesno problem to solve the optimization problemi 0 Set B to n times the largest distance then the answer to the yesno problem is yes 0 Set STEP to BZi 0 Now set E to B 7 STEP and STEP to STEP2i 0 Now do the yesno problem with this new B and this new STEPi o If the answer is yes set E to B 7 STEP and STEP to STEP2i o If the answer is no set E to B STEP and STEP to STEP2i 0 Continue this process until STEP 0 The last value of B will be the solution to the optimization problemi How long will this take Since STEP is halved at each call to the yesno procedure the number of calls will be the log of the initial value of STEPi But STEP is initially n times the largest distance so the number of calls is lognloglargest distance which is less than the size of the input Thus the optimization problem can be solved by solving the yesno problem a number of times which is less than the length of the input This example suggests the idea that two problems are equally hard if both of them can be solved in polynomial time if either one of them can be solved in polynomial time This equivalence relation on problems also suggests a partial ordering of problems A problem A is no harder than problem B if a polynomial time deterministic algorithm for B can be used to construct a polynomial time deterministic algorithm for Al We symbolize this relation by A S Bi If A is the 71 yesno traveling salesman problem and B is the traveling salesman optimization problem then we have both A S B and B S A If A and B are any two problems in P then we have both A S B and B S A because we could take the polynomial time deterministic algorithm for one problem and make it a subroutine of the polynomial deterministic algorithm for the other problem and never call the subroutine While in these examples the relation S works both ways there are cases in which S only works one way For example let A be any problem in P and let HALT be the halting problem then A S HALT but HALT S A because we donlt need an algorithm for HALT to construct a polynomial time algorithm for A and the polynomial time algorithm for A cannot help in constructing any algorithm for HALT let alone a polynomial time algorithm for HALT This relation A S B which we are calling A is no harder than B is usually called polynomial time reducibility and is read A is polynomial time reducible to B There are many other de nitions of reducibility in the literature We refer the interested reader to Garey and Johnson1979 and Hartley Rogers1967 The notion of a partial ordering on problems should aid us in our task of classifying problems In particular it may aid us in saying that two problems in NP are equally hard Further it suggests the question is there a hardest problem in NP We consider this question in the next section 84 NPComplete Problems In this section we will consider the relation S de ned in the last section and answer the question is there a hardest problem in NP Since we have a partial order S we might think of two very standard instances of partial orders the partial and total ordering of the integers which has no maximal element and the partial ordering of subsets which has a maximal element From these examples we see that our question cannot be answered on the basis that we have a partial order If we consider S applied to problems is there a hardest problem The answer is no because the Cantor diagonal proof always allows us to create harder problems On the other hand one may recall that the halting problem is the hardest recursively enumerable RE problem So on the analogy with RE there may be a hardest problem in NP Cook 1971 proved that there is a hardest problem in NP A problem which is the hardest problem in NP is called an NP complete problem Cook proved the more speci c result Theorem 2 Cook s Theorem Satis abz39lz39ty is NPcomplete The proof of this theorem requires an exact de nition of nondeterministic algorithm and since we have avoided exact de nitions we will only be able to give a sketch of the proof We refer the interested reader to Garey and Johnson1979 for the details The basic idea of the proof is to take any instance ofa problem in NP which consists of a nondeterministic algorithm a polynomial that gives the bound on the nondeterministic running time and an input for the algorithm and to show how to construct a Boolean expression which is 72 satis able iff the nondeterministic algorithm reaches a yes answer within the number of steps speci ed by the polynomial applied to the size of the input The construction proceeds by creating clauses which can be interpreted to mean that at step 0 the algorithm is in its proper initial state Then for each step a set of clauses are constructed which can be interpreted to mean that the state of the algorithm and the contents of the memory are wellde ned at this step Further for each step a set of clauses are constructed which can be interpreted to mean that the state of the algorithm and the contents of memory at this step follow from the state and contents at the previous step by an allowed instruction of the algorithmi Finally some clauses are constructed which can be interpreted to mean that the algorithm has reached a yes answer The polynomial time bound is used to show that the length of this Boolean expression is bounded by a polynomial in the length of the input An interesting consequence of the proof is that satis ability of Boolean ex pressions in clause form is NP complete This result can be re ned to show that satis ability in clause form with exactly 3 literals per each clause is also NPcompletei After Cookls result Karp1972 quickly showed that a few dozen other stan dard problems are NP completei Garey and Johnsonls book contains several hundred NPcomplete problemsi Johnson also writes a column for the Journal of Algorithms which contains even more information on NP complete problemsi Why is this business of NPcomplete problems so interesting The NP complete problems are the hardest problems in NP in the sense that for any problem A in NP A S NPicompletei So if there were a polynomial time deter ministic algorithm for any NPcomplete problem there would be a polynomial time deterministic problem for any problem in NP that is P and NP would be the same class Conversely if P NP then there is no point to looking for a polynomial time deterministic algorithm for an NPcomplete problemi Sim ply knowing that a problem is in NP without knowing that it is NPcomplete leaves open the question of whether or not the problem has a polynomial time bounded algorithm even on the supposition that P NPi The fact that a number of NPcomplete problems have been well known problems for several hundred years and no one has managed to nd a reasonable algorithm for any one of them suggests to most people that P NP and that the NPcomplete problems really are One of the virtues of Cook7s theorem is that to show that an NP problem A is NPcomplete you only have to show that satisfiability S A As a catalog of NPcomplete problems is built the task of showing that an NP problem A is NP complete gets easier because you only have to pick some NPcomplete problem B and show that B S Al e have already mentioned the traveling salesman problem TSP This problem is NP complete Let us show that if Hamiltonian circuit is NP complete then TSP is NPcomplete The Hamiltonian circuit problem HC is given a graph is there a circuit which contains each vertex exactly once and uses only edges in the graph Given an instance of HC we create an instance of TSP by letting each vertex from HC become a city for TSP and de ning the 73 distance between cities by dij 1 if there is an edge in HC between vertices i and j and d ij 2 if there is no suc edge Now if there were n vertices in HC we use B n as our bound for TSP If the answer to TSP is yes then there is a circuit of length n but this means that the circuit can only contain edges of distance 1 and hence this circuit is also a Hamiltonian circuit of the original graphi Conversely if there is a Hamiltonian circuit then there is a TSP circuit of length n To complete our proof We must make sure that this transformation from HC to TSP can be accomplished in polynomial time in the size of the instance of HO Since HC has n vertices and TSP can be speci ed by giving the nn 7 l2 distances between the n cities we only have to check for each of the nn 7 l2 distances whether or not it corresponds to an edge in the original graphi Even with a very simple algorithm this can be done in at worst n4 which is bounded by a polynomial in the size of the HC instance This is a very simple example of proving that one problem in NP is NP complete by reducing a known NPcomplete problem to the problemi We refer the reader to Garey and Johnson1979 for more complicated examplesi 85 Dealing with Hard Problems In the theoretical world there are hard problems Some of these hard problems are NPcomplete problems and there are other problems which are harder than NPcomplete problems How can these hard problems be handled in the real world The simplest way to handle hard problems is to ignore themi Many practical programmers do not know what hard problems are Their programming involves tasks like billing and payroll which are theoretically trivial but practically quite important lgnoring hard problems may be a reasonable strategy for these programmers Another way to handle hard problems is to avoid themi To avoid hard problems you have to know what they are One of the major virtues of lists of NPcomplete problems is that they help the programmer to identify hard problems and to point out that no reasonable algorithms for these problems are known It is often unfortunately the case that a programmer is approached with a request for a program and the requester has tried to remove all the speci c information about the problem and generalize the problem as much as possible Overgeneralization can make a problem very hard If the programmer can get the speci c information she may be able to design a reasonable algorithm for the real problem and avoid the hard generalizationi Sometimes real problems are really hard but not too big For example many real scheduling problems turn out to be traveling salesman problems with 30 to 50 cities For these situations an exhaustive algorithm may still solve the problem in reasonable time The programmer should still try to tune the algorithm to take advantage of any special structure in the problem and to take advantage of the instructions of the actual computer which will be used Exhaustive search is a way to handle some NPcomplete problems when the 74 size of the input is not too large Heuristics are another way to deal with hard problems We have already mentioned heuristics in discussing backtrack algorithms A heuristic is a method to solve a problem which doesn7t always wor To be useful a heuristic should work quickly when it does work The use of heuristics is based on the not unreasonable belief that the real world is usually not as complicated as the worst case in the theoretical world This belief is supported by the observation that creatures which seem to have less computing power than computers can make a reasonable living in the real world Arti cial intelligence has been using heuristics for years to solve problems like satis abilityi hese heuristics seem to be very effective on the instances of satis ability which arise in arti cial intelligence contexts Heuristics are also widely used in the design of operating systems Occasional failures in these heuristics lead to software crashes Since we usually see only a couple of such crashes per year these heuristics seem to be very effective Sometimes the behavior of heuristics can be quanti ed so that we can talk about the probability of the heuristic being correct For example a heuristic to nd the largest element in an n element array is to nd the largest element among the rst n7 1 elements If the elements of the array are in random order then this heuristic fails with probability ln and as n A 00 the probability that this heuristic gives the correct answer goes to 1 Such probabilistic algorithms are now being used for a wide variety of hard problems For example large primes are needed for cryptographic purposes While it seems to be hard to discover large primes there are tests which are used so that if a number passes all the tests then the number is probably a prime Many hard problems can be stated as optimization problems nd the small est or largest something which has a particular property While actually nding the optimum may be difficult it may be much easier to nd something which is close to optimumi For example in designing a computer circuit one would like the circuit with the fewest gates which carries out a particular computationi This optimization problem is hard But from a practical point of view no great disaster would occur if you designed a circuit with 10 more gates than the optimum circuiti For various hard problems approximation algorithms have been produced which produce answers close to the optimum answer We will consider an example of approximation in the next section 86 Approximation Algorithms Let us consider the traveling salesman optimization problem given a set of cities and distances between them nd the shortest circuit which contains each city exactly once This problem arises in many real scheduling situations We will try to approximate the shortest circuit 0 make an approximation possible we will assume that the distances behave like real distances that is the distances obey the triangle inequality dij S di h dhj so that the distance from city i to city j is no longer than the 75 distance from city i to city k plus the distance from city k to city j A simpler task than nding the minimum circuit is nding the minimum spanning tree The minimum spanning tree is a set of links which connects all the cities and has the smallest sum of distances In the minimum spanning tree the cities are not all directly connected several links may have to be traversed to get from city i to city ji There is a reasonable algorithm for the minimum spanning tree because the shortest link is always in this tree So one can proceed to nd this tree by putting in the shortest link and continuing to add the shortest link which does not complete a cycle A circuit can be constructed from the minimum spanning tree by starting at some city and traversing the links of the tree to visit every other city and returning to the starting city This circuit is twice as long as the sum of the distances of the links in the minimum spanning tree But this circuit may visit some cities more than once To clean up77 this circuit we use this circuit while no city is repeated and if city j is the rst repeated city and if city i is the city before city j and if city k is the next city after city j which has not yet been visited we connect city i to city kl We continue to use this procedure to produce a circuit in which each city is visited exactly oncei From the triangle inequality we have that the length of this new circuit is at most as long as the circuit with cities repeated and hence that this new circuit is no longer than twice the length of the minimum spanning tree The optimum circuit must be at least as long as the minimum spanning tree because the optimum circuit connects every city Thus we have OPT g ALG g 2OPT where OPT is the length of the optimum circuit and ALG is the length of the circuit produced by the approximation algorithmi t would be pleasant if all hard optimization problems had approximation algorithms Unfortunately this is not the case We really needed the triangle inequality to produce an approximation for the traveling salesman problemi onsider an instance of Hamiltonian circuit with n vertices e can convert this to an instance of traveling salesman without triangle inequality by assigning distance 1 to all the edges which are in the original graph and assigning distance n 2 to all the edges which were not in the original graphi Now if there were a Hamiltonian circuit in the original graph then there would be a traveling salesman circuit of length n If we could approximate this traveling salesman problem within a factor of 2 the traveling salesman circuit would have length at most 2n exactly when the original graph had a Hamiltonian circuit because if the traveling salesman circuit used even one of the edges not in the original graph it would have length at least 2n 1 So approximating the traveling salesman problem without triangle inequality is as hard as Hamiltonian circuiti his example can be generalized to show that no approximation within a factor of is possible by assigning each edge not in the graph a distance greater than fn7nil i To make sure the transformation can be carried out in polynomial time in the size of the instance of Hamiltonian circuit must be 76 bounded by 214 where pn is a polynomial Thus no reasonable approximation to the traveling salesman problem is possible unless the Hamiltonian circuit problem can be solved quickly that is unless P NPi 87 The World of NP The fact that various NPcomplete problems have resisted attempts to nd rea sonable algorithms for them suggests to many people that P NPi Even if we accept this belief there are still other open questions about classes of problems associated with NB The class NP is de ned in terms of the yes instances of its problems A class could also be de ned in terms of no instancesi ln corre spondence with NP we de ne the class coNP as problems whose complements are in NP that is for each problem in coNP there is a nondeterministic algo rithm which has polynomial bounded running time for the no instances of the problemi The notion of S we have used before will lump NP and coNP together To tell them apart we will need a ner notion called polynomial time many one reducibility and as is traditional in computer science we will overload the symbol S to stand for this new partial orderingi This new ordering will be de ned on sets but we will also use the ordering for problems where we ambiguously use the same name for the problem and for the set of yes instances ofthe problemi For example we will use SAT to mean the satis ability problem and we will also use SAT to mean the set of Boolean formulas which can be satis ed With these preliminaries we de ne ASB i there exists a polynomial time computable function so that for every input instance 1 16A iff fz Bi With this ordering it is still true that all nontrivial sets in P are mutually equivalent iiei VAB6P ASB and BSA On the other hand sets in NP and coNP do not have to be related For example let SAT mean all the Boolean expressions that are true for at least one setting of the variables and let coSAT mean all the Boolean expressions that are false for all settings of the variables We have already seen that SAT 6 NR Now notice that coSAT E coNP because if the answer to the coSAT question is no then there is a setting which makes the expression true that is not false and a nondeterministic algorithm could quickly guess the setting and verify that the setting makes the expression truer While SAT and coSAT seem to be very closely related this notion of polynomial time manyone reducibility 3 does not seem to see them as related Notice that for an expression E we ave E E coSAT iff E g SAT but our notion of S does not allow i it requires that E E coSAT iff 6 SAT While it is possible to construct this the only ways I know would give at least for some E7s an whose length is exponentially larger than the length of E So we expect that this notion of S will be able to distinguish NP from CON P i The above de nition of coNP suggests the questions 0 Does NP coNP 0 Does PNP coNP Unfortunately these questions are unsolvedi While the above questions are unsolved many people believe that the answer to each of these questions is no The basis for this belief is the analogy between NP and REi For RE the following diagram can be shown to be valid Recursive COREComplete coRE RE If the analogy between NP and RE is valid the world of NP should look ie REComplete coNPComplete coNP NP A minor difference in the two diagrams is that RE CORE has the name RECURSIVE but NP coNP has not been assigned a name If this diagram is correct then there are several types of hard problems which are not NP complete In particular there may be problems which are in NP coNP but are not in Pi Until recently CompositeNumber was a candidate problem for this status We know that CompositeNumber is in NP because we can show that a num ber is composite by guessing a factor and then proving that it is a factor by dividing which takes polynomial time in the number of digits The complement of CompositeNumber is Primesi While it is not immediately obvious for every prime number there is a proof that the number is prime and the length of the proof is bounded by a polynomial in the log of the number So the complement of composite is also in NP and composite is in NP coNPi But everyone including the National Security Agency assumes that composite and prime do not have reasonable algorithmsi Unfortunately proving that compositeprime is not in P is probably very dif cult since this would imply P NPi The status of Primes has recently been clari ed by Agrawal et al2002i Interestingly this result was discovered by one professor working with two undergraduatesi They showed that there is a polynomial time algorithm which determines whether or not a number is a primer At the moment their algorithm is not considered fast enough to be a practical algorithmi How does PRIMES E P effect cryptographic schemes As you may know several cryptographic schemes like RSA depend on the dif culty of factoring 79 NP Complete for their security The new algorithm does not at present effect RSA7s security because the algorithm can show that a number is composite without giving a factor Even if we know that 4 has a factor we don7t know how to nd the factor quickly Hence RSA is still secure But many people expect that a fast composite algorithm which produces a factor will be found soon The paranoids assume that NSA has such a factoring algorithm but wants to keep it secret so only they can break RSA coded messages We are now without a reasonable candidate for a problem in NP coNP but not in P Each of the problems that were known to be in NP coNP have each been shown to be in P For example LinearInequalities was known to be in NP coNP and it has now been shown to be in P LINEAR INEQUALITIES INPUT A set of inequalities of the form aiiri a2i12 quot39 anirn S bi for i 1 through m QUESTION Is there a set of rational numbers 1112 In which simultaneously satisfy all m inequalities We conclude by mentioning that our diagram of the world of NP may be incorrect but there is some reasonable circumstantial evidence to support it 88 Exercises Ex 81 Show that if Hamiltonian circuit is NP complete then Hamiltonian path is NPcomplete Ex 82 Show that if Hamiltonian path is NPcomplete then Hamiltonian cir cuit is NPcomp ete Ex 83 For the following graph use the minimum spanning tree method to construct a short traveling salesman circuit Assume that any missing edges have the shortest distance consistent with the triangle inequality How close to the minimum circuit is your constructed circuit Ex 84 Show that the following variant of the Towers of Hanoi is in the Class F INPUT A con guration7 CON7 of the Towers of Hanoi puzzle with n disks QUESTION Does CON occur in the sequence of con gurations used to move n disks from tower A to tower C using the minimal number of moves Ex 85 Show that the following Graph Isomorphism problem is in the Class NP INPUT Two graphs G1 and G2 QUESTIONCan the vertices of G1 be relabeled so that the relabeled G1 is identical to G2 Chapter 9 DIFFERENCE EQUATIONS 91 First Order Equations A rst order linear constant coefficient difference equation has the form Xn AXH fn where A is a constant and is a function of n We would like to nd the solution Xn written as a function of mi If we knew the value7 X07 of Xn when n 07 we could calculate X1 AXO f1 then we could compute X2 AXl f2 AQXO Af1 f2 X3 AX2 3 SXO A2160 A160 103 We can summarize these computations by Xn AnXO Z fz i i1 which we will call ANS This formula is correct when n 1 for then it says 1 X1 AlXo Z fz 1 l AXO f1 i1 since A1 A Assuming the formula ANS is correct for n 7 1 we get 82 n71 X MAHXO Z f W l i n i1 verifying by induction that ANS is correct We now have a solution to the initial value problem for a first order linear constant coefficient difference equation The initial value problem is Given 0 and a difference equation nd Xn as a function of n We will have to show that initial value problems for difference equations of this type have only one solution Then we will be sure that ANS gives us the solution not just one of several possible solutions 92 Uniqueness Theorem Theorem 3 An initial value problem for a di erence equation of the form Xn fltXn717Xn727 7Xnik7n with initial conditions X1 0417X2 0127M7Xk ak has a unique solution Proof From the formula and the initial values we can compute Xk1 by Xk1 fXkXk1 X1l l fakak1a1h l So given any h consecutive values Xn71 Xn2 Xnk we can uniquely com pute Xn Thus by induction we have that the initial value problem has a unique solution REMARK The one essential ingredient in this proof is the assumption that f is a function that is given an input value or vector of values freturns exactly one value More general recurrence relations are possible and it may not be possible to put them in the above form and satisfy the assumption that f is a function REMARK You may notice that the solution of an initial value problem for a linear constant coefficient difference equation is analogous to the solution of an initial value problem for a linear constant coefficient differential equation with the sum playing the same role for the difference equation as the integral does for the differential equation But the theory for the difference equation is much simpler The proof of the Uniqueness Theorem is trivial and no assumptions other than being de ned have to be placed on the function fto assure a unique solution 93 km Order Equations If we consider a 16 order linear constant coefficient difference equation we can write it in the form Xn Canil C2Xn72 Canik 901 We can write this in the compact form Lle 9n where LX Xnchn71 Can2 canki We can consider L to be a linear operator analogous to a matrix which operates on functions de ned on the natural numbers These functions also called sequences are analogous to vectors on which a matrix would operate Recall that when we work with vectors we de ne the operation of addition of vectors to be the componentwise sum of the two vectors ie XYi XiYi where X Yi is the ith component of the sum vector and Xi and i are respectively the ith components of the vectors X and Y This operation of vector addition can be extended to sequences where we view the sequences as in nite dimensional vectors with a component for each natural number As well as vector addition we also have the operation of scalar multiplication when we are dealing with vectors That is we consider the components of the vector v to be numbers from some eld like the real or complex numbers and if we take an element a called a scalar from the same eld then we can de ne the scalar product av to be the vector whose i h component is a times the ith component of v Of course the multiplication of each component by the scalar is carried out in the eld we are using Again this operation of scalar multiplication can be extended to sequences by considering the sequences as in nite dimensional vectors Now that we have de ned vector addition and scalar multiplication the formula aX bY is well de ned where X and Y are sequences and a and are scalars We can now de ne what mean by saying that L is a linear operator L is a linear operator iff LaX bY aLX bLY for all sequences X and Y and all scalars a and bi Returning to the operator L de ned for our linear difference equation we nd that if X aY bZ then LX LaY bZ aYn bZn 01aYn1 bZn algal7116 bZnLk rearranging the right hand side of the equation using the usual rules for addition and multiplication we nd t at LaYbZ aYn61Yn71 39CkYnikbZn51Znil39 6an716 aLlY 84 And thus L is7 in fact7 a linear operator on the space of sequences To understand what we are doing consider a speci c operator L1X Xn 7 XHi If Xn n then L1X n 7 n 71 1 So we see that the effect of this operator on the function it is the production of the function which is constantly 1 We also nd that L1H 1 7 1 0 and since L1 is a linear operator L1a1 bX aL11 bL1X bL1Xi We want to be able to solve the 16 order linear constant coef cient differ ence equation Lz Notice that the Uniqueness Theorem we have proved assures us that the initial value problem for these equations has a unique solu tion Let us rst consider the Homogeneous Equation LX ditions 0 with initial con Xo 0407X1 117AH7Xk71 ak71A lfwe take Xn 51Xn71 52Xn72 Can7k 0 and replace Xi by AHk we get the characteristic equation of L Akc1Ak 1ck 0 Let M be the ith root of the characteristic equation7 then k Xn 2am i1 is a solution of the Homogeneous Equation since 16 k k m aiA ZenLug Zai 0 0 i1 i1 i1 And L 0 from the fact that M is a solution to the Characteristic Equation Notice that 221 am is a solution for any values we assign to the agsi To nd the values for the ags we must look at the initial values k 7 0 7 X0 7 E aiAi 7 a0 i1 k X1 E aZAl1 a1 i1 k 71 X191 E aMi aka i1 This is a system of k linear equations in k unknowns We can write it in matrix form as 1 1 m 1 A1 A2 11 Ak 11 10 A A m Ag 12 11 A f l A134 m A l ak a If the X8 are distinct then the matrix A is nonsingular1 If the matrix were singular there would be a vector 120121111121671 such that 12A 01 But this would mean that the k 7 1 degree polynomial be blA bk71Ak 1 would have k distinct roots which is impossible So if the X8 are distinct we can uniquely solve this system of equations and calculate the agsi If two X8 are identical then two columns of the matrix are identical the matrix is singular and the 03 do not uniquely specify the a si In general the solution is k X ZaiDm Px i1 where mi is the number of roots with indices less than i which have the same value as Ail Dml is the my derivative that is nn71mnl 7mi m i Using this general solution the matrix for computing the ags from the ags is nonsingulari Otherwise there would be a vector 12 such that 12A 0 but this means there would be a k 7 1 degree polynomial with k roots counting multiplicity which is impossible Returning to the Nonhomogeneous Equation Lz gn we notice that the Uniqueness Theorem holds So we know that the initial value problem for a Nonhomogeneous Equation has a unique solution One method for solving these problems is to find a particular solution 1 such that Lv gn but 1 is not required to satisfy the initial conditions Let h be any solution of the Homogeneous Equation that is Lh 01 Then Lv h Lv 0 Thus to solve the initial value problem we pick that h which causes 1 h to satisfy the initial conditions We can view h as the general solution of the Homogeneous Equationi So h will contain k unspeci ed coefficients a1 a2 M aki We then solve a system of linear equations for the a amp This system of equations is speci ed by the requirement that v h satisfy the initial conditions The method we have just described is not an algorithm since it does not tell us how to nd the particular solution v1 In general there is not an algorithm which will always give us a particular solution in closed formi There is an algorithm which gives us the particular solution as a summation But for a 86 very common class of Difference Equations there is an easy method to obtain a particular solution as the following theorem statesi Theorem 4 Consider the Di erence Equation Lz pnA where pn is a polynomial of degree r and A is a root of multiplicity m of the characteristic polynomial of L This equation has a particular solution of the form qnA where qn is a polynomial of degree r m REMARK If A is not a root of the characteristic polynomial of L then we consider A to have multiplicity 0 Thus if A is not a root of the characteristic polynomial then the Difference Equation has a particular solution of the form qnA where qn has the same degree as Proof Consider LDjA where Dj is the jth derivative with respect to Al LDjA DjLA since the derivative operator can be moved across sums and constants with respect to Al LA An kchA where chA is the char acteristic polynomial of Land h is the order of the Difference Equation v n v nik 0 if m gt j LDJ DJ chA qjimm nnc if 2 m Since chA and its rst m 7 1 derivatives will be 0 if A has multiplicity m and since the r h derivative of An k will give a polynomial of degree r in n times An ki In the above qj mn is a polynomial of degree j 7 mi As a particular solution of the Difference Equation we consider 221 aij From above Ll L1ij Wll Z aleDj W Z ajqjimn k For this to be equal to pnA we need J 2 aqu39imm Pn k Since the right hand side is a polynomial of degree r the left hand side must have degree r so J7mr and J rmi Finally we notice that E aijA can be written in the form qnA where qn is a polynomial of degree r nm E To use this theorem to nd a particular solution of LM pnA we deter mine r the degree of pn and determine m the multiplicity of A Then we take a polynomial qn of degree r m with unknown coef cients and substitute qnA for X in the Difference Equationi This will give a system of Linear Equations one for each power of n We then solve the system to determine the unknown coef cients of If we want to nd a particular solution 1 for the difference equation Llrl f1n f2n 87 and we can nd y and 2 such that Ly fl7 and LM f2 then we can use 1 y 2 since lel Lly 2 Llyl lel f1 f2 Algorithm for solving linear constant coefficient difference equations INPUT In 5117171 Ck1n7k 9n 107117 7Ik71 1 Find the roots of the polynomial Akc1Ak 1ck Call these roots A1x2 i i i Aki E0 Find a particular solution 1 so that vn Clvn71 Ck39Un7k 900 1 does not have to satisfy the initial conditions The last theorem in the handout may help in nding vi CA3 i The general solution is k In v Z aiDm 11 Where mi is the number of roots With indices less than 239 Which have the same value as Ail Dml is the mth derivative7 that is nn 7 l i i i n 17 mi ml 4 Find a1 agp i i ak by solving the set of linear algebraic equations k 10 7 v0 ZaiDm ngl i1 k 117111 ZaiDm Px i1 k 1167171671 ZaiDm Pfill i1 Exercises Ex 91 Use the methods described above to solve7 the following initial value problems 1 In 2In1 I0 1 In21n171 I00 In21n171 I0l In21n171 I07 fnil l fn72 f1 f2 1 fnfn71fn72 1 f1f20 fnfn71fn72 1 f1f21 fnfn71fn721 f1f20 PWNQWFWN 5 In2In1n I00 10 In 2In1 n I0 71 11 In21n17n I00 12 In 2In1 2 I1 2 13 In 2In1 2 I0 2 14 fn1fnfn714nin2 f11f24 15 In 2In1 n2 I1 2 16 An AVL tree is a binary tree in which at each vertex the heights of the right and left subtrees of that vertex differ by at most 1 If the AVL tree is of height h what is the maximum possible number of vertices in the tree What is the minimum number of vertices Derive difference equations for the maximum and the minimum number of vertices in an AVL treei Find appropriate initial conditions for these difference equations Solve both these initial value problems and thus obtain bounds on the number of vertices in an AVL tree of height h 94 NonNegative Difference Equations In this section we will consider nonnegative difference equations7 that is7 equa tions which can be written in the o In Clrnil 621772 Ckrnik 9n where Ci 2 0 for each i and gn 2 0 We will discover some facts that make these equations easier to solve than the equations we have dealt with above In particular7 we will give an easy method to determine the asymptotic solution of these nonnegative equations 941 Homogeneous Nonnegative Difference Equations First we will consider the special case when gn 0 As usual we call these equations homogeneous The term homogeneous is used because we could replace each occurrence of I by aI and then divide the equation by a to obtain the original equation A homogeneous nonnegative difference equation can be written as In 6117171 6217172 39 39 39 Ck1n7k where ci 2 0 for each i We will also assume that our equations are aperiodic in the sense that gcdi l ci gt 0 1 This assumption is usually satis ed by the equations that arise in practice and the assumption allows the results to be written more simply Theorem 5 An aperiodic nonneqative di erence equation has a solution of the orm In Ag where A0 is the unique positive real root of Ak 7 clAk l 7 7 ck Further for any other root A1 of this polynomial A0 gt Mll Theorem 6 If I0I1 Ik1 are the initial values for an aperiodic nonneq ative di erence equation then the solution is In 90 ifIi 2 0 for each i 6 01 h 71 andfor at least one such i In gt 0 For example we can apply these theorems to the Fibonacci numbers which have the difference equation fn fn71 fn72 and the initial conditions f0 07161 1 Here the coef cients are c1 c2 l and the initial conditions have f0 2 0 and f1 gt 0 So the theorems say that fn Ag where A0 is the unique positive real root of A2 7 A 7 1 For this example it is easy to calculate that A0 l VZ to Verify that Al l 7 2 and to see that A0 gt Mll Notice that fn is always an integer but A3 is an irrational number Another simple example is In 1771 21772 with I0 1 I1 2 The solution is In 2 which is clearly 92 Consider this equation with initial conditions I0 0 I1 15 and show that the solution is still 92 Consider the same equation with initial conditions I0 74 I1 7 Show that In gt 0 if n 2 3 and that In 92 still holds On the other hand consider the equation with initial conditions I0 71 I1 l and show that In 92 942 Equations With Forcing Functions Next we want to consider nonnegative difference equations with nonnegative forcing functions that is k In Zcirn7i i1 If we assume that gn gt 0 or Ii 2 0 for eachi E 01h 7 1 then it is easy to see that In gt 0 for all n 2 h This assumption will make the proofs simpler There are three situations we want to consider 0 gn is much less than Ag 0 gn is Ag times a polynomial in n o gn is much greater than Ag We will show that the solutions will behave almost as one would expect So if gn is much less than Ag In Ag and ifgn is much greater than Ag then In g The behavior in the remaining case may be a little unexpected 1f gn ndAg then In nd1g that is the response to forcing by a polynomial times A3 is a a polynomial of one higher degree times Ag 1n the most common situation this is gn Ag and In nAg Notice that in this situation you don7t see the polynomial gn 1 Ag and the polynomial 1 is invisible Of course 1 is a polynomial of degree 0 and so the response will involve a polynomial of degree 1 Here again there is a problem of invisibility because it n01 n1 and we usually don7t write the exponent To prove these results we will make use of the following lemma about non negative polynomials Lemma 1 Let p Ak 7 21ciAk i be the characteristic polynomial for a nonnegative di erence equation Then p has a single positive real root A0 and p lt 0 for all A E 00 and p gt 0 for all A gt A0 The next three lemmas will consider the various growth rates for gn and nd the asymptotic order for the solution of a nonnegative difference equation with gn as the forcing function Lemma 2 The nonnegative di erence equation In c1In1 chn2 39 39 39 Cklnik 901 with gn 0Q for some A1 with 0 lt A1 lt A0 has the solution In Ag Proof Clearly In 2 221 ciIni since gn 2 0 Since In gt 0 for all n 2 Is one can nd a b gt 0 so that In gt Mg holds for n in k 2h 71 Then k k k In 2 ZciIni 2 ZcibAg i 2 Mali6 ZciAgii i1 i1 i1 2 bAg kA g 2 M3 and by induction In 2 Mg for all n 2 h For the upper bound k k In E Z Cixnii 9n E Z Cixnii W i1 i1 for all big enough n Now assume that Inn S aAg i 7 Milii for big enough it and i in 1i i i h then k In Eel mam 7 Mi m 39 1 L k 3 mg 7 7 2 mg BA i1 S M3 VAl kMh B 7 WM So we can Choose 7 gt B which gives 67 7 lt 0 and by the lemma on polynomials p1 lt 0 and thus In S aAgi Putting the upper and lower bounds together gives In S Ag D Lemma 3 The nonnegative di erence equation In 5117171 62172 Ckrnik 901 with gn 90 for some A2 with 0 lt A0 lt A2 has the solution In Proof This proof is left as an exercise D Lemma 4 The nonnegative di erence equation In Clrnil 621772 Ckrnik 9n with gn ndAg has the solution In nd1g Proof To simplify the proof diVide the difference equation by A3 to get k I c I 7 i 2 m 0 i1 A0 0 Next de ne yn InAg and bi ciAg to obtain k yn Zbiynii nd 11 with the condition that 221 bi 1 For the upper bound assume that you can nd a so that yni S an7id1i Obviously if n is big enough a can be chosen so that this bound holds for k consecutive ygiisi Then k yn S Ziaam 7 id1 nd i1 k S an 7 ld1 Zbi nd i1 S an 7 ld1 nd because n 7 i S n 7 l and 221 bi 1 We re left with showing that an 7 ld1 nd S and1 or equivalently an 7 ld1 S and1 7 ndi Factoring the right hand side leaves an 7 ld1 S ndan 71 but clearly n 7 lquotI lt 70d and we only need an 7 l S an 7 l and this relation holds for all a 2 1 Notice that for the base of this induction we had to choose a large enough to give the bound yni S an 7 id1 and now we nd that we also need a 2 1 So we can certainly pick a big enough to satisfy both of these conditions For the lower bound assume that we can nd a B gt 0 so that yni 2 7 ld1 holds for k consecutive ygiisi en k yn 2 Zbi n 7nd1 nd 13971 2 Mn 7 W i121 n i i1 2 607 kd1 nd because n 7i 2 n 7 k and 221 bi 1 We are left with showing that B 7 kd1 nd 2 5000144 or equivalently nd 2 MOW n Wml The trick here is to rewrite nquot1 as 7 h kld1 and use the binomial expansion So nd1 7 n7 kd1 7 k 77 kd1 7 n 7 kd1 11 EM 7 kgtd1ikiltd t 1 7 n 7 W i0 2 11 3 n 7 kd kiltdz1gt g n 7 mac 1d1 In this calculation we have changed the limits of the summation at a couple of places We have also used the fact that n 7 h1 i S 1 for i 2 1 and n 2 hi Now nd 2 7 hdh ld1 2 nd1 7 n 7 hd1 by choosing 6 small enough Notice that to start this induction we only had to choose 6 small enough to give a lower bound on yn7ii So our choices of B are consistent Putting the upper and lower bounds together we have yn nd1 and retranslating this result to In we have In n D Because our difference equations are linear we have the familiar supemosition piinciple that is if In is a solution to k In Zc n7i 901 11 and yn is a solution to k yn Zciyn7i i1 then 2 is a solution to k 2n Zcm 39 900 fn i1 Using the superposition principle we can restate the preVious lemma as Lemma 5 The nonnegative di erence equation In 511n71 621772 Ck1n7k hn8 where hn is a polynomial of degree d in n and A0 is the positive real root of Ak 7 221 ciAk i has the solution In nd1g The special case when the polynomial has degree zero arises so often in practice that we restate this special case 94 Lemma 6 The nonnegative di erence equation In 511n71 621772 Ck1n7k M8 where Ao is the positive real root ofAk7EL1 ciAk i has the solution In nAg 95 Homogeneous Periodic and Primitive Homogeneous nonnegative recurrences which can be written in the form HNN sn clsn1 cgsn2 cksnk where all ci 2 0 and ck 0 Since the characteristic polynomial of this equation is a nonnegative polynomial there is a dominant eigenvalue which we call Aol Let g gcdilci gt 0 If g l the equation is primitive and for all other eigenvalues A Ao gt lAjll If g gt 1 the equation is periodic and has g eigenvalues Ao1 l l l Ag71 all with absolute value equal to Moll An example of a simple periodic nonnegative difference equation is 5n 45n72 which has periodic characteristic polynomial chz 12 7 4 and dominant root Ao 2 With initial conditions so 1 s1 the solution is Sn 4ln2l and sn 92 Ag When the initial conditions are so 1 s1 0 the solution is 8 7 4 when n is even n 7 0 when n is odd and sn 02 but sn Ag since every odd position in the sequence is 0 The next theorem and its corollary elucidate the difference between periodic and primitive equationsl Theorem 7 Let Ao be the dominant eigenvalue of the homogeneous nonnegative recurrence HNN with nonnegative initial values so l l l sk1 lt sn gt is the solution then sn 00 If in addition there are k consecutive elements in lt sn gt which are positive then sn 9 Ag l For example the Fibonacci recurrence is a homogeneous nonnegative recur rence in which all terms after the rst are positive Therefore this result says that fn Ag where Ao 1436 is the unique positive eigenvalue Another simple example is sn sn1 an2 with so ls1 2 Since chz 12 7 z 7 2 I lz 7 2 and the rst two conditions are positive the solution is 92 Corollary 1 Let lt sn gt be the solution to a homogeneous nonnegative re currence with nonnegative initial conditions where not all initial conditions are zero If the characteristic polynomial is primitive then lt sn gt is eventually positive and sn Ag We should note that a periodic equation whose index of imprimitivity is g gt 1 such as sn 487L727 the example at the beginning of this section with g 2 can be expressed as a system of g primitive equations because 5n CgSnig 5251371725 a a a CrgSnirg can be written as the primitive system 0 0 0 the Cgthjl 5297532 A A A CTgthr t9 cgtifjl 52 ij l l l crgt 1 nir thgi 1 Golgi 5297537721 A A A 975511 where the initial conditions for each lt t gt are the initial conditions of lt sn gt whose subscripts equal j mod g If all initial conditions for a particular j are zero7 the sequence lt t gt is the zero sequencer Otherwise7 at least one initial condition for lt t7 gt is positive and by the corollary t5 Agg Translating this back to the original sequence lt sn gt7 we see that for a xed j the subsequence whose subscripts satisfy the arithmetic progression n E j mod g can either be the zero sequence or it will grow like Ash In some sense we can therefore regard lt sn gt as a periodic sequence with period g7 or under some special circumstances the period of lt sn gt may be a divisor of g 96 Summary Nonhomogeneous Nonnegative Equa tions In this section we look at nonhomogeneous nonnegative equations which have one of three types of nonzero inputsl Theorem 8 Consider a nonnegative di erence equation 91 5n 618ml 628772 CkSnik gn where g is a function such that the value of gn is nonnegative for each natural number n Let A0 be the dominant eigenvalue of the recurrence If sn gt 0 for all su ciently large n then 0 sn Ag when gn OM for some positive A1 lt A0 96 0 sn gn when gn 90 for some A2 gt A0 0 sn nd1g when gn fng for a polynomial f with degf d These three types of nonnegative forcing functions respectively correspond to the situations when gn is much less than Ag much greater than Ag and equal to A3 times a polynomial in n For the last two types of forcing functions the hypothesis requiring sn to be eventually positive is super uous In the rst type the required eventual positivity follows once there are k consecutive positive terms and that can result from positive initial conditions or positive gn or some combination of these forms of positivenessi Notice that the solutions asymptotically behave about as one would expect except that the behavior in the third case may be a little unexpected In that case the response to forcing by a polynomial times Ag results in a polynomial of one higher degree times A3 The most common occurrence of this is with gn A3 for which sn nAg holds The remainder of this section is occupied with proving the theoremi The following special case arises often in practice Corollary 2 For any positive constant b any nonnegative di erence equation of the form 5n Clsnil 628772 CkSnik M8 has solution sn nAg where A0 is the dominant eigenvalue There are many variations of the results in Theorem 8 Because these vari ations are proved in essentially the same way as the results in the theorem we simply state them here and leave their proofs as exercisesi o lfgn OM for A1 lt A0 then sn 00 a If the difference equation is primitive and gn 0Q for A1 lt A0 then sn Ag b If sn gt 0 for h consecutive values of n and gn OM for A1 lt A0 then sn Ag 0 Let Cn maxgnggn 71Higilgli lf gn for A2 gt A0 then a sn nm sn OGn lf Cn Ogn then sn lfgn then sn lf gn hng for some nonnegative hn then sn Ongn and nm AA A m 91 0quot VVVV o If gn hng for some nonnegative hn which is nondecreasing then mzntwmgt Chapter 10 Worked Examples Here we consider k h7order linear recurrences7 that is7 equations of the form L 3n 7 clsn1 7 Cgsn2 7 7 011637146 for n 2 k Where Ci are complex scalars With ck 07 and 11 is a complexvalued function We can nd a nice solution for the special cases in Which the forcing function has the form 1M VFW Where A is a xed scalar and p is a polynomial With complex coef cients The initial value problems we consider here have this formi Throughout7 A1 i i 7A are the di erent eigenvalues of the recurrence 101 All Simple Roots All examples in this section are second7order linear equations Whose character istic polynomia is chz 127176 z7i z27 Which has the simple roots A1 37 A2 72 and the general solution of the homogeneous system has the form 3n a13 a2 72 for some 0417042 6 C and for any 11 the equation 8n 87171 687172 01 has the general solution 3n a13n a272n on for some particular solution Uni The constants 0417042 are determined by the initial conditions To nd a particular solution to the nonhomogeneous equation 5n 3n71 65n72 A Pn7 where p is a nonzero polynomial and A is a scalar7 we use the following Solving L when there are no multiple roots When chz has no multiple roots then Len is a solution to the equation 10 1 sn 7 clsn1 7 628772 39 39 39 CkSnik npn7 if and only k 102 Sn EMA A n 401 11 where a17i i i ak E C 971 is a polynomial with degq degp and 7 0 if 17v t 103 6 1 ifAthqutt A particular solution is A n6 Letls rst analyze the homogeneous initial value problemi Example 1011 For any initial value problem with equation sn sn1 6sn2 we have sn a13 a2 72 7 and the initial conditions so7 s1 give 104 so a1 a2 and s1 Sal 7 2a Multiplying the rst equation by 3 and subtracting that from the second equa tion7 we obtain 3s0 7 s1 s1 7 3s0 75a2 and a2 lnserting this value for a2 into the rst equation of 1047 we have 3s07s1 2s0s1 a180TT7 and the solution to the homogeneous initial value problem is n 3n MT SIHV Example 1012 Consider the second7order equation 5n 8771 68772 2 3 Which has degp 0 and A 2 372 and so degq 0 and 6 0 The above formula gives the particular solution 1 c2 and the recurrence can be used to solve for 0 714 7 6 2 7 v 7 4 652 2 7 c2 7 271457 35 2 7 25 Therefore 25 2 0 Which gives the particular solution 1 72 and 105 Sn a13n a272n 7 2 for some a1 a2 6 C We can Write this in terms of the initial values so 31 namely 106 so a1 a2 71 and 31 Sal 7 2a 7 2 Which can be solved for al a2 as in the previous example Multiplying the rst equation by 2 and adding it to the second we have 2 4 280315a174anda130l Substituting this value of a1 into the rst equation of 106 we obtain WHOWIHHWWHW7 and 105 becomes 7230314 3807sll 7 5 5 3n 72 7 2 To satisfy ourselves that this has been done correctly let us use the last expres sion to verify that 52W935075H4743163047 Which is consistent With the recurrence Example 101 For the second7order equation 107 Sn sn1 637 n2 let us rst check that there is no constant c for Which wn c2 is a particular solution We have 10771 610772 n2 7 w 7 4 652 7 n2 7 c2 2 1c 35 2n 7 2c 2 c n 100 Which cannot be zero for all n When 5 is a constant Rather since degp 1 and A 2 3 72 the formula says that there exists a particular solution of the form 1 an b2 for some constants abi The recurrence gives vn71 6vn72 n2 7 Un an 7 a b2n 1 6an 7 2a b2n 2 n2 7 an b2n 2n 1an 7 a b 3an 7 2a b 2n 7 2an 12 7 2 12a 2n 7 77a 7 212 Which must equal zero for all n that is 2a 2 0 and 77a 2b 0 This gives a 71 2 7a 77 an 1 72n 72 1 is a particular solution of 1017 and the general solution has the form 3 a13 a2 72 7 2n 72 1 Where ahag are constants in C For initial values so 31 we obtain 30 a1a27 72 and 31 3a172a27Qi Simultaneously solving this system of equations yields 72303116 d 768072813 a17 5 an a27 10 Which gives 2308116 n 68072313 3Ff3 f 72 1 7 2n m Verifying this calculation for n 2 shows that 183 98 144 12s 748 6 82228168087 as required Example 101114 For 5n 8771 68772 373 degp 0 and A A1 Which means that 6 1 and the formula gives 1 3nbn for some I E C Note that wn 03 is a solution to the homogeneous equation and so cannot be a particular solution here Also 1 3 lm a could be used here but the constant term can be absorbed into the earlier coef cient of 3 To nd I use 714 6 3 7 v 7 37141201 71 2n 7 4 7 3n 3 7 3 175b 3 and from this I 35 and 1 n3n1Si Therefore 3n1 3n a13 a272 L 101 and so a1 a2 31 Sal 7 2 95 gives 7 28031795 3807sl95 if f a1 and a2 Hence 5 gage 31 7 95 3703 330 7 31 9572ngti Example 101115 For 3 sn1 63772 n72 degp 1 and A A2 from which we obtain 6 1 and 1 72 qn where qn cn2 m for some 120 6 C en vn1 611772 n72 7 1 47171 101 1 3401 i 2 2n 2401 equals zero for all integers n 2 2 In particular from n 2 and n 3 we obtain 41 340 242 47 42 341 243 67 which give 512 90 4 and 512 190 6 and c 15 b 11251 Therefore 571 117 5 qn 2 and 3n a13n a2 72ni Substituting for n 0 and n 1 we have so a1 a2 and 31 Sal 7 2a 7 3225 and from this we obtain 230813225 330 7 31 7 3225 a1andagi 102 102 One Multiple Root Solving L when there is a single eigenvalue When there is a single eigenvalue A1 it has multiplicity h and the rule for particular solutions simpli es to vn Ann qn is a particular solution where q is a polynomial with degq degp and 6 0 ifA y A1 108 k i All If A A1 the solution is 3n A 4101 A 401 where degq degp degq1 S h and the coe cients of ql are determined from the initial conditions If A A1 the solution is 8n All 4101 A 71k 401 AnQW anddegw k dew The ve examples in this section are secondiorder equations7 and each has the characteristic polynomial chz 1274z41722i This has the double root A1 2 Which means that the general solution is sn aln a2 on7 Where vn is a particular solution As in the last section7 vn depends on ill and a17 a can be calculated from the initial conditions Example lOlQil For any initial value problem With recurrence 5n 45n71 7 487L727 sn aln a2 7 Where the constants a1 sl 7 2s02 and a2 so can be computed from the initial conditions 807817 and the solution is Sn s1 72son2302n71i 103 Example 1022 Consider the second7orcler equation 8n 48n71 48n72 3n7 Where degp 0 and A 3 is not an eigenvalue of the recurrence From the formula a particular solution is 1 03 for some constant c and 47ml 7 4W4 3 7 v 7 3 125 7 4c 9 7 95 7 3 9 7 c Which equals zero for c 9 This gives 1 3n2 and general solution 3 mm a22n 3n2 for some ahag E C For the initial conditions 3031 computation gives 37 WTH SO 7 927 37H Example 1023 The second7orcler equation 3 43771 7 43772 3 n has degp l and A 3 is again not an eigenvalue Therefore there are constants ab such that 1 an b3 is a particular solution Also 411771 7 411772 n3 7 1 3 212an 7 a b 7 4an 7 2a b 9n 7 9an 12 7 3Hltlt9 7 agtn 7 M b Which equals zero for all integers n 2 2 When a 9 and b 74a 736 Therefore 1 9n 7 363 n 7 43 2 is a particular solution and the general solution has the form 3 mm a22n n 7 43n2l For initial values 8081 we obtain lt51 7 2250 9 Sn n 7 so 362 n 7 A93 Example 1024 The second7orcler equation 3 43771 7 43772 2 has degp 0 and A 2 is an eigenvalue Since its multiplicity is 2 6 2 holds in the formula and 1 an22 is a particular solution for some a E C Then 4vn71 7 4vn72 2 7 Un 2n2an 712 7 an 7 22 17 an2 2 72a 1 104 implying a Which means that vn n22n 1 is a particular solution The general solution therefore has the form 3n n2 mm a22n717 and from the initial values so 31 we obtain 3n n2 31 717 230n 2802nili Example 101215 The second7order equation 3n 43n1 7 437 2 n has degp 17 and A 2 is again the double eigenvalue of the recurrence7 Which means that vn 2 qn is a particular solution for some qn an3bn2i Then 4vn1 7 41 n2 7 Un 2712401 1 401 i 2 n 40017 Which must equal zero for all values of n 2 2 From 24n717qn72n7qn 0 for n 273 we obtain a 16 and b 3a 121 Therefore7 vn n3 n22 1 is a particular solution7 and the general solution has the form n3 3n n2 mm aggt2nili lnitial values so 31 give 3 4 3n n2 31 7 230 7 n 230 2nili Since from 1018 the solution is With a polynomial of degree 37 the same solution could have been obtained by solving a system of four linear equations to nd the coef cients of To obtain these coef cients as functions of only so and 31 the recurrence 1025 could be used twice to give 32 and 33 in terms of so an 81 105 103 One Multiple Root Several Simple Roots Solving L when there is a multiple eigenvalue When m1 2 2 an m2 mt l are the respective multiplicities of the distinct eigenvalues A1 A2 i i A of the recurrence L then 1 Ann6qn is a particular solution where q is a polynomial with degq degp and o wwlmm 109 5 m1 ifA A1 1 ifA 2iHt Since each of the last ve examples had only one double root The last alternative of 109 did not occur The next example illustrates this case Example lOiSil The second7order equation sn 4sn1 7 5572 2sn3 2 has degp 0 and A 2 Since the characteristic polynomial factors as chz 13 7 412 51 7 2 I 7 l2z 7 2 A 2 is a simple root of chz and 6 l Which gives the particular solution 1 an2ni hen 411771 7 511772 211773 2 7 1 2 28an 71 7 5an 7 2 an 7 3 4 7 4an 2 24 7 a and 1 4n2 n2n2 is a particular solution and the general solution is sn aln a2 a32 n2n2 aln a2 a3 4n2 i For initial values s0 s1 s2 we obtain soa2as and81 a1a22as8 82 2a1a24a3327 Which implies sn 72s0 3s1 7 s2 8n 2s1 7 s2 16 so 7 2s1 s2 7162n n2n2i 104 The Input is 7 p1 73mm For this form of input one can nd a particular solution by taking the sum of particular solutions to two related equations Of course there is nothing special about 2 so if the input is a sum ofj terms one can nd a particular solution by taking the sum of particular solutions to j related equations 106 Example 101411 For the second7order equation 10110 3n 4sn1 7 437 3n n 2 we break this equation into two equations7 one for the input 3 n and one the input 2 1011 Un 4vn1 7 41 3 n7 10112 wn wsn1 7 4wn2 TR From Example 1012127 we have that vn n 7 4 3n2 is a particular solution to 101117 and from Example 1013117 wn n2 2n 1 is a particular solution to 1012 Combining these we have that n 7 4 2 7122714 is a particular solution to 10110 The general solution to 10110 is then 3n a1n a2 n 7 4 3n2 n2 2nil where 11 and a2 depend on the initial conditions so and 81 Solving for al and 12 we nd that 107 Chapter 11 Further Reading 111 Books 0 AiVi Aho JiEi Hopcroft and JiDi Ullmani The Design and Analysis of Computer Algorithms AddisonWesley Reading MA 1974 0 Paul Cull Mary Flahive and Robby Robsoni Difference Equations From Rabbits t0 Chaos SpringerVerlag New York 2005 o MiRi Garey and DiSi Johnsoni Computers and lntractability A Guide to the Theory of NPCompletenessi WiHi Freeman San Franciso CA 1979 o DiHi Greene and DiEi Knuthi Mathematics for the Analysis of Algo rithmsi Birkhauser Boston MA 1981 Di Hareli Algorithmics The Spirit of Computing AddisonWesley Read ing MA 1987 various editions 0 El Horowitz and Si Sahnii Fundamentals of Computer Algorithmsi Com puter Science Press Potomac MD 1978 0 Jr Kleinberg and El Tardosi Algorithm Design AddisonWesley Boston 2005 DiEi Knuthi The Art of Computer Programming Voli 1 Fundamental Algorithmsi Voli 2 Seminumerical Algorithmsi Voli 3 Sorting and Searchingi AddisonWesley Reading MA 0 Al LeVitini The Design amp Analysis of Computer Algorithmsi Addison Wesley Boston MA 2003 0 Vi Pani How to Multiply Matrices Faster Lecture Notes in Computer Science 179 SpringerVerlag Berlin 1984 0 Hi Rogersi Theory of Recursive Functions and Effective Computabilityi McGrawHill New York NY 1967 108 0 RI Sedgewicki Algorithms AddisonWesley Reading MA 1983 0 J Fr Traub and Hi Wozniakowskii A General Theory of Optimal Algo rithms Academic Press New York NY 1980 0 S Winograd Arithmetic Complexity of Computations SIAM Philadel phia PA 1980 0 Ni Wirthi Algorithms Data Structures Programs PrenticeHall Englewood Cliffs NJ 1976 112 Articles 0 Mi Agrawali Ni Kayal and Ni Saxenai 2002i PRIMES is in Pi wwwicseiiitkiaciinprimalityipdf o P Buneman and Li Levyi The Towers of Hanoi Problemi Information Processing Letters 10 1980 pp 2432441 0 SA Cook The Complexity of TheoremProving Procedures Proci 3rd ACM Symposium on Theory of Computing 1971 pp 1511581 0 Pi Cull and Jr DeCurtins Knight s Tour Revisited Fibonacci Quarterly 161978ppi 2762851 0 Pi Cull and EIFI Ecklund Jri Towers of Hanoi and Analysis of Algorithms American Mathematical Monthly 92 1985 pp 4074201 0 RIMI Karpi Reducibility Among Combinatorial Problems in Complexity of Computer Computations ed by RIEI Miller G JIWI Thatcher Plenum New York 1972 pp 851041 0 Ki McGown Al Leininger and Pi Culli Knightls Tour on Boards with Holes is NPcomplete Bulletin of the Institute of Combinatorics and its Applications 2005 0 Vi Strasseni Gaussian Elimination is Not Optimali Numerische Mathe matik 13 1969 pp 3543561 0 TIRI Walsh The Towers of Hanoi Revisited Moving the Rings by Count ing the Moves Information Processing Letters 15 1982 pp 64671 109 PCNX l7z 2L FFT EXA39IW LE P m l 2 7 0 x H2 I 2 00 6 I a 20 01 23 1 OJ 0 2 03 01 77me FOIL11 o 2 O 2 O 393 O M 24 M if Ho 0 20 to 242 20 30 30 I I Z l 3 2 1L 3 31 I 1 9 j v 39 96 C M m 1 1 H 2 28439 23 23 3 Hu 39I 4 76 5 23139 I 23 c O IPaAEAT Wm MuirPL 339s 1 42t z3g I l Phi24 6 ff39h39 I 707z haven395 5quot L 7i I 7139 I EL LH E I 1 7 7 9 7f 7 Jkv z i W mnmm It i 441 1 1 3 O commaquot a pmo 812m 1 7 e O Nuns 7A1emzl Sdeuy Hem 6126 A QueoE dlgm xms Ah My f5 greed H39 R MES mcmvne S WFS M wk step op M39IZI KG Same Marljag mmrm Infarqu swam EM kwa a VeSoum j a Ltd qu mm 4 sufmqu wi n People reviues va t Me I jaw A Puml Ff m4 ssuwava W M fesowruz am he used bj we person or a me gm accept 1 request g an Dqur warmsrs M overlay wFHw 11v mqrreA one in rme M We to be reyzttek nal F1 12 Mme as my re ned DSPDSSquotW W ven h Equey rs f AA em wf a Wreck pan bd crf ne UL fth sm stunTn mm cf C 510 quot nkkiwb I Tm quxesrs D 5 we aM zm He f SUMfU U Md C533 137 are MomcWVMWMQ I Gaol mpFt a comparbee MM u j WNW QT Mimum posgihka size Baum r i l F 1 l I f 11 1 Q H P ID aw 39 wwdmuw size 4 Desrjn a grad alga We CM Me a sin119 anych m e to 013 select DMQ requerr M L we 7 mm mule am We 1 M DO WQSTQ lime Seled A make Hmf gift 003 W M 4mm mun Sm njwr elk unwm ble ones quot Tamar Does it 1 me The vac 55E Sekd edawz 39 01 1 Wad Me w t e HFf a m A w Chossi he me one MW he at A r a l I 3 WM fl w howl meoxtme mr x gt W Wu rec uw g w T rule 6 5 W Chow The rmch 00Mthle one so We szwf S w w W W mch TM rajzd xg 7 vare Mm 392 Me Lt choose We reqm39r hima Hm so m w mm TM 9 W Ck okbdl W RIM LLJ W 1 maxim gradalschedfvde c IQ lnpui R a Sex 07 V qus WJAWW HS 30375 A rx 9d 01 Camiwa mqms rs of WNNWISIZ A I 1 Wkile R is wk 1 MM Li 0 tell 1L mid reclws39r dwell 11 A gt daleto an vuth no WMHOQ WIHN R e omk fa Curranms k A 5 Manchu A39 A 139s DWIva 6 W is maximum In OHM Wards vat o i M mam soiumw 14w IAI glol PmVT Hm A a S whim Basra idea M W ap mm 7 We mejware m th soln of 6 Wk whom cws rrudwk b3 ow 01 M01 6 th our glam IE lama Mm Foam 31quotan Lat A 01 CL Ck3 m rha af of re uxw aked e M T WJl T it wen b gmed sdda 50 nYKeOQr W Lek o iju 1jm3 be k25d uj39re lum m D Mdk in PM fumed Mex Gnu f5 tn mm CW Immm Owr ywhd Mk 13 Emmi on m i w bn V m up w yumrm 15an M Passw e lvx fm39 We COM mm W For a Y k fu r ffjr I Enigma MWM Em cm Y l The swumwf 3 demld True beche xeon Nae Mrs 105 Selaff M Irng mm mumsag SMDJWST fmrgk MaV ndwmve 5M 7 0 fan my Asswwijns 39Huz sweme 3 We Jyw rl ive n dk phone His True 3quot r J H jra NutZ fojm H l39ZLVI i d Sij gt J requeuf is S ll in 5 R aim in T3 Sahara Beauug reedj 5h24wlg 2130 roe w nm20 FMSMEAS HME Ir In R Lug WWLS T NM NM sftjr D We w Hm Prove That lAlngFJ S Owned1 M LL95 Windham 39kf A15 M optmod he W739lg WWW ma a mwiw rsk We mm fum fcgo W k I Jk J L 5W1 vv7llt 7 4 9 F g mre quotWu6 exrsm 5K 3 o A RoTc jkm SW13 Dag Hr 3k ends wa T 39 39 I Ck lands rcgt kn i s 5 39m R in GK 3 Seltored BM 46 mdysckeokum Shufch Ou ur Ch 5 whom w s LR OT ex 3 wfosgbm becw e U Mmme m srs Mo cw Mufmwkt cndr Mam he llw k W03 1 Sort Johe V mm ha nal17 HM WWI i ampi f 3 39 39 30A 1 UL 5 Sn 2 0160 lL rtf reclwsuf WHdi 3M M amp flmh mj 3 W Hme fhmush so fl3939 Si 7f Wm Selai He T Ha vwfu S h w 6an in MM rj nfsk sz f L es cumme iWQ Mnwish IL 3 WQL warp W jmvimwb mam n MenMs in Me 2065 WM WWde 5 pmmsec l m MSW Hm 4a quotW5 WA Om HM be Tm 2 00m SW2 WWW Sm f DLm wa Mil wn Vacquart OLMejm 71 Ocvxlojm gt Laurae 1 Q3 wartmu WM Fs 9g 7 Soilm pmbiews le aw MOM Mutwa Ana ySE 07L AlgerWW Alia gamer Oopph for Samj praMg of WPfiYala e 77 Mavhj Problemg Pwposa 1W5 dMS Mode mac wealMm deny mm ym nagza m cmme bf 0450th pvtquot 13m WT page ch 3x 5 7 ym m um 07 Wan77M omr v r g oe dk of impwwfm ms abs mf A7ornbm WMM 7 743 dad5 V5 GWCMSg Mam ufW a dv w c in my 77M aways afa W WW 745W 074 TM MW WM 01124 Wk MM 00 W 7 7 I awed169 1 6de I W Mme 74m ExamFig mfgzzr hm Z M Mr is rum 11mg MmeL335 EXw e Pro olem V me V mmbw 90 Wm M aswidiha A Vme A 22Y1 Ham I 3mm wth m Wmd WWW MSW one mwmbw car a tnwa 5h LiceM017 fer4 c InVWCT CQHULMS DFQPA WTaAd If quota a HipK 39 SCTV Tinsz 39 39 m 1 2 at 3 564 l H 4 wk le AC1 7 AZ Aciw Am T235 6 std 12 7 At39kmym M ugxbwp el 16 z 61 K 56 0 2g l 5 6 10 91 9 55 0 390 v 63321in 0 6 lsv n 3mm TWKWM39IW Cg LAW67 v This cmbe PTV39M l d W 4m 0 o 1 dm g 35071754 I QMJCIMEMSI SA 691WL 4k mmwau ml W072 Nubia 32 sf cm 046317 WM on m er hf NWT Suvtglo 1 we muck m 7 aw IN 3 3977 mmch er is u w MM Shel NO EJLL WWWM GymMM W Ma ii QaT 1141 We W1 7gVW MCWwo MK Al xfe xnb d W cou vr39t 0M3 1AM VACmid Wmmver 9140495 1mm be MULMM Mwm cath qw mmw 5W WM KM gm aWWW R WSW JCWJI SQ batLIL A1 wamp0od U Y Mm quotaw CH39IHCH A M n A39Lm W Ln05L at M d WQm IMJJ of m getM Size mud mm w wo i MA mpg 1 MUZ y L PWt Ma wang 9mm Wmnma M 3 C39mO M m 39X mec 0M2 Vetmi mailer ra 7 it 7Q 3 496m Tun LPCmMcf 50 54 4Cn4a 30 g D 4Cn394c Heir OWICLWC5 M 5 qufc m 14 I 7 N6 j bw5 6 mom waxL kw 3 x I w 15 9 A NZX mdf ib QWW I M1 c E uffw 723112 Warm lt M7108 Wheyx LSVWFWHVKQ mo 01503 WMAD My colwt vegng ma avs CMHZ woch Whig MilVS is MACK wQ oxH MW of Smen 7 asaW Ptu c hemww HAM wr PC Sums on m NAM Size WUWS MJMQA 58 w h E heg f bT39ClQV tam I SO 35mg 5 amidemi Sfj nifm ak quot Simm l5 0 w 60 a A 19M mot sz 0 0 M In Gilayr m dealm 0f 6LjDYThVW dde39rWd Simrt umFIWMM ViviMMMOA W merMs Brgfp WH M HTA 4 3 CWL Ln b 39 blur L L 3 04 Mm Atme 3 39 W01 9 431M aTWVSA SLWQf 8m VQPYW EKC 6 08 33 damnquotbl 3 C 75 C800 ong a 00 F5 Hm MWW I C Unti 0nl W CM 91f 6 43b be Owl wala M Cams Mrm by W m 3 o 5 h 7 l W W W quotW manr I Am cw02 c is M u o J I L am 410673 NJ Wu Mm m7mo ml 05 V fang ff is bmrldeni may 39n la 391 vii 0VH SD L X M M3050UAL 1 o 218 9 70 M W gtUO gt00 M m ao L2 gt WWW W 7W jLZJZC aoLJ y 336 Cs lowMed fmmjew WWIls if TWb Vw 31W WWIquot Uquot 2 mm 9 mvmm A jMPmb WM WM fulcy 390 mm fag W10 25n243n a 004 L bo luaUU 33m euojv g W s n s n 1255 Ca w LIE oftWag 5o w M 31mph IMZ c ovxsl demnf Ame Lem 1 kkyvy Mon fayrallj 249 Ml UMQ ctIIWMeM 423171 quotit repreym Phy ff biraer bemwu node ILL UV Coir cf mammi Mode 14 t V Emblm Q M39n o jmph amp H39E Md Margit r em M32 ee E 1 length of w 7le F gt 390 ts HM arm as 9 fa m ue marmy P r bm 539 CD 8431 realmble node A Set 7 which cmfmn all Medic 9 determined a skortesf vptdk Aiaon qhm 1 L A We Mud 6ch rm graph We mode M bk 39tiW Smrwa w S A Wm 9 31 W ailwwd39s modicum a ILPFir WHO y 60415 Ode Q its ghovteq PKJ HQ mgh mu jU3fW UZ If 5orf39fbajll 39tmaf a a 52 O CHEM M fxy el we 51 Mum each bud is 55167313451 Pair1 f fMc fabI 5 me V 39 w wng we hawVt gx cmi Mj Wt qr cf waFlm 511 We awHQ We 7 Z 3 6139 C 1 DO 39 2 FXF MH uff f 39m 37 New mm azj ia in w nodal 1 PW 1 MM we knuw m dx mui 5 9 D 7 uPPar bwmd I 3 4 g Note 0quot quotall C af rmm 3 lling 1 is U ME 01137507102 My 3 wem39mt 670 lt1 slawref Pam Men I 2 Tma 13th timer bar414C Lilia 4 I f 16027 cvxpl39lrr ta by expand7245 nd 2 We We 2 d 397gt W5 pm d s Mailer PW 7quot 53 111 155 12 73 V 135 4 Mia 24 32gt Md 4 hm AM apwls fut f z r 39 v 6 True cist WM j o 3 1 3 9 i 3 7 Aff39 w dam Wthz f fr 7046 TM u l f hrwxd I 515 Liigfm 9 4 implored Part quot11149er idevmfigd w 4 pawnms drrmmcwnumi 39t39v expiovvd 2 Wme WpavbcwlA h l M 39 xi Q Show55f Path 52 5x rm I I 2 V234 V I N A J Maichde Part Q 3 m guppcszr U s M made Wm M mim nmm upperixwvi gymW s 4w itg upper bmmdi 3 15 mm mng A th39L uLLSC miner pobaiblia pw W W b 34 9mm MAW maria in Pad 3 Maiquot E 5 WW 5 MW 39tl 9 6 yi gjer Hw Km 77de CD40 i Q 3 explm 63 WWMC L 3 39 Ame 332 5 13 NWZ 210ij WM 9395 way 3 a23 4 5 My w W W R lm we 9kmin mm TWA 5 t0 4an 4545 K T MM pr bmvxd q A g 5 quotWM dBTI 120 0 4 3 Ur 11mm v fut6 mm an39 wan rm Sfo Lt j M MALMHE me ChmWM 6r r L MM pin 1013 w WMVJM I an v6 U 3 dV va 00 3 KmV Null M We to WWI m M 4 dist 0 5 V 39 Mtg Wm V 475 d 51o du M 933 g emfe H 5 mt empty H maid15 a up avmm MW 7 M idawnquot HI 41 qu of H 2 1m charm r i 8 at 94595 M V Q E a If diary 7 distmyl my Nwaacri 3 dram d1t6uN y g3 l 1 prwcw a u 1 iewmehag 3 H V quotH 115 a dab swahu g failed nan mug WM WWW A 5 2 7 elemads Mil Jasminel numm c VdLt L5 and 51 de 19155245 add a Mu Mgnvwvi39 Danubekw A j u m Lam smmwe 4amp9 max w a 5ampqu VHM Algorithms Prasad Tadepalli Computers and Computer Science Computer Science is no more about computers than Astronomy is about telescopes EW Dijkstra 19302002 The field of Algorithms which is at the heart of Computer Science is much older than Computers and Computer Science Father of Algorithms Around 820 AD AI Khwarizmi invented and documented many algorithms we now use for operations on numbers He was a Persian who worked in Baghdad s House of Wisdom The word algorithm comes from his name as does the word algebra Algorithms Prehistor 325 265 BC Euclid of Alexandria Sometimes called the grandfather of Algorithms invented Geometry I and number theory 6OO AD Decimal system was invented in India 780850 AD AI Khwarizmi 11701250 AD Leonardo Fibonacci popularized algorithms in Europe 1448 Guttenberg invented the printing press Fibonacci Turing Machine and Effective Procedures Alan Turing 19121954 is considered the Father of Computer Science In 1936 he formalized the notion of an effective procedure as a set of rules that can be mechanically followed by a machine The rules must be precise and unambiguous For a procedure to be an algorithm the machine must eventually stop with the correct answer First Digital Computers Eniac Electronic Numeric 39 Integrator and Computer is the first generalpurpose computer Eckert and Mauchly Built in Univ of Penn 194346 cost 500K Used for computing artillary tables Programmed by manipulating switches and cables First Stored Program Computers EDVAC von Neumann Eckert and Mauchly 1949 1000 44 bit words Addition 846 microsecs multiplication 2900 microsecs Pilot ACE Turing 1950 1 MHz clock fastest at that time Some Highlevel Programming Languages Turing equivalent Fortran John Backus 1957 Lisp John McCarthy 1959 Cobol Grace Hopper et al 1959 Algol 60 John Backus 1960 Pascal Niklaus Wirth 1970 C Dennis Ritchie 1972 Prolog Alain Colmerauer and Philippe Roussel 1972 C Bjarne Stroustrup 1986 Java Sun Microsystems 1995 What are algorithms Precise mechanical procedure to compute correct answers to all instances of a problem Must terminate on all possible inputs Are described in pseudocode that can be translated to any programming language Three questions we ask about every algorithm Is it correct How fast does it run Can we do better Applications of Algorithms l wi m Finding a fastest route to an address Safe encryption of messages Making faster computer chips Analyzing social networks to detect social groups Routing messages optimally in the internet Computational Problems Computational problems are mathematical functions from inputs to outputs Shortest Path Problem Input Map starting address destination Output Directions for shortest route Sorting numbers Input A sequence of numbers eg 5812 Output Sorted sequence of numbers eg 1258 Problem instance is one particular possible input eg 5 8 1 2 The size of the instance is the length of its encoding eg the number of numbers to be sorted We use n to denote problem size Do all problems have algorithms Do all problems have algorithms No Not all problems have algorithms Problems that have algorithms are called decidable There are many undecidable problems eg does an arbitrary program ever terminate Do two programs compute the same function Some problems most likely do not have ef cient algorithms These are called NP complete problems In this class we will only study decidable problems In 08321 you will have seen instances of undecidable problems How to measure the time In general the bigger the problem the longer it takes to solve it So we make the time complexity tn a function of the problem size n Even for the same size some problems are more difficult than others Analyze the hardest problem of that size worstcase complexity The exact time taken depends on The programming language used to implement the algorithm The compiler The computer hardware What otherjobs are running in the machine How much memory does one have Network speed etc ONotation Problem How do we compare times in a way that does not depend on the details of implementation Insight Implementation differences only make a constant factor difference in time We say a function f 09 if there is a constant c such that fn S c gn for all n A function f 29 if g 00 A function f 69 if f 29 and fOg Complexity in Onotation Multiplicative and additive constants can be dropped 20n2 10 On2 6n2 nb dominates na if bgta So 6n3 n3 n2 Any exponential dominates polynomials 03 n700 O3quot 2quot 63 2 A polynomial dominates logarithm On log n Olog n log nquot2 2 log n Three functions 9000 8000 7000 6000 5000 4000 yX2 yxquot3 y2X 3000 2000 1000 Exponentials vs Polynomials 1200000 1000000 800000 yX2 600000 yxquot3 y2quotx 400000 200000 0 Efficient usually means polynomialtime although for some applications Onquot5 or even Onquot3 is not considered efficient and for some applications exponential worstcase is still acceptable Fibonacci Numbers FO 0 F11 Fn Fn1 Fn2 if n gt 1 Function Fib1n If n0 then return 0 Else if n1 then return 1 Else return Fib1n1Fibn2 Three Questions 1 Is Fib1 correct Yes it directly translates the definition into pseudocode 2 How long does it take Suppose it takes tn on n We have the following recurrence relation tO 1 t12 tn tn1tn23 So tn gt Fn This is bad because Fn1618n Why is Fib1 so bad Fib14 Fib13 Fib12 Fib12Fib11Fib11Fib1O Fib11Fib1O1 1 O 101103 Fib1 2 is calculated twice To calculate Fib15 Fib13 will be called twice and Fib12 will be called 3 times Fib11 will be called 4 times The number of redundant calculations increase exponentially with n LPONT 3 Can we do better We can cache the values that are already computed and reuse them This is the key idea behind dynamic programming algorithms Function Fib2n Array Fib2n O Fib20 O Fib21 1 For i 2 thru n do Fib2i Fib2i1Fib2i2 Return Fib2n Fib2 Analysis Is it correct Yes prove by induction How long does it take It takes On Can we do better Yes Exercise 04 Summary Algorithms are at the core of Computer Science and are older than Computers Have many many applications and are essential ingredients of successful projects Algorithms must be correct and efficient Efficiency is measured in Onotation Onotation helps analysis by abstracting away from the details of the implementation Difference Equations Solve fn fn1 fn2 f0 2 f1 1 Step 1 Find Roots fn fn1 fnZ fn 39 fn 1 39 fnZ 0 M x 1 0 11112 4 1 12 1 11132 11 1152 12 115y2 Step 2 Find General Solution fn alhlquot 212 A2 substitute in initial conditions and solve n0 f02a1e10a27t20a1a2 2 a1 212 n 1 f1 1 a1k11a27t21 1 a1k12 211 2 1 a1k1 2 k2 alkz 13927127 1397 2a1 a1 1 2 X2 o13912 a1 1 3 aZ 2 1 1 fn 1mquot 1 1152n 1152n 1112002 Solve X Xn1 2Xn2 X0 1 X1 1 Step 1 Find Roots Xn Xn1 2Xn z Xn 39 Xn 1 39 2Xn2 0 M 7L 2 0 1402 41221 1ix92 2 A1192 212 1V9y2 1 Step 2 Find General Solution Xn 211 k1 aZ X2 substitute in initial conditions and solve 102Xn a1k10a27t20 12Jla2 13931a z n 1 X11a1t11a27t21 1317111039 a112 1a17t17t2a17t2 lXz717tza1 a391 1 39 My 7V1 39 A2 a11121 23 312343 xquot 23 M 13 AZquot 232n 131n 1112002 1112002 Solve Xn 2Xn1 3Xn2 X0 1 X1 1 Step 1 Find Roots Xrl 2Xn1 3Xn2 Xn 39 2Xn 1 39 3Xn2 0 xi 2x 3 0 2422 4 1 32 1 2 iV162 2 21 2 V162 3 AZ 2 162 1 Step 2 Find General Solution Xl1 alklquot aQXZ substitute in initial conditions n 0 X0 1 alklo azkzo 1a1 a2 1a1a2 n 1 X1la1711a2721 1 3a111 a1 1321114ra1 24211 a112 221211212 xn123quot1 Forcing Functions Equations of the form 11 Km 21 cani g 1 Where gn is the forcing function Homogeneous part Output of equation ok like either the homogeneous solution or lnction which ever is larger Solve Xn 3Xn1 5 X0 1 Xn h11 Vl1 hn 3hn1 Vn 3Vnl 5 Step 1 Find Roots hn 3hn1 hTl 3hn1 0 7t 3 0 7t 3 3 hTl a13quot 1112002 Step 2 Find Particular Solution Vn 3Vn1 5 Guess Vl1 general equation of same form as forcing function GUESS Vn c where c constant Solve for variables by substituting in guess Vn 3Vn1 5 c3c5 2c5 c52 ubstitute results into the guess equation 3 Vn 52 Step 3 Find General Solution hl1 a13quot Xl1 hn Vn Xn a13quot 52 substitute in initial conditions and solve n0 X0 1 a13U52 1 a1 52 72 a1 Xn72 3quot52 1112002 1112002 Solve Xn 3Xn1 2n 4 X0 1 Xn hrl vn hn 3hn1 Vn 3Vn1 2n 4 Step 1 Find Roots hn 311m1 hn 311711 0 Step 2 Find Particular Solution Vn3Vn12n4 Guess Vn general equation of same form as forcing function GUESS Vl1 cln c0 Solve for variables by substituting in guess Vn 3Vn1 2n 4 cln c0 3c1n1 co 2n 4 c1nc03c1n3c13c02n4 Equation from last slide cln C0 Beln 301 300 2n 4 pull out terms by their power n cln 3cln 2n c1 3cl2 2201 cl1 1 co3c13004 c0 31 300 4 sub in answer 00 3 300 4 2 co1 co 12 3V clnco1n12n12 Why Can Break Up By Powers if aX3bXZcxdeX3fXZgXh this means that they 0 for the same values of n Therefore 0 ax3 bXZ cx d 0ex3fxzgxh this means that a e b f c g and d h otherwise the polynomials wouldn t be equal 1112002 1112002 Step 3 Find General Solution hTl 2113 X 11 Vn Xna13quotn 12 substitute in initial conditions and solve n0 X01a13 012 1 a1 12 12 a1 Xn123 n12 Solve Xn 3Xn1 2 X0 1 ampmw hn 3hn 1 Vn 3Vn1 2 Step 1 Find Roots hn3hn 1 hn3hn10 k30 L3 hna13quot 1112002 Step 2 Find Particular Solution Vn 3Vn1 2quot Guess Vn general equation of same fonn as forcing function GUESS Vl1 c 2 Solve for variables by substituting in guess Vn 3Vn1 2 c2 3c 2quot391 2quot c 2 3c 2 diVide both sides by 2 391 c 2 c 2 2 vn 22n 2w Step 3 Find General Solution hn a13n Xn hn V x a13quot 2W1 substitute in initial conditions and solve n 0 X0 1 a130 201 1 a1 2 3a1 3X 3 gtxlt 3n 2n13n1 2nl MmE xvmcguwz39 kszp HM 5m 4 fadefendeart W V 6 M QWg my m w Cam1m 39N 7 M3 74wN W W Ll 7gtbem 1 V39P cf AIM do we do WT 1 Mid of ways 59 57 Luff7 APmnfza eneg39 A m 173 1712 W11 94 0511122r 351 54 T Fogem pm 1447 we Maria 395 dmi a Soho7 an be g FEW ff if Iquot F61 wwie ff 0 j mum umszyiirffMj W x v y m laww ma X f ch ow l VvH not 6 th jliwnMa 50 we 13th L we Mggmmwf in 42 J22an gfw e HOW all we CG 14L femm d8 Wgatg 3523 WM 7 v T TW EB 2 L W 7 We will Marmara393 rm cc 5 3th I w vxvj H W Y my Magnum 6 v29 D H a 69 39 CWW WWW 0qu reach memo 1quot 0 Lm m Chock 1170 xI CW 0 72 a We don t M 7 msda39 elu z 61W lbw 2A we Sam 2 M1 Mm 7A4 amid e mad Wig 754W M Ma mm 1 Mu Mama MM xsl We go my 2 Mu awed W Augm been clemmw WA gt Hatamid Ema dzv BMM A 5416 17a 010de by 5071547641 mg 7pm Ll We a node 145 z zzMK M W 9W 9mm aw Dram 7mm 1m sema 9th ff ML 9m 16 2 J39aeug we can WT ying 9 13372ko and x9355 3W2 it a mi ngvinxvyxg w V V Mvznjugv39i gtgujvi 2 we 2 w 3M3 jug yang 9M3 90 y 9 9 yra x ys i ro i39 c I7 7 C lt How an we mu Ha ww e em ve prune M wah oil 7M sewnh 9PM M chgtble 393 Note pmnug a WW my when m emf NW 13 we M1 wn 0 tie SAW and chase rm WWWe 79Wquot 7 Clam 76v Wm3 m 2 We cue mw f I c Lmq I EZWTW m Chose am My i3 WAT on 41w my in a ik 7 1m 7 Lg lose it a Saw ybf 44517 MMquot E z Lef s 566 H m acmmv 7v5vgtcyvyg y 0 f Emma VVXH 5H0 2 X 24 2 4 quotj vaVv22WIgt VjifixMeVWWVK wa k XVVJVElxu31yu x39k 3v2gt c9v5 duxgvi W v gtlt mgw my ijIH XVV y5IEJi 206 w 2I X J L fiw W SWmMMTze bapkaW 3W NM Sm a P rvblw P0 SSEM W at cf MFRe SWFWWM Wllile S 3 W9 Mfg dame SubJIva m 3 ad WM IT 1quot 5quot Ft 7 filmer gagmama PFWK r Li H quot Walk J rf W mwtr Aw MM zzf5 X smm Ff w qul s dacm p aim WEE PL 190 8 AMan no 50 mm 39m dcsrjmm 0 bachtfmki ua Vmbim39 Ned b may 1 choose i W JWT comm A awe e emue V pruning expand aw ne Emmy femur ma Wu swbywbhm W decide f fumed fwd 3 o r rmm WWMW

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

#### "I signed up to be an Elite Notetaker with 2 of my sorority sisters this semester. We just posted our notes weekly and were each making over $600 per month. I LOVE StudySoup!"

#### "I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

#### "It's a great way for students to improve their educational experience and it seemed like a product that everybody wants, so all the people participating are winning."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.