Introduction to Algorithms
Introduction to Algorithms COT 5407
Popular in Course
Popular in Engineering and Tech
Dr. Tyrell McKenzie
verified elite notetaker
This 83 page Class Notes was uploaded by Connie Schmeler on Monday October 12, 2015. The Class Notes belongs to COT 5407 at Florida International University taught by Tao Li in Fall. Since its upload, it has received 25 views. For similar materials see /class/221842/cot-5407-florida-international-university in Engineering and Tech at Florida International University.
Reviews for Introduction to Algorithms
Report this Material
What is Karma?
Karma is the currency of StudySoup.
Date Created: 10/12/15
Chapter 16 Greedy Algorithms Greedy is a strategy that works well on optimization problems with the following characteristics 1 Greedy choice property A global optimum can be arrived at by selecting a local optimum 2 Optimal substructure An optimal solution to the problem contains an optimal solution to subproblems The second property may make greedy algorithms look like dynamic programming However the two techniques are quite different 1 An ActivitySelection Probem Let S 12 n be the set of activities that compete for a resource Each activity 239 has its starting time 3i and finish time fi with 3i 3 fi namely if selected 239 takes place during time 3ifi No two activities can share the resource at any time point We say that activities 239 and j are compatible if their time periods are disjoint The activity selection problem is the problem of selecting the largest set of mutually compatible activities Greedy Activity Selection Algorithm In this algorithm the activities are first sorted according to their finishing time from the earliest to the latest where a tie can be broken arbitrarily Then the activities are greedin selected by going down the list and by picking whatever activity that is compatible with the current selection What is the running time of this method 0 L S J Well it depends on which sorting algorihtm you use The sorting part can be as small as 0nlog n and the other part is 0n so the total is On log n a Good You ve been studying hard C L haven t you U S Theorem A GreedyActivitySelector solves the activityselection problem Proof The proof is by induction on n For the base case let n 1 The statement trivially holds For the induction step let n 2 2 and assume that the claim holds for all values of n less than the current one We may assume that the activities are already sorted according to their finishing time Let p be the number of activities in each optimal solution for 1n 1 and let q be the number for 1n Here p 3 q holds L S Can ou ex Iain Wh 7 Yes it s because every optimal solution for 1n 1 is a solution for 1n Good Then how about the fact thathq l C L S f u 39O Oh that s too hard for me Here s an answer Assume that p 3 q 2 Let W be any optimal solution for 1n Let W W n if W contains n and W W otherwise Then W does not contain n and is a solution for 1 n 1 This contradicts the assumption that optimal solutions for 1 n 1 have p activities Optimality Proof We must first note that the greedy algorithm always finds some set of mutually compatibleactivities Case 1 Suppose that p q Then each optimal solution for 1n 1 is optimal for 1n By our induction hypothesis when n 1 has been examined an optimal solution for 1n 1 has been constructed So there will be no addition after this otherwise there would be a solution of size gt q So the algorithm will output a solution of size p which is optimal Case 2 Suppose that p q 1 Then every optimal solution for 1 n contains n Let k be the largest 239 1 g 239 g n 1 such that fi 3 3 Since f1 3 g fn for all 239 1 g 239 g k 239 is compatible with n and for all 239 k 1 g 239 g n 1 239 is incompatible with n This means that each optimal solution for 1n is the union of n and an optimal solution for 1k So each optimal solution for 1k has p activities This implies that no optimal solutions for 1 k are compatible with any of k 1 n 1 10 Let W be the set of activities that the algorithm has when it has finished examining k By our induction hypothesis W is optimal for 1k So it has p activities The algorithm will then add no activities between k l l and n 1 to W but will add n to W The algorith will then output W U n This output has q p l 1 activities and thus is optimal for 1n ll 2 K napsack The 0 1 knapsack problem is the problem of finding given an integer W 2 1 items 1n and their values v1vn and their weights w1wn a selection I g 1n that maximizes 2612 under the constraint Elie wi S W An example Algo Boy is going to a desert island He is allowed to carry one bag with him and the bag holds no more than 16 pounds so he can t bring all what he wants So he weighted and values each item he wants to bring What should he be putting in the bag 12 item weight lb value A CD player with 8 2O Bernstein Mahler box CLRS 2nd Ed 10 25 Twister Game 2 8 SW Radio 4 12 Harmonica 1 5 Roller Blades 4 6 Inflatable LifeSize 1 8 R2D2 Doll Tell me what I should bring l3 There is an 0nW step algorithm based on dynamic programming A greedy approach might be to o Sort the items in the decreasing order of values Then scan the sorted list and grab whatever can squeeze in This approach does not work Can you tell us Why C L S 14 Sure This strategy does not work because it does not take into consideration that a combination of less valued items may weigh less and still have a larger value item weight lb value CLRS 2nd Ed 10 25 CD player with Mahler 8 20 SW Radio 4 10 Twister Game 2 8 R2D2 Doll 1 8 Harmonica 1 5 Roller Blades 4 2 With W 10 you should pick fa0 CLRS 2nd Ed but there is a better combination 15 3 Huffman Coding Storage space for files can be saved by compressing them ie by replacing each symbol by a unique binary string Here the codewords can differ in length Then they need to be prefixfree in the sense that no codeword is a prefix of another code Otherwise decoding is impossible The character coding problem is the problem of finding given an alphabet C a1 an and its frequencies f1 fn a set of prefixfree binary code W 201 wn that minimizes the average codelength 71 Z fi lwz39l39 i1 l6 Depict a prefixfree binary code using a binary tree where each left branch corresponds to the bit 0 each right branch corresponds to the bit 1 and the leaves are uniquely labeled by the symbols in C The codeword of a symbol a in C is the concatenation of the edge labels that are encountered on the path from the root to a Each node v is labeled by the frequency sum of the symbols in subtreev l7 18 The Huffman coding is a greedy method for obtaining an optimal prefixfree binary code which uses the following idea For each 239 1 g 239 g n create a leaf node vi corresponding to ai having frequency fi Let Dv1 vn Repeat the following until HDHL Select from D the two nodes with the lowest frequencies Call them a and 3 Create a node 2 having a as the left child and y as the right child Set Hz to fix fly Remove a and y from D and add 2 to D The replacement will force the codeword of a y to be that of 2 followed by a O a 1 19 An example a1 b23 c2 d4 e5 1aampc gtx x3 The idea can be implemented using a priorityqueue that is keyed on f 20 The correctness Of The Greedy Method Lemma B If a and y have the lowest frequencies in an alphabet C then C has an optimal prefix code in which a and y are siblings Proof Let T be an optimal code and let h be the height of T There are two leaves at depth h that are siblings If they are not a and 3 exchange the nodes with a and y This will not increase the average code length I 21 Lemma C Create an alphabet D from C by replacing a and y by a single letter 2 such that fz fac y Then there exists a onetoone correspondence between a the set of code trees for D in which 2 is a leaf and o the set of code trees for C in which a and y are siblings Proof Omitted I Suppose that a and y are letters with the lowest frequencies in 0 Obtain an optimal code T for D and replace 2 by a depthone binary tree with a and y as the leaves Then we obtain an optimal code for C 22 Chapter 24 Single Source Shortest Paths The shortest path the path with the smallest edgeweight sum What is the length of the path abdcf C L S The Shortest Path Problem SPP 6uv jg the shortest path length Compute 6uv for SingleSource a fixed u and all 21 SingleDestination a fixed 2 and all u SinglePair fixed u and v thi s AllPair all u and 2 Negative weight edges can create negative weight cycles which make the shortest paths unde ned Theorem A Optimal Substructure Theorem If p 221212 vk is a shortest path from 111 to vk then for all ij1 g 239 gj g k pij vivj is a shortest path from vi to Uj 1 6uv min6ua wav am 6 2 6uv g 6ua wav for any a Idea Use a variable duv to compute 6uv Relaxation with respect to points abc dac lt mindacdab 201 A general strategy 1 Set duv 00 for all u and v 2 Repeat the following until no update Pick abc and relax dac with 90 dac lt mindacdab 201 3 Output duv as 6uv Strategy for SSSP Fix u to the source node 3 Omit the first component from d since it s fixed to s 1 Set dv 00 for all 1 2 Repeat the following until no possibility of update Pick b and c and then relax dc with 503 dc lt mindcdb 201 3 Output dv as 63v Dijkstra 5 Algorithm Here all weights are nonhegative S jg the set of nodes that are finished ie 63 2 dv Q jg V S a priority queue keyed with d 1 Set d3 to O dv to 00 for all 2 7e 3 S to 7 and Q to V 2 While Q 7e 2 extractminquot u E Q then a Add u to S b For each 2 with uv E E dv lt mindvdu wuv Each time dv is updated set 7tv to u which is the predecessor of v How many times is each edge examined How many calls to C S Extract Min 7 L 2 HOW about DecreaseKey Each edge is examined once ExtractMin is called at most quotfa0 V times 3 DecreaseKey is called at most E times With Fibonacci heaps the total running time is 0E V lg V The BellmanFord Algorithm Here negative weights are allowed The algorithm detects whether there are negative weight cycles Repeat V 1 times 1 For each edge uv relax with respect to uv 2 If for some uv dv gt du wuv then output negative weight cyclesquot How many times is each edge examined L S What is the running time f The input graph The edges are ordered in decreasing order of their weights The first round 10 ll Relaxation is still possible 12 Another input graph The edges are ordered in decreasing order of their weights The first round 13 Mu the minimum number of edges in shortest su paths 00 g V 1 for all u Theorem B If G has no negative weight cycles then for all 2 after the uvth round dv becomes 63 1 Proof By induction on 0 Base 0 O trivial because 2 s 14 Induction Let m 2 O and 0 m 1 Suppose that the claim holds for all u such that 00 g m Pick an m 1edge shortest sv path p and let u be the node preceding 2 Then 1100 m and 63 2 63 u wu 1 So after the mth round dv 63v Thus in the m 1st rouhd du becomes at most 6su wuv and this is the smallest possible value that du can acquire I 15 Theorem C If G has a negative weight cycle then there exist some u and U such that after the V 1st round dv gt du wuv holds Thus if G has a negative weight cycle then the algorithm outputs negative weight cyclequot 16 Proof Suppose G has a negative weight cycle v1v2vkv1 whose length is L lt 0 Then L 100111712 wvk1avk Milk01 Assume to the contrary that dv g du wuv holds for all u and 2 after the V 1st round By our assumption dvz S CHM 1 in ll 11m for all 2391 g 239 g k where 210 vk Summing these inequialities for all 239 we have A A Z dvi S dvi L 21 21 This implies L 2 O a contradiction I 17 SSSP for a DAG 1 Obtain a topological sort of the nodes 2 For each u 7e 3 set du 00 Set ds O 3 For each node v in the sorted order and each u with uv E E set dv mindvdu wuv What is the running time C S L f 18 What is the value of the last node 19 282 Math Preview Chris Brown September 8 2005 Contents 1 Why This 2 Logarithms 21 Basic Identities 22 Basic Consequences 3 Sums of Series 31 Arithmetic Series 32 Polynomial Series 33 Geometric Series 34 Harmonic Series 35 Arithmetic Geometric Series 36 Using Integration 37 Who Knew 4 Bounds 41 Big Bounds by Limits 42 Putting Limits to Work 43 Little Bounds by Limits 44 Properties 5 Simple Recurrence Techniques 51 Brute Force Substitution 52 Telescoping 53 The Master Method OOCTJCTJCHAgtgtJgtOOQD OOOO H O 1 Why This l7ve noticed that often the only math in an algorithms book is presented in one or two chapters and then is referred to for the next 1100 or so pages So its good to get this material nailed down early This is mostly reiteration of content from CLRS chapters 3 and 4 but l7ve used material from a couple of other Algorithms textbooks This is meant to ll in some CLRS gaps but l7m not sure the gaps are really there 7 maybe its just been useful for me to pull this stuff together in one spot If it helps anyone else great but its certainly not meant to be any sort of self contained tutorial Still please report typos bugs thinkos and unclarities as well as suggestions for more content to brown cs rochesteredu and maybe we can improve its chances of being useful 2 Logarithms 21 Basic Identities De nition For b gt 1w gt 0 logb is real L such that bL x Properties following easily from the de nition 0 log7 is a strictly increasing function x gt y i logb gt logby 0 log is one to one logb logby i z y o logb1 0 for any b since b0 1 o logbb a pretty much restates the de nition 0 logbxy logb logby lf bL z and bM y then bLbM bltLMl zy and by the previous result logbxy L M logb logby o logbx a logbx Use previous result 1 times with y x o 10M ylogb Use 2nd property above to justify taking logs on both sides logb logby 10gby 1le o logcz logblogbc This important base changing identity that establishes t hat logs to two different bases are related by a constant multiple Let L logcx M logbx N logbc Then CL m bM m and bN 0 But if c bN then x CL bNL and alsoxbMsobMbNL MNL LMN 22 Basic Consequences There are a number of little corollaries and techniques owing from these identities that might cause some head scratching when rst seen They are useful if you want to compare functions that are written using different bases For instance7 one can rewrite an exponentiation to a different base by rst taking logs and then exponentiating to the base of the log7 comma ga nm 211gn7 or the more exciting nlgm 21mm nlgm 219n1g1gn 21T7 which if you7re normal isn7t obvious at rst blush These examples do not address the complications introduced by arbitrary bases 69 e 107 3 that would force the use of the base changing identity7 thus introducing a base changing multiplier into the picture 3 Sums of Series 31 Arithmetic Series For instance 7 nn 1 i1 2 39 Why Draw 71 gtlt n 1 rectangle and a jaggy diagonal and count the squares in one half Or think of numbers 1 to n and pair them up rst to last7 2nd to penultimate7 etc You get 712 sums each equal to n 1 Questions what if n is odd Does this trick work for other summation limits These questions should lead you to think that maybe you can use the above counting tricks for linear variants on the sum lndeed Consider 2 5 8 3k 7 1 Rewritten as 3 1 2 3 k 1 1 1 1 the sum is obviously3kk712 7k ORyoucan again add the rst and last7 2nd and next last7 etc to get k2 pairs of pair sums each equal to 3k 17 or k3k 127 which is the same answer7 nicht 32 Polynomial Series The most familiar instance more generally You might remember that the sum of the rst 71 squares has a cubic term in the answerturns out to be no accident In general summing over the exponentiation index This is justi ed approximation to an integral see later but for any speci c k 2 is a classroom favorite the exact formula involving a k lst degree polynomial can be proved by induction 33 Geometric Series The old familiar k 22 2H171 o is of course visualized best by thinking of each term 2quot as a 1 bit in a binary number Thus the sum lllllll for k 1 bits and if you add 1 you get 100000000 2k Related is the special case 1 l 7277 21 2k My 1 H o Bearing a close resemblance is the more general k I k1 7 1 2le b Ti i0 T 7 1 with 7 often known as the ratio The last is a trivial generalization of k i rk171 Zr 77 i0 7 1 which when 0 lt r lt 1 leads to k 1 2 s 177 H o l as k a 00 In this tending to in nite case7 the sum is quite easy to derive Let Sbe the in nite sum of powers Then S1rr2r3r4 So TSrr2r3r4 and subtracting these two equations only allowable if they are convergent7 an in nite amount of right hand side vanishes to leave SirS1 S 1 Cute7 eh This trick can generalize to more complex summand terms if one is careful see the telescoping technique in Section 52 34 Harmonic Series We saw that the general case for polynomial series is which does not work when k 71 That case is the Harmonic series Hltngt W m i1 where y 057721566 is Euler s constant 35 ArithmeticGeometric Series In this example sum the index appears both as coef cient and exponent in the summand terms Unfortunately our solution relies on the base being 2 k 222139 k 71 1 2 i1 The solution for this type of sum is an analog of integration by parts7 which counts on producing sub sums that cancel except for rst and last terms and the sum of a leftover term that is easy to evaluate k k 222139 2mm 7 2i i1 i1 here is our reliance on base 2 7 and now k kil Em 7 Z 12i1 i1 i0 We7re aiming for a sum of terms of form 22 1 to emerge from the RHS7 and sure enough k I kil I kil I 222144 7 222144 7 Z 211 i1 i0 i0 k2k1 i 0 i 2 1 i 2 k 71 1 2 with a little care on the last sum 36 Using Integration With some simple ideas rendered mathematically precise we can translate results from contin uous mathematics into results for the discrete sums we use One simple idea is monotonic nondecreasing eg logxx22w if z gt 0 or the discontin uous and antimonotom39c non increasing functions7 eg Conveco functions are those that never curve downward77 like 7176m74 These puppies have always looked concave to me I have to remember convex down A function can be convex but not monotonic7 or could be monotonic but not convex Clearly is not convex7 nor is logz A discontinuous function cannot be convex Convexity can be proved by showing7 for a real function7 that the average value of the function at two points is above the function of their average Convexity for a function on the integers can be proved by showing the same thing for all adjacent sets of three integers fn 1 is at most the average of fn and fn 2 An integer function fn can be extended to a real function fz simply by linear interpo lation Then we have some useful properties 1 fn is monotonic convex gt fz is monotonic convex 2 f 1st derivative exists and is nonnegative f monotonic 3 f exists and is monotonic fx convex 4 Thus f z exists and is nonnegative f convex See CLRS Appendix 2 on integral bounds and nice pictures explaining why they work the idea is to put upper and lower bounds on discrete sums by pairs of de nite integrals on the corresponding functions Some useful integration formulae the Theory Math Cheat Sheet linked off the main course page has many more k 1 k1 dz 71 0 n l amd 7 anil 0 e z ae n l l k 7 k1 k1 1 z lnxdxi k1n lnn7 k12n CLRS on p 1067 show a bound for the harmonic series Section 34 Another example is to use a simple case of the last formula to get a lower bound for the sum of logarithms 7 which is what The logarithm of a factorial lgnl Zlgz 0 Zlgz 2An1gd 11 13972 by CRLS formula All Changing bases7 lgxd lgelnxdx lge lnzdx 1 1 1 lgezln 7 m EL lgenlnn 7 n 1 nlgn7nlgelge 2 nlgn7nlge lge 3 14437 so Zlgz 2 nlgn 7144371 13971 37 Who Knew It seems that the last example is related to the derivation of the very useful Stirling s Formula which bounds 711 n n n n 1 lt7 27m 3 n 3 lt7 27m 1 7 e e 1171 for n 2 1 4 Bounds 41 Big Bounds by Limits Not much to add on bounds Besides the traditional there7s a c and no such that for n gt no etc etc one can use in nite limits to determine the order of functions and doing so is often easy because of L7Hopital7s rule so here7s the idea A function f 6 09 if lim m c lt oo mace 901 for nonnegative 0 including the case 0 0 So if the limit exists and is not in nite f grows no faster than 9 and so f 6 09 If the limit is 00 then f grows faster than 9 see below A function f 6 99 if fn 31320 m gt 07 including the case that the limit is 00 This leaves A function f 6 99 if lim M c 7H 901 for some constant 0 lt c lt 00 42 Putting Limits to Work L7Hopital7s Rule If f and g are differentiable functions with derivatives f and g and gm fn 31330 901 00 then lim M lim amp H 901 H 9 n39 So lets check what happens with some favorite functions like fn n2 and gn nlg n We expect that f 09 but 9 E 0f In the ratio of functions a factor of 71 drops out so were interested in 1 1m 7 mace But we dont like differentiating lgn7 so we change base by the last identity in Section 2 to get the more friendly ln function lgn lnnln2 So by L7Hopital7s Rule7 l 2 l 2 lim n n lim 7M lim nln2 oo H00 mm H00 171 H00 Since the 00 case isn7t allowed in the 0n test7 f is not 097 but since the limit of the ratio must go to 07 g is 0f 43 Little Bounds by Limits For strictly smaller and greater growth rates we have the little oh and little omega concepts You7re probably way ahead of me7 since the limit forms of these de nitions are obvious With f and 9 functions from non negative integers into the positive reals7 f 6 09 is the set of functions f such that n lim Q 0 and f E wg is the set of functions f such that Its easy to remember that little oh functions are the smaller functions in Big Oh Less intuitively7 the little omega functions are the LARGER ones in Big Omega 44 Properties CLRS have lots of fun exercises so you can prove properties of these bounding functions For instance H Membership in 0w 099 are each transitive 2 f 6 09 lt25 9 6 SW 3 f e g g e f 9 de nes an equivalence relation whose classes are called complexity classes U rb Of g Omaxf 9 with similar equations for 9 Q 03 Our buddy L7Hopital lets us prove lgn E 0n a gt 0 So the log grows more slowly than any positive power of 71 like the 000001 power 5 Similarly n 6 02 k gt 0 So powers of n grow more slowly than 2 in fact more slowly than any b where b gt 1 The asymptotic order of some common summations is easy to determine CLRS has Ap pendix A2 entirely devoted to this topic I dont know how the following ts into what it says but does have some good examples of calculating bounds and some important caveats including a neat proof that the sum of the rst 71 integers is So some summation factoids follow If d is a nonnegative constant and r 31 1 be a positive constant The sum of a polynomial series 212 6 nd1 The sum of a geometric series EL Ti 6 9 of its largest term Remember we disallow r 1 and note that usually b is some function of the problem size n The sum of a logarithmic series 21logz39 E n logn The sum of a polynomial logarithmic series 212 logz39 E nd logn The proof for the geometric series follows from the formula for its sum Section 33 and the proofs for the other cases rest on the fact that the functions are monotonic and can be bounded by upper and lower rectangles Eg for 71 use the upper rectangle 0 0 n 0 7171 0nd and the lower rectangle n20 710 71 nd2d 712 nd2d It turns out the these upper bound rectangles and lower bound rectangles both grow at the same rate hence the function has to grow at that rate too A picture would be good but l7m too lazy 5 Simple Recurrence Techniques CLRS presents some sophisticated techniques for solving recurrences but deemphasizes a couple of simple ones 51 Brute Force Substitution Probably the most obvious thing to do with a recurrence is just to use it to rewrite itself In the canonical mergesort example start with the original Tn 2Tn2 n So what is 2Tn2 7 We simply substitute n2 in the equation and get 2Tn2 22Tn4 n2 4Tn4 n Tn 4Tn4 2n Plodding onwards 4Tn4 42Tn8 n4 8Tn8 n Tn 8Tn8 3n So we see and can prove by induction if we wish Tn 2kTn2k kn We know there will only be k lgn terms so Tn nT1 nlgn nlgn n Onlogn The ugliness of those right hand side partial results is the target of the next technique which with a little foresight aims to make the rewriting trivial 52 Telescoping The telescoping technique is mentioned in CLRS Appendix A1 The crucial observation is that for any sequence 10 a1a2 an 71 2ak 7 ak1 an 7 a0 i1 We used this idea for our in nite series solving trick in Section 33 Continuing with the mergesort example T1 1 11 T71 2T712 717 but it is not in the correct form to telescope We want the right hand use of T139 to have exactly the same form as the left hand side T139 7 1 We notice the pesky 2 and also the pesky 71 messing up the right hand side We want the RHS to be a function of 712 Hmmmm lnspiration strikes and we divide both sides by 71 to obtain T01 Tn2 7 1 71 712 Fair warning this sort of pre adjustment is an aspect of the telescoping method that is a bit creative and varies from recurrence to recurrence Anyway now were set Our equation now works for any 71 that is a power of two removing this assumption is possible and gives almost the same answer and we can keep rewriting thus 712 714 17 Tn4 Tn8 714 7 718 17 downto Finally7 we can add up all these equations The left hand side is our original sum7 and there are canceling pairs of terms for all but the rst on the LHS and the last on the RHS There are lg71 rewrites since we divide 71 by 2 every time7 so and multiplying by 71 gives us T71 711g71 71 07110g71 Now the idea of adding up all the equations is nice conceptually7 but in practice we usually see telescoping presented and used just as re writing a RHS term above7 just rewriting T3722 71 as 17 say 53 The Master Method The master method aka the Master Theorem is for solving recurrences of the form T01 aTU tb Which is best visualized by a recursion tree that splits up the problem into a smaller versions smaller by the factor b and puts together subproblems at the current level at a cost of The sum of non recursive costs the fns at every level is the row sum CLRS seems to rely on their MM proof for intuition as to why it works They spend a fair amount of time and space on it their Section 447 with some helpful graphics7 and some books have even hairier proofs Here ljust want to make sure l7m understanding whats going on The two big questions seem to be In CLRS Theorem 41 their Section 43 1 What do those three cases mean 2 Where did that logba come from OK The problem size drops by factor of b for every unit depth increase Thus we hit the leaves of the tree at depth D such that nbD 1 Rearranging and taking logs yields D lgnlgb7 which is in log What the MM is telling us7 in fact the answer to 17 is that the row sums are not the same at all depths Hold that thought Let7s gure out how many leaves the tree has With branching factor 17 the number of nodes at depth D is L aD Taking logs gives lgL Dlga lgalgb lgn7 using our expression for D from derived just above The ratio E lgalgb relating these tree properties is important In fact it is the answer to question 27 since by the last property of logs in Section 27 the base changing formula7 E logba Call E the critical exponent Thus the number of leaves in the recursion tree is about L 71E Also we know the depth of the tree the number of row sums is about D lgnlgc The row sum for the row of leaves at depth D is 97 LE7 or simply 71E if the base case cost is 1 Last7 the value Tn we are looking for is the sum of the non recursive costs of all the nodes in the tree7 which is the sum of the row sums Often the row sums form a geometric series7 or can be approximated closely bounded by geometric series here7 a series of the form 250 card 7 see Section 33 The ratio r is important in determining how the series acts ln particular7 we know Section 4 that if r 31 17 the sum is in 9 of its largest term From what we know and that fact we can deduce the following Little Master Theorem 1 Roughly the r gt 1 case If the row sums form an increasing geometric series with row 0 at the root of the tree7 then Tn E 97 LE7 with E the critical exponent aka logba The cost is proportional to the number of leaves in the recursion tree since the sum is dominated by the largest member of the series7 the last term 13 2 Roughly the r 1 case If the row sums are about constant Tn E fn logn since there are logn equal terms 3 Roughly the r lt 1 case If the row sums are a decreasing geometric series then Tn E fn that is all the real work is done at the root in putting together the subproblems 7 the sum is dominated by the rst term That7s the intuition behind the MM in terms of geometric series which are simple and easy to believe The MM is a more general result with the same intuitions The graphics in CLRS are quite helpful I think If you dont like CLRS7s proof there7s a nice long one in Johnsonbaugh and Schaefer CB s Answers Asst 2 csc 172 8 Sept 1997 Read the handout on Mathematical Preliminaries 1 How many di erent 5card draw poker hands may be dealt from a 52 card deck 52 choose 5 or 052 or 52475 2 How many different 5card draw poker hands may be dealt from a 54 card deck one with two identical jokers Since the number of permutations of the cards obeys the law in the handout for sets with identical members it seemed to me that it would be ME4959 but I much prefer the argument that there are 052 ways to choose a hand without jokers ways to choose a hand with l joker and 032 ways to choose a hand with 2 jokers These two answers don t agree so something s wrong with the rst one To be continued 3 If you have an unfair coin with probability p of coming up heads with 1 not necessarily equal to 12 and thus q 1 p probability of coming up tails what is the probability of tossing the coin n times and getting k heads This is pretty easy to gure out from rst principles and in fact is also the general term of the binomial probability density function an important one in statistics 137 2 pk pquotquot 4 Explain the fact that 11 1 14641 in terms of Pascal s triangle Think of this as 10 U4 which because there are no carries can be written as 110000 41000 1 which is just the binomial coefs multiplying the powers of 10 Because of pesky carries the phenomenon is obscured for 115 5 What is the relation between 05090quot and 72509 They re equal just take log to base c of both sides 6 Prove that FHH and Fquot are relatively prime This is easy once you prove by induction that with Fquot a Fibonacci number Fn1Fn1 F 1quot You can verify this relation for the rst few F s Then you want to prove that thencefore adjacent versions of the difference have the same magnitude and di erent signs preserving the relationship So we want FnilF 1FHFH2 Fnanil FnFn Now the trick is to expand each side in such a way as to have no terms below Fn2 liability 1 Fnian72 Fn72Fn72 FnFnil Fnianil Fnianil 392Fnian72 Fn72Fn72 l nian72 FnFnil nianil 2Fn71Fn72 and Fnian72 FnFnil Fnianil Fnian72 39l Fnianil FnFnila FnFnil FnFnilv I don t claim this is the most elegant way to do this Once the relation is proved it is easy to see that Fquot and FHH are relatively prime since any common divisor would have to be a divisor of 1 7 Express 10g10 c in terms of 10g2 c and 10g2 10 We did this already 10g2 c logg 10 logmx 8 If n is a l4digit integer will the value of 72 t in a computer word with a capacity of 47 bits plus sign Yes indeed since 10g2 n lt Itng n10gm 2 lt 14030103 lt 47 9 A prime number is an integer greater than one which has no exact divisors other than 1 and itself Using this de nition and mathematical induction prove that every positive integer greater than one may be written as a product of prime numbers Hint don t induce by adding 1 to 72 Rather start your proof out If n is prime it is de nitely the product of itself and 1 If it isn t prime If n is prime it is de nitely the product of itself and 1 If it isn t prime it has two factors 1 and q such that n pq and both 1 and q are less than 72 Use the inductive argument on p and q noting that since they are decreasing with every application there will come a time when they are either prime or 1 Chapter 6 Heapsort A complete binary tree is a binary tree in which each nonleaf has two children and all the leaves are at the same depth from the root A nearly complete binary tree is a binary tree constructed from a complete binary tree by eliminating a number of nodes possibly none from right at the leaf level A heap is a nodelabeled nearly complete binary tree with a special property Implementing Node Labeled Nearly Complete Binary Trees Using Arrays The array indices start from 1 The nodes are enumerated levelwise by going from the root to the leaf level and going from left to right within each level Justification for Using the Array Implementation With this implementation accessing the parent and the children is easy o For every 239 2 g 239 g n the parent of the it node is W2 o For every 239 1 g 239 g Ln2j the left child of the 239 node is 2239 o For every 239 1 g 239 g n 12j the right child of the 239 node is 2239 1 The Special Property of a Heap A max heap is a nodelabeled nearly complete binary tree where the label is called the key and the keys satisfy the following max heap property o For every nonleaf its key is less than or equal to its parent s key A min heap is defined with less than or equal toquot in place of greater than or equal toquot In this case the property is called the min heap property Here we study only maxheaps The Height of a Heap What is the height of an nhode heap C L J 1 D I think it S DOMquot m39 Close You re off by one It is Iogm 1 1 Equivalenty it is Llog n f Basic Operations on Max Heaps MaxHeapify An input to the procedure is a heap and a node 239 We set is to 239 and then execute the following o If k is a leaf then quit the loop now o If the key of each child of k is at most the key of k then quit the loop a If is has only one child then set k to the unique child Otherwise set k to the child with the greater key than the other o Exchange the keys between is and k o Set is to k 4 lt 11 o x ILTZM85111 BEFORE 72 AFTER The Pseudo code MaxHeapifyA n i PW Q W t99N l HO l w k lt 239 while I g n2 do k lt 2k D Set k to the left child for now if k 1 g n and Ak lt Ak 1 then 19 lt k 1 D If the right child exists and it has a larger key D than the left Child k is the right child if Ak lt AW then as lt Ak Ak lt AW AW lt as D Swap Ak and AW if AW gt Ak 12 D Move to k i k4 k Why is the loop terminated when k exceeds n2 C L S J quot I don t know quota It s because the nodes beyond n2 are leaves and thus are Without Children C L S 2 Now What s the running time It s proportional to the height 3 of 239 BuildMaxHeap This operation turns a given array into a maxheap To do this we first turn each of the subtrees of the root into a maxheap Then there is only one spot where violation of the maxheap property may exist which is the root If we execute MaxHeapify from the root then such violation will be eliminated BuildMaxIIeapA n 1 BuildMaxHeap0A n 1 BuildMaxl IeapMA n i 1 if 2239 g n then BuildMaxIIeap0A n 2239 if2i1 3n then BuildMaxHeap0A n 2239 1 MaxHeapifyA n i 9159 ll Unrolling the Recursion The algorithm is a collection of calls to MaxHeapify Since MaxHeapifyA n i does not change the keys of the nodes outside subtreei we can reorder any way we want so long as the call for a node comes after the for its descendants So the following does the same 1 for 239 lt n downto 1 do 2 MaxHeapifyA n 239 Since MaxHeapifyAni does nothing if igt n2 the initial value ofz39 can be Ln2j 1 for 239 lt Ln2j downto 2 do 2 MaxHeapifyA n i 12 An example l3 What is the running time of BuildMaxHeapAn C L S J The height of the heap is 0009 n so each call to Heapify costs 0009 n Since there are 0n calls the total cost is 0nlog n Correct but the analysis can be more rigorous C L S J For each h 0 g h 3 U9 n the number of nodes having height h is at most gt 73H So the total cost is at most L395 n Dig h i iOUL 0 n hO 2h hO 2 Note that 00 h 12 W 239 Thus the running time is 0n Can you argue that the running time is actually 6n 7 C L S 15 That s easy MaxHeapify is called times so the running time is u i O Since the running time is 0n this implies that the running time is 6n Good C L S 2 l6 Heapsort Sorting Using a Max Heap First we turn the input array into a maxheap By the maxheap property the root has the largest key So we record the key as the largest key in the input array Now we replace the key of the root by the key of the last node and decrement the size of the node by one This may generate violation of the maxheap property but that can be resolved by BuildMaXHeap Thus we find the second largest key We will repeat the replacement process to find the third largest the fourth largest and so on 17 The Pseudo code HeapSortA n 1 PWEJQ PF 39WN BuildMaxIIeapA n for 239 n downto 1 do Bi lt A1 D Identify the ith smallest element A1 lt Az D Replace A1 MaxHeapifyA i 1 D Heapify for 239 1 to n do Ai lt Bi 7 What 5 the running time C L S f 18 Well MaxHeapify runs in time Olog n u a O There are n calls to it So the running time in question is On log n That s correct C L S f 19 An Example 1061 8 7 9 3 2411 The input numbers 6 has replaced 11 Wk 6 7 1 3 2 4 has replaced 10 8 2 4 After BuildMaXHeap x 8 t9 7 6 7 1 r 3 2 4 Heapi ed from the root 39 6 7 1 3 2 Heapi ed from the root 20 2 has replaced 9 4 quot 39 6r 2 1 3 has replaced 8 Heapi ed from the root 3 Z 1 Heapi ed from the root 21 A priority queue A data structure for maintaining elements that are assigned numeric values Operations on Priority Queues o Maximum To obtain an element with the largest key o Extract Max To take out the element with the largest key a Insert Insert an element o Delete Delete an element o Decrease Key Decrease the key of an element o Increase Key Increase the key of an element 22 Implementation of a Priority Queue Using a Max Heap MaximumA Return A1 Extract MaxAn This is the core of HeapSort InsertAnac To add a key a to A whose size is n first increment n by one Then insert a as the nth element Then resolve violation from the inserted key towards the root 23 3913 Wig 8 m 5 fl 3 24 The Pseudocode InsertA n 19 1 nlt n1 Anlt a 2 UpwardFiXAnn UpwardFixA k ilt k while 239 2 2 do j lt W2 if Aj gt Az then ylt AH Az lt AU AU lt y D Swap Am and AD if AU gt A2 ilt j D Move to j WEIQ PFP99 The running time is 0091 25 DeleteAi Exercise 657 Increase KeyAiac What needs to be done is to replace Az by a if a gt Az To do the replacement set Az to a and call UpwardFixA 239 Decrease KeyAi Use Heapify 26