Fundamental Algorithms CS 231
Popular in Course
Popular in ComputerScienence
This 46 page Class Notes was uploaded by Jordon Hermiston on Thursday October 29, 2015. The Class Notes belongs to CS 231 at Wellesley College taught by Randy Shull in Fall. Since its upload, it has received 26 views. For similar materials see /class/230934/cs-231-wellesley-college in ComputerScienence at Wellesley College.
Reviews for Fundamental Algorithms
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 10/29/15
M Finding the Minimum Element MinimumA min lt Al for i lt 2 to length A do if min gt Ai then min lt Ai return min a M Is That the Best We Can Do Remark A decision tree for the MIN problem must have at least n leaves since any one of the n keys may be the output M kt That Don39t Impress Me Much The decision tree model yields a 9lg n lower bound for finding the minimum value Surely any algorithm that returns a minimum value requires at least 901 steps a a M ht Lower Bound On Minimum Lemma Any algorithm to find the min of n keys by comparisons of keys must do at least nl comparisons Proof In any list with n distinct entries 11 entries are not minimum An entry is not minimum if it is larger than at least one other entry on the list Hence n l entries must be quotwinnerquot in comparisons done by algorithm1 Each comparison has one only winner i 5 Simultaneous Minimum and Maximum Remark Finding the minimum and maximum elements independently requires 2n 2 comparsons Challenge Can we do better Strategy for Simultaneous MinMax 1 Maintain min and max elements seen so far 2 Compare each pair of input elements to each other 3 Compare smaller of two inputs to min amp update 4 Compare larger of two inputs to max amp update s M Selection in Expected Linear Time To find iCh smallest element p lt Randomized PartitionAlohi hi 17 largest elements l0 17 hi I l p l0 smallest elements if i p lo1 then return Ap elseif i lt p lo1 then search bottom half of array for i else search top half of array for i p lo1 RandomizedSelect Randomized Select A 10 hi i if 10 hi then return Alo p lt Randomized PartitionAlohi if i p lo1 then return Ap elseif i lt p lo1 then return Randomized SelectAlop 1 i else return Randomized Select Ap1hii p lo1 s An Unhappy Choice in the Worst Case lo phi n I elements AHappy Choice on Average 772 712 Selection in WorstCase Linear Time SelectA k 1 Divide the keys into sets of size five and find median of each set y3 2 Apply Select recursively to find median m of nS medians M Smaller to Left Laiger to Right 3 Compare each key in A and D to m Sl 6 C U keys from AJD lt m 32 6B U keys from AJD gt m r groups of 5 r groups of 5 4 if sl l k then return m else if k s Sl then return SelectSlk else return Select k Sl l Stack Operations Each operation runs in 01 time MultiPop S k MultiPopS k while not Stack EmptyS and k 2 0 do PopS k e k l MulthopS3 gt S s M WorstCase Time In a sequence of n stack operations starting with an empty stack what is the worstcase cost of a mul t iPop operation Mu1t1P0pSn gt S S Worstcase cost of n stack operations Amortized Cost 9 a S S S S S S PushSA PushSB PushSC MultiPopt92 PushSB MultiPODS5 Aggregate Method Each object can be popped at most once for each time it is pushed s M The Accounting Method a M Actual Costs Push s A 1 Pop S 1 MultiPop S k mink sizeS Amortized Costs Overcharge Peter Push sA 2 to pay Paul Pop 839 0 S Paul MultiPopSk 0 4 Money 1n the Bank B 1 B 1 gt gt A m gt mag S S S S S S S PushSA PushSB PushS C MulthOP S 2 PUShSB Mu1t1P0pSE Pay 2 2 2 0 2 0 No bounced checks s Potential Energy Push S C Push 313 Empty MultiPOP 32 Stad MulthopS 5 Associate a potential energy with the data structure at each stage Amortized Change in potential cost A due to operation Ci Ci Di 39 DiZ R Actual cost Total Amortized Cost 215139er Ci 215139er Di Di 1 Cn cIDn IXDn I Cn I Dn Z Dn2 39 39 39 1 13D1 13D0 21m cl ltltIgtltDngt ltIgtltD0gtgt Thus if we define I so that 19Di 2 19D0 2 0 for all i the total amortized cost is an upper bound on the total cost 3 For Example I size of S Push sc Push 53 Push SB MultiPop S 2 Empty PushSA MultiPopS5 Stack 0 1 2 3 1 2 n 9I A A A A A I S S S S S S S Ripple Counter Software Ripple Counter Increment A i e 0 while iltlengthA and Ail do Ai e 0 i lt il if iltlengthA then Ai e 1 Cost of an increment is linear in the number of ipped bits Total number of ips for a sequence of n increments M Amortized Cost We seek a potential function I so that ltIgtDl 2 ID0 2 0 5 Total Amortized Cost Suppose the ith Increment operation resets t1 bits so C s til The number of one s in counter after 1th operation is bl S bi1 li1 The potential di erence is CIgtD ltIgtDi 1 b bl1 blJ ti l bl1 l ti IA The amortized cost is therefore 8139 Ci Day 39 Di 1 s 1111 41 2 a a M Int Aggregate Method Flips every fourth call Flips every call Flips every other call s Wellesley College 0 CS231 Algorithms 0 October 11 1996 Handout 18 ORDER STATISTICS Reading CLR Chapter 10 Terminology Let S be a set of n elements necessarily distinct The 1th order statistic of S is the 1th smallest element ie the element that is larger than exactly i 1 other elements in the set Such an element is said to have rank 139 The minimum of S is the rst order statistic element with rank 1 The maximum of S is the nth order statistic element with rank n The medians ofS is are the e1ements with rank n12J and l n12 l Example B 43 5 17 91 2 42 19 72 37 3 Minimum of B Maximum of B Medians of B Rank of 17 The Selection Problem Speci cation Select A i Return the ith order statistic from an array of n distinct elements Trivial algorithm Select A i Sort A return A i Can obviously be done in n lgn time The main question of today39s lecture Can we do better Important special cases Can find mininum or maximum with n l comparisons Can find minimum and maximum with 3 l nZ l comparisons For any fixed k 2 1 can select kth or n kth element in n time by k applications of minimummaximum deletion Selection in Expected Linear Time Key idea Use randomized quicksort like partitioning but only explore one partition Randomized SelectA 10 hi i if 10 hi then return Alo i guaranteed split lt Randomized PartitionA lowerlength lt split 10 if i S lowerlength then return Randomized SelectA 10 Split i return Randomized SelectA split 1 hi i lowerlength 10 hi at this point 10 hi 1 2 n l Can derive that the expected running time is Tn E ZTk n k Hm See CLR for details Can use the substitution method to show that Tn S cn is a solution to the above recurrence Notes The substitution method uses a form of induction If we assume that the smaller problems have a given running time and we can show that the whole problem also has that running time then that running time is a valid solution of the recurrence n4 While the substitution method shows that Tn S cn is a solution to Tn E ZTk n k Rm 2 n it cannot show that this is a solution to the quicksort recurrence Tn E ZTk n k1 Try it and see Selection in Worst Case Linear Time Presentation in CLR is confusing I prefer the following explanation We will consider the median of median of c algorithm where c is an integer constant 2 l MedianofMedianofcc A i 1 View input array Aln as atwodimensional array Blc lnc Ifc does not evenly divide n can always pad the array with extra large elements that won t affect results Sort each column using any method even quicksort Since each column contains c elements this step is linear not quadratic After this step the row B l cl2 lnc are medians of their respective columns Use MedianofMedianofcc B l cl2 lnc l ncl2 to find median of medians mm At least 14 elements are 2 mm at least 34 elements are lt mm Can see this by imagining that the columns are sorted from left to right by their medians Important we only imagine this sorting we don t actually do it N Partition A around mm guaranteeing that mm ends up in the high partition Let k be the number of elements in the low partition and n k the number of elements in the high partition LA 4 If i S k use MedianofMedianofc with index i on the low partition Ifi gt k use MedianofMedianofc with index i k on the high partition Analysis M Binary Code 1mm 39 tr co ecuon f I andmles 0 f g g mmofs 39 I mes quoti 5 r as it Coded lnd2imal ASCII 99 III 1 ml at 107 111 Ill ll III a M 111 III I 101 Ill 5 ll Ill 31 I III 97 ll ll J 116 117 III III 115 32 1X1 102 HZ Us 121 115 116 m 159 32 Ill 1112 32 quot5 III I 116 ll Ill 1m 32 ll 1 115 quot5 uuwn mun 1mm mom umum unmn 1mm mama mum unmo mum moon noun nnnun Homo mom nounu mama mum mun mum man nnnn mom mmmx nnmo uunma mnmm mum 111nm mum mam umnu mrmnnn unuu mom woman man umn nunn1 mnnn mmnn mum uan lumnu mun mono manna mnnu ll l ul mnmn nmna mum unma mum ammo 1mm mam 11mm moon Improving on ASCII ASCII uses eight bits in order to store 28 distinct characters If our file uses fewer characters we might do better w1th a di erent code For example rather than store quotABRACADABRAquot in eight b1t ASCII code we store it in our own personal five b1t b1nary code A 00001 B 00010 0000100010100100000100011000010010000001000101001000001 gt This represents a 375 savings over the ASCII encoding But we can do better If we glve up the 110th11 of xed encoding length m VariableLength Encoding Assign shortest bit strings to the most commonly used letters A0 B1 R01 C10 D11 So ABRAC 010101001101010 This uses only 15 bits compared to 55 in our preVious scheme But it does required blanks to separate the characters ADABRA becomes Lt a M Epiphany Without the spaces the encoding of ABRACADABRA is ambiguous 010101001101010 For example is the first bit an A or the start of the pair 01 representing an R But delimiters aren t needed if no character code is the pre x of another RecallA0 131 R01 010 1311 s M Prefix Codes improved code for the message ABRACADABRA is given B C D R by 11 00 010 10 011 The message ABRACADABRA is translated as the 25bit string 1100011110101110110001111 a M Coding Trees Decoding needs a convenient representation for the prefix code For example 1100011110101110110001111 Recall A11 1300 R011 0010 1310 s M Cost of Tree T De nition Let T be the code tree corresponding to a prefix code Define Bar Zea fc dTrc wherefc is the frequency of character 0 and dTc denotes C s depth in T What is the cost of the ABRACADABRA tree given in the previous slide M Can We Do Better ABRACADABRA tree yielding 23 bit string 01101001111011100110100 M Huffman Codes Huffman encoding is a method for constructing a tree which leads to a bit string of minimal length for any given message The first step is to find the frequency counts for the message For example ABRACADABRA Frequency Table 2 l l 2 WUOUUDgt a M Build Tree Bottom Up Maintain a priority queue Q keyed on frequency lCrlllD1B2R2A5 Remove the two trees with smallest root values and merge them into a new tree s M N Repeat 3 4 m To Obtain a M a M Huffman Code HuffmanC n M Q C for i lt l to n l do 2 lt Allocate Node X lt leftz lt Extract MinQ y lt rightz lt Extract MinQ fz lt fx fy InsertQz return Extract MinQ GreedyChoice Property Let x y E C have minimum frequencies Then there exists an optimal prefix code for C in which the codewords for xy have the same length and differ only in the last bit Proof of GreedyChoice an Choose a and b be sibling leaves at maximum depth in T Since x is min frequency x s a and costT 5 costT I M OptimalSubstructure Property Let x y E C have minimum frequencies and T be any optimal prefix code tree for C in which x y are siblings Tree T obtained from Tby replacing the parent of leaves x y by leaf 2 Withfz fx y is an optimal prefix code tree for C C xy U z a 5 And Conversely Let x y E C have minimum frequencies and T be any optimal prefix code tree for C C xy U z where fZ fx fy Tree T obtained from T by replacing leaf 2 by interior node having x y as children is an optimal prefiX code tree for C M Sometimes Binary Trees Let Us Down Repeated insertion of elements from a sorted le results in a skinny tilted tree and terrible search times a M Balancing a Tilted Tree Riga a s M Easier Said than Done About which node of a binary search tree do we pivot in order to obtain balance We require global information to restore balance 3 M RedBlack Trees Every simple path from a node to a descendant leaf contains the Every red n0de has two same number of black nodes black children Every leaf WIL is black 395 M N RedBlack Trees are Roughly Balanced Lemma 41 A redblack tree with n intemal nodes has height at most 2 lgn l We show subtree rooted at each node x has at least 21Wquot 1 internal nodes I V M kl Implications Search Inseit Delete Successor Predecessor Max Min RedBlack Tree i 5 M Tree Insert T 643 Tree Insert T 2 find z s parent node y in T with empty leaf attach 2 as left or right child of y Maintaining Balance We change the pointer structure through rotation which preserves the inorder structure of the redblack tree RightRotate y s Left Rotate T x No colors yet a M Rotation Code Left RotateTX y e rightX right X e lefty if 1efty e NIL then plefty e X py e pX if pX NIL then root T 6 Y else if X left pX Attach y s left subtree to X39S right subtree Link X39S parent to y then leftpX 6 Y else right pX 6 Y lefty e X pfo e y Put X on y s left s RB Insert T 643 Case 1 Both x39s parent and uncle are red There are two symmetric sets of cases px leftppx and px rightppx V M M Case 1 is Easy color x39s grandparent red color both x s parent and uncle black 395 M Case 2 Requires a Bit More W0rk Case 2 x is the right child 0 a re parent but now x39s uncle is black Coloring x39s parent black won39t help Instead we rotate px and x Case 3 Always Results from Case 2 Case 3 x is the left child ofa red parent and x39s uncle is still black But may arise on its own as well 5 Restoring Balance We are over balanced on the left side of the tree ltpxJ le ppx We RightRotate about ppx to restore c3155 2 msures that X 5 ght balance s1blmg 15 black The Results After RightRotate about I t f former x pam mg ormerp x pp black and painting former ppx red Tree DeleteT 603 Tree Delete Tz if leftz or rightz is a leaf then y lt z else y lt successorz copy y s key amp info into node z delete node y As long as the deleted node y is red all is well Try treeedelete T 655 RB DeleteT 655 The easy case RB Tree DeleteTz if leftz or rightz is a leaf then y lt z else y lt successorz copy y s key amp info into node z delete node y if colory black then RB Delete FixupT X z39s nonNIL child if it had one Lt RB Delete FixupT 643 Deleting a black node 2 may unbalance the RBtree However there is an easy fix if z former child is red RB Delete T 799 More challenging case s RB Delete Fixup T NIL Leftirotate T 812 and recolor to restore balance We think of x as carrying an extraquot black obtained from 2 when 2 was spliced out Case 4 of Four Cases Nodes 812 956 and 999 were recolored to restore balance In the interests of time not to mention sanity we read the remaining cases from the text in the privacy of our own rooms 395 M Graphs De nition A graph GVE is a collection Vofvertices and E of edges C C Vertices are simple objects that can have names and other properties an edge is a connection between two vertices M Graph Terminology A graph may have many representations Figure l A graph Graph terminology path r simple path cycle tree forest spanning tree a WTHIIHDgaqoaV 0000001100111uv 00000000000118 00000000001010 0000000111000G 00000011110003 0000000111001 00000010100019 0000110000000H 00001100000001 1111000000000FA 0011000000000vniA 11010000000007A 1101000000000W Adj acencyMatrix of G V E De ning Properly A tree 1s a graph with n nodes and n1 edges connecting any two nodes A tree is a graph in which there is exactly one path De nition An Old Friend M N Advantages and Disadvantages Adjacencymatriees require 01 steps to determine whether u v E E D GH 0 10 0 00 0 00 1 00 1 10 1 00 0 10 0 01 0 01 0 00 0 00 0 00 00 OOOOHHOOOOOOOH t t t t OOOOOOOOOH OOHHOOOOOOOOOW L 0 0 0 0 0 0 0 0 0 1 0 1 1 OOOOOOOt H H OOH TJ Hwowoooooooooz E 0 0 0 1 1 1 1 0 0 0 0 0 0 OOOOOOOOOOHOHO B 1 1 0 0 0 0 0 0 0 0 0 0 0 ZWWHHEQmmU0wgt oooooooojgt 0 However the representation requires 0V2bits of storage and 0V2 steps to initialize This can be a problem if the graph is not dense I V M ht Adjacency List of GM E i 5 Nobody39s Perfect Some simple operations are not easily supported by the adj acencylist representation For example how do we delete a node and all the edges connected to it 696 M BreadthFirst Search BFSGS for v in vertices G do colorv lt white dv oo parent v lt nil colors lt gray dS lt O parents lt nil Q lt EnqueuesEmpty Queue while not Empty QueueQ f lt DequeueQ for g in Adj f do if colorg white then color 9 lt gray dg lt df l parentg lt f Enqueueg Q colorf black do 5 BreadthFirst Search inAction EIEIEIEIEI Contents of queue during search HE HE E Picking Up a Graph to Explore It M Shortest Path De nition The shortestpath distance 6sv from sto v is de ned to be the minimum number of edges in any path from vertex s to vertex v M Correctness of BreathFirst Search Theorem 234 BFS discovers every vertex v E Vthat is reachable from the source s and upon determination dv 6sv s M N DepthFirst Search Use a stack rather than a queue as the data structure to hold vertices yields a much different search order E E E E E HEIEIE HEIEIEE EEIEIE EIEIEH HE HEIH GWE Contents of slack during depth rst Search DepthFirst Forest DepthFirst Forest of G i 5 Amazing lt gt Depth rst search was rst stated formally hundreds of years ago as a method for traversing mazes gr
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'