Adv Data Structures
Adv Data Structures CS 6310
Popular in Course
Popular in ComputerScienence
This 34 page Class Notes was uploaded by Lisette Hodkiewicz on Wednesday September 30, 2015. The Class Notes belongs to CS 6310 at Western Michigan University taught by Ajay Gupta in Fall. Since its upload, it has received 46 views. For similar materials see /class/216878/cs-6310-western-michigan-university in ComputerScienence at Western Michigan University.
Reviews for Adv Data Structures
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 09/30/15
wmurcs Mum Imaducuan m Algumhm s Secand mm by mmLmmmM am mum w 1 nm 1 w m AjayGupta 1 Mum wMLLcs a mum ma n mayan wmurcs Mum 143mm Cumwa 1 7 M LUCAILNODH 2 mlJ lt7 TRUL 3 uJ AjayGupta wmurcs Mum x UMH mm u v w m w H w mehuhwww n u AjayGupta wMUCS 1122004 a 4 1 ML 5 J 1 T VYV any 7 h Vquot 3 r V L Aj ay Gupta wmurcs Imaducuan m Algumhm s Secand mm by mmLmmmM am E1195 a mum a V mum W m AjayGupta wmurcs Mum AjayGupta 2 wmurcs Mum AjayGupta z WMU C S 1122004 Ajay Gupta WMUCS Introduction to Algorithms Second Edition by Conner Leiserson Rivest amp Stein Chapter 29 23 September 2004 Ajay Gupta WMLLcs 23 Sep emhex 2mm Alaycnp wmuecs 23 September 2mm AjayGupta z wmurcs Mum Imaducuan m Algumhm s Secand mm by mmLmmmM am my 1 m u mnwm M7 n Mmva J vnnmmu NW Hm MM w m AjayGupta wmurcs Mum mnnmm m H m m wmmrw w mum m mum M him 1 NH mu MMA m n mm 1 man 7 mm J elm lt mum munm AjayGupta 2 wmurcs Mum TREE AIM AUMIH I while Infv NIL 1 u efth 1 return TREEMAXIMLMHJ l wllilenng NIL z 0 7 me 3 rclurnA AjayGupta z wmurcs Mum AjayGupta A wmurcs Mum AjayGupta 5 wmurcs umznnA Imaducuan m Algumhm s Secand mm by mmLmmmM am Emu 21 xe m muWWMmw w u Wm H AjayGupta 1 wmurcs umznnA 5AMLVCUMWWAMM m mm Hurrsnu 1 W AjayGupta 2 umznnA wMLLcs MAKFVSETU39 I plvl quot quot 1 runM lt Ajayc39llp wmurcs mm wv Am mm H H umznnA erun mam 1mm Hm m mum lemrn pm AjayGupta wmurcs umznnA AJayGupta UC BerkeleyicS 170 E icient Algorithms and Intractable Problems Handout 12 Lecturer David Wagner March 11 2003 Notes 12 for CS 170 1 Disjoint Set UnionFind Kruskal s algorithm for nding a minimum spanning tree used a structure for maintaining a collection of disjoint sets Here we examine e icient implementations of this structure It supports the following three operations 0 MAKESETx create a new set containing the single element a o UNIONxy replace the two sets containing a and y by their union 0 FINDx return the name of the set containing the element a For our purposes this will be a canonical element in the set containing a We will consider how to implement this e iciently where we measure the cost of do ing an arbitrary sequence of m UNION and FIND operations on n initial sets created by MAKESET The minimum possible cost would be Om n ie cost O1 for each call to MAKESET UNION or FIND Our ultimate implementation will be nearly this cheap and indeed be this cheap for all practical values of m and n The simplest implementation one could imagine is to represent each set as a linked list where we keep track of both the head and the tail The canonical element is the tail of the list the nal element reached by following the pointers in the other list elements and UNION simply concatenates lists In this case FIND has maximal cost proportional to the length of the list since following each pointer costs 01 and UNION has cost 01 to point the tail of one set to the head of the other The worst case cost is attained by doing n UNIONs to get a single set and then m FINDs on the head of the list for a total cost of Omn much larger than our target Om To do a better job we need a more clever data structure Let us think about how to improve the above simple one First instead of taking the union by concatenating lists we simply make the tail of one list point to the tail of the other as illustrated below That way the maximum cost of FIND on any element of the union will have cost proportional to the maximum of the two list lengths plus one if both have the same length rather than the sum Da UNION D D Dam More generally we see that a sequence of UNIONs will result in a tree representing each set with the root of the tree as the canonical element To simplify coding we will mark the root by setting the pointer in the root to point to itself This leads to the following initial implementations of MAKESET and FIND Notes number 12 2 procedure MAKESETX initial implementation px x function FINDX initial implementation if x then return FINDpx else return x It is convenient to add a fourth operation LINKXy where w and y are required to be two roots LINK changes the parent pointer of one of roots7 say 3 and makes it point to 3 It returns the root of the composite tree 3 Then UNIONXy LINK FINDX7 FIND But this by itself is not enough to reduce the cost if we are so unlucky as to make the root of the bigger tree point to the root of the smaller tree7 n UNION operations can still lead to a single chain of length n and the same cost as above This motivates the rst of our two heuristics UNION BY RANK This simply means that we keep track of the depth or RANK of each tree7 and make the shorter tree point to the root of the taller tree code is shown below Note that if we take the UNION of two trees of the same RANK7 the RANK of the UNION is one larger than the common RANK7 and otherwise equal to the max of the two RANKs This will keep the RANK of tree of n nodes from growing past Olog 71 but m UNIONs and FINDs can then still cost Omlogn procedure MAKESETX final implementation px x RANKX 0 function LINKxy if RANKw gt RANKy then swap 6 and y if RANKw RANKy then RANKy RANKy 1 pxgt y returny The second heuristic PATH COMPRESSION7 is motivated by observing that since each FIND operation traverses a linked list of vertices on the way to the root7 one could make later FIND operations cheaper by making each of these vertices point directly to the root function FINDX final implementation if x then px FINDpx returnpx else returnx We will prove below that any sequence of m UNION and FIND operations on n elements take at most Omn log n steps7 where log n is the number of times you must iterate the Notes number 12 3 log function on 71 before you get a number less than or equal to 1 Recall that log n lt 5 for all n S 2216 265536 m 1019728 Since the number of atoms in the universe is estimated to be at most 1080 which is a conservative upper bound on the size of any computer memory as long each bit is at least the size of an atom it is unlikely that you will ever have a graph with this many vertices so log n S 5 in practice 2 Analysis of UnionFind Suppose we initialize the data structure with n makeset operations so that we have n elements each forming a different set of size 1 and let us suppose we do a sequence of k operations of the type union or find We want to get a bound on the total running time to perform the k operations Each union performs two find and then does a constant amount of extra work So it will be enough to get a bound on the running time needed to perform m 3 2k find operations Let us consider at how the data structure looks at the end of all the operations and let us see what is the rank of each of the n elements First we have the following result LEMMA 1 If an element has rank k then it is the root ofa subtree ofsize at least 2 PROOF An element of rank 0 is the root of a subtree that contains at least itself and so is of size at least 1 An element u can have rank k 1 only if at some point it had rank k and it was the root of a tree that was joined with another tree whose root had rank k Then u became the root of the union of the two trees Each tree by inductive hypothesis was of size at least 2 and so now u is the root of a tree of size at least 219 B Let us now group our 71 elements according to their nal rank We will have a group O that contains elements of rank 0 and 1 group 1 contains elements of rank 2 group 2 contains elements of rank in the range 34 group 3 contains elements of rank between 5 and 16 group 4 contains elements of rank between 17 and 216 and so on In practice of course no element will belong to group 5 or higher Formally each group contains elements of rank in the range k2k where k itself is a power of a power of a power of 2 We can see that these groups become sparser and sparser LEMMA 2 No more than 7121 elements have rank in the range k 2k PROOF We have seen that if an element has rank r then it is the root of a subtree of size at least T It follows that there cannot be more than n2f elements of rank r The total number of elements of rank between k 1 and 2 is then at most 1 001771001771 ngzmglg Fg kag k By de nition there are no more than log 71 groups To compute the running time of our m operations we will use the following trick We will assign to each element u a certain number of tokens where each token is worth 01 running time We will give out a total of nlog 71 tokens Notes number 12 4 We will show that each find operation takes Olog n time plus some additional time that is paid for using the tokens of the vertices that are visited during the find operation In the end we will have used at most Om n log n time Let us de ne the token distribution If an element u has at the end of the m operations rank in the range k 2k then we will give at the beginning 2 tokens to it LEMMA 3 We are distributing a total of at most nlog 71 tokens PROOF Consider the group of elements of rank in the range k 2k we are giving 2 tokens to them and there are at most 7121 elements in the group so we are giving a total of 71 tokens to that group In total we have at most log 71 groups and the lemma follows D We need one more observation to keep in mind LEMMA 4 At any time for every u that is not a root rankM lt rankbam PROOF After the initial series of makeset this is an invariant that is maintained by each find and each union operation D We can now prove our main result THEOREM 5 Any sequence of operations involving m find operations can be completed in Om n log n time PROOF Apart from the work needed to perform find each operation only requires constant time for a total of Om time We now claim that each find takes Olog n time plus time that is paid for using tokens and we also want to prove that we do not run out of tokens The accounting is done as follows the running time of a find operation is a constant times the number of pointers that are followed until we get to the root When we follow a pointer from u to 1 where 1 we charge the cost to find if u and 1 belong to different groups or if u is a root or if u is a child of a root and we charge the cost to u if u and 1 are in the same group charging the cost to u means removing a token from as allowance Since there are at most log 71 groups we are charging only Olog 71 work to find How can we make sure we do not run out of coins When find arrives at a node u and charges u it will also happen that u will move up in the tree and become a child of the root while previously it was a grand child or a farther descendent in particular u now points to a vertex whose rank is larger than the rank of the vertex it was pointing to before Let k be such that u belongs to the range group k2k then u has 2 coins at the beginning At any time u either points to itself while it is a root or to a vertex of higher rank Each time u is charged by a find operation u gets to point to a parent node of higher and higher rank Then u cannot be charged more than 2 time because after that the parent of u will move to another group D WMUCS 23 September 2004 Introduction to Algorithms Second Edition by Conner Leiserson Rivest amp Stein Chapter 4 Ajay Gupta 1 WMUCS 23 September 2004 Ajay Gupta 2 wMLLcs Imaducuan m Algun39hm s Secand Eaan mmmmm am Emu 33 Mum Ajayc39llp Mum wmurcs Dnu L mm V w rrlImHL r w w 2 AjayGupta wmurcs Mum Ajayc39llp wmurcs Mum AjayGupta A WMUCS 1122004 x439 1 I e l quot 39 I r y 4k Ajay Gupta WMUCS 1122004 Ajay Gupta 6
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'