Algorithm Design and Analysis
Algorithm Design and Analysis CSE 565
Popular in Course
Popular in Computer Science and Engineering
This 0 page Class Notes was uploaded by Libby Kuhlman on Sunday November 1, 2015. The Class Notes belongs to CSE 565 at Pennsylvania State University taught by Staff in Fall. Since its upload, it has received 43 views. For similar materials see /class/233118/cse-565-pennsylvania-state-university in Computer Science and Engineering at Pennsylvania State University.
Reviews for Algorithm Design and Analysis
Report this Material
What is Karma?
Karma is the currency of StudySoup.
Date Created: 11/01/15
Algorithm Design and Analysis LECTURE 9 m Greedy Algorithms Minimum Spanning Tree Kruskal s Prim s ReverseDelete Clustering Sofya Raskhodnikova Wm m Minimum spanning tree MST In ut A connected undirected graph G V E E gt R P with weight function w For now assume all edge weights are distinct Output A spanning tree T i a tree that connects all vertices i of minimum weight wT Zwuv uvET Wm m Greedy Algorithms for MST Kruskal39s Start with T o Consider edges in ascending order of weights Insert edge e in T unless doing so would create a cycle ReverseDelete Start with T E Consider edges in descending order of weights Delete edge e from T unless doing so would disconnect T Prim39s Start with some root node s Grow a tree T from s outward At wch step add to T the chapest edge e with exactly one endpoint in T Wm gl Cut and Cycle Properties Cut property Let S be a subset of nodes Let e be the min weight edge with exactly one endpoint in S Then the MST contains e Cycle property Let C be a cycle and let f be the max weight edge in C Then the MST does not contain f ersmthe MST fisnutmthe MST Wm A M W Review Questions Let G be a connected undirected graph with distinct edge weights Answer true or false Let e be the cheapest edge in G Some MST of G contains e Let e be the most expensive edge in G No MST of G contains e Wm W Review 39 E Questions Let G be a connected undirected graph with distinct edge weights Answer true or false Let e be the cheapest edge in G Some MST of G contains e AnswerzYes by the Cut Property Let e be the most expensive edge in G No MST of G contains e Answer False Counterexample if G is a tree all its edges are in the MST Wm A M 8 Prim39s Algorithm Correctness Prim39s algorithm Jarni k 1930 Prim 1959 ilnitialize S any node iApply cut property to S iAdd to T min cost edge in cutset corresponding to S and add one new explored node 11 to S Wm A m l Implementation of PrimGw IDEA Maintain V 7 S as a priority queue Q ala Dijkstra Key each vertex in Q with the Weight of the last Weight edge connecting it to avertex in S e V keyv e w for all v E V kEyLV lt 0 for some arbitrary E V while Q 9 do u e EXTRACTVMIMQ do ifv E Q and Wm v lt keyv then keyv e wu v gt DECREASEKEY 1 V lt 14 At the end v TLV forms the MST Wm A W gl Demo of Prim s algorithm Wm Wm A W Wm Demo of Prim s algorithm Wm Wm gl Demo of Prim s algorithm oeS oeVis Wm m Demo of Prim s algorithm Wm 065 6 Wm g Demo of Prim s algorithm QeS oeVis Wm Demo of Prim s algorithm Wm Demo of Prim s algorithm Wm Demo of Prim s algorithm Wm gl Analysis of Prim Q lt V 80 keyv e no for all v e V mtal keys lt O for some arbitrary S e V While Q Q do u e EXTRACTMINQ for wch v e Adju n times 3734 doif v 6 Q and wu v lt keyv mes then keyv e Mu v 7Lv lt u Handshaking Lemma gt m implicit DECREASEKEY s Time as in Dijkstra WW A W g Analysis of Prim while Q Q do u e EXTRACTMINQ n for wch v e Adju times degreeOt do if v 6 Q and wu v lt keyv tim then keyv e wot v Handshaking Lemma 3 m implicit DECREASEKEY s Wm W l Correctness of Kruskal KruskaL 1956 Consider edges in ascending order of Weight 7 Case 1 If adding e to T creates a cycle discard e according to cycle property Easel 7 Case 2 Othenxxise inserte u v into T according to cut property Where S set of nodes in u39s connected compon ent m Implementation of Kruskal Use the UnionFind data structure 7 Build set T of edges in the MST 7 Maintain a set for each connected component means m I sort edges weights so that w s w 5 s w 1 v q awash u s v make a set ContaJang smgleton u WW We th 122332131123w go through edges 0 sorted order Jf u and v are An afferent sets T 4 T J merge the sets contaJang u and v mm 1 Mummme Wm m The UnionFind Data Structure Operations MAKEUNIONFINDS creates the data structure puts all elements in S into separate se s On time Where nS FINDu returns the representative of the set containing u Olog n time UNIONAB merge sets AB into a single set 01 time gl Linked List Representation For wch set keep the size the had the tail of the list 7 for each element keep a pointer to the representative head 39 MAKEUNIONFIND 01 time per element FINDX return a pointer to representative 01 time Wm A W gl Linked List Representation UNIONAB keep the representative of the larger set 4 39 E E I an II nunnu Ill Wm Linked List Representation UNIONAB keep the representative of the larger set E Excluding updates to representatives 01 time Wm A W El Analysis of UnionFind Theorem A sequence ofk UNION operations takes Ok log k time Proof We bound the number of updates to representatives Initially n singleton sets are created One UNION merges at most two of the singletons After k UNIONs all but 2k singletons are untouched At every UNION representativex is updamd only if the size of X s set at least doubles After k UNIONs x is in the set of size 3 2k Representativex is updamd S log2 2k times At most 2k elements are touched by k UNIONs Total time on updating representatives Ok log k Wm m Forest Representation Each element is a node Each tree represents one set store its size The root is the representative MAKEUNIONFIND create roots 7 01 time per element a o UNIONAB point the root of the m a smaller tree to the root of the larger tree 7 01 time Wm g FIND operation F1NDX follow the links to the root 9 Theorem FIND takes Olog n time a 9 Proof Time to evaluate FINDx o number of predecessors ofx number of times x changes representatives Every time x changes representatives the size of its set at last doubles It mn happen g logZ n times Wm m An Improvement to FIND Path Compression update every pointer on the way to the root 9 e FINDX 0 an 49 o o o o a a Theorem n FIND operations take 0n un time Where 001 is inverse Ackerman function Wm A W 3 Implementation of Kruskal Build set T of edges in the MST Maintain a set for each connected component KmskaltG m lt Sort adng weights so that w s was s w r o q mmmmonmrm v forearm eago uv qo through edges 1n sorted order lt m 1 u g umommwml mmml gt 5 Wm I M WWW t a t m Muran mama commas l Sorting 0m log m 0m log n2 0m log n Union7Find operations 0m log n 0m10g H mm Wm A M LeXicographic Tiebreaking To remove the assumption that all edge costs are distinct perturb all edge costs by tiny amounts tlo break any ties Li i fi l332 f fi2239fl 39 x Im act Kruskal and Prim only interact with costs via pairwise comparisons If perturbations are sufficiently small MST with perturbed costs is MST with original costs Implementation Can handle arbitrarily small perturbations implicitly by braking ties lexicographically according to in ex Wm g MST Algorithms in 2007 Deterministic comparison based algorithms 70m log n Jarn1 k Prim Dijkstra Kruskal Boruvka 7 0m 0t m n Chazelle 2000 Holy grail 0rn Related 7 0m randomized KargerKleinTarj an 1995 7 0m veri cation DixonRauchTarjan 1992 Wm Given a set U of 11 objects eg photos documents microorganizms labeled p1 pn classify them into coherent groups Ou rbreak of cholera deaThs in London in 1850s Reference Nina Mishra UniversiTy of Virginia 9172007 S Raskhodnikova based on slides by E Demaine C Leiserson A Smith K Wayne Clustering Distance function for each pair of objects Close number of corresponding pixels whose inTensiTies differ by some Threshold Fundamental problem Divide into clusters so that objects in different clusters are far apart Routing in mobile ad hoc networks Identify patterns in gene expression Document categorization for web search Similarity searching in medical image databases Skycat cluster 109 sky objects into stars quasars galaxies 9172007 S Raskhodnikova based on slides by E Demaine C Leiserson A Smith K Wayne Clustering of Maximum Spacing 0kclustering Divide objects into k non empty groups Distance function satisfies dpi pj 0 iff pi pj identity of indiscernibles dpi pj 2 O nonnegativity c1031 pj dpj pi symm try 1 Spacing Minimum distance between spacmg any pair of points in different clusters 32 OGoal Given an integer k find o a k clustering of maximum spacing k 4 9172007 S Raskhodnikova based on slides by E Demaine C Leiserson A Smith K Wayne Greedy Clustering Algorithm Singlelink kclustering algorithm Form a graph on the vertex set U corresponding to 11 clusters Find the closest pair of objects such that each object is in a different cluster and add an edge between them Repeat n k times until there are exactly k clusters Key observation This procedure is precisely Kruskal39s algorithm except we stop when there are k connected components Remark Equivalent to finding an MST and deleting the k l most expensive edges 9172007 S Raskhodnikova based on slides by E Demaine C Leiserson A Smith K Wayne Theorem Let C be the clustering C1 Ck formed by deleting k l most expensive edges of a MST C is a k clustering of max spacing Proof Let C be some other clustering C1 Ck The spacing of CgtXlt is the length dgtXlt of klSt most expensive edge Let pi p be in the same cluster in C say Ci but different clusters in C say C and Ct Some edge p q on pipj path in Cgtlltr S spans two different clusters in C All edges on pipj path have length S dgtXlt since Kruskal chose them Spacing of C is S dgtllt since p and q are in different clusters 9172007 S Raskhodnikova based on slides by E Demaine C Leiserson A Smith K Wayne