Popular in Course
Popular in ComputerScienence
This 11 page Class Notes was uploaded by Jonas Bartell on Monday October 26, 2015. The Class Notes belongs to CS1510 at University of Pittsburgh taught by Staff in Fall. Since its upload, it has received 28 views. For similar materials see /class/229396/cs1510-university-of-pittsburgh in ComputerScienence at University of Pittsburgh.
Reviews for ALGORITHMDESIGN
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 10/26/15
DPsummary 11b CSlSlO Design and Analysis of Algorithms Fall 2006 Patchrawat Uthaisombut Dynamic Programming general coin change problem Let Cv be the minimum number of coins of denominations 4 3 I needed to make change for v dollars Cv minCv 4l Cv 3l Cv ll knapsack problem Let P139 w be the maximum profit that obtainable with weight at most w from items in the set 1 2 1 P139 w maxP139 l w P139 l w wz p1 chained matrix multiplication LetM139 j be the minimum number of multiplications needed to compute A AM A M0 139 mintgkgrilMO k Mk 1 J 1714 1117 optimal binary search tree Let C1 j be the cost of an optimal binary searchtree containing real keys 139 j and dummykeys139 l j C1 j minlgg C1 r l cr l j w139 longest common subsequence LetL139 j be the length of a longest common subsequence of x1 x2 x and y1 yz y L1 lj ll ifxy L1 J maxL1 l L1 l 1fx1 yj sequence alignment LetC139 j be the cost of an optimal alignmentof x1 x2 x andyl yz y L1 l j layj C1jmax L1j 16 L1 l j 6 Floyd39s shortest path algorithm LetL139 j k be the length of a shortestpath from 1 to jwhere the internal nodes are from the set 1 2 k We want L1 j n for all 139 j L1 j k minL1 j k l L1 k k l Lk j k 1 bitonic euclidean TSP Label the vertices using positive integer from left to right For139 lt j letL139 j be the length of a shortestbitonic spanning path forvertices in the setl 2 1 Where one path ends at vertex139 and the other path ends at vertex j We want minlgj 4 Ln j dm L1 J39 miniLU 1 J dlilp mjnlgkger Li 1 k dim maximum contiguous sublist DPsummary 11b Let V139 be the value of a maximum contiguous sublist of s1 s2 s ending with s We want maxlglgn V139 V139 max V139 l s SI Longest Increasing Subsequence LIS LetL139 be the length of a longest increasing subsequence of x1 x2 x ending Withx We want maxlglgn L1 L1 max L39 l Where the maximum is over allj suchthat 0 5 j lt 1 andx lt x making change with limited number of coins Let C1 v be the size of a smallestmultisetthat is a subset of d1 d2 dto make change forv dollars Let C1 v be 00 if there is no such set C1 v minC1 l v C1 l v d l pricecollecting path Let S139 be the largest weight of any shortest path from node 0 0 to node 1 j in grid graph G S139 maxS139 j l S139 l j w rocket assembly Let C1 k be the cost of an optimal assembly sequence to assemble pieces 139 C03 k mung 14m 13 CJ391k f0 JL k k word segmentation Let C1 be the quality of an optimal segmentation for the string containing letters s1 s2 s C i maxo 1171 C 139 qualitys 11 S 12 st printing neatly Let C1 be the penalty of an optimal layout for words 139 n We want to compute C1 C1 min C0 1 L 21 wkf Where the minimum is taken over all j suchthat 1 5 j 5 n l and 2 wk 5 L Initial condition C1 0 if 22 wk 5 L 1 Parallel and Distributed Computation Parallel computation Tightly coupled processors that can communicate al most as quickly as perform a computation Distributed computation Loosely couple processor for which communication is much slower than computation 2 PRAM Model A PRAM machine consists of m synchronous processors with shared memory This model ignores synchronization problems and communication issues7 and concentrates on the task of parallelization of the problem One gets various variations of this model depending on how various processors are permitted to access the same memory location at the same time 0 ER Exclusive Read7 only one processor can read a location in any 1 step 0 CR Concurrent Read7 any number of processors can read a location in a step 0 EW Exclusive write7 only one processor can write a location in any 1 step 0 CW Concurrent Write7 any number of processors can write a location in a step What it 2 processors try to write different values 7 Common All processors must be trying to write the same value Arbitrary An arbitrary processor succeeds in the case of a write con ict 7 Priority The lowest number processor succeeds We de ne Tn7 m to be parallel time for the algorithm under considera tion with m processors on input of size 71 Let Sn be the time complexity of the best known sequential time algorithm for a particular problem Then the e ciency of a parallel algorithm is de ned by 571 Enm W Ef ciencies can range from 0 to 1 The best possible ef ciency you can have is 1 Generally we prefer algorithm whose ef ciency is The folding principle states that you can always use fewer processors and get the same ef ciency 3 OR Problem INPUT bits b17 by OUTPUT The logical or of the bits One can obtain an EREW Algorithm with Tn7 n logn using a divide and conquer algorithm that is perhaps best understood as a binary tree The leaves of the binary tree are the bits Each internal node is a processor that OR7s the output of its children The ef ciency of the EREW Algorithm is n 1 nlogn logn One can obtain an EREW algorithm for the OR problem with En7m nlog n 917 by partitioning the input into 10 partitions of logn bits each7 and precomputing the OR of the bits in these partitions One can also obtain a CROW Algorithm with Tnn 91 In this algorithm each processor Pl sets a variable answer to 07 then if bl 17 Pl sets answer to 1 The ef ciency of this algorithm is E01771 91 4 MIN Problem See section 10217 and section 1022 lNPUT lntegers 1 2 OUTPUT The smallest The results are essentially the same as for the OR problem There is an EREW divide and conquer algorithm with En7n log n 1 Note that this technique works for any associative operator both OR and MlN are associative There is an CROW algorithm with Tn7 m n2 1 and En7 m n2 171 Here7s code for processor PM 1 S 2397j S j for the CROW algorithm to compute the location of the minimum number If Xi lt Xj then Tij 1 else Ti jO Andi1 If Ti j 0 then Andi0 If Andi1 then Answer i What happens when the above code is run in the various CW models if there are two smallest numbers What happens in the various CW models if there are two smallest num bers and you just want to compute the value of the smallest number7 that is if the last line is changed to If Andi1 then Answer Xi 5 Parallel Pre x Problem lNPUT lntegers 1 J Let Sl 21 xi OUTPUT 17Sn We give a divide and conquer algorithm Solve the problem for the even as and odd as separately Then 52139132i71242i and 52121 1 3 2121 2 4 2122 This gives an algorithm with time Tnn logn on EREW PRAM This can easily be improved to Tnnlog n log 717 thus giving 91 ef ciency Note that divide and conquer into the rst half and last half is more dif cult because of the sum for the rst half becomes a bottleneck that all of the last half of the processors want to access Also note that this technique works for any associative operator 6 All Pairs Shortest Path Problem lNPUT A directed edge weighted graph G7 with no negative cycles OUTPUT The length of the shortest path Dz j between any pair of points 239 and j First consider the following sequential code 3 For i 1 to 11 do For j1 to 11 do Dijweight of edge i j Repeat log n times For i 1 to 11 do For j1 to 11 do For m1 to 11 do DijminDij DimDmj The correctness of this procedure can be seen using the following loop invariant After t times through the repeat loop7 for all 239 and j7 if the length of the shortest path between vertices 239 and j that has 2 or less edges is equal to Dlz397j Note that the outer loop is de nitely not parallizable Lets try to parallize the innter 3 loops to see what happens Repeat log n times For all ijm in parallel DijminDij DimDmj But note that this would require a concurrent write machine that always writes the smallest value So we try Repeat log n times For all ijm in parallel Tim jminDij DimDmj DijlminT11jl T1njl This runs in time T017713 logzn on an CREW This gives an ef ciency something like Enm n3 nil713 log2 n llogz n 7 OddEven Merging See section 1021 INPUT Sorted lists 1 mm and y1 7y OUTPUT The two lists merged into one sorted list 217 22 We give the following divide and conquer algorithm Mergex1 Xn y1 yn Mergex1 X3 X5 y2 y4 y6 to get a1 an Mergex2 X4 X6 y1 y3 y5 to get b1 bn for i1 to 11 do z2i 1minai bi z2i maXai bi This can be implemented on an EREW PRAM to run in time Tn7 n logn thus giving ef ciency 1log The following argument establishes the correctness of the algorithm Each ai is greater than or equal 117 ai Each 11 239 gt 1 is larger than bi1 Hence7 ai 2 22 1 Each bl is greater than or equal 11bi Each bi 239 gt 17 is larger than ll1 Hence7 bl 2 22 1 This same argument shows ll1 2 ZZZ1 and bi 2 ZZZ1 So ZZZ1 and 221 must be ai and bi 8 OddEven Merge Sorting See section 1021 We give the following divide and conquer algorithm Sortx1 Xn MergeSortX1 Xn2 Sort Xn2 Xn This can be implemented on an EREW PRAM to run in time Tn7 n log2 71 thus giving ef ciency 10 1 log nlog2 n 9 Pointer Doubling Problem In the pointer doubling problem each of the 71 processors is given a pointer to a unique element of a singly linked list of 71 items The goal is for each processor to learn its location in the linked list7 eg the the processor with the 17th element in the list should know that it is 17th in the list for i 1 to n 1 in parallel do dil1 d n 0 repeat log n times for each item 1 in parallel if next 1 nil then didi dnexti next 1 next next 1 The correctness of the code follows from the following loop invariant The position ofz39 equals di dnext 1 dnext nexti Note that this is essentially solving the parallel pre x problem7 with the work done before the recursion instead of after To solve the parallel pre x problem we would replace the initialization for i 1 to n 1 in parallel do dil1 by for i 1 to n 1 in parallel do dilxil 10 Eulerian Tour Technique to Find Tree Depths The input is a binary tree with one processor per node Assume that each processor knows the location of one node The problem is to compute the depth of each node in the tree We show by example how to reduce this to pointer doubling From the following tree We create the list the rst line 11 11 1 11 1 A B D B E B A C A and call pointer doubling with dlil initialized to either 1 or 1 appropriately The depth of a node is then computed by looking at the sum up to the point shown in the second line 11 Expression Evaluation The input is an algebraic expression in the form of a binary tree7 with the leaves being the elements7 and the internal nodes being the algebraic oper ations The goal is to compute the value of the expression Some obvious approaches won7t work are 1 Evaluate nodes when both values of children are known7 and 2 parallel pre x The rst approach wont give you a speed up if the tree is unbalanced The second approach wont work if the operators are not be associative First assume that the only operation is subtraction We label edges by functions We now de ne the cut operation If we have a subtree that looks like I hX fX gx constant c A B and cut on the root of this subtree we get I hfx gc If we have a subtree that looks like I hX fx gx constant c A B and cut on the root of this subtree we get I hfc gx Thus we are left with nding a class of functions7 with the base elements being constants7 that are closed under composition7 subtraction of constants7 and subtraction from constants This class is the functions of the form as b7 for realrational a and b Note that in one step we can apply cuts to all nodes with an odd numbered left child that is a leaf Note that in one step we can apply cuts to all nodes with an odd numbered right child that is a leaf This leads to the following algorithm Repeat log n times For each internal node v in parallel if v has odd numbered left child that is a leaf then cut at v if v has odd numbered right child that is a leaf then cut at v renumber the leaves using pointer doubling Note that in logn steps we will down to a constant sized tree since each iteration of the outer loop reduces the number of leaves by one half So Tn n logn 12 A Problem that is Hard to Parallelize No one knows a fast parallel algorithm for the following problem known as the Monotone Circuit Value Problem INPUT A Boolean formula or circuit it doesnt matter F consisting of the connectives AND and OR and a truth assignment A for the variables in F OUTPUT 1 if A satis es F that is makes F true and 0 otherwise More precisely no one knows of a parallel algorithm that runs in time 0l09kn for some k with a polynomial number of processors Here n is the size of the formula
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'