Analysis of Algorithms
Analysis of Algorithms ECS 222A
Popular in Course
Popular in Engineering Computer Science
This 13 page Class Notes was uploaded by Ashleigh Dare on Tuesday September 8, 2015. The Class Notes belongs to ECS 222A at University of California - Davis taught by Staff in Fall. Since its upload, it has received 52 views. For similar materials see /class/191688/ecs-222a-university-of-california-davis in Engineering Computer Science at University of California - Davis.
Reviews for Analysis of Algorithms
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 09/08/15
Discussion 3 Dr Nina Amenta Thursday January 27 ECS 222A Winter 2005 Hash Tables and Tries Since hash tables and tries are the current topics in lecture we will brie y introduce su x trees a partic ular type of trie and demonstrate an application utilizing that data structure We will work an example concerning hash tablesi Su ix Trees A suffix tree is a trie that contains all of the suffixes of a given string in lexicographic order For example consider the string cat note this is not the world s most interesting stringi Mississipi is the classic choice but the tree is very big ln lexicographic order the suffixes are at cat ti The dollar sign indicates the end of the string cat at cat t Note this is a really lousy drawing of a suffix tree The labels cat etc are actually meant to be on the edges and the nodes are labeled with the suffix number like 1 for cat 2 for at 3 for t and 4 for i For a given string of length n a suffix tree can actually be constructed in linear time To read more about this result look at the MCCREIGHT algorithmi The timing analysis is very easy to follow see Su x Trees and Su x Arrays by Aluru but it is too long to present in section Su ix Tree Example Given a circular string 5 we are asked to nd the lexicographically least linearization arbitrary cut of S in On time where n l For example in trusted mississippi the lexicographically least linearization would be imississippii Our algorithm is as follows H Make an arbitrary cut in the string Double the string Build a suf x treei F90 DFS our suffix tree in lexicographic order The rst string of length m that we nd is the lexicograph ically least linearizationi This algorithm runs in On time because each of the steps takes at most Cutting and doubling the string are clearly linear in the size of the string Building the suffix tree is 0n using known algorithms and a DFS of a suffix tree is also We justify the 0n bound for the DFS by noting that there are n 1 leaf nodes corresponding to each of the n l suffixes including the endofstring character and therefore 2n is a very generous upper bound for the total nodes in tree because each internal node has at least one child Since our graph is a tree there are at most V7 1 edges in the tree and thus our wellknown running time OV E for DFS is clearly CLRS 1151 Let pnm be the probability that there are no collisions when inserting n keys into a table of size ml We want to nd a relation between n and m that minimizes collisionsi For this problem we make two assumptions about our hash function 1 The hash function is universali Thus Prhz S lmi Discussion 1 Dr Nina Amenta Thursday January 12 ECS 222A Winter 2005 Asymptotic Notation We begin by stating a few useful definitions 1 1f gn then 3 positive constants 01027L0 such that 0 S 019n S S 029n for all n 2 not E0 If Ogn then 3 positive constants 0 no such that 0 S S 0gn for all n 2 not CA3 1f Qgn then 3 positive constants 0 no such that 0 S 0gn S for all n 2 not F If 0gn then 3 positive constants 0 no such that 0 S lt 0gn for all n 2 not 5 1f Qgn then 3 positive constants 0 no such that 0 S 0gn lt for all n 2 not Proof by Substitution It is easy to complete a proof by substitution and a bit harder to ascertain if you have found the best possible upper bound For example we will show that TnS an T910n is Onlogn which is not the best boundi In order to prove this we will apply the substitution method which is basically inductioni We will assume that the asymptotic running time bound holds for small n assume it is true for all n S n and then show that it is true for all n gt n i Thus in this case we assume Tn S 0n log n Tn an T910n 1 S an 09n10 log 9n10 2 an 09n10 log 910 log n 3 S an 09n10logn 4 S anlogn 09n10 logn 5 S nlog na 9100 6 But what we want is to show that Tn S 0n log n Thus we want a 9100 S 0 7 a S 010 8 10a S 0 9 Thus we have bounded 0M as long as 0 2 10a then Tn Onlog But this is not the sharpest upper bound Now using the same methodology we show that Tn Tn an T910n 10 S an 09n10 11 na9100 S 0n 12 a9100S 0 13 a S 010 14 10a S 0 15 Thus again we have bound By choosing 0 2 10a then Tn Note when doing proofs by substitution it is always important to remember the precise goali For example consider this INCORRECT PROOF that Tn an 2Tn2 is Tn an2Tn2 16 S an2cn2 17 S a cn On 18 This is an absolutely true statement a cn However what we needed to show was that a cn S on for some choice of or This will never be true When we assume that Tn Onlogn the correct proof is as follows Tn an 2Tn2 19 S an 2cn2logn2 20 an cnlogn 7 log2 21 an cnlogn 71 22 an cnlogn 71 S cnlogn 23 aclogn71 S clogn 24 a7cclognSclogn 25 a S c 26 Thus this equation is true for all c 2 a Thus Tn an 2Tn2 is Onlog Random Search The purpose of this example is to dervive some classic results in a very simple and clear way In RANDOM SEARCH we always have an array of size n and we are looking for a particular element 1 in that array or an index i such that z a For this analysis let us assume that we choose and replace In other words we choose a number i at random from between 1 and n each time we toss the elements into a bag and pluck them out one by 1 returning them each time If I we return else we continue to choose and replace elements In order to analyze the expected running time of our algorithm let Z be a random variable such that Z equals the number of indices picked before I is found Since we allow indices to be choosen more than once EZ can be calculated as follows EZ Z ZPTZZ 27 Em11PTZ12PTZ23PTZ3 2s EZ12 T 13 T 1 30 EZZj1n1 31gt We can simplify the expression for EZ by using the following result 21111 I Z N 32 390 7 33 171 i 1H1 0 011 34 x H o x 1 7 35 17 If lt gt Therefore the can be written in the following way 1 1 E Z 7 36 l l n 17 lt gt Em 42 37gt Wotan 1 1 E Z if 38 l l n El lt gt EZ n 39 This ia a bit surprising Searching an array of size n deterministically takes 0n time of course But now we see that searching at random takes expected time 0n as well b Now nd the expected number of indices picked when each indice is inspected only once In other words we toss our elements into a bag choose them 1 by 1 and discard each time if f I Let Z be a random variable such that Z equals the number of indices picked before I is found EZ can be calculated as follows EZ1PTZ12PTZ23PTZ3nPTZ7L 40 1 n71 1 n71n72 1 n71 mi 1 Em 39fg39T39ni g39 n 39n4139n42 39 39 n mirm 41 1 1 1 1 EZ123n 42 1 EZ712n 43 71 nn1 ElzligT 44 n1 EZT 45 This is to be expected In order to nd the desired element we have to look at about half c For this analysis we assume that z is not unique in the array In fact we assume that the array A contains Is Is As in part a we choose and replace Again let Z be a random variable such that Z equals the number of indices picked before I is found Since there can now be multiple elements k in the array that equal 1 we modify the expression for EZ in the following way and use the result from part a to simplify EZ1PTZ12PTZ23PTZ3 46 EZ 12 k3lt kgt 48 E121 Zltj1gtlt kgt 49gt k 1 50 k 1 121272 51gt 6 EZ 52 Again7 this is to be expected d In order to determine the expected number of indices picked until the entire array has been searched7 de ne the following variables Let Z be a random variable such that Z equals the number of indices picked until the entire array has been searched Let equal the number of indices picked since the i th indice was found and until the 1th indice is found Therefore ZY0Y1Y2quot39Yn4 53 ElzlEYoY1Y2Yn1 54 EZ ElYo EY1 EY2 EYn1 55 We can calculate the directlyi Remember that is the number of tries to nd the 1th indice after we have already found 239 indices Em1PTm12PT1223PT123 56 lt51 E1321 T ilt12g3lt gt gt 58gt Em 20 0G 59 nii 1 EYi 72 60 11 n 17 H nii 1 EYZ 2 61 11 n w H Eliib 62gt Now we can calculate the However7 a closed form of this particular solution is not known to me at this time and so I will leave it in the unsimpli ed form EZ 7 63 Discussion 4 Dr Nina Amenta Thursday February 3 ECS 222A Winter 2005 Single Source Shortest Paths This problem is de ned as follows given a graph G we want to nd a shortest path from a given source vertex 8 E V to each vertex v E V We have a plethora of algorithms available to us for solving this problem and each one runs only under a certain set of conditions The different algorithms are described in detail in CLRS Chapter 24 but for the purposes of this course we are only interested in DIJKSTRA7S algorithm 0 DIJKSTRA This algorithm takes as its input a weighted directed graph where all of the edge weights are nonnegative The algorithm is very similar to MST PRIM and also uses a priority queue The algorithm runs in OE log V and depends on the implementation of the priority queue If a Fibonacci heap is used the running time can be improved to OVlogV DIJKSTRAGws 1 lNITIALIZE SINGLE SOURCE 2 SHQ 3 Qev 4 whileQa Q 5 do u e EXTRACT MINQ 6 S lt7 S U 7 for each 1 E Adj 8 do RELAXuvw RELAXuvw 1 if dv gt d u wuv 2 then dv du wuv i u u lNITIALIZE SINGLE SOURCEGs 1 for V1 2 do dv lt7 oo 3 u NIL The running time of this algorithm is calculated as follows The minpriority queue Q is maintained by three priorityqueue operations INSERT implicit in line 3 Olog n EXTRACT MIN line 5 Olog n and DECREASE KEY implicit in RELAX in line 8 Olog Thus we see that the running time of DIJKSTRA7S algorithm is 0V E log V which is OE log V for dense graphs Dynamic Programming Dynamic programming solves problems by combining the solutions to subproblems In order to apply the dynamic programming method to a particular problem the problem must exhibit optimal substructure and overlapping subproblems Longest Increasing Subsequence Consider an unsorted array of integers A31261478 Our goal is to nd the Longest Increasing Subsequence hereafter abbreviated LIS within A In this case increasing subequences are 3 6 7 8 or 1 6 or 11 4 7 8 depending on whether or not the subsequences are strictly increasing Notice that the subsequences are not contiguous meaning that 16 is an increas ing subsequence even though 1 and 6 are not adjacent in the array In this case a LIS is 1 2 6 78 or 11 4 7 8 depending on the convention First consider how dif cult the solution is to calculate by brute force We need to consider not only the larger elements that come after a given element but what order they appear in For the rst element in the array we need to check the n7 1 other elements and the determine all other possible increasing subsequences that start with the rst element which may or may not be the optimal starting point Thus the brute force method clearly takes exponential time But dynamic programming comes to our rescue First we introduce a function which will come in very handy 607 1 ifAlz Am 0 otherwise 1 Next we will de ne our data structure We will create an array L of n elements such that is equal to the size of the LIS that terminates in Clearly L1 1 because the A1 will always be a LIS of size 1 This is our base case We ll the array according the following recursive formu a LU 19560ij le 1 2 To summarize we simply walk the array from the beginning up to element AM and take the maximum possible subsequence that terminates in element For our example we ll the array L in this manner L11232345 To nd our maximum we simply walk our array and take the maximum which in this case is 5 To nd our solution we can simply walk backwards through our list and recreate the solution or store a pointer to the previous element in another list In either case this algorithm runs in On2 3Partition Given integers a1 an we want to determine whether it is possible to nd a partition of elements 1 n into three disjoint subsets I JK Q 1 n such that 1 n 2azajak ai lt3 i6 je keK 11 For example 1 2 3 4 4 5 8 is a YESinstance because there is the partition 1 8 4 5 2 3 4 while 2235 is a NOinstance because 2 2 3 5 12 and there are not 3 disjoint subsets that sum to 4 Consider the following dynamic programming solution to this problem On input a1 an let A 13E1ai and de ne a boolean matrix M of size A 1 X A 1 X n1 with the meaning that Mz y k is true if and only if there are two disjoint subsets I J E 1 k such that 216 ai z and Ejej aj y Once we construct the matrix the answer to the 3 PARTITION problem is in the entry MA A The recursive de nition is Mlzyyyil Mlzyyyiill VMlzyyiamill VMlziahyyiill 4 with base cases M000 1 and Mz y0 0 for z y gt 0 If we index off the table we count that as a false value There are 0A2n entries in the matrix each of which can be lled in 01 time Thus this algorithms takes 0A2n time Here is a very simple example on the set of integers 1233 Since the sum of this set of integers is 9 a 3 PARTITION would be a subset that sums to 3 eryyyll eryy l eryy73l eryy74l Notice that the matrixes are all symmetric around the diagonal and also that once a cell is set to true it will remain set to true throughout the duration of the algorithm Thus we can see that we can easily use only 0A2 space and that we only need to ll in the upper triangle of the matrix However our overall running time will still be 0A2n Subset Sum SUBSET SUM is a very simple variation of the knapsack problem However it gures prominently in the NPCompleteness reduction tree and is therefore worth looking at in its own right The problem is de ned as follows Given an array A is it possible to nd a subset that sums exactly to a bound B Notice that this is a decision problem meaning the answer is yes or no A3245371310611 Consider the above array A In this case is it possible to nd a subset of A that sums to exactly 17 The answer is yes because the sum of 4 13 17 and therefore 413 is such a subset What about 19 The answer is again yes 2 710 What about 22 or 26 After we derive the dynamic programming recursive formula run it yourself and see We will de ne an n X B matrix where n is the number of elements in our array and B is the desired sum We de ne the cell Mij to contain 1 if it is possible to nd a subset of the integers 1 through i that sum to exactly j and 0 otherwise Again as in knapsack we have a Oindex column but this time we initialize that column to 1 meaning that it is always possible to have a subset that sums to 0 Therefore the following recursive formula presents itself Mz j MAXMi71jMi71j7Ai 5 Exactly as in the knapsack algorithm we justify this solution accordingly We check if we can get the sumj by not including the element 239 in our subset and we check if we can get the sum j by including i by checking if the sum j 7 Ai exists without the ith element This is identical to knapsack except that we are only storing a yesno answer Note that the base case of initializing the Ocolumn to 1 allows us to always include the element because when Ai j then 7 1j 7 Ai 7 10 1 Consider the following example to determing whether or not the above array A contains a subset that sums to 5 0 1 2 3 4 5 0 1 0 0 0 0 0 3 1 0 0 1 0 0 2 1 0 1 1 0 1 4 1 0 1 1 1 1 5 1 0 1 1 1 1 M 7 3 1 0 1 1 1 1 7 1 0 1 1 1 1 13 1 0 1 1 1 1 10 1 0 1 1 1 1 6 1 0 1 1 1 1 11 1 0 1 1 1 1 We nd our solution by checking cell Mn B in this case M115i 1f Mn B 1 then it is possible to nd a subset of A that sums to B If not then no subset is possible To nd our optimal subset we simply recurse backwards through the matrix For example suppose that MnB 1 We want to know if An is included in our set and so we check the two cells Mn 71 B and Mn71 n7 Anli 1f Mn7 1 B 1 then we do not include An in our set and we continue to recursei 1f Mn 7 1 n 7 An 1 then we do include Ani If both equal 1 we can choose which subset to take There are n X B cells in our matrix and therefore this algorithm runs in 0nB timer Notice that this is not polynomial because our running time depends on two variables and one could easily be an exponential function of the other This is consistent with the result that SUBSET SUM is NPCompletei Discussion 7 Thursday March 10 Dr Nina Amenta ECS 222A Winter 2005 Summary These notes contain two examples of primals and duals as well as two more example reductions as well as hints for homework problem 1c Duality We will look at the formal de nition of taking the dual of a linear program and then show an easy example The original problem is called the pn39mal problem and is stated as follows Maximize Subject to n Zai jgbi i12m j1 szO jl2n The dual of the above primal given in standard form is de ned as Minimize in Z biyi i1 Subject to m ZaijinCj j1727wn i1 yizo i12mmgt Consider the following linear programming problem Maximize 2 211 312 Subject to 311 212 S 2 11 212 S 5 411 12 S 1 11712 2 0 We can more clearly see how to take the dual by expressing this program in its matrix form 1 2 3 4 5 6 7 8 9 10 11 Maximize 223 12 Subject to 3 2 2 712 s 13gt 4 1 2 1 1112 2 0 14 Then we can see that the dual problem becomes Minimize yi 2 2 51 w 15 93 Subject to 91 3 71 4 2 2 211g2 3131 16 93 91792ny 2 0 17 Now we will look at another example of a linear program and its corresponding dual but this will be a 77real life77 example and a 77real life77 dual A hospital menu consists of two foods A and B each of which contains certain amounts of fats carbohydrates and proteins Constituent A B Fat 1 4 Carbohydrate 3 2 Protein 5 3 It is necessary that each hospital meal provide at least 12 units of fat 15 units of carbohydrates and 22 units of protein Each ounce of food A costs 30 cents and each ounce of food B costs 25 cents and obviously the hospital wants to minimize its expenses and yet still provide all of its patients with the appropriate amount of nutrition Thus this gives rise to the following linear program Minimize 2 301A2013 18 Subject to IA4IB 212 19 31A2IB 215 20 5IA3IB 222 21 IA 13 2 0 22 Now we will think of ourselves as directors of a rm selling arti cial foods F C and P obviously indicating Fats Carhybohydrates and Proteins where one unit of F provides 1 unit of fat etc As directors of this rm we want to provide the same nutritional value as foods A and B and yet not exceed the costs of those foods We also want to maximize our pro ts Thus based on these constraints we can formulate a linear program which just happens to be the dual of the above primal Maximize 2 12yF 15yc 22M 23 Subject to yF 390 5yP S 30 24 49F 290 3yP S 25 25 yFyycny 2 0 26 Now we see the canonical economic relationship between duals and primals minimizing expenses for a buyer is the dual of maximizing pro ts for a seller This example is taken from Linear Programming by Calvert and Voxmani Bonnie and Clyde Bonnie and Clyde have just robbed a bank They have a bag of money and want to divide it up For each of the following scenarios either give a polynomialtime algorithm or prove that the problem is NP COMPLETEi The input in each case is a list of the n items in the bag along with the value of each a There are n checks which are in an amazing coincidence made out to 77 Bonnie or Clydelli They wish to divide the checks so that they each get the exact same amount of money Solution NP COMPLETE o NP This problem is clearly in NB The witness algorithm consists of the bag of checks and an assignment of each of the checks to Bonnie or Clydei Then simply sum Bonniels and Clyde7s checks and verify that the sum is the same This is clearly polynomial time o NP HARD Even though SET PARTITION is a trivial reduction we will reduce from SUBSET SUM just to be dif cult An instance of SUBSET SUM is a set of integers X and a bound B where in a yes instance X contains a subset of integers that sums to exactly B Thus to create an instance of BONNIE AND CLYCE CHECKS BNCC we let T lnzi and let each integer zl correspond to a check of value Ii and then we add one more check zn1 such that 7 If B 2 T2 then choose a c such that T c 2B and let zn1 c 2B 7 Ti 7 If B lt T2 then choose a c such that B c T c2 and let zn1 c T 7 2Bi This is clearly a polynomial time conversion algorithm since we are adding only one variable and performing only a few trivial arithmetic operations Thus the proof 7 gt Assume X is a yes instance of SubsetSumi Thus there exists a subset of X that sums to exactly B 96 If B 2 T2 then all of the integers that sum to B are on one side of the partition and c is on the other giving T c 7 B T 2B 7 T 7 B B and the remaining integers in the set plus 5 will sum to B and thus there exists a partition 9 If B lt T2 then c is on the same side of the partition as all of the other integers that sum to B and B c B T 7 2B T 7 B and thus a partition exists 7 lt Assume BNCC is a yes instance Then there exists a partition and by de nition 5 will either be on one side or the other 96 If c 2B 7 T then T c T 2B 7 T 2B and both sides of the partition must sum to B Thus the other side of the partition will consist only of integers in X and will sum to B 96 If c T7 2B then Tc TT7 2B 2T7 B and each side of the partition must sum to T 7 B Thus 5 B T 7 2B B T 7 B and the other elements on the 5 side of the partition are all elements from the original X and sum to i Thus a yes instance of SUBSET SUM maps to a yes instance of BNCC and vice versa Thus since BNCC E NP and BNCC E NP HARD then this implies that BNCC E NP COMPLETEi i There are n checks as above but this time Bonnie and Clyde are willing to accept a split in which the difference is no larger than 100 Solution NP COMPLETE This time we do the trivial reduction from SET PARTITION only when we convert to the corresponding Bonnie and Clyde problem we multiply I by 1000 Thus if the algorithm produces a solution that differs by only 100 then the two solutions must be equal because if we divide by 1000 we get the difference in the original problem
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'