### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# Data Struct&Algor EECS 281

UM

GPA 3.8

### View Full Document

## 29

## 0

## Popular in Course

## Popular in Engineering Computer Science

This 399 page Class Notes was uploaded by Ophelia Ritchie on Thursday October 29, 2015. The Class Notes belongs to EECS 281 at University of Michigan taught by Sugih Jamin in Fall. Since its upload, it has received 29 views. For similar materials see /class/231525/eecs-281-university-of-michigan in Engineering Computer Science at University of Michigan.

## Reviews for Data Struct&Algor

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 10/29/15

Outline Last time 0 Intro what this course is about 0 Binary search and simple backoftheenvelope performance analysis Today 0 Operation Count 0 FindMax ComputeRank Sugih Jamin Gamineecs umich edu Mirror mirror on the wall Given two algorithms how do you determine which is more efficient Runtime computation name search Population size l Linear l Binary EECS 281 100 10 ms 07 ms lookup 7 names UM 40000 4 secs 15 ms Washtenaw 35 secs 18 ms MI 9 mil 15 mins 23 ms US 290 mil 8 hours 28 ms World 6 billion 7 days 33 ms Assume can search 10 namesms An 8year old may take longer Sugih Jamin lamineecsumich edu Time Complexity Operation Count Using wallclock time to measure complexity can be tricky 0 depends on cpu speed 0 depends on cpu oad 0 but we ll come back to this later Given the uncertainty of wallclock based measurement we use instead operation count as performance measure a pick one operation ADD SUB MUL DIV CMP LOAD STORE that is performed most often 0 count the number of times the operation is performed Sugih Jamin lamineecsumich edu Roofing Problem A builder can carry two shingles on his shoulder as he climbs up the ladder He then climbs down and carries two more shingles up the ladder Each round trip up and down the ladder costs 2 Questions 1 What is the cost to build a tree house 8 shingles 2 What is the cost to build a shed 128 shingles 3 What is the cost to build a house 2048 shingles Answers a 8 shingles 2 i82i 40 o 128 shingles 2 1282 2 128 o 2048 shingles 2 20482 2048 What is the ONE operation we counted in this case Sugih Jamin lamineecsumich edu Operation Count Example Find Max Given an array of N elements find the index of the largest element Questions 0 What s the fixed cost of the algorithm 0 Which operation should we count to compute the variable cost 0 How many times is this operation executed in the worst case Sugih Jamin lamineecsumich edu What is the Time Complexity of this Function Vold f1rit N mt p p an array N mtegers 111t 1 3 k 1 for 1 o 1 lt N 1 113 u gt1 01ltN1 r k 1 k lt N j J p313 return 5mmquot Umineecs m an Four Operation Count Accounting Rules Rule 1 Consecutive Statements 81 82 S3 SN The runtime R of a sequence of statements is the runtime of the statement with the max runtime maxRSl R092 RSN Rule 2 Conditional Execution if 31 SQ ese S3 The runtime of a conditional execution is maXRSl R092 RSs Sugih Jamin Gamineecsumich edu Four Accounting Rules cont Rule 3 IterationLoop for 81 82 S3S4 The runtime of a loop with n iterations is maxRSl n gtllt R82 n gtllt RS3 n gtllt RS4 Rule 4 Nested Loop for 31 S2 S3 for S4 S5 36 S7 Count inside out The runtime is the max of the runtime of each statement multiplied by the total number of times the statement is executed Sugih Jamin Gamineecsumich edu Aside Sum What is the sum of all integers from o O to 4 inclusive o 1 to 25 inclusive o 3 to 12 inclusive Sugih Jamin Gamineecsumich edu Operation Count Example Compute Rank Given an unsorted array a of N elements compute an array r where rz is the rank of az39 in a The rank of an element is defined as the number of all elements in the array that is smaller than it plus the number of elements to its left that is equal to it Example a4 3 9 3 7 gives r2 04 1 3 Time Complexity of rank how many CMPs in total Sugih Jamin lamineecsumich edu Aside Pseudocode o highlevel description of an algorithm 0 more structured than English prose 0 less detailed than a program 0 hides program design issues 0 mathematical formatting allowed 0 special notations for assignment for equality testing 0 To ease grading you MUST use this syntax whenever pseudocode is asked for in this course Sugih Jamin lamineecsumich edu Outline Last time 0 Finished up hashing o Binary search divideandconquer o Recursive function and recurrence relation Today 0 Tree Terminology 0 Tree Traversal o N ary Tree Sugih Jamin lamineecsumich edu Divide and Conquer Divide and conquer doesn t work with linked list unfortunately So why use linkedlist at all How can we speed up search and use dynamically allocated space Unsorted dictionary Hash table Sorted dictionary Tree Sugih Jamin lamineecsumich edu Trees Arboreal examples birch cherry maple oak pine plum willow Sugih Jamin Gamineecsumich edu Organizational Examples Orgchart CEO I VP Product VP VP Development Marketing Sales lBedrooml N America l Europe Asia Bathroom Sugih Jamin lamineecsumich edu Organizational Examples cont Family tree I grandparent I I uncle I I parent I I1stcom Ibrother I You I sisterJ I2nd cousin niece so I I daughter nephew I I grandchild I Sugih Jamin lamineecsumich edu Organizational Examples cont Object inheritance array I vector I I I stack I IBSTI IHeapI I Trie kdtree I IBTree I Note no multiple inheritance or it won t be a tree Sugih Jamin lamineecsumich edu What is a Tree Tree a set of nodes storing T5 elements in a parentchild re lationship such that a a there is one root the top most node 0 the root node has no par ent 0 all other nodes have ex actly one parent 0 a o parentchild relationship is denoted by direct link in tree a 0 there is a unique path from one node to another Sugih Jamin Gamineecsumich edu Tree Terminology T a r root node of tree T k is a child of r k is a parent of m m is a grandchild of r j 13 z are siblings of m j l n 13 z are leaf nodes degree of a tree max l a number of children each 2 node can have this tree is of degree 2 a binary tree Sugih Jamin Gamineecsumich edu Tree Terminology cont r root node of tree T Tr right subtree of r Tl left subtree of r a path the set of nodes visited to get from a node higher up on the tree to a node lower down 0 path from r to l is 19771 l o the path length of r to l is 3 hops o path length may be 0 i i is a path Sugih Jamin Gamineecsumich edu Tree Terminology cont T o ancestor k and m are ancestors of n there is a path from k to m 0 and m to n 0 each nodez is its own ancestor 0 o z is a proper ancestor of j if the path length from 139 to j is not 0 o descendant m and n are de 0 a scendants of k there is a path from k tOm and kto n c j is a proper descendant of 139 if a the path length from j toz 7E O Sugih Jamin lamineecsumich edu Tree Terminology cont o depth of node 139 is the length of path from the root to z o depthr 0 depthl 3 0 all nodes on a level of the tree T have the same depth the root is 0 at level 0 o the depth of a tree is the max 0 depth of all nodes this tree is of depth 3 0 height of node 139 is the longest 0 a path from 139 to a leaf node 0 heightl heightz 0 0 heightm heighty 1 0 heightk 2 heightr 3 Sugih Jamin lamineecsumich edu Binary Tree Binary tree every node has 0 1 or 2 children Proper binary tree every node has 0 or 2 children Perfect binary tree every level is fully populated Complete binary tree every level except the lowest is fully populated the lowest level is populated left to right Sugih Jamin lamineecsumich edu Binary Tree Representation A binary tree can be represented either as a linked structure a as an ordered set 7 167 339 my 1 NH 317 3 Z a or as an array starting at index 1 1 r k y j m X Z I For a binary tree a node at 7 index 139 has its children at which indices A node at index 139 has its parent at which index Sugih Jamin Gamineecsumich edu Example Use of Binary Tree Expression Tree Encode 01 c de as an Tree traversals expression tree a depth first 0 preorder visit node Tl TV 0 o postorder visit Tl Tr node 0 inorder visit Tl node TV 0 a a breadth first level by level Sugih Jamin Gamineecsumich edu Binary Tree Traversal Expression Tree 01 c de Print the expression tree in the nor mal order 0 cow c c0 em Which kind of tree traversal will you 0 G need a a Give me a recursive and an iterative implementation Sugih Jamin lamineecsumich edu N ary Trees A tree may be empty External node an empty node with no children Internal node a node with children Leaf node an internal node with only external nodes as children How many external nodes does an N ary tree with n internal nodes have Sugih Jamin lamineecsumich edu Characteristics of N ary Trees How many external nodes does an N ary tree with n internal nodes have binary tree tertiary tree 4ary tree 0712011 0712011 0712011 0712112 0712113 on114 0712213 0712215 0712217 Every new internal node replenishes one external node and brings with it N 1 new external nodes For n internal nodes we have nN 1 1 original external nodes For binary tree n internal nodes means n 1 external nodes gt max 712 leaf nodes How many internal nodes does an N ary tree with m external nodes have Sugih Jamin Gamineecsumich edu Outline Today a Review of PNP and 2 approximate solution to TSP 0 Dynamic Programming 0 01 Knapsack o Travelling Salesperson Sugih Jamin lamineecsumich edu P NP and NPComplete If there s an algorithm to solve a problem that runs in polynomial time the problem is said to be in the set P If the outcome of an algorithm to solve a problem can be veri ed in polynomial time the problem is said to be in the set NP nondeterministic polynomial the nondeterminism refers to the outcome of the algorithm not the verification There is a set of problems in NP for which if there s a polynomial solution to one there will be a polynomial solution to all The set is called NPComplete Sugih Jamin lamineecsumich edu NPComplete NPHard If you can show that a problem is equivalent can be reduced to a known NPComplete problem you may as well not try to find an efficient solution for it unless you re convinced you re a genius If such a polynomial solution exists P NP It is not known whether P C NP or P NP NPhard problems are at least as hard as an NPcomplete problem but NPcomplete technically refers only to decision problems whereas NPhard is used to refer to optimization problems Sugih Jamin lamineecsumich edu Examples of NPComplete Problems Hamiltonian Cycle Problem Traveling Salesman Problem 01 Knapsack Problem Graph Coloring Problem can you color a graph using is 2 3 colors such that no adjacent vertices have the same color Sugih Jamin Gamineecsumich edu A 2 Approximation for a Special Case of TSP If distances satisfy the triangle inequality ie for edges um 2210 Wu in G Cuw Cvw 2 Cmw and G is a complete graph we can build a 2 approximate TSP kapproximate means that the result is off by a factor of is from the optimal 2 approximate TSP 1 construct an MST ofthe graph using Prim s for example 2 construct an Eulercycle ofthe MST 3 replace all edges u v and 12111 for which 1 is visited more than once with edge u 111 Running time OE log V for MST OV for Euler OE log V as opposed to OV V by brute force Sugih Jamin lamineecsumich edu Euler Cycle An Euler cycle in a graph G is a path from v to 1 that contains all of the edges and all of the vertices of G but no repeated edges The Konisbergbridge problem the first graph theoretic problem A graph G has an Euler cycle iff G is connected and the degree of every vertex is even Given an MST to construct the Euler cycle 0 consider each edge as 2 edges in separate directions so each vertex has an even number of edges 0 do a preorder traversal a running time there s one edge per node a tree has at most V 1 edges each nonleaf node is visited 3 times OV Sugih Jamin lamineecsumich edu 2 Approximate Proof Let 0 Tom the optimal TSP minus the last edge which must be a spanning tree 0 M be an MST o E the Euler tour visit every edge in M twice once per direction 0 T the nonoptimal TSP obtained from E Claim T is at most 2 times longer than Tom Proof 0 CT g CE by triangle inequality Cu7w S Cuyv Cuw o CE 2CM o CM g CTopt by definition of MST So CT g 20T0pt Sugih Jamin lamineecsumich edu Dynamic Programming a used when a problem can be divided into subproblems that overlap o solve each subproblem once and store the solution in a table 0 if run across the subproblem again simply look up its solution in the table 0 reconstruct the solution to the original problem from the solutions to the subproblems o the more overlap the better as this reduces the number of subproblems Origin of name Bellman 1957 programming planning decision making by a tabular method dynamic multistage timevarying process Sugih Jamin lamineecsumich edu Devide et impera Divideandconquer o for base cases solve problem directly 0 do recursively until base cases reached 0 divide problem into 2 or more subproblems o solve each subproblem independently 0 solutions to subproblems combined into a solution to the original problem Works fine when subproblems are nonoverlapping othenvise overlapping subproblems must be solved more than once as with the Fibonacci sequence Sugih Jamin lamineecsumich edu Computing the Fibonacci Sequence contd What is a Fibonacci sequence f0 Zoifl lifnzfn 1fn 2an2 2 Recursive implementation Iterative version int int rfibint n ifibint n assume n gt 0 assume n gt 2 return 11 lt 1 39 n int i fn rfibn lrfibn 2 fO Ofl 1 gm fori2ton Running time 92 fi filfi2 Preiss 343 return f n Running time n Preiss 1432 1441 Sugih Jamin Gamineecsumich edu Computing the Fibonacci Sequence contd Why is the recursive version so slow The number of computations grows exponentially Each rfib i z lt n 1 computed more than once Tree size grows almost 2quot Actually the number of base case computations in computing fn is fn Since fn gt quot 1 see Preiss Thm 39 complexity is Qquot Why is the iterative version so fast Instead of recomputing duplicated subproblems it saves their results in an array and simply looks them up as needed Can we design a recursive algorithm that similary look up results of duplicated subproblem Sugih Jamin Gamineecsumich edu Memoized Fibonacci Computation int fibmemon 0 l l l int mfibint n fibmemo assume n gt 0 and left to right evaluation if fibmemon lt O fibmemon mfibn 2 fibmemo mfibn l fibmemo return fibmemon Memoization or tabulation use a result table with an otherwise inefficient recursive algorithm Record in table values that have been previously computed Memoize only the last two terms int rfib2int fn2 fnl n assume n gt 0 return n lt 1 n rfib2fnl fn2fnl n main return rfib20 l n Sugih Jamin Gamineecsumich edu Example of Problems Solved Using Dynamic Programming o AllPair Shortest Path Floyd s algorithm 0 01 Knapsack o Travelling Salesperson Sugih Jamin lamineecsumich edu 01 Knapsack Problem Preiss 1412 1425 handouts You are a poor Mayan trader carying wares on your back in a tattered knapsack back and forth between Chichen Itza and Palenque Each item has both a weight and a pro t The tattered knapsack cannot carry more than its capacity or it will rip Choose a set of items that doesn t rip the knapsack while maximizing your profit Sugih Jamin Gamineecsumich edu 01 Knapsack Problem More formally a n items a wii weight of itemz 0 pi profit for itemz o 0 capacity of knapsack in total weight Let S 1 x2 x3 xn be the selection set xi 1 if item 139 is selected 0 otherwise hence 01 Given 1111 w2 w3 mm and p1p2p3 pn our objective is to maximize 21 19 er subject to constraint 21 wixi g C Sugih Jamin lamineecsumich edu How to Solve It BruteForce BranchandBound what are the bounding factors Greedy DynamicProgramming 0 can the problem be decomposed into overlapping subproblems o is optimal solution to problem guaranteed to consist only of optimal solutions to subproblems Sugih Jamin lamineecsumich edu 01 Knapsack Brute Force Solution Exhaustiver enumerate all feasible solution and pick one with the highest total profit Running time Sugih Jamin Gamineecsumich edu 01 Knapsack BnB Solution Let Sk 331332333 xk k lt n be a partial solution Bounding factors 3k is feasible only if 2le wixi g C If Sk is infeasible so are all solutions that include Sk a total potential profit from a solution that includes Sk is 251 19 er MAXZ Lk1 pi prune branch if maximum potential profit for branch is less than current maximum profit Sugih Jamin lamineecsumich edu 01 Knapsack Greedy Solution Put item into knapsack one by one no removal of item already chosen Possible greedy criterion o by weight lowest first a by profit highest first a by value profitweight highest first Optimal solution not guaranteed see Preiss Tbl 141 Sugih Jamin Gamineecsumich edu 01 Knapsack Dynamic Programming Solution Letn3u51020and1gt5060140C73O Solution table Pz cj cj step increments to C piltuil iH ol 5i 1ol 15 l 20l 25 l 30 50 5 1 0 50 50 50 50 50 50 60 10 2 0 50 110 110 110 110 140 20 3 0 50 Can the problem be decomposed into overlapping subproblems ls optimal solution to problem guaranteed to consist only of optimal solutions to subproblems Sugih Jamin lamineecsumich edu 01 Knapsack Dynamic Programming Solution Letn3u51020and1gt5060140C30 Solution table Pz cj piqui 1H 01 5t 10 15 i 20 25 i 30 5o 5 1 o 50 5o 50 5o 50 5o 60 1o 2 o 50 60 110 110 110 110 140 20 3 o 50 60 110 140 190 200 Subproblems of Pz cj o if w gt cj cannot add item 139 so Pz cj Pz 1 cj o if w g cj considerboth adding and not adding item 139 o if item 139 not added Pz cj Pz 1 cj o if item 139 added Pz cj p Pz 1 cj mi and take the maximal of the two options diff from greedy Sugih Jamin Gamineecsumich edu Running Time of 01 Knapsack Dynamic Programming Solution Running time to compute table 0010 Polynomial Worse than BF Sugih Jamin lamineecsumich edu Refined 01 Knapsack Dynamic Programming Solution Observation do not need to compute every entry of the table Pz cj only needs Pz 1 cj and Pz39 1 Cj wi recursively Running time 1 2 22 2n 1 2 Running time of algorithm OMNnC 2quot Sugih Jamin lamineecsumich edu Dynamic Programming for TSP Observation on TSP Let the optimal shortest TSP path from node 121 be vla vka vkIla vkIQa 39 39 39 avkITw vl The subpath 2 vk1 vk2 vkn 121 must be the shortest path from 22k to 121 that visits each of the other vertices exactly once for 0k1 0k2 0kn7 01 0k2 0kn7 01 etc down to 0kn 01 So what s the problem Sugih Jamin lamineecsumich edu DP for TSP Problem is we don t know which permutation of vim 24641 24642 vkn gives the shortest path We have to try them all out W7 Uk17 Uk27 7 7 7 7Ukn 27Ukn 17Ukn W7 Uk17 Uk27 7 7 7 7Ukn 17Ukn 27Ukn W7 Uk27 Uk17 7 7 7 7Ukn 27Ukn 17Ukn Uk17 W7 Uk27 7 7 7 7Ukn 27Ukn 17Ukn Uk27 W7 Uk17 7 7 7 7Ukn 27Ukn 17Ukn W7 Uk17 Uk27 7 7 7 7 Ukn 27 Ukn7 Ukn 1 111m vk17 vk27 7 7 7 7 vkn7 vkn 27 Ukn 1 and all other permutations thereof Note the amount of duplicated subproblems Sugih Jamin Gamineecsumich edu Computing TSP using DP Example Given the following graph Store the edge weights in an adja cency matrix with 00 as the weight of a nonexistent edge gowa 058 40 ooxioww gomcoo coogt8 amp Let Cvw the weight of edge 2210 and Duv the length of a path fromutov Sugih Jamin lamineecsumich edu Computing TSP using DP 1 hop 00 6 0 G Consider all alternatives not greedy Dba Cba 1 Dc a Cc a 00 Dd a Cd a 6 Sugih Jamin lamineecsumich edu Computing TSP using DP 2 hops 0 Consider all permutations not greedy DC7b7a Dbca CbcDca 6oooo Ddca CdcDca oooooo Dbda Cbd Dda 2 46 10 Dcda Ccd Dda 2 86 14 A 9 9 9 0 0000 9 00 Sugih Jamin Gamineecsumich edu Computing TSP using DP 3 hops Consider all permutations when adding new vertex D d7 C7 b7 DC7 d7 b7 D d7 b7 C7 D b7 d7 C7 DC7 b7 d7 Db7 C7 d7 Cdc I Dcba oo8oo Ccd I Ddba 84 12 CdbDbca 3oooo CbdDdca 4oooo CcbDbda 71o 17 CbcDcda6142O G G Sugih Jamin Gamineecsumich edu Computing TSP using DP full path QVG 6 6 0 A 6 N 9 o 0 0 0 0 O C DC7 d7 b7 a7 0a b Db 6 d a 0a c Dc b d a MIN9 12 220 9 17 21 u 9 3 Extracted path TSP a c a a 6 d a a c d b a Sugih Jamin Gamineecsumich edu Outline Announcements 0 PA1 will be online latertoday Due Thu 104 100 pm online submission Last time o ComputeRank in N log N time o BubbleSort 0 Review of asymptotic analysis 0 Empirical performance evaluation Today 0 Review of asymptotic analysis 0 Review of foundational data structures Sugih Jamin Gamineecsumich edu Asymptotic Algorithm Analysis An algorithm with complexity fn eg fn n fn 2 n2 etc is said to be not slower than another algorithm with complexity gn if fn is bounded by gn for large n See Fig 41 Commonly written as fn Ogn read fn is bigOh gn aka the asymptotic or bigOh notation Definitions fn Ogn iffEI c gt 0710 2 O Vnn Z n0fn g 6 901 O9nfn13 c gt 0710 2 0 Wm 2 7100 fn c 900 and c is a constant ie it doesn t change with n So more accurately fn E Ogn But NOT fn S O9n Sugih Jamin Gamineecsumich edu BigOh Fallacies Part I Fallacy 0 If fn Ogn gt fn NOT Fallacy 1 Let f1n hn2 and f2n hm gt f1n f2n Therefore if f1n 0012 and f2n 0012 gt f1n f2n NOT See Fig 41 Fallacy 2 fn Ogn gt 901 01fn NOT There s no such thing as a 0 10 function Landmark Functions Wellknown functions you compare the complexity of your algorithm to in order of increasing complexity see Figs 42 43 constant 01 logarithmic Oog n square root O linear or Ohn 0n nlogn On log n quadratic 0n2 cubic 0n3 in general polynomial 001k 1 2 1 exponential 0aquot a gt 1 usually a 2 factorial 0n Sugih Jamin Gamineecsumich edu Tight Bound While fn O9n 75 K71 901 you nevertheless want to pick a gn as close to fn as possible then it is called a tight bound Sugih Jamin Gamineecsumich edu BigOh Rules RUe1f1n O91n7f2n 09201 the f1n f2n 0ma91n792n Example f1n n3 E 0013 f2n 712 E 0012 19 f1n f2n 0013 Rule 2 f1n 091n f2n Og2n then f1n f2n 091n 9201 If your code calls a function within a loop the complexity of your code is the product of your code s complexity times the complexity of the function you call Rule 3 n O9ngn 00101 the n 00101 Sugih Jamin Gamineecsumich edu BigOh Fallacies Part II Fallacy 3 Let 9101 gtlt 9201 If H S C 9101 and C 9201 is H O91n Fallacy 4 Let f1n 0910 0 f2n 092n DOGS 9101 lt 9201 gt f1n lt f2n Siblings of BigOh bigOmega 2 lower bound For fn gt O v n Z Ofn Qgn if Elna gt Ocgt OVn Z n0fn Z cgn h1n 001201 ltgt h2n 9072101 bigTheta fn gn iff fn O9n and fn Q9n fn grows as fast as gn Cousins of BigOh littleoho fn 0gn if fn Ogn but fn 7E gn That is if gn is not an asymptotically tight upper bound of fn In long hand fn 0gn if v c gt O Vn n 2 no gt fn g cgn In contrast to O o is forall c gt 0 whereas 0 only requires there exists c gt 0 Example 2712 0012 is asymptotically tight but 2n 0012 is not So 0 is sloppier than 00 which is why we use it more often littleomegaw fn wgn iffgn 0fn Sugih Jamin Gamineecsumich edu Relatives of BigOh cont D068 n gn gt 901 9001 D068 n gn gt fn 90 Does fn gn gt fn is in the same order as gn In the Limit FYI 00 fn 0gn cgt fn c1 mm and Iimnm c1 90 fn mm cgt fn 2 c2 mm and Iimnm c2 90 fn gn ltgt both Iimnaoo g c1 and v Iimnaoo g CQ 00 fn ogn lt Iimnaoo o v wo fn wagon lt Iimnmgg g 00 v Sugih Jamin Gamineecsumich edu Summary asymptotic analysis deals only with LARGE n 0 provides a shorthand to express upper bound it is not an exact notation be careful how big c is be careful how big no must be 0 run experiments to check the common case behavior 0 tradeoff space vs time o be scalable but optimize only when and where necessary don t lose the forest 0 think outside the box Done with Preiss Chs 2 and 3 with the exceptions per Syllabus Sugih Jamin Gamineecsumich edu Evaluation Questions 1 Let fn n 1n2c 712 can we say that fn 001 If so why If not what is the bigOh of fn Why 2 Is log n On log n 0012 Which is a tigher bound 3 Given 4 consecutive statements 31 SQ S3 84 Let 31 Oog n 32 Oog n3 S3 001 S4 03n What is the bigOh time complexity of the four consecutive statements together Prove it mathematically using the definition of bigOh 4 You ran two programs to completion Both have running time of 10 ms Can you say that the two programs have the same bigOh time complexity Why or why not Sugih Jamin Gamineecsumich edu Foundational Data Structures Array and linked list Why are they called foundational data structures What are abstract data types For example Sugih Jamin Gamineecs umichedu Array What is an array Name 2 advantages of using an array What are the disadvantages of using array Sugih Jamin Gamineecsumich edu Linked List What is a linked list Why use linked list Name three disadvantages of linked ist Sugih Jamin Gamineecsumich edu Array vs Linked List Timing complexity of findJength prepend append deleteJast accessith searchkey replaceith insertafterith deleteith Sugih Jamin Gamineecsumich edu Linked List Optimization o tail pointer Section 42 advantage a circular list with or without sentinel Section 42 advantage 0 freelist with elements allocated in batches Section 132 Sugih Jamin Gamineecsumich edu Outline Last time a Review of basic ADTs o PA1 Walk Through Today a Continue PA1 Walk Through a Dictionary ADT o Hashing Sugih Jamin Gamineecsumich edu Dictionary ADT Table lookup look up a key in a table The key is associated with some other information value Key space is usually more structuredregular than value space so easier to search An entry is a ltkey valuegt pair For example lttitle mp3filegt Usually the ADT stores pointers to entries Sugih Jamin lamineecsumich edu Searching for Items Two types of search 0 entry search pointer comparison 0 key search value comparison Types of dictionary 0 unordered list 0 ordered list 0 unsorted list 0 sorted list What is the difference between an ordered and a sorted list Adding entry to an ordered sorted ist must retain the ordered sorted property of the list Sugih Jamin Gamineecsumich edu Runtime Unsorted list a insert 0 a search 0 Sorted list a insert 0 a search 0 Sugih Jamin Gamineecsumich edu Dictionary ADT Application Implementing Function Call Consider 10 foo does something 20 bar does something else 32 main foo bar How does main know where foo and bar are Sugih Jamin Gamineecsumich edu Symbol Table Example Symbol Memory Location foo 10 bar 20 main 32 Sugih Jamin lamineecsumich edu Symbol Table Symbol table 0 constructed by the assembler 0 maps symbols to binary addresses 0 used by the linkerto resolve symbol references perhaps across different object files 0 used by the debugger to find where the symbols are g gacc The g option tells g to save the symbol table Othenvise it will be removed to save disk space Sugih Jamin lamineecsumich edu Implementing Symbol Table Which ADT to use for symbol table How shall we implement this ADT As array As linked list Sugih Jamin lamineecsumich edu Implementing Symbol Table cont What operations are used with the symbol table Time complexity of the different implementations o array 0 linked list Can we do both of these operations in 01 time Sugih Jamin lamineecsumich edu Hash Table and Hashing Reference items in a table the hash table by keys Use arithmetic operations to transform keys into tabe addresses buckets Hash function h transforms the search key into a bucket o hash code tkey gt intmap maps the key which could be a string etc into an integer intmap o compression map cintmap gt bucket maps the intmap into an address within the table ie into the range 0 M 1 for table of size M Given a key hkey gt ctkey gt bucketindex Sugih Jamin Gamineecsumich edu Hashing Example and Complexity Example 1 hash table of size 5 want to insert 4 3 6 22 How does insert unordered and search on a hash table work What would be an example hash function What are their time complexities Example 2 hash table of size 4 want to insert 4 3 6 22 What to do with keys that hash into the same bucket Moral Pick a table size M that doesn t have common factors with the key Prime numbers are good candidates Sugih Jamin lamineecsumich edu Hash Functions Webster Main Entry hash Pronunciation hash Function transitive verb Etymology French hacher from Old French hachier from hache battleax of Germanic origin akin to Old High German hAppa sickle akin to Greek koptein to cut more at CAPON 1a to chop as meat and potatoes into small pieces Criteria for a good hash function 0 easy and quick to compute 0 must compute a hash for every key 0 must compute the same hash for the same key 0 should distribute keys evenly in hash table 0 minimize collisions Sugih Jamin Gamineecsumich edu Hash Functions cont Common parts of hashing function 0 truncation hash code remove parts common across keys or extract bits from keys that are more likely to be unique acros keys Examples 0 734 763 1583 o 1412138193 0 celloeecsumichedu o folding hash code 0 modulo arithmetic compression map What if the keys are not integers 0 string use the ASCII encoding of each char 0 float treat it as a string of bits 0 in general treat the representation as a bitstring and sum up or extract parts of it Sugih Jamin Gamineecsumich edu String Hashing Example symbol table Bucket Symbol Memory Location h0bd30 md 10 1 2 h bar 3 bar 20 4 5 6 hCmaW37 ma 32 8 What is my hash function int hashchar 5 int M S String to be hashed M table Size H H H H H II Consider these strings stop tops pots spot How would they hash into my table using the above hash function Sugih Jamin lamineecsumich edu A Better String Hash Code Polynomial hash code t takes positional info into account 13Oak1 131aJk2 13k 2a ak 1 If a 33 the mtmap s are c t tops o t pots Good choices of a for English words 33 37 3941 What does it mean for a to be a good choice Why are these particular values good This is folding partition the key into several parts and combine them in a convenient way Sugih Jamin lamineecsumich edu Compression Mapping o division method mtmap mod M M table size prime 0 MAD multiply and divide method a mtmap b mod M a and b nonnegative integers Sugih Jamin Gamineecsumich edu Hashing Collision Resolution With hashing collision is inevitable What do you do when two items hash into the same bucketbin Methods of collision resolution 0 separate chaining o scattertable o coalesced chaining 0 open addressing linear probing quadratic probing double hashing o extensible hashing dynamic hashing Sugih Jamin lamineecsumich edu Collision Resolution Separate chaining let each bin points to a linked list of elements see Fig 83 Coalesced chaining 0 store linked list in unused portion of the table 0 if item hashes to an already occupied bin follow the linked list and add item to the end coalesced chaining see Fig 84 o hash table can only hold as many items as table size Open addressing if there s a collision apply another hash function repeatedly until there s no collision Set of hash functions 10 ill 12 Sugih Jamin lamineecsumich edu Open Addressing To probe to look at an entry and compare its key with target Linear probing hikey h0key 139 mod M do a linear search from h0key until you find an empty slot see Fig 86 compare against Fig 84 Clustering when contiguous bins are all occupied Why is clustering undesirable Assuming input size N table size 2N What is the bestcase cluster distribution What is the worstcase cluster distribution What s the average time to find an empty slot in both cases Sugih Jamin lamineecsumich edu 8 1 DATA STKU CTUKES AND ALCOKITHMS Discussion slides November 14 2007 The University of Michigan Agenda Questions on anything Review Graph terminology Shannon Switching Game Graph problems Branch amp Bound 1122 DATA STRUCTURES AND ALGORITHMS Questions Any questions in general gagg DATA smuches AND ALGORITHMS Graph Terminology What is a graph formally What kinds of graphs can we have 66quot8 DATA STRUCTURES AND ALGORITHMS Graph Terminology Order of a graph Order of a node Complete graph How many edges does Kn have contrary to yesterday39s lecture Examples Spanning subgraph Induced subgraph Example of spanning subgraph which is not induced Example of induced subgraph which is not spanning Example of spanning induced subgraph 6612 8 1 DATA STRUCTURES AND ALGORITHMS Graph Terminology Parity of a node Isomorphism examples Are A and B isomorphic Degree sequence If two graphs have the same degree sequence are they isomorphic If two graphs are isomorphic do they have the same degree sequence Examples 6amp2 8 l DATA STRUCTURES AND ALGORITHMS Graph Terminology Closed and open walks X0X1X2Xm Eulerian trail Konigsberg bridge problem Let39s try it out Can we find any generalizations When does such a trail exist 6amp2 8 l DATA STRUCTURES AND ALGORITHMS Graph Terminology Eularian trail continued Closed all even degree vertices Open exactly two even degree vertices Hamilton cycle with nodes this time 6amp2 8 1 DATA STRUCTURES AND ALGORITHMS Graph Terminology Bipartite graph When is a graph bipartite Why 66quot8 DATA STRUCTURES AND ALGORITHMS Shannon Switching Game Yeah he39s that guy outside the building the statue Rules Two players positive and negative Let GV E U and V have been distinguished Negative player goal Destroy paths Positive player goal Connect U and V Never ends in draw Why Let39s try playing for a bit 11281 DATASTKUCTUKES 10 AND ALGORITHMS Shannon Switching Game Postive game neutral game negative game What39s an easy way to convert a neutral game to a positive game When does G produce a positive game 6amp2 8 l DATA STRUCTURES AND ALGORITHMS Shannon Switching Game Postive game neutral game negative game What39s an easy way to convert a neutral game to a positive game When does G produce a positive game Solution Iff there is a subset U containing u and v of the vertex set V such that the induced multisubgraph X has two spanning trees T1 and T2 with no common edges What does this mean 6amp2 81 DATASTKUCTUKES 12 AND ALGORITHMS Graph Questions Draw a connected graph whose degree sequence equals 5 4 3 3 3 3 3 2 Does a graph of order n with at least n1n22 1 edges have to be connected Is it possible to have a disconnected graph of order n with one fewer edge Why 6amp2 81 DATASTKUCTUKES 13 AND ALGORITHMS Branch and Bound How is BFS better than DFS How is DFS better than BFS Sometimes called Bestfirst search 6612 I DATA STRUCTURES 39 AND ALGORITHMS Branch and Bound Basic idea Start off with infinite cost and null solution Try to search most promising configurations first Compare with current best Update if necessary Backtrack if necessary A search How does that work What39s the purpose of Branch and Bound and A Implications for PA3 gags 1 DATA STRUCTURES AND ALGORITHMS Arbitrary Objects 1 00000301803000000000 395 compression map 00000 012quot M1 Figure 254 The two parts of a hash function a hash code and a compression map X quotExxu quot313 quotPAQu quot339 nngsu mazes x 1 an 9A3 I o quot 33 II a ll 3 u uses um ll 139 3333 8 S x nAIOQu quotnamequot OTQu notuu x xeau nmagu 2925quot 3913 I II OI 6 x Z 19mm Sansone em liq Hangs se 2 4 1 z uogaun e eugep am esoddns z s uAIoq uvAtau IIOIQII uoluu ll i Figure 145 Hash functions for character strings mese diagrams show the disper sion for a set of English words the rst 1000 distinct words of Melville s Moby Dick using Pro gram 747 with M 96 and a 128 top M 97 and a 128 centerand M 96 and a 127 bottom Poor dispersion in the rst instance results from the combination of un even usage of the letters and the common factor 32 in the table size and multiplier which preserves the unevenness The other two in stances appear random because the table size and the multiplier are relatively prime Swat k t Outline Announcement out of town Thu 118 Tue 1113 Tue 1120 There will be guest lectures John Umbaugh and Chris Kiekentveld No office hours Thu 118 Fri 119 Tue 1113 Tue 1120 Last time Sorting a Selection sort a Heap sort a lnsertion sort Today More sorting o lnsertion sort optimizations o Merge sort Sugih Jamin Gamineecs umich edu Insertion Sort Algorithm see Fig 152 0 given a pile of items in an array 0 insert into an already sorted array one item at a time using linear search 0 nonadaptive version continues linear search till end of sorted array 0 adaptive version stops linear search once an insertion point has been found Contrast to selection sort 0 selection sort select item from pile in order add to the front end of sorted list a insertion sort pick next item from pile insert in order into sorted list Sugih Jamin lamineecsumich edu Insertion Sort Sort array a5 2 3 8 5 1 Nonadaptive Adaptive void void isortnaint a int N isortaint a int N Is it stable Is it stable Sugih Jamin Gamineecsumich edu Running Time and Optimizations Running time Optimizations a binary search instead of linear search a move instead of swap saves 0012 copies a use of sentinel saves 0n2 CMPs Sugih Jamin Gamineecsumich edu Insertion Sort Move Instead of Swap Slide items down instead of swapping them to move down 1 steps only need is 2 copies instead of 3k copies Sort array a5 2 3 8 5 1 void isortamint a int N Saves 0012 copies but total running time is still 0012 Is it stable Sugih Jamin lamineecsumich edu Insertion Sort Use of Sentinel Sort array a5 2 3 8 5 1 Can use sentinel to do 0012 less comparisons 0 move smallest item to the leftmost position 001 operation 0 then it s not necessary to check j gt o in the inner loop saves 0n2 CMPQ Total running time is still 0012 void isortasint a int N int i j tmp for j l i jl i lt N i if aj gt ai ji now move smallest to aO if aO gt aj swapa0 aj for i 2 i lt N i for j i tmpai aj l gt tmp j aj ajl aj tmp return Sugih Jamin lamineecsumich edu Inversion A sorted list is but a permutation of an unsorted one Let S 1 2 3 n and P 191192 pn be a permutation of S An inversion in P is a pair of items ppj where p gt pj but 139 lt j eg P 14 3 2 has 3 inversions 4 3 4 2 3 2 The job of a sorting algorithm is to correct inversions Sugih Jamin Gamineecsumich edu Insertion Sort Average Running Time Let PR be P in reverse ie PR 2 pnpn1 p2p1 A pair of items pbpj is either an inversion in P or in PR with equal probability There are W pairs in a sequence of n items W W inversions in a given On average we expect sequence P Insertion sort corrects one inversion at a time so the average running time of insertion sort is W 0012 Sugih Jamin Gamineecsumich edu Merge Sort Sort a85 24 63 45 17 31 96 50 using merge sort 85 24 63 45 85 24 63 45 85 24 63 45 24 63 45 24 85 45 63 24 45 63 85 17 24 31 45 85 17 31 96 50 17 31 96 50 17 31 96 50 17 31 96 50 17 31 50 96 17 31 50 96 5O 63 85 96 Sugih Jamin Gamineecsumich edu Merge Sort Algorithm see Preiss Fig 1510 a spilt list into two roughly equal parts a mergesort each part recursively a merge the two sorted parts void mergesortltint a int left right int mid if left lt right return mid leftright2 mergesorta left mid mergesorta midl right mergea left mid right return Sugih Jamin Gamineecsumich edu m Splitting list into two roughly equal parts is easy How to merge the two sorted parts Simplest way use scratch space of equal size see Preiss Fig 159 void mergeint a left mid right Running time of merge Need an extra 0N space which may not be available for large data set or small devices handhelds eg Sugih Jamin Gamineecsumich edu Merge Sort Recurrence relation Running time Disadvantages o Nonadaptive for presorted sequence worse than insertion sort 0 Not inplace needs an extra 001 space with attending copy operation so slower than other On log n sort a Stack space recursion uses up Oog n stack space Advantage can be stable if merge is stable Sugih Jamin lamineecsumich edu Removing Recursion Iterative Merge Sort Bottomup 0 consider original list as N sublists of size 1 o scan thru list performing 1by1 mergers to pro duce N2 ordered sub lists of 2 elements 0 scan thru list performing 2 by2 mergers to pro duce N4 ordered sub lists of 4 elements 0 etc void imsortint a left right int size i n for size l n right left size lt n size 2 for i left i lt right size i 2size mergea i isize l mini2size lright return Sugih Jamin lamineecsumich edu Inefficiencies of merge H0n1mergeint a left mid rightk l for ikleft jmidl k lt right k 2 if i gt mid ak tmpj continue 3 if j gt right ak tmpi continue 4 ak tmpj lt tmpi tmpj tmpi 5 The boundary checks in lines 2 and 3 return false most of the time Can we remove them by using a sentinel as in insertion sort Sugih Jamin Gamineecsumich edu Insertion Sort Use of Sentinel Sort array a5 2 3 8 5 1 Can use sentinel to do 0012 less comparisons 0 move smallest item to the leftmost position 001 operation 0 then it s not necessary to check j gt o in the inner loop saves 0n2 CMPQ Total running time is still 0012 void isortasint a int N int i j tmp for j l i jl i lt N i if aj gt ai ji now move smallest to aO if aO gt aj swapa0 aj for i 2 i lt N i for j i tmpai aj l gt tmp j aj ajl aj tmp return Sugih Jamin lamineecsumich edu Merge Using Bitonic Sort void bsmergeint a left mid right int ijktmpMAXN memcpytmp ampaleft mid leftl for k mid leftl j right j gt mid j k tmpk aj copy in reversed order for ikleft jright k lt right k ak tmpj lt tmpi tmpj tmpi return But bsmerge iS not stable Why Sugih Jamin Gamineecsumich edu Outline Last time a Review of basic ADTs o PA1 Walk Through Today a Continue PA1 Walk Through a Dictionary ADT o Hashing Sugih Jamin Gamineecsumich edu Dictionary ADT Table lookup look up a key in a table The key is associated with some other information value Key space is usually more structuredregular than value space so easier to search An entry is a ltkey valuegt pair For example lttitle mp3filegt Usually the ADT stores pointers to entries Sugih Jamin lamineecsumich edu Searching for Items Two types of search 0 entry search pointer comparison 0 key search value comparison Types of dictionary 0 unordered list 0 ordered list 0 unsorted list 0 sorted list What is the difference between an ordered and a sorted list Adding entry to an ordered sorted ist must retain the ordered sorted property of the list Sugih Jamin Gamineecsumich edu Runtime Unsorted list a insert 0 a search 0 Sorted list a insert 0 a search 0 Sugih Jamin Gamineecsumich edu Dictionary ADT Application Implementing Function Call Consider 10 foo does something 20 bar does something else 32 main foo bar How does main know where foo and bar are Sugih Jamin Gamineecsumich edu Symbol Table Example Symbol Memory Location foo 10 bar 20 main 32 Sugih Jamin lamineecsumich edu Symbol Table Symbol table 0 constructed by the assembler 0 maps symbols to binary addresses 0 used by the linkerto resolve symbol references perhaps across different object files 0 used by the debugger to find where the symbols are g gacc The g option tells g to save the symbol table Othenvise it will be removed to save disk space Sugih Jamin lamineecsumich edu Implementing Symbol Table Which ADT to use for symbol table How shall we implement this ADT As array As linked list Sugih Jamin lamineecsumich edu Implementing Symbol Table cont What operations are used with the symbol table Time complexity of the different implementations o array 0 linked list Can we do both of these operations in 01 time Sugih Jamin lamineecsumich edu Hash Table and Hashing Reference items in a table the hash table by keys Use arithmetic operations to transform keys into tabe addresses buckets Hash function h transforms the search key into a bucket o hash code tkey gt intmap maps the key which could be a string etc into an integer intmap o compression map cintmap gt bucket maps the intmap into an address within the table ie into the range 0 M 1 for table of size M Given a key hkey gt ctkey gt bucketindex Sugih Jamin Gamineecsumich edu Hashing Example and Complexity Example 1 hash table of size 5 want to insert 4 3 6 22 How does insert unordered and search on a hash table work What would be an example hash function What are their time complexities Example 2 hash table of size 4 want to insert 4 3 6 22 What to do with keys that hash into the same bucket Moral Pick a table size M that doesn t have common factors with the key Prime numbers are good candidates Sugih Jamin lamineecsumich edu Hash Functions Webster Main Entry hash Pronunciation hash Function transitive verb Etymology French hacher from Old French hachier from hache battleax of Germanic origin akin to Old High German hAppa sickle akin to Greek koptein to cut more at CAPON 1a to chop as meat and potatoes into small pieces Criteria for a good hash function 0 easy and quick to compute 0 must compute a hash for every key 0 must compute the same hash for the same key 0 should distribute keys evenly in hash table 0 minimize collisions Sugih Jamin Gamineecsumich edu Hash Functions cont Common parts of hashing function 0 truncation hash code remove parts common across keys or extract bits from keys that are more likely to be unique acros keys Examples 0 734 763 1583 o 1412138193 0 celloeecsumichedu o folding hash code 0 modulo arithmetic compression map What if the keys are not integers 0 string use the ASCII encoding of each char 0 float treat it as a string of bits 0 in general treat the representation as a bitstring and sum up or extract parts of it Sugih Jamin Gamineecsumich edu String Hashing Example symbol table Bucket Symbol Memory Location h0bd30 md 10 1 2 h bar 3 bar 20 4 5 6 hCmaW37 ma 32 8 What is my hash function int hashchar 5 int M S String to be hashed M table Size H H H H H II Consider these strings stop tops pots spot How would they hash into my table using the above hash function Sugih Jamin lamineecsumich edu A Better String Hash Code Polynomial hash code t takes positional info into account 13Oak1 131aJk2 13k 2a ak 1 If a 33 the mtmap s are c t tops o t pots Good choices of a for English words 33 37 3941 What does it mean for a to be a good choice Why are these particular values good This is folding partition the key into several parts and combine them in a convenient way Sugih Jamin lamineecsumich edu Compression Mapping o division method mtmap mod M M table size prime 0 MAD multiply and divide method a mtmap b mod M a and b nonnegative integers Sugih Jamin Gamineecsumich edu Hashing Collision Resolution With hashing collision is inevitable What do you do when two items hash into the same bucketbin Methods of collision resolution 0 separate chaining o scattertable o coalesced chaining 0 open addressing linear probing quadratic probing double hashing o extensible hashing dynamic hashing Sugih Jamin lamineecsumich edu Collision Resolution Separate chaining let each bin points to a linked list of elements see Fig 83 Coalesced chaining 0 store linked list in unused portion of the table 0 if item hashes to an already occupied bin follow the linked list and add item to the end coalesced chaining see Fig 84 o hash table can only hold as many items as table size Open addressing if there s a collision apply another hash function repeatedly until there s no collision Set of hash functions 10 ill 12 Sugih Jamin lamineecsumich edu Open Addressing To probe to look at an entry and compare its key with target Linear probing hikey h0key 139 mod M do a linear search from h0key until you find an empty slot see Fig 86 compare against Fig 84 Clustering when contiguous bins are all occupied Why is clustering undesirable Assuming input size N table size 2N What is the bestcase cluster distribution What is the worstcase cluster distribution What s the average time to find an empty slot in both cases Sugih Jamin lamineecsumich edu Outline Last time 0 Tree Terminology 0 Tree Traversal o N ary Tree Today 0 Binary Search Tree BST 0 Binary Space Partition BSP Tree 0 Heap and Priority Queue Sugih Jamin Gamineecs umich edu Binary Search Tree BST What is the difference between a binary tree and a binary search tree Definition of BST a all values in Tl is lt value on root a all values in TV is gt value on root o where lt and gt can be user defined Search algorithm search for key starting from Treat a if Troot is null not found if key 2 T root value found a if key gt T root Value search TT7a o if key lt T root Value search TTl Sugih Jamin lamineecsumich edu BST Search Time Complexity Averagecase search time l successful l unsuccessful Expressed in terms of depth or height successful search on BST takes 0 unsuccessful search takes 0 linked list hash table BST Worstcase successful search time on BST 0 Worstcase unsuccessful search time on BST 0 Sugih Jamin Gamineecsumich edu BST Insertion Insertion of new element a starting from root find appropriate place for new 0 insert new 0 example build a BST consisting ofk b m f v z o p a 0 will the appropriate place always be a leaf node Given a BST where are the smallest and largest elements How do you remove an element from a BST Remove 2 b m from the above BST Sugih Jamin lamineecsumich edu BST Removal After removal of a node tree must remain a BST Removal of an element elt 0 search for elt o if elt is a leaf node just remove it 0 what if elt is nonleaf o swap with either the smallest element of TV or the largest element of Tl o if elt is now leaf remove it o if nonleaf it should have only one child replace parent s pointer to this node with pointer to child and remove the node Sugih Jamin lamineecsumich edu Problem Collision Detection Blue wants to go to object 6 Which objects are in the way E E o a Sugih Jamin Gamineecsumich edu Binary Space Partitioning BSP Tree What is a BSP A spatial data structure used to help organize geometry in n dimensional space Roughly a BST that sorts objects in space by their coordinates Used to speed up queries about whether geometric objects overlap or about the spatial ordering of geometric objects Example uses 0 computer graphics whether an object is behind another to determine display 0 robot motion planning whetherthere is obstruction in a path 0 computer games collision detection 0 in general occlusion culling and collision detection Sugih Jamin Gamineecsumich edu BSP Partitions Space Blue wants to go to object 6 Which objects are in the way x30 x52 1 3 E x45 l D c t m 2 E y30 J k E Sugih Jamin Gamineecsumich edu Creating a BSP Tree Algorithm 0 use a plane to divide space in 2 0 objects to the left of plane are put in Tl 0 objects to the right of plane are put in TV 0 repeat for Tl and Tr Types of BSP 0 polygonaligned pick existing objects polygons in space as dividing planes 0 axisaligned dividing planes placed along 1 and y axes Sugih Jamin Gamineecsumich edu BSP Creation B BSP Found Colliding Object Blue wants to go to object 8 Which objects are in the way Next check for collision against items 0 and m ESP and Occlusion Culling What can Blue see a E El MinHeap MinHeap an ordered tree with the following properties 0 every subtree of T is a heap o the key in the root of T is g the key in every subtree of T Binary heap must be a complete binary tree see Preiss Fig 114 note that it is not a representation of the ternary tree in Fig 113 A complete binary tree can be efficiently stored as an array Sugih Jamin Gamineecsumich edu Binary MinHeap findmin time complexity O enqueue must maintain binary heap properties Fig 76 GTM 0 insert new item as the rightmost leaf of the complete tree a percolate up swap new with parent as long as news key lt parent s key time complexity O dequeuemin Fig 77 GTM 0 remove root 0 move rightmost item elt of lowest level to root 0 percolate down swap elt with children with the smallest key until elt s key is smaller than children s or elt is a leaf node 0 how to find the rightmost item time complexity O Sugih Jamin lamineecsumich edu Priority Queue Priority Queue a list of items with priorities Examples 0 shortest job first print or cpu queue o simulation of events in games for example 0 scheduling in general Only need the following operations 0 enqueue o findmin or findmax o dequeuemin or dequeuemax Implementation as a sorted linked list enqueue takes 0 What data structure would be good to implement a priority queue Sugih Jamin Gamineecsumich edu Outline Today 0 Quick sort and improvements 0 Distribution sort Sugih Jamin Gamineecsumich edu Quick Sort Like mergesort it is a divideandconquer algorithm Mergesort easy division complex combination mergesortleft half mergesortright half mergeleft right Quicksort complex division easy combination partitionleft right quicksortleft partition quicksortright partition Sugih Jamin Gamineecsumich edu Quick Sort Algorithm see Fig 154 0 choose an element of list as pivot 0 put all elements lt pivot to the left of pivot 0 put all elements 2 pivot to the right of pivot 0 move pivot to its correct place on the list 0 sort left and right subarrays recursively not including pivot void qsortint a left right int split index of where the pivot is if left gt right return Split partitiona left right qsorta left split l qsorta Splitl right Sugih Jamin Gamineecsumich edu Partition Lomuto int partitionint a left right int i j pivot pivot aleft for i left j il j lt right j continue to look for elt lt pivot if aj lt pivot i swap with first gt elt swapai aj put pivot in place swap with last elt lt pivot swapaleft ai return i Running time Sugih Jamin Gamineecsumich edu Running Time of Quicksort Think of it as building a binary tree Best case 0 partition ON 0 height of tree Oog N 0 total ON log N Worst case when array is presorted o partition ON 0 height of tree ON 0 total 0N2 Average case ON log N Sugih Jamin Gamineecsumich edu Pivot Selection Medianof three o ideally pivot is the median ofthe elements values but to find median need to sort 0 heuristic choose the middle value of the first last and middle elements as pivot Randomized choose a random element as pivot running time 0N log N with high probability Sugih Jamin lamineecsumich edu Quicksort is lnplace but Recursion uses up stack space Worst case needs N stack space Small list 0 do insertion sort if array size is below threshold 0 but quicksort works by reducing array size 0 terminate quick sort when array is below threshold leave unsorted subarrays in place 0 subarrays are sorted relative to each other do insertion sort on the whole array as last step Quicksort is not stable Sugih Jamin lamineecsumich edu TailRecursion Removal Tail recursion a when the last statement of a recursive function calls itself a can be replaced by an itera tive loop Examples int countlink list if llist return 0 return lcountlist gtnext void walklink list if llist return visitlist walklist gtnext void reversewalklink list if llist return reversewalklist gtnext visitlist Sugih Jamin lamineecsumich edu Tailrecursion Removed Quicksort Observations o quicksort calls itself twice 0 we can remove the last call 0 the first call still uses up stack space 0 the order of doing left partition first or right partition first does not matter Call quicksort recursively only on the smaller partition Sort the larger partition iteratively Sugih Jamin lamineecsumich edu Improved Quicksort void qsortint a left right qsorttra left right isorta right left void qsorttrint a left right while right left gt smallsize Split partitiona left right if split lt leftright gtgt 1 qsorttra left split l left splitl right Stays else qsorttra splitl right right split l left Stays return Sugih Jamin Gamineecsumich edu Running Times of Sorting Algorithms Algorithm Best Average Worst Linear insertion 001 0012 0012 Binary insertion On log n 0012 0012 Bubble 0n2 0n2 0n2 Quicksort On log n On log n 0n2 Straight selection 0012 0012 0012 Heapsort On log n On log n On log n Mergesort On log n On log n On log n Sugih Jamin Gamineecsumich edu Intuition on Sorting Algorithms Performance How can mergesort and quicksort achieve On log n when insertion sort and selection sort run at 0012 Mergesort and quicksort move elements far distances correcting multiple inversions at a time Why is quicksort worstcase 0012 while mergesort has no such problem The choice of pivot determines size of partitions whereas mergesort cuts array in half everytime Is On log n the best we can do Sugih Jamin Gamineecsumich edu Decision Tree and The Information Theoretic Lower Bound Is On log n the best we can do Yes if we use binary comparison on the key values N elements in a decision tree result in N permutations of the elements at the leaves see Preiss Fig 1511 Let h be the height oftree 2h N h log Ni log Ni Iog2 N IOg2123N N Z Iog2z i1 N Iog21og22Iog23og2 N N IO 2 QNogN II II gN 22 Sugih Jamin Gamineecsumich edu DistributionDigital Sort To escape the Information Theoretic Lower bound instead of comparing keys use them as data 0 bucketcounting sort 0 radix sort 0 radix exchange sort Sugih Jamin Gamineecsumich edu BucketCounting Sort o if keys are small nonnegative integers use them as indices into a countbucket array 0 example to sort a 0 n 0 first allocate bucket 0 k where k is the maximum key value c to sort simply increment bucket a O bucket a l bucket a n 0 walk down bucket i 0 g 139 g k and for each 139 print 139 out bucket i times Running time 001 but 1 cannot be too large Instead of simple integer each element can be a pointer to record then instead of incrementing the count of each bucket distribute the records into their appropriate buckets Bucketcounting sort is stable if the distribution step is stable Sugih Jamin lamineecsumich edu Radix Sort Useful when key values are large but consist of baseradix eg digits or chars that each can be sorted using bucket sort Examples 0 hello is a string of 5 characters radix26 o 429 is a string of 3 digits radix10 Algorithm 0 do bucket sort on the least significant rightmost radix c then sort the resulting list on the next significant radix o algorithm works only if the bucket sort used is stable 0 not inplace Example sort CX AX BZ BY AZ BX AY 0 first pass CX AX BX BY AY BZ AZ 0 second pass AX AY AZ BX BY BZ CX Sugih Jamin lamineecsumich edu Radix Sort Running time let d be the string length assume equal length strings N number of strings and R the radix size algorithm makes d passes over the strings 0dN at each pass the ith element of the strings are sorted 0dR total running time OdR N ON for d and R small but note that if keys are treated as binary numbers and radix2 is used algorithm degenerates into ON log N each binary number becomes a binary comparison for 32bit key N 232 and d 32 is d log N Sugih Jamin lamineecsumich edu Outline Today a Edit Distance and Approximate Matching o KnuthMorrisPratt and BoyerMoore pattern matching Sugih Jamin Gamineecsumich edu Edit Distance Edit aka Levenshtein distance of two words 5 and t dist s t is the smallest number of operations selected from R D I that turns 5 into t Three allowed operations each at unit cost 0 R x y replace character 1 with y o Dx delete a o I x insertx Example dist quotghostquot quothousequot 3 ghost host Dg houst Iu house Rte Sugih Jamin lamineecsumich edu Computing Edit Distance Worst case delete all s characters of s and insert t characters of t in order dist st g s t Dynamic programming solution to compute dist s t first compute the edit distance between prefixes of s and t ie dist s 0 i tioj0 z lts0 jltt Sugih Jamin Gamineecsumich edu Edit Distance Computation Example First row dists0t0 distquotpquot quotpquot O dists0t0l distquotpquot quotpequot l ie pI 8 dists0t02 distquotpquot quotpesquot 2 8 presto ie pIeIs tpeseta dists0t0j distquotpquot quotpquot j ie pI 1 Second column dists0itol i l i gt 2 ie pDreDsD st h B U h CD D D D gt 4gt F 4gt Ch Ch 9 Ch Second row dists0lt0j j j gt 1 ie pRreIsIeI D43 dists04t03 2 distquotprestquot quotpesequot 2 3 ie pDresRte AwN x x xm x Ch gt 00 h A CD 15 CD Ch 4gt D h A CD C F U D 391 13 D55 distquotprestoquot quotpesetaquot 3 ie pDresIetRoa Sugih Jamin Gamineecsumich edu DP for Edit Distance LetDz39j dists0 i tO j in otherwords dist s t Ds 1 t 1 How to compute the D table algorithmically using dynamic programming Running time Omnm sn t Sugih Jamin Gamineecsumich edu WagnerFischer Consider the final step How does 8 1 becomes t j fsi tjturnsoi 1 intot0j 1 gt takes g Dz 1j 1 edit operations Othenvise do the cheapest of the three edit operations 0 Rsi tj andturn sO 1 1 intotO j 1 gt takes g 1 Dz 1j 1 edit operations a Dsi andturn sOi 1int0t0j gt takes g 1 Dz 1j edit operations 0 Itj andturns0i int0tOj 1 gt takes g 1 Dz j 1 edit operations Sugih Jamin Gamineecsumich edu WagnerFischer more formally Dz 1j 1 emitsi ifsi tj MM Dz 1j 1 doRSitj I 1 MIN Dz 1j do Ds i otherwnse Dz j 1 d0Itj D 1 0 1 2 3 4 5 D 1Ijsmallest number of edit 8v p e S e t a Ioperatlons to turn an empty string 1 0 1 2 3 4 5 6 IntOtOjjIl o p 1 o 1 2 3 4 5 similarlyDi 1i1 1 r 2 1 1 2 3 4 5 D00D 1 1O 2 e 3 2 1 2 2 3 4 D011MIN0121 f g i g g g g D111MIN0111 5 O 6 5 4 3 3 3 3 D14 1MN334 4 D2 1 D10 1 Sugih Jamin Gamineecsumich edu Edit Operations distquotprestoquot quotpesetaquot 3 which 3 edit operations DH Lj Di L 1 Dh z 1MW DEL DMj H D 1 0 1 2 3 4 5 5v p e s e t a 1 0 1 2 3 4 5 6 0 p 1 0 1 2 3 4 5 1 r 2 1 1 2 3 4 5 2 e 3 2 1 2 2 3 4 3 S 4 3 2 1 2 3 4 4 t 5 4 3 2 2 2 3 5 o 6 5 4 3 3 3 3 emitsi ifsi tj d0Rsitj do D s i otherwise d0Itj D55 1 D44 gtRs5t5 Roa D44D33gts4 t D331D32gtIt3 Ie D32 D21 gt 8 3 s D21 D10 gt 8 2 e D1o 1DoogtDs1 Dr D00 D 1 1 gt 8 O p Sugih Jamin Gamineecsumich edu Approximate Matching Given a text T and a pattern P find a substring w Tlj such that dist Pw is minimal 10 is called an approximate match of P and dist P w is a measure of how good the match is smaller better Example T retreive retreeve retreev P retrieve w retreeve TlOl7 distPw l Rie How to find the best approximate match 10 Brute force Sugih Jamin Gamineecsumich edu Best Approximate Match Brute force a compute the edit distance between P and all substrings of T o choose the substring with the smallest edit distance Form 2 P n T there are 0012 substrings of T and each substring takes 0mn time to compare Running time 0mn3 Sugih Jamin Gamineecsumich edu Computing Best Approximate Match Observations 1 the BF method recomputes some comparisons multiple times PO TO PO TO1PO1 TOPO1 TO 1 etc gt Use dynamic programming 2 computing the edit distance between POz and TOj already compares POz against all prefixes of TOj we only need to further compare POz against suffixes of TOj ie against Tlj 0 1 g j 1 l j 1 is the empty string ADz j MNdistPOz TljO g1 g j 1 Sugih Jamin Gamineecsumich edu DP for Best Approximate Match Compute AD using dynamic programming similar to computing D ADz 1j 1 ifsi tj ADz 1j 1 AD 2 W 1MN App 13 otherwise The only difference is when P is the empty string it matches all substrings of T No need to output the edit operations Running time 0mn Sugih Jamin Gamineecs umich edu Best Approximate Match Example ADz 1j 1 ifsi tj ADM 1 MIN App 1j otherwise ADMj AD 1 0 1 2 3 4 5 6 7 8 9 A j a m p i e t e a 1 0 0 0 0 0 0 0 0 0 0 0 0 0 p 1 1 1 1 1 01 1 1 1 1 1 1 e 2 2 2 2 21 1 1 2 21 2 2 i 3 3 3 3 3 212 2 3 2 2 Best match ADmj minimal w T4 5 2 pi Sugih Jamin Gamineecs umich edu Multiple Best Approximate Matches AD 1 0 1 2 3 4 5 6 7 8 9 A a p i a n t p o t 1 0 0 0 0 0 0 0 0 0 0 0 0 0 p 1 1 0 1 1 1 1 1 0 1 1 1 1 2 2 1 0 1 2 2 2 1 0 1 2 2 a 3 2 2 1 1 1 2 3 2 1 1 2 3 4 3 3 2 1 2 2 3 3 2 2 2 4 n 5 4 4 3 2 2 2 3 4 3 3 3 Best matches ADmj minimal wl T13 pli DnDa 7112 T14 plia Rna Da 1113 T15 plian Di i Sugih Jamin Gamineecsumich edu kApproximate Match Substring win T is a kapproximate match for P if dist Pw g k The best approximate match algorithm can be used to determine if a kapproximate match exists in time 0mn Sugih Jamin Gamineecsumich edu KnuthMorrisPratt Algorithm Idea Both the BruteForce and SimplifiedBM algorithms completely throw away comparisons already made and restart from scratch KMP s idea why not save the comparisons we have made and not redo them whenever possible Sugih Jamin lamineecsumich edu quotUH KnuthMorris Pratt Intuition I OO NSDSD WQQ C a d C C a d C a C a d C a d b C a d b left to right 4 5 6 7 quotcadcadquot matches prefix quotcadquot C a d C a d b 8 9 quotcadcquot matches prefix quotcquot C a d C a d b 10 C matches only itself c a d c a d b 11 16 quotcadcaquot matches prefix quotcaquot c a d c a d b 17 quotcaquot matches only itself c a d c a d b 18 24 match found Sugih Jamin Gamineecsumich edu KMP Algorithm KMP Match a match P against T left to right a if Pj 7E Tz shift P to the right until prefix of P matches T up to Tz 1 Tzcagac cagat LO Q a Pzagacagat agaca Sugih Jamin Gamineecsumich edu kmpmatch int index of matching start in T kmpmatchchar T char P int i int 0 int n sizeofT int m sizeofP int prefixlen computeprefixlenP for i O i lt n i while j gt O ampamp Ti Pj j prefixlenj 1 if Ti Pj J if j m return i ml return NOTFOUND Sugih Jamin Gamineecsumich edu KMP prefixlen Construction How many of the tail elements in P match its head unlike when matching T which matches only up to Tz 1 here matches all the way to Pj ForPc a d c a d b prefixlen 0001230 P P P C C C a C a 00000000 OSDSDSDSDSDSD OSDQQQQ 0DQI0U I by def no match no match match match match no match no match no match no match prefixlen00 prefixlenl0 prefixlen20 prefixlen3l prefixlen42 prefixlen53 if brute force prefixlen60 Sugih Jamin Gamineecsumich edu KMP Match Example index0123456789A EFGHI T c a d c a d c c a d c a c a d c a d b prefixlen O O O l 2 3 O P c a d c a d b left to right l 2 3 4 5 6 7 Ti6 l Pj6 jprefixlenj l3 c a d c a d b Ti6 Pj3 j i 8 9 Ti7 l Pj4 jprefixlen3l c a d c a d b 10 Ti7 l Pjl jprefixlen00 C a d C a d b Ti7 Pj0 j i ll 16 TiC l Pj5 jprefixlen42 c a d c a d b 17 TiC l Pj2 jprefixlenl0 c a d c a d b 18 24 match found Sugih Jamin Gamineecsumich edu KMP prefixlen Construction 2 ForPa g a c a g a t prefixlen 00101230 P P P a a a SDLQ SDSDSDSD SD OSDLQO SDSDSDSDSDSDSDSD SDLQLQLQLQLQLQ SDLQSDSDSDSD SDLQSDOFI39 O I by def no match match no match no match match match match no no no no match match match match prefixlen00 prefixlenl0 prefixlen2l prefixlen30 prefixlen4l prefixlen52 prefixlen63 prefixlen70 Sugih Jamin Gamineecsumich edu KMP Match Example 2 1ndex T preflxlen P 0 c 0 a K 30 1 1 V V V jpref1xlen63 5mmquot jammeecs m mu KMP prefixlen Construction Algorithm Content Of prefixlen j o prefixlen O O by definition a if match remember how many consecutive elements have matched a if mismatch look backwards in P for a match by consulting the partially constructed prefix table dynamic programming Sugih Jamin Gamineecsumich edu computeiprefixlen CLRS p 926 int return prefixlen computeprefixlenchar P int m sizeofP int prefixlen new int m int i index into Pl m l int j O index into P to determine prefix prefixlenO O for i l i lt m i while j gt O ampamp Pi Pj j prefixlenj l rewinds prefix back Steps on matched prefix until a matched element is found if Pi Pj ji prefixleni j return prefixlen Sugih Jamin Gamineecsumich edu KMP computepref ixl en Example For P a g a c a g a t Pfa Pprefixhead prefixlen 00101230 index 0 l 2 3 4 5 6 7 P39 a g a C a g a t by definition a Pfxj0 Pi1 a Pfxj0 Pi2 a g Pfxjl Pi3 a Pfxj0 Pi3 a Pfxj0 Pi4 a g Pfxjl Pi5 a g a Pfxj2 Pi6 a g a C Pfxj3 Pi7 a g Pfxjl Pi7 a Pfxj0 Pi7 prefixlen00 prefixlenl0 prefixlen2l jprefixlenj l prefixlen30 prefixlen4l prefixlen52 prefixlen63 jprefixlenj l jprefixlenj l prefixlen70 j Sugih Jamin Gamineecsumich edu KMP computeprefixlen Example 2 ForPc a d c a d b PfxPprefix prefixlen 0 O O 1 2 3 O PfXjO 39 Pfxj0 PfXjO Pfxjl Pfxj2 Pfxj3 Pfxj0 0000 999 99 00quot by definition Pil Pi2 Pi3 Pi4 Pi5 Pi6 Pi6 prefixlen00 prefixlenl0 prefixlen20 prefixlen3l prefixlen42 prefixlen53 jprefixlen2 prefixlen60 Sugih Jamin Gamineecsumich edu KMP computeiprefixlen Example 3 ForPa g a c a g a g a c PfxPprefix prefixlen index P O 1 a g a SDN SDSD SDLQ 000 4 a SDSDSDSD LQLQLQ SDSDSDSDSD LQLQLQO 0 O 1 O 1 2 3 2 3 4 U1 by definition Pfxj0 Pfxj0 Pfxjl Pfxj0 Pfxj0 Pfxjl Pfxj2 Pfxj3 Pfxjl Pfxj2 Pfxj3 1 Pi1 Pi2 1 Pi3 1 Pi3 Pi4 Pi5 Pi6 Pi7 Pi7 Pi8 Pi9 prefixlen00 prefixlenl0 prefixlen2l jprefixlenj l prefixlen30 prefixlen4l prefixlen52 prefixlen63 jprefixlenj l prefixlen72 prefixlen83 prefixlen94 j Sugih Jamin Gamineecsumich edu KMP Time Complexity What is KMP s time complexity Consider for i0 iltn i while j gt O ampamp Pj Ti j prefixlenj l rewind Ti j if Pj o the outer for loop advances 139 at most n times 0 j advances only when 139 advances o the inner while loop rewinds j but at most by the number of advances 1 since the last rewind 0 since j can advance only as much as i n the sum of all of its rewinds is also at most n 0 worst case 001 nfrom z 2n fromj Sugih Jamin lamineecsumich edu KMP Time Complexity cont Example index T P prefixlen i j rewind 0123456789ABCDEFGHI aaababbaaaaabaaaaaa aaaaaa 012345 012333345556789ABCCCCCCDEFGHI 01232101210001234543210012345 OOOlllOOllOOOOOOOlllllOOOOOOO k 10 rewinds Time complexity of kmpmatch is 2n n a mthatOfcomputegprefixleniS2nL KMP S time complexity is 001 m Sugih Jamin lamineecsumich edu BoyerMoore Algorithm a match right to left a precompute the last table as in SBM o precompute suffixlen table that is similarto KMP s prefixlen table but matches the pattern against its own suffix tail instead a on mismatch move to the larger of last and suffixlen a note that both last and suffixlen tables depend only on the pattern and can be stored with it across searches Sugih Jamin Gamineecsumich edu Outline Today a Design Patterns BruteForce and Greedy 0 Counting Change Problem 0 Pattern Matching Algorithms as Examples of 2 Design Patterns 0 Brute Force 0 Simplified BoyerMoore Sugih Jamin lamineecsumich edu Design Patterns What is a design pattern Design patterns we look at in this course brute force divide and conquer recursive amortized greedy usuay involving heuristics branch and bound backtracking dynamic programming Sugih Jamin Gamineecsumich edu Problem Counting Change Cashier has a collection of coins of various denominations Want return a specified sum using the smallest number of coins Formally o A sum to be returned 0 n coins the coins P 1914927193 7pm 0 the denominations D dp1 d192 dp3 dpn can have repetition two dimes three pennies o the change 0 C P o the selection 8 5 1 if p E C O othenvise a Want minimize Z s of coins such that Z dcz A Sugih Jamin Gamineecsumich edu Counting Change Example o A 43 o n 13 o P 2 coins of different sizes D 1o11251o15111511 C 1o112515 o S 1111011000000 Sugih Jamin Gamineecsumich edu Solution Bruteforce Approach Try all subsets ofP o 811O1OO1 o 820111OO o 831O111O O o How many possible subsets are there Feasible solution set all 81 s for which E dcz A Objective function the Si that minimizes E 5i What is the time complexity to compute the sums Total time complexity of this approach 0 worst case 0 best case Sugih Jamin lamineecsumich edu Bruceforce Algorithm Solves a problem in the most simple direct or obvious way 0 does not take advantage of structure or pattern in the problem 0 usually involves exhaustive search of the solution space 0 pro often simple to implement 0 con usually not the most efficient way Sugih Jamin lamineecsumich edu Greedy Approach Pick coin with largest denomination first a return largest coin p from P such that dpz g A o A 2 dpi 0 find next largest coin What is the time complexity of the algorithm Solution not necessarily optimal o considerA 20 and D 15101011111 o greedy returns 6 coins optimal requires only 2 coins Solution not guaranteed 0 considerA 20 and D 151010 o greedy picks 15 and finds no solution Sugih Jamin lamineecsumich edu Greedy Approach Algorithm decides what is the best thing to do at each step local maxima and never reconsiders its decisions 0 pro may run significantly fasterthan bruteforce 0 con may not lead to the optimal or even correct solution global maxima Usually requires some initial precomputation to set up the problem to take advantage of special structurepattern in the problem or solution space Sugih Jamin lamineecsumich edu Pattern Matching Given a text string T of length n T and a pattern string P of length m determine if P is a substring of T P E A match means Tz 2 POTz 1 2 P1 Tz m 1 2 Pm 1 If match found return 139 first match Example strings o The quick brown fox jumped over the lazy dog a cagacagacagata o 10111100001010111000111100101 Sugih Jamin Gamineecsumich edu Alphabet Space The string doesn t have to consist only of alphabets in a human language Alphabet space X o English language The quick brown foxjumped over the lazy dog a DNA sequence cagacagacagata 0 binary data 10111100001010111000111100101 Alphabet size Z o English language 26 alphabets a DNA sequence 4charactersc g a t a binary data 2 digits 1 0 Sugih Jamin lamineecsumich edu Pattern Matching Algorithms o Bruteforce o Simplified BoyerMoore greedy but falls back to bruteforce o KnuthMorrisPratt memoized o Original BoyerMoore memoized Sugih Jamin Gamineecsumich edu Bruteforce Pattern Matching T a a b P a C b l 2 a C 3 4 a 5 int index bfmatChChar T C a b 01 b a a l d a a a a b C C a C b a a C ll 12 a C b a a C l3 14 a C b a a 15 16 a C b a 17 C b a a C a C b 18 b a a C a C 19 a C b a a C 9 10 1 0 aC aaC b a a C aC I of matching Start Char P T C a C a a C b a a C 24 in T text P pattern What is the time complexity of the algorithm worst case best case Sugih Jamin Gamineecsumich edu Simplified BoyerMoore Intuition T a a b C b d a a a a b C a C b a a C P a C b a a C right to left l a C b a a C Skip no matCh 32 a C b a a C Skip to rightmost match 654 a C b a a C falls baCk to brute force a C b a a C Skip to rightmost matCh Compare against 24 CMPs with bruteforce What is a heuristic Origin heuriskein Gr to find to discover Archimedes Heureka A process that may solve a given problem but offers no guarantee of doing so in terms of time andor quality of solution the opposite of algorithm a technique that improves the average case but not necessarily the worstcase performance Sugih Jamin Gamineecsumich edu SimplifiedBM Algorithm Heuristics used by the SBM algo T i i i i rithm 0 j m1 o SBM0 match pattern backwards a when Tk 7E Pj SBM1 ifTk gz P shift P to the right past Tk restart matching from T k m SBM2 ifTk 2 Pl and Tk gz Pl 1 m 1 rightmost l SBM21 ifl lt j shift P right to align Pl with Tk restart matching from Tk m 1 1 shift right by l SBM22 esel gt j shift P right by 1 restart matching from T k m j fall back to bruteforce Sugih Jamin Gamineecsumich edu SimplifiedBM Example T P a a b c b d a a a c b a a c 0quot Q d 1 c by SBM O by SBM l by SBM 2l by SBM 22 by SBM 2l T P a a b c b d a a a c b a a c 0quot O Q Q o l Sugih Jamin Gamineecsumich edu SimplifiedBM last Computation How do you determine l Just as in the greedy countchange algorithm precompute the information heuristics you need In this case precompute l for every letter of the alphabet store these in last 1 o initialize each member of last Z to 1 0 go thru P in reverse to determine the last occurrence of each alphabet Forthe example P in previous slide last Sugih Jamin Gamineecsumich edu Outline Today 0 P NP NPComplete NPhard PSPACE definitions 0 Graph Coloring Problem 0 Convex Hull 0 Dynamic Programming Allpair shortest path Sugih Jamin Gamineecsumich edu P NP and NPComplete If there s an algorithm to solve a problem that runs in polynomial time the problem is said to be in the set P If the outcome of an algorithm to solve a problem can be veri ed in polynomial time the problem is said to be in the set NP nondeterministic polynomial the nondeterminism refers to the outcome of the algorithm not the verification There is a set of problems in NP for which if there s a polynomial solution to one there will be a polynomial solution to all The set is called NPComplete Sugih Jamin lamineecsumich edu NPComplete NPHard If you can show that a problem is equivalent can be reduced to a known NPComplete problem you may as well not try to find an efficient solution for it unless you re convinced you re a genius If such a polynomial solution exists P NP It is not known whether P C NP or P NP NPhard problems are at least as hard as an NPcomplete problem but NPcomplete technically refers only to decision problems whereas NPhard is used to refer to optimization problems Sugih Jamin lamineecsumich edu PSPACE If a problem can be solved by an algorithm that uses an amount of space polynomial in the size of its input the problem is said to be in the set PSPACE It is known that P C PSPACE and NP C PSPACE but not whether P 7E PSPACE Sugih Jamin lamineecsumich edu Examples of NPComplete Problems Hamiltonian Cycle Problem Traveling Salesman Problem 01 Knapsack Problem Graph Coloring Problem can you color a graph using is 2 3 colors such that no adjacent vertices have the same color Sugih Jamin Gamineecsumich edu Graph Coloring Problem Bruteforce BnB algorithm a branch a bound Running time Theorem any planar graph can be colored using 4 colors Planar graph a graph that can be drawn on a plane such that no two edges cross each other Sugih Jamin lamineecsumich edu Application of Graph Coloring Problem Register allocation a want local variables in registers a each local variable represented as a node a if the lifetimescope of 2 variables overlap draw an edge between the two nodes 0 can the variables be stored in the is available registers Example for a 4register CPU can we store all local variables inside the for loop in the registers f int 11 int i j for i O i lt n i int u v Statements involving i u v juv Sugih Jamin lamineecsumich edu Graph Exercises 1 Strongly connected graph a Given the graph in Fig 1 how many minimal number of additional edges are needed to make the graph strongly connected b Draw the graph with the additional edges Fig 1 39 Fig 2 2 MST and SPF a Given the graph in Fig 2 assuming one builds an MST and an SPF starting from node A assign minimal nonnegative integer weights to the edges such that the MST is different from the SPF b Draw the resulting MST and SPF Sugih Jamin Gamineecsumich edu 2D Graphics Primitives Point xy Line 0 two end points 0 line formed by drawing all points in between the two end points Polygon 0 defined by vertices 0 closed all lines connected 0 draw one line at a time 0 color shading Sugih Jamin lamineecsumich edu Lines Two points 1341 2342 form a line y yl 92 91 513 5131 5132 5131 92 91 9 x x1y1 332 331 y mx I b 312 311 where m inf and b yl mxl Careful that we are usually only dealing with a line segment y3 mx3 b is on the above line segment iff MINlt 1315132 S 133 S MAXlt3315132 and MINQJL 312 S 93 S MAXQJL 312 Sugih Jamin lamineecsumich edu Line Intersection and Relative Position If y mlx b1 intersects y mgx 192 at 25 92 52 51 m ml mg and yr m1m2 5131 Given a line segment between 1341 and x2 y2 a point at x3 y3 is to the right of the line segment if 313 yl332 x1 3 1y2 311 lt O 313 311 312 311 that IS the slope in IS smallerthan the slope in Sugih Jamin Gamineecsumich edu Orientation of Three Points Given an ordered triplet p q r of points if going from p to q to r o the angle that stays on the left hand side is lt 7r they are said to make a left turn or is oriented counterclockwise o the angle that stays on the right hand side is lt 7r they are said to make a right turn or is oriented clockwise o the angle is 7r they are collinear 191x1 ylp22 312 and p33 93 make a left turn if 93 92 gt 92 91 3 2 2 1 Line intersection can also be determined by checking the orientations of three of the four end points Sugih Jamin lamineecsumich edu Convex Hull A polygon is simple if all its edges intersect only at its vertices A polygon is convex if it is simple and all its internal angles are lt 7r The convex hull of a set of points is the boundary of the smallest convex region that contains all the points in the region or on the boundary Think of tying a rubber band around a set of pegs nailed on a plank Useful for collision detection and path planning by robots including software robots in games for example Sugih Jamin lamineecsumich edu Jarvis s March or Gift Wrapping Algorithm Algorithm o identify a the anchor point of the convex hull with minimum ycoordinate and minimum xcoordinate if there are ties o the next convex hull vertex b is the point with the smallest polar angle with respect to a in case of ties pick the point with the largest xcoordinate a similarly the next vertex c has the smallest polar angle with respect to b etc What is the running time complexity of the algorithm Sugih Jamin Gamineecsumich edu Gift Wrapping Time Complexity Finding the anchor point takes 001 time Radial comparator compare the polar angles of two points with respect to a third by checking the orientation of the three points the next vertex on the convex hull will have a left turn to all other vertices with respect to the current vertex running time 01 For each vertex it takes 001 comparisons to find the smallest polar angle There are h vertices to the convex hull so the algorithm runs in 00171 time Worst case h n and running time is 0012 Such algorithms are said to be output sensitive Sugih Jamin lamineecsumich edu Graham Scan Algorithm Not output sensitive Algorithm 1 identify a the anchor point of the convex hull with minimum ycoordinate and minimum xcoordinate if there are ties 2 sort the remaining points using the radial comparator with respect to a 3 let H be the convex hull initially H a 4 consider the points in sorted order for each new point p o ifp forms a left turn with the last two points in H or if H contains less then 2 points add p to H 0 else remove that last point in H and repeat the test forp 5 stop when all points have been considered H contains the convex hull Sugih Jamin Gamineecsumich edu Running Time of Graham Scan Finding the anchor point takes 001 time Sorting the points takes On log n using heapsort for example Adding a new point takes at most 2n times Total time is On log n If h H lt log n Gift Wrapping algorithm is faster Sugih Jamin lamineecsumich edu Computing the Fibonacci Sequence What is a Fibonacci sequence How would you generate up to the nth Fibonacci numbers Sugih Jamin Gamineecsumich edu Computing the Fibonacci Sequence contd What is a Fibonacci sequence f0 Zoifl lifnzfn 1fn 2an2 2 Recursive implementation Iterative version int int rfibint n ifibint n assume n gt 0 assume n gt 2 return 11 lt 1 39 n int i fn rfibn lrfibn 2 fO Ofl 1 gm fori2ton Running time 92 fi filfi2 Preiss 343 return f n Running time n Preiss 1432 1441 Sugih Jamin Gamineecsumich edu Computing the Fibonacci Sequence contd Why is the recursive version so slow Why is the iterative version so fast Sugih Jamin Gamineecsumich edu Computing the Fibonacci Sequence contd Why is the recursive version so slow The number of computations grows exponentially Each rfib i z lt n 1 computed more than once Tree size grows almost 2quot Actually the number of base case computations in computing fn is fn Since fn gt quot 1 see Preiss Thm 39 complexity is Qquot Why is the iterative version so fast Instead of recomputing duplicated subproblems it saves their results in an array and simply looks them up as needed Can we design a recursive algorithm that similary look up results of duplicated subproblem Sugih Jamin Gamineecsumich edu Memoized Fibonacci Computation int fibmemon 0 l l l int mfibint n fibmemo assume n gt 0 and left to right evaluation if fibmemon lt O fibmemon mfibn 2 fibmemo mfibn l fibmemo return fibmemon Memoization or tabulation use a result table with an otherwise inefficient recursive algorithm Record in table values that have been previously computed Memoize only the last two terms int rfib2int fn2 fnl n assume n gt 0 return n lt 1 n rfib2fnl fn2fnl n main return rfib20 l n Sugih Jamin Gamineecsumich edu Devide et impera Divideandconquer o for base cases solve problem directly 0 do recursively until base cases reached 0 divide problem into 2 or more subproblems o solve each subproblem independently 0 solutions to subproblems combined into a solution to the original problem Works fine when subproblems are nonoverlapping othenvise overlapping subproblems must be solved more than once as with the Fibonacci sequence Sugih Jamin lamineecsumich edu Dynamic Programming a used when a problem can be divided into subproblems that overlap o solve each subproblem once and store the solution in a table 0 if run across the subproblem again simply look up its solution in the table 0 reconstruct the solution to the original problem from the solutions to the subproblems o the more overlap the better as this reduces the number of subproblems Origin of name Bellman 1957 programming planning decision making by a tabular method dynamic multistage timevarying process Sugih Jamin lamineecsumich edu Dynamic Programming and Optimization Problem DP used primarily to solve optimization problem eg find the shortest longest best way of doing something Requirement an optimal solution to the problem must be a composition of optimal solutions to all subproblems In other words there must not be an optimal solution that contains suboptimal solution to a subproblem Sugih Jamin lamineecsumich edu AllPairs Shortest Path Allpairs shortest path problem Given a weighted connected directed graph G V E for all pairs of vertices in V find the shortest smallest weighted path length between the two vertices First solution run Dijkstra s SPF algorithm V times Runtime complexity of Dijkstra s SPF OE log V OV2 log V Solution s runtime OV3 log V Sugih Jamin Gamineecsumich edu Floyd s Algorithm A dynamic programming method for solving the allpairs shortest path problem on a dense graph Floyd s algorithm 0 uses an adjacency matrix 0 initially 0 all nodes have distance 0 to itself D002 1 O 0 distance between directly connected nodes is the weight of the connecting edge D0u v Cu v 0 all other distances are set to 00 D00 w 2 oo 0 add nodes to V one at a time for each node 2 added compare all distances with and without using this node Dz law MNDz 1vw7 Di 1vavz Di 1vz a 10 Runtime OV3 Sugih Jamin lamineecsumich edu Floyd s Algorithm floydG D INFINITY DV V forall vw in E DVw CVW C for i O i lt n i for v O v lt n v for w 0 w lt n w DVW MINDVW DViDiW Sugih Jamin Gamineecsumich edu Floyd s Example init V 0 mm10 1m0m 5031 01mm ab Cd sumich edu Sugih Jamin Gamineec Floyd s Example V U a Va CL 00 051 CL 00000 00030 00030 d 00 00 0 1 1000 d 00 sumich edu Sugih Jamin Gamineec Floyd s Example V U b 0a b V Va 5 10 5 10 0200 00030 c430 1000 d 00 sumich edu Sugih Jamin Gamineec Floyd s Example V U c a b c V 0a b V 023 1 1 430 1 430 sumich edu Sugih Jamin Gamineec 8 1 DATA STRUCTURES AND ALGORITHMS Discussion slides for week of September 25 2007 The University of Michigan Outline Recurrence relations Recursive functions HW2 Unit testing Refactoring Pair Programming HIEIll Ff yf llllllilly 6amp2 DATA STRUCTURES AND ALGORITHMS Recurrence relations Recurrence relations are those that are defined in terms of themselves SO0 SnnSn1 What does the above function do How about a closed form 6612 1 DATA STRUCTURES 3 39 AND ALGORITHMS Recurrence relations closed forms Sometimes recurrence relations can be expressed in closed form SO0 SnnSn1 S n nn 12 6612 8 DATA STKU CTUKES 39 AND ALGORITHMS Recurrence relations Fibonacci How would we express the Fibonacci sequence as a recurrence relation What about the tribonacci sequence 6amp2 8 l DATA STRUCTURES AND ALGORITHMS Recursive Functions Implement recurrence relations Need a base case Or else Why recursion Better Faster 6amp2 8 l DATA STRUCTURES AND ALGORITHMS HW2 Questions 1 and 2 deal with recurrence relations and recursive functions Any questions 662398 1 DATA STRUCTURES 39 AND ALGORITHMS Unit testing Unit testing is a methodology by which you test the software of your program Tests are unit tests because they usually operate on a small specific part of the system They are organized in classes We can implement an extremely simple version of unit testing with assert 612 1 DATA STRUCTURES 39 AND ALGORITHMS Unit testing PowerRalserh class PowerRaiser public PowerRaiser unsigned int base unsigned int getBase const unsigned int raise unsigned int power const private unsigned int basei c1122 DATA STRUCTURES AND ALGORITHMS Unit testing PowerRalsercpp PowerRaiserPowerRaiser unsigned int base basei base unsigned int PowerRaisergetBase const return basei unsigned int PowerRaiserraise unsigned int power const if O number return i else return basei raise power 7 l 1121 DATA STKU CTUKES 10 39 AND ALGORITHMS Unit testing PowerRaiserTesth class PowerRaiserTest public void runAllTests void testGetBase void testGetPower gcqg DATASTKUCTUKES 11 AND ALGORITHMS Unit testing PowerRaiserTestcpp void PowerRaiserTesttestGetBase PowerRaiser p 10 assert 10 pgetBase void PowerRaiserTesttestGetPower PowerRaiser p 3 assert l praise 0 assert 3 praise l assert 9 praise 2 assert 4782969 praise l2 assert 15625 PowerRaiser 5 assert 1000000 PowerRaiser void PowerRaiserTesttestAll testGetBase testGetPower raise 6 10 raise 1122 DATA STRUCTURES AND ALGORITHMS Unit testing Each time we make a change to the code base we run all unit tests to make sure that all of the functionality is still there If an error occurs it signals a bug We can figure out where it is with our tests identify it immediately and correct it Or if the bug cannot be resolved we can revert our code using SVN or CVS for example to the prior state Thus code repositories play a big part in unit testing Tutorial for CVS httpwwwcshmceduqrefcvshtml saga DATASTKUCTUKES 13 AND ALGORITHMS Unit testing This assertbased unit testing is all right but it would be nice to have a unit testing facility that didn39t break on the first error that occured There are lots of unit testing frameworks for C for example CxxTest CPPUnit Unit Boost Test Libraries QuickTest Lots for other programming languages as well It39s a good practice It will save you from a lot of headaches lower the cost of software development and if you know and do it it will make you a lot more appealing to potential employers 6612 8 l DATA STRUCTURES 14 39 AND ALGORITHMS Unit testing CxxTest example MyTestSuiteh include ltcxxtestTest8uitehgt class MyTestSuite public CxxTestzzTestSuite public void testAddition void TSiASSERT l l gt 1 TSiASSERTiEQUALS l l 2 c1122 DATASTKUCTUKES 15 AND ALGORITHMS Refactoring Refactoring is a practice of making your code less risky No functionality of your code will change after a refactoring but it will be a lot cleaner and safer The idea is to make sure all unit tests pass before refactoring then refactor then make sure all unit tests pass afterwards Otherwise revert the refactoring Refactoring is driven by code smells If something smells bad in the code then we need to refactor 6 E8 l DATA STRUCTURES 16 39 AND ALGORITHMS Code duplication prerefactor bad void myFunction int ary unsigned int arysize for unsigned int i O i lt arysize i ary i 1 do some stuff here for unsigned int i O i lt arysize i ary i 2 do some stuff here if myBool for unsigned int i O i lt arysize 2 i ary i 5 else for unsigned int i arysize 2 i lt arysize i ary i 5 6amp2 8 l DATA STKU CTUKES 17 39 AND ALGORITHMS Code duplication postrefactor good void doIterativeOp int ary unsigned int startIdx unsigned int endIdx int operand for unsigned int i startIdx i lt endIdx i ary i operand void myFunction int ary unsigned int arysize doIterativeOp ary O arysize l doIterativeOp ary O arysize 2 if myBool doIterativeOp ary O arysize 2 5 else doIterativeOp ary O arysize 2 5 11281 DATASTKUCTUKES 18 AND ALGORITHMS Code duplication postrefactor 2 void doIterativeOp int ary unsigned int startIdx unsigned int endIdx int operand for unsigned int i startIdx i lt endIdx i ary i operand void myFunction int ary unsigned int arysize doIterativeOp ary O arysize l doIterativeOp ary O arysize 2 doIterativeOp ary O arysize 2 myBool 5 5 DATA STRUCTURES AND ALGORITHMS 6612 8 l Pair Programming A discipline in which people program in pairs Two sets of eyes on the code instead of one reduces costly software errors according to one study by up to 30 Good way to transfer knowledge Also a good way to take the edge off of the sometimes boring task of programming an 8 1 DATA STKU cw RES 20 39 AND ALGORITHMS Outline Midterm next Thu 1025 68 pm in DOW 1013 Review session Tue 1023 68 pm in COOLG 906 Last time 0 Finished up LZW Today 0 AVL Tree 0 Amortized analysis 0 Multiway Search Tree 0 23 tree 0 Btree Sugih Jamin Gamineecsumich edu Unbalanced Tree Insert 1 2 3 4 5 6 7 into a BST Insert 4 2 6 1 3 5 7 into a BST From the second BST remove 2 insert 9 insert 8 insert 11 remove 8 remove 3 Moral even a balanced tree can become unbalanced after a number of insertions and removals Why is a balanced tree desirable When is a tree said to be balanced Sugih Jamin lamineecsumich edu Balanced Tree A perfect binary tree of height h has exactly 2h1 1 internal nodes Only trees with n 1 3 7 15 31 63 internal nodes can be balanced Need another definition of balance condition Want the definition to satisfy the following criteria 1 height of tree of n nodes Oog n 2 balance condition can be maintained efficiently 01 to rebalance a tree We will look at AVLtree 23 tree and Btree Sugih Jamin lamineecsumich edu AVL Tree Adel sonVel skii Landis AVL tree s balance condition A nonempty binary tree is AVL balanced if both Tl and Tr are AVL balanced and 71 th S 17 where hl is height of Tl and hr height of TV Br 1 Sugih Jamin lamineecsumich edu AVL Tree Balance Condition Does the AVL balance condition satisfy the desired criteria 1 height of tree of n nodes Oog n 2 balance condition can be maintained efficiently 01 to rebalance a tree The AVL balance condition satisfies criterion 1 Preiss Th 102 p 316 The height of an AVL tree with n internal nodes is Iog n Can the AVL balance condition be maintained in 01 Sugih Jamin lamineecsumich edu AVL Tree Insertion Removal and Balance Factor Insert and remove operations exactly the same as for BST but must rebalance the tree if AVL balance condition is violated after insertionremoval Define balance factor B of a node 139 as hli hm o if the tree rooted at node 7 is AVL balanced B g 1 o if Tn is deeper B lt 1 o if T12 is deeper B gt 1 Sugih Jamin lamineecsumich edu Unbalanced AVL Tree If Bgt gt 1 there are four cases to consider depending on the direction of the imbalance from the unbal anced node LL LR RL and RR For example LL a new node is added to ul BuO gt1gtB7a2 which violates the AVL balance condition and the tree rooted at r is now unbalanced Sugih Jamin lamineecsumich edu Rebalancing AVL Tree Want rebalancing operation to be 01 time complexity Preiss Th 103 p 322 When an AVL tree becomes unbalanced exactly one single or double rotation is required to balance the tree AVL rotations from the node T that has become unbalanced B9a gt 1 do LL RR LR or RL rotation depending on the direction of the imbalance from r Rotate counter to the direction of traversal For double RL or LR rotations reverse the effect of the last traversal first Must retain BST property at all times Sugih Jamin lamineecsumich edu Single LL Rotation o L rotate right make r the right child of u o L rotate right make ur the left child of r h2 Note Errata for Preiss Fig 108 Sugih Jamin Gamineecsumich edu Single RR Rotation o R rotate left make r the left child of v o R rotate left make I the right child of r Sugih Jamin Gamineecsumich edu Double LR Rotation o R rotate left do an RR rotation on u o L rotate right do an LL rotation on r After After RR Rotation LL Rotation h2 gt gt Sugih Jamin Gamineecsumich edu Double RL Rotation o L rotate right do an LL rotation on v o R rotate left do an RR rotation on r Br 2 After After LL Rotation RR Rotation gt gt h2 h2 h Sugih Jamin Gamineecsumich edu Exactly One Rotation Needed Pf of Preiss Th 103 a when adding a node only the height of nodes in the access path between the root and the new node can be changed a if adding a node doesn t change the height of node 139 in the access path no rotation is needed at z or its ancestors o if height of 139 changes it can either 0 remains balanced no rotation needed at i but may be necessary at its parent node see LL figure 0 becomes unbalanced after one rotation the height of subtree previously rooted at z is the same as before insertion so none of its ancestors needs to be rebalanced Sugih Jamin lamineecsumich edu Deletion in AVL Tree First delete node as with BST Then update ancestors balance factors and rebalance as needed Sugih Jamin Gamineecsumich edu Simple Amortized Analysis o algorithm must save up some credits before it can spend them 0 credits are paid out periodically in lump sum instead of peruse Analogy 0 Mom send some money or earn it 0 buy groceries once a week pay lump sum 0 eat groceries every day amortizespreadout paid sum Sugih Jamin lamineecsumich edu 8 1 DATA STKU CTUKES AND ALCOKITHMS Discussion slides November 14 2007 The University of Michigan Agenda Questions on anything Review Graph terminology Shannon Switching Game Graph problems Branch amp Bound 1122 DATA STRUCTURES AND ALGORITHMS Questions Any questions in general gagg DATA smuches AND ALGORITHMS Graph Terminology What is a graph formally What kinds of graphs can we have 66quot8 DATA STRUCTURES AND ALGORITHMS Graph Terminology Order of a graph Order of a node Complete graph How many edges does Kn have contrary to yesterday39s lecture Examples Spanning subgraph Induced subgraph Example of spanning subgraph which is not induced Example of induced subgraph which is not spanning Example of spanning induced subgraph 6612 8 1 DATA STRUCTURES AND ALGORITHMS Graph Terminology Parity of a node Isomorphism examples Are A and B isomorphic Degree sequence If two graphs have the same degree sequence are they isomorphic If two graphs are isomorphic do they have the same degree sequence Examples 6amp2 8 l DATA STRUCTURES AND ALGORITHMS Graph Terminology Closed and open walks X0X1X2Xm Eulerian trail Konigsberg bridge problem Let39s try it out Can we find any generalizations When does such a trail exist 6amp2 8 l DATA STRUCTURES AND ALGORITHMS Graph Terminology Eularian trail continued Closed all even degree vertices Open exactly two even degree vertices Hamilton cycle with nodes this time 6amp2 8 1 DATA STRUCTURES AND ALGORITHMS Graph Terminology Bipartite graph When is a graph bipartite Why 66quot8 DATA STRUCTURES AND ALGORITHMS Shannon Switching Game Yeah he39s that guy outside the building the statue Rules Two players positive and negative Let GV E U and V have been distinguished Negative player goal Destroy paths Positive player goal Connect U and V Never ends in draw Why Let39s try playing for a bit 11281 DATASTKUCTUKES 10 AND ALGORITHMS Shannon Switching Game Postive game neutral game negative game What39s an easy way to convert a neutral game to a positive game When does G produce a positive game 6amp2 8 l DATA STRUCTURES AND ALGORITHMS Shannon Switching Game Postive game neutral game negative game What39s an easy way to convert a neutral game to a positive game When does G produce a positive game Solution Iff there is a subset U containing u and v of the vertex set V such that the induced multisubgraph X has two spanning trees T1 and T2 with no common edges What does this mean 6amp2 81 DATASTKUCTUKES 12 AND ALGORITHMS Graph Questions Draw a connected graph whose degree sequence equals 5 4 3 3 3 3 3 2 Does a graph of order n with at least n1n22 1 edges have to be connected Is it possible to have a disconnected graph of order n with one fewer edge Why 6amp2 81 DATASTKUCTUKES 13 AND ALGORITHMS Branch and Bound How is BFS better than DFS How is DFS better than BFS Sometimes called Bestfirst search 6612 I DATA STRUCTURES 39 AND ALGORITHMS Branch and Bound Basic idea Start off with infinite cost and null solution Try to search most promising configurations first Compare with current best Update if necessary Backtrack if necessary A search How does that work What39s the purpose of Branch and Bound and A Implications for PA3 gags 1 DATA STRUCTURES AND ALGORITHMS Outline Announcements 0 PA1 will be online latertoday Due Thu 104 100 pm online submission Last time o ComputeRank in N log N time o BubbleSort 0 Review of asymptotic analysis 0 Empirical performance evaluation Today 0 Review of asymptotic analysis 0 Review of foundational data structures Sugih Jamin Gamineecsumich edu Asymptotic Algorithm Analysis An algorithm with complexity fn eg fn n fn 2 n2 etc is said to be not slower than another algorithm with complexity gn if fn is bounded by gn for large n See Fig 41 Commonly written as fn Ogn read fn is bigOh gn aka the asymptotic or bigOh notation Definitions fn Ogn iffEI c gt 0710 2 O Vnn Z n0fn g 6 901 O9nfn13 c gt 0710 2 0 Wm 2 7100 fn c 900 and c is a constant ie it doesn t change with n So more accurately fn E Ogn But NOT fn S O9n Sugih Jamin Gamineecsumich edu BigOh Fallacies Part I Fallacy 0 If fn Ogn gt fn NOT Fallacy 1 Let f1n hn2 and f2n hm gt f1n f2n Therefore if f1n 0012 and f2n 0012 gt f1n f2n NOT See Fig 41 Fallacy 2 fn Ogn gt 901 01fn NOT There s no such thing as a 0 10 function Landmark Functions Wellknown functions you compare the complexity of your algorithm to in order of increasing complexity see Figs 42 43 constant 01 logarithmic Oog n square root O linear or Ohn 0n nlogn On log n quadratic 0n2 cubic 0n3 in general polynomial 001k 1 2 1 exponential 0aquot a gt 1 usually a 2 factorial 0n Sugih Jamin Gamineecsumich edu Tight Bound While fn O9n 75 K71 901 you nevertheless want to pick a gn as close to fn as possible then it is called a tight bound Sugih Jamin Gamineecsumich edu BigOh Rules RUe1f1n O91n7f2n 09201 the f1n f2n 0ma91n792n Example f1n n3 E 0013 f2n 712 E 0012 19 f1n f2n 0013 Rule 2 f1n 091n f2n Og2n then f1n f2n 091n 9201 If your code calls a function within a loop the complexity of your code is the product of your code s complexity times the complexity of the function you call Rule 3 n O9ngn 00101 the n 00101 Sugih Jamin Gamineecsumich edu BigOh Fallacies Part II Fallacy 3 Let 9101 gtlt 9201 If H S C 9101 and C 9201 is H O91n Fallacy 4 Let f1n 0910 0 f2n 092n DOGS 9101 lt 9201 gt f1n lt f2n Siblings of BigOh bigOmega 2 lower bound For fn gt O v n Z Ofn Qgn if Elna gt Ocgt OVn Z n0fn Z cgn h1n 001201 ltgt h2n 9072101 bigTheta fn gn iff fn O9n and fn Q9n fn grows as fast as gn Cousins of BigOh littleoho fn 0gn if fn Ogn but fn 7E gn That is if gn is not an asymptotically tight upper bound of fn In long hand fn 0gn if v c gt O Vn n 2 no gt fn g cgn In contrast to O o is forall c gt 0 whereas 0 only requires there exists c gt 0 Example 2712 0012 is asymptotically tight but 2n 0012 is not So 0 is sloppier than 00 which is why we use it more often littleomegaw fn wgn iffgn 0fn Sugih Jamin Gamineecsumich edu Relatives of BigOh cont D068 n gn gt 901 9001 D068 n gn gt fn 90 Does fn gn gt fn is in the same order as gn In the Limit FYI 00 fn 0gn cgt fn c1 mm and Iimnm c1 90 fn mm cgt fn 2 c2 mm and Iimnm c2 90 fn gn ltgt both Iimnaoo g c1 and v Iimnaoo g CQ 00 fn ogn lt Iimnaoo o v wo fn wagon lt Iimnmgg g 00 v Sugih Jamin Gamineecsumich edu Summary asymptotic analysis deals only with LARGE n 0 provides a shorthand to express upper bound it is not an exact notation be careful how big c is be careful how big no must be 0 run experiments to check the common case behavior 0 tradeoff space vs time o be scalable but optimize only when and where necessary don t lose the forest 0 think outside the box Done with Preiss Chs 2 and 3 with the exceptions per Syllabus Sugih Jamin Gamineecsumich edu Evaluation Questions 1 Let fn n 1n2c 712 can we say that fn 001 If so why If not what is the bigOh of fn Why 2 Is log n On log n 0012 Which is a tigher bound 3 Given 4 consecutive statements 31 SQ S3 84 Let 31 Oog n 32 Oog n3 S3 001 S4 03n What is the bigOh time complexity of the four consecutive statements together Prove it mathematically using the definition of bigOh 4 You ran two programs to completion Both have running time of 10 ms Can you say that the two programs have the same bigOh time complexity Why or why not Sugih Jamin Gamineecsumich edu Foundational Data Structures Array and linked list Why are they called foundational data structures What are abstract data types For example Sugih Jamin Gamineecs umichedu Array What is an array Name 2 advantages of using an array What are the disadvantages of using array Sugih Jamin Gamineecsumich edu Linked List What is a linked list Why use linked list Name three disadvantages of linked ist Sugih Jamin Gamineecsumich edu Array vs Linked List Timing complexity of findJength prepend append deleteJast accessith searchkey replaceith insertafterith deleteith Sugih Jamin Gamineecsumich edu Linked List Optimization o tail pointer Section 42 advantage a circular list with or without sentinel Section 42 advantage 0 freelist with elements allocated in batches Section 132 Sugih Jamin Gamineecsumich edu Common Bugs Array 0 index variable not initialized o bounds not checked 0 pointers in array not deallocated memory leak 0 when moved or realloced pointers to array elements not moved Linkedlist a head and tail not initialized to NULL 0 extraction or insertion dangles pointer delete last preV gtnext not set to NULL 0 deallocation doesn t walk down list memory leak o accessing deallocated element Sugih Jamin Gamineecsumich edu 3 38 Discussion slides for week of October 1 2007 The University of Michiran Agenda Inheritance Polymorphism review Access public private protected friends Circular dependencies and forward declarations Tree Binary Trees Binary Search Trees 112 8 1 DATA STRUCTURES 39 AND ALGORITHMS Inheritance and Polymorphism Allow us to specify relationships between types Abstraction generalization specification The is a relationship Examples Why is this useful in programming Allows for code reuse More intuitiveexpressive code Code Reuse General functionality can be written once and applied to any derived class Derived classes can have selective specialization oy adding members or implementing virtual functions Interfaces and Implementation There are two things that can be Inherited A particular interface A particular implementation Interfaces and Implementation There are two things that can be Innerlteo A particular interface A particular implementation The virtual keyword means that the interface is inherited but not necessarily the implementation Pure Virtual and Abstract Adding 0 maKes a function pure Virtual No implementation is given Must be implemented by concrete derived classes Any class with at least 1 pure virtual member is abstract and cannot be instantiated Often used to define pure interface classes with no functionality Example Virtual VD HunVirtual Class Base Class Derived i ublie Base 7 public public void printO void printO cout ltlt Base cout ltlt Derived What is the output Derived 0 Base pB ampd Derived gtXltpl 6262 pBgtprint pDgtprint Virtual vs Nonvirtual Virtual functions are interfaces All derived types will have the functionality so the function can be called on any instance of the base Implementations are only suggestions Pure virtual functions are only interfaces Nonvirtual functions should be considered mandatory implementations for derived classes Don t override these or you get unintuitive behavior Base Class class TodoList public virtual int isempty return 1 virtual void add T elt cerr ltlt quotTodoListadd not definedn virtual Tnext cerr ltlt quotTodoListnext not de nedn Virtual Int typeu cerr ltlt quot I oaoListtype not de nednquot j 11 Derived Classes class TodoQueue public TodoListlt Tgt private int mytype Queue q public TodoQueue mytypeTODOQ 0 Virtual Int Isempryu returnlqSIze j virtual void add T elt q enqueueelt virtual Tnext if qsize returnqdequeue return 0 virtual int type return mytype When to use InheritancePolymorphism Do you have typesclasses that share functionality Are you tempted to implement identicalsimilar functionality in several places Can you define a natural hierarchy using isa relationships Using templates vs virtual functions Does the type T change the behavior of the class If no use a template If yes use virtual functions Multiple Inheritance Use spanneg and cautiously The deadly diamond Multiple Inheritance Usnng quotVirtualquot Inheritance reselves amblgwty Ensures that a single instance of the base class is inherited Class B public virtual A Class C public virtual A Class D L ublz39c A L ublz39c B K Access Control Private Access only within the class Public Access from anywhere Protected Access within this class and any derived classes Why do we make anything private Friends Functions and Classes Allows access to private members on a case bycase basis class myCass int m yPriva te Var friend void myFriendFunctionO friend class myFriendCass L Classes and Structs What are the differences between Classes and structs in c Classes and Structs What are the differences between Classes and structs in c Classes default to private access Structs default to public access What about default type of inheritance Private ConstructorsDestructors can you have private constructors and destructors Private ConstructorsDestructors can you have prIvate constructors and destructors Yes but why Gives you very explicit control over creationdeletion Use friends to declare explicit ownership quotsingletonquot deSIgn pattern Reference counted objects Public vs Private Inheritance Public inheritance public members stay public in the derived class is a relationship Class Derived public Base Private inheritance public members become private in the derived class implemented as a Class Derived private Base Circular Dependencies 0 HOW do you reselve CIFCUIar dependenCIeS39 First try to redesign without the dependency Forward Declarations Let the compiler Know that the definition is coming later Must use pointers size isn t known yet Binary Trees Root Height Size R Full Binary Tree Complete Binary Tree Outline PA1 Past Due Last time o Binary Search Tree BST Binary Space Partition BSP Tree o MinHeap and MaxHeap a Priority queue Today a Trie o RLE and Huffman encoding Sugih Jamin Gamineecsumich edu E trie from retrieval rhymes with try to differentiate from tree A trie a tree that uses parts of the key as opposed to the whole key to perform search Whereas a tree associates keys with nodes a trie associates keys with edges Example for set of strings on owe owl tip to Note the handout s external node is our leaf node not our external null node Sugih Jamin lamineecsumich edu Trie Definition Let S be a set of strings from alphabet X such that no string in S is a prefix of another A trie for S is an ordered tree T such that a each edge of T is labeled with symbols from Z a the ordering of edges of a node follows the canonical ordering of Z a labels of edges on the path from the root to any node in T forms a prefix of a string in S An nary trie is usually implemented with labels for the edges stored either at the children nodes or at the parent node Sugih Jamin lamineecsumich edu Text Compression Lossless vs lossy compression 0 runlength encoding o statistical modelling Huffman coding arithmetic coding not covered a dictionarybased encoding L277 L288 L278 LZW Readings a Dictionary method L277 L288 L278 0 LZW httpwwwdogmanetmarknarticlesIzwIzwhtm LZW animation httpwwwdatacompressioncomIempelzivhtm Sugih Jamin Gamineecsumich edu Run Length Encoding A very simple encoding method for each repeated element output the element and the count Example 11111111000001111000011110000011111110000 Output 8150414041507140 Sugih Jamin Gamineecsumich edu Huffman Encoding Example string If a woodchuck could chuck wood ASCII encoding 8 bitschar gt requires 256 bits to encode store in binary the string Observe there are only 13 distinct symbols in example string so 4 bitschar is sufficient to encode the string gt requires 128 bits Huffman encoding s main ideas 0 variablelength encoding use different number of bits code length to represent different symbol 0 entropy encoding assign smaller code to more frequently occuring symbol Goal Zlc fc minimized where c is each unique symbol in string c length of its code and fc its frequency Sugih Jamin Gamineecsumich edu Huffman Encoding contd If a woodchuck could chuck wood Can be encoded using the following codes for example swnbolc eqfc codec 1 11111 f 1 11110 a 1 11101 1 11100 l 1 1101 W 2 1100 d 3 101 u 3 100 h 2 0111 k 2 0110 o 5 010 c 5 001 5 000 Takes only 111 bits to encode the string Sugih Jamin Gamineecsumich edu Huffman Encoding Code table construction 0 want more frequently occurring symbols to have shorter codes 0 given that it is a variable length code so that there ll be no confusion when decoding no code can be a prefix of another hence known as a prefixfree code or simply prefix code Huffman code decoding 0 need the code table cost amortized over message 0 variable length code but no code is a prefix of another Sugih Jamin lamineecsumich edu Huffman Trie Better known as Huffman tree Huffman tree construction algorithm count the frequency of occurrence of each symbol make each symbol a leaf node with its frequency as its weight repeatedly combine trees with smallest weight first break ties arbitrarily weight of combine tree is the sum of its two children s encode each symbol as the path from the root with a left represented as 0 right 1 Example Sugih Jamin Gamineecsumich edu Huffman Tree Construction contd Characteristics of Huffman trees 0 higher frequency symbols at shallower depth 0 since all symbols are leaf nodes no code is a prefix of another Construct Huffman tree from the Z elements Z alphabet size 0 implement as a MinHeap where the key is the frequency of occurrence of each element of Z a take the two smallest elements off MinHeap O 0 make a tree of them with the key of the new root node being the sum ofthe keys of the two children 0 a put new tree back into MinHeap O Total construction time 0 Sugih Jamin Gamineecsumich edu Encoding Time Complexity Running times n string length Z alphabet size 0 frequency count 001 o Huffman tree construction OZ log Z c Total time 001 Z log Z For binary data treat each byte as a character Sugih Jamin lamineecsumich edu Compressing the Huffman Code Table The Huffman code for any particular text is not unique For example the following are all acceptable symbolc freq fc code Cc C c C39 c 5 000 001 000 c 5 001 010 001 d 3 101 100 010 o 5 010 000 011 u 3 100 101 100 l 2 1101 0110 1010 h 2 0111 1101 1011 k 2 0110 1110 1100 W 1 1100 0111 1101 a 1 11101 11100 11100 f 1 11110 11101 11101 1 11100 11111 11110 1 11111 11110 11111 The last column can be compressed into 339 39cdou4 lhkaaflI Sugih Jamin lamineecsumich edu Compressing the Huffman Code Table contd The last column was not created from a Huffman tree directly the Huffman tree is used only to determine the code length of each symbol then order symbols by code length set the first of the shortest code length to all Os add 1 to the code of each subsequent symbol hOONA when transiting from code length is to code length is 1 add 1 to the last lengthk code and use that as the prefix for the first length is 1 code 5 set the k 1st bit to O and repeat from Step 3 Code tables that follow these rules will have only one encoding for each symbol and still be prefixfree Sugih Jamin lamineecsumich edu 3 38 Discussion slides for week of October 17 2007 The University of Michiran Agenda Constructors and Destructors PA2 questions discussion Bits and binary operations Binary IO Files iosbinary Gprof subversioncvs brief 981 DATA STKU CTU RES AND ALGORITHMS Constructors What are constructors used for Custom initialization When a variable comes into scope First space is allocated for it The variable IS initialized Constructor ca led Can a constructor have parameters Example What about new statements 112 8 1 DATA STRUCTURES 39 AND ALGORITHMS Destructors What are destructors used for Custom dealocation Can destructors have parameters When a variable goes out of scope o Destructor caled Space is dealocated Local variables are destroyed in the reverse order of creation Useful for dealocated all virtual memory used in the class 112 8 i DATA STRUCTURES 39 AND ALGORITHMS Example class Dog pub l i C 1 for the termmatmg 39 039 char name int age Dogchar name Rex int age0 I thisgtnamenew charstrlenname 1 constructor strcpythlsgtnamename thisgtageage 30th ltlt name ltlt quotis born ltlt endl Doggt destlucml cout ltlt name ltlt quotis gone ltlt endl delete name leleases clynnnncn y allocated 111611101339 back to the heap a DATA STRUCTURES 281 AND ALGORITHMS Example contd unphclt COllStl llCtOl39 call main Dog dogA Fido 3 dogB cout ltlt quotHere boy ltlt endl if true Dog dogC quotSpotquot dhudoum cout ltlt quotGood girl ltlt endl 2ck nwhwcdh Output Fido is born Rex is born Here boyl Spot is born Spot is gone Good girll Rex is gone Fido is gone 66 DFJASTRMCTURiS 39 AND ALGORITHMS Bits and Binary Why do we care a A deep understanding of computers requires an understanding of what is happening at the bit level a Common interview questions Application examples 7 Device drivers parts of operating systems a Heavily optimized code 7 Embedded devices a Lowlevel file manipulation eg compression 7 Storing arbitrary data formats efficienty 112 8 i DATA STRUCTURES 39 AND ALGORITHMS Data Representation Standard variable types hold sets of bits 7 Char 1 byte 1001 0001 7 int 24 bytes 1 001 0001 1 001 0001 7 long 4 bytes 1 001 0001 1 001 0001 1 001 0001 1 001 0001 Unsigned flag is often useful to prevent any of the bits being Interpreted as a negation ag We often inter ret chars as ASCII codes but this is only a partcu ar Interpretation of the bits Assigning 00010000 to a char 7 charx 16 decimal 7 charx 020 octal 7 charx 0x10 hexl 112 8 i DATA STRUCTURES 39 AND ALGORITHMS Bit Manipulations amp AND 1010 amp 1001 OR 1010 1001 A XOR 1010 quot1001 complement 1010 ltlt left shift 1010 ltlt 2 gtgt right shift 1010 gtgt 1 11281 DATASTKUCTUKES AND ALGORITHMS Bit Manipulations Masking 7 How to extract the middle 4 bits from 01011100 7 How to clear the middle 4 bits from 01011100 Packing 7 How do we combine 0X04 and 0X03 into a single byte 1 DATA STRUCTURES 10 I AND ALGORITHMS Bit Fields Bit fields are sometimes useful for storing bits in a compressed structure In memory Not directly useful for 10 Bit shiftsoperations can accomplish the same thing struct DISKREGISTER unsigned ready1 unsigned erroroccured1 unsigned diskspinning1 l DATA STRUCTURES 11 I AND ALGORITHMS Files are streams of bits 010010010101010101010101010100110 000100101011111101001001010100101 010100110110010010101001010101010 100000000011110100100011110010100 112 8 i DATA STRUCTURES 39 AND ALGORITHMS Symbols are just chunks of bits 010010010101010101010101010100110 000100101011111101001001010100101 010100110110010010101001010101010 100000000011110100100011110010100 11358 1 DATA STRUCTURES 39 AND ALGORITHMS IO for binary files Upen mes usmg binary mode Prevents interpretations of newlines spaces etc from causing trouble ifstream myFile quotdatainbinquot ios3939in iosbinaly l I DATA STRUCTURES 14 I AND ALGORITHMS Reading and Writing Bits Can only readwrite on byte boundaries Don t use gtgt and ltlt insertion extraction 7 These are for formatting Use readwrite functions for blocks of chars 7 No nu character appended no interpretation Just the raw bits 7 May not want to read ever thin at one time for very large files use blocks eg 1KB at a time 112 8 l DATA STRUCTURES 39 AND ALGORITHMS Reading and Writing Bits Read from a le char buffer100 ifstream myFile quotdatainbinquot ios3939in ios3939binaly myFileread buffer 100 do stuff With the data Write to a le char buffer1 00 ll the buffer With the data you want ofstream myFile quotdataoutbiuquot ios3939out ios3939binaly myFile write buffer 100 l DATA STRUCTURES 16 I AND ALGORITHMS iosbinary Used while opening les mylnStreamopen eNametxt iosin iosbinaly Does it convert the data in le to binary Trick Question The les are already in binary Duh What iosbinary does is prevent interpretation of end ofline characters etc 1 DATA STRUCTURES 17 I AND ALGORITHMS Viewing binary files A text editor interprets les as ASCII which won t work well for generic binary les 7 xxd hex ASCII 7 xxd b binary ASCII 7 hexdump C hex ASCII 1 DATA STRUCTURES 18 I AND ALGORITHMS xxd output 0006530 0000 0016 0000 0a26 000b 4601 000d c605 0006540 0001 d002 0037 2400 0f84 d002 5e84 d002 0006550 0e84 0000 5d84 0000 1184 98fe 6084 98fe 0006560 0f00 0037 2400 0f84 3804 5e84 3804 0e84 78quot8 0006570 0000 5d84 0000 1184 0000 6084 0000 0016 0006580 0000 0a26 010b 4601 000d c605 0001 a005 ampF 0006590 0037 2400 0f84 a005 5e84 a005 0e84 0000 7quot 0006cf6 00000000 00000000 01 101010 00000000 00000000 00000000 0006ch 00000000 00000000 00000000 00000000 00000001 00000000 0006d02 01001111 00000000 01101100 00000000 01100101 00000000 Oe l DATA STRUCTURES 19 I AND ALGORITHMS Profiling A profiler allows you to identify which pieces of code are taking the most time 7 Allows you to focus optimization effort where it is necessary Io maKe your program 5 faster 7 Optimizing a function that takes 1 of the total time 7 Optimizing a function that takes 10 of the total time 7 Optimizing a function that takes 50 of the total time 112 8 i DATA STRUCTURES 39 AND ALGORITHMS gprof Compile with the pg flag Run the program as usual 7 Will run somewhat slower You will have a gmonout le in your directory 7 This is overwritten each time the program is run Run gprof to examine the data gprof options executable le profiledata es gt outfie l DATA STRUCTURES 21 I AND ALGORITHMS gprof flat profile output Flat profile Each sample counts as 001 seconds cumulative self total Elm seconds calls 115quotCall LIBcall 3 50 5 43000 312 312 2 6 9 2 1 12 166667 14156 67 6 0 S 039 040 040 12 000 000 090 12 000 000 040 l 000 000 090 l 000 000 0 40 l 000 17000000 1111 Dve law IOfileoverflow IOpuc Life update volt stdicbnf39 39cverflcw inc stdicbu syswrite char const int 0519am paratnrltlt char internalimcaunt Life print void toconc void Life initialize void instructions void main K rn DATA STRUCTU RES AND ALGORITHMS m2 81 gprof call graph index time self called name 002 1212 main 2 11 425 002 015 12 updaevoid 1 015 0 00 4200043000 39 hborcuurgtttint int 4 000 017 11 3tarc 3 L2 425 000 017 1 002 015 1212 000 000 12f12 000 000 1212 000 000 11 000 000 11 L31 42 5 0 0 0 17 00 0 17 111 15 000 4500043000 Lifeupdatevoid 1 4E 375 015 000 43000 Lifeneighborcnuncint inc 4 868 l DATA STRUCTURES 23 I AND ALGORITHMS Version Control Subversion svn cvs proprietary tools Useful for individual projects crucial for multiple developers Functionality central repository for code version history rollbacks revision merging conflict detection resolution assistance cnange lists logging branch management tagging 3amp2 839 DATA STKU CTU RES 24 AND ALGORITHMS Typical Use Create a repository Import some les into it Check out a local working directory Make changes Update to latest version Fix any con icts test changes in latest version Check in your changes 1 DATA STRUCTURES 25 I AND ALGORITHMS Outline Last time a Review of PNP and 2 approximate solution to TSP 0 Dynamic Programming 0 01 Knapsack o Travelling Salesperson Today 0 finish up Bitonic merge o Longest Common Subsequence problem Sugih Jamin lamineecsumich edu Longest Common Subsequence LCS Examples of LCS applications 0 approximate matching eg spell checker a file comparison used in diff cvs cheater finder o searching for proteins with similar DNA sequences etc For string GDVEGTA GVEA is an example subsequence of the string Contrast o substring elements in the substring must be consecutive in the string 0 subsequence not Example given 2 strings GOVEGTA and GVCEKST the longest common subsequence is GVET note GET is also a common subsequence Sugih Jamin lamineecsumich edu LCS Definition More formally 0 given 2 sequences a1 n and b1 m a find subsequences az391 ik and bj1 jk where 139 lt n1 and j lt jl1 0 such that az 1 2 bj1ai2 bj2 aik 2 bij o for k as large as possible Brute force approach Running time Sugih Jamin lamineecsumich edu Brute Force LCS Given strings a and b of length n and m respectively a each element of a is either in or not in a subsequence a there are 2 different subsequences a each subsequence must be matched against b s m elements a Running time 0m2n Sugih Jamin lamineecsumich edu Dynamic Programming for LCS Define Lz j k to be the length of the LCS of a1z39andb1jwhere1 gignandlgjgm Question Can we find overlapping subproblems such that the optimal solution to the LCS problem consists of optimal solutions to the subproblems Would the optimal solution for Li j consists of optimal solutions for Lzquotj where 1quot g z and j g j Use similar approach to solving for the 01 Knapsack problem consider adding or not adding the last item to the solution of the subproblem Sugih Jamin lamineecsumich edu DP for LCS Example Compute the LCS of GTTCCTAATA and CGATAATTGAGA by filling in the table L AQ01233345666 GH01233345556 Am01233345556 G901233344455 T801233344455 T701233333455 A601222223444 A501222223334 T401222222233 A301111112222 G201111111111 0100001111111 000000000000 bW0123456789m a GTTCCTAATA Running time sumich edu Sugih Jamin Gamineec DP for LCS Let Li O LOj O k the length ofthe LCS Case 1 if ai 7E bj one ofthem cannot be in the LCS of Lz j so either Lz j Lz j 1 or Lz j Li 1j Since we want the longest common subsequence Lz j MAXLz j 1Lz 1j Case 2 if ai 2 bLj then Li 1j 1 must be is 1 Thus Lz j is optimal and consists of optimal solutions to the subproblems Proof by contradiction assume Li 1j 1 2 k then adding az39 to a1 z 1 and bj to b1 j 1 S makes Lz j 2 k 1 gt contradiction o O ifz O orj O Lz j Lz 1j 11 ifz jgt0andaz bj 1 MAXLz 13 Lz j 1 otherwise Sugih Jamin Gamineecsumich edu LCS Extraction Use stack to accummulate LCS in reversed order Start from Ln m o if Li 2 LLj found a match push Lz move to Li 1j 1 0 otherwise move to the larger of Li 1j and Lz j 1 if Li 1 j 2 Lz j 1 always move horizontal or vertical Alternative algorithm again start from Ln m o ifLz39j 2 Li 1j move to Li 13 0 else ifLz39j 2 Lz j 1 move to Lz j 1 o the above two steps can be swapped 0 else push Lz move to Li 1j 1 Pop stack if there are multiple common subsequences with the same length each of the above extracts only one Sugih Jamin Gamineecsumich edu

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

#### "I made $350 in just two days after posting my first study guide."

#### "There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

#### "It's a great way for students to improve their educational experience and it seemed like a product that everybody wants, so all the people participating are winning."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.