### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# ADV DATA STRUCTURES CSC 3102

LSU

GPA 3.55

### View Full Document

## 29

## 0

## Popular in Course

## Popular in ComputerScienence

This 248 page Class Notes was uploaded by Miss Emery VonRueden on Tuesday October 13, 2015. The Class Notes belongs to CSC 3102 at Louisiana State University taught by B. Karki in Fall. Since its upload, it has received 29 views. For similar materials see /class/222855/csc-3102-louisiana-state-university in ComputerScienence at Louisiana State University.

## Reviews for ADV DATA STRUCTURES

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 10/13/15

Notion of Algorithm 2503102 1 1 55 Kam LSU Program Algorithms Data Structures N Winh Algorithms Data Structures Programs PrenticeHall Englewood Cliffs NJ 1976 2503102 1 2 as raw LSU What is an Algorithm I An algorithm is a sequence of unambiguous instructions for solving a prob em Obtaining a required output for any legitimate input in a finite amount of time Procedural solutions to problems problem I Computer is capable of understanding and following the instructions given I An input specifies an instance of the algomhm problem the algorithm solves CSCCNW 13 EB KE K LSU Motivation for Algorithm I Theoretical importance The study of algorithms represents the core of computer science I Practical importance A practitioner s toolkit of known algorithms Framework for designing and analyzing algorithms for new problems 2503102 1 A 55 Kam LSU GCD Algorithms I Problem Finding the greatest common divisor of two nonnegaiive not bothizero integers m and n denoted gcdmn I Algorithms Several algorithms exist for solving this problem Euclid s algorithm Apply repeatedly the equality gcdmngcdnm mod n until m mod Iris 0 The lastvalue ofm is e gc De nitianibased algaritfm Start by checking Whether t minmn divides bothm and n lfit does tis the answer if not decrease tby l and try again The last value oft which divides both integers is the gcd Middleischoal algorithm Find the prime factors of both In and n identify all the common factors Whose product is the gcd 2503102 1 5 55 law LSU Sequential Search Pseudocode Algorithm SequentiaISearchA0n1K Searches for a given value in a given array by sequential search nput An array A0n1 and a search key K Output Returns the index of the first element of Athat matches K or 1 if there are no matching elements it 0 while ilt n and AU 4 Kdo if ilt n return i else return 1 2503102 1 e 55 law LSU Binary Search Pseudocode Algorithm BinarySearchA0n1K mpements a nonrecursive binary search nput An array A0n1 sorted in ascending order and a search key K Output An index of the arrays element that is equal to K or 1 if there is no such element Is 0 r lt n1 while Is rdo m lt rum21 if K Am return In elseif Klt Am rlt lt m1 else Ilt m1 return 1 2503102 1 7 as raw LSU Characteristics of Algorithm I Finiteness terminates after a finite number of steps I Definiteness rigorously and unambiguously specified I Input valid inputs are clearly specified I Output can be proved to produce the correct output given a valid input I Effectiveness steps are sufficiently simple and basic 2503102 1 e as raw LSU 080 3102 Recursive Algorithms B B Karkl39 LSU Definition and Examples I Recursive algorithm invokes makes reference to itself repeatedly until a certain condition matches I Examples Computing factorial function Tower of Hanoi puzzle Digits in binary representation I Non recursive algorithm is executed only once to solve the problem 080 3102 02 BB Karle LSU Analyzing Efficiency of Recursive Algorithms Steps in mathematical analysis of recursive algorithms I Decide on parameter 11 indicating input size I Identify algorithm s basic Operation I Determine worst average and best case for input of size n 4 If the basic operation count also depends on other conditions I Set up a recurrence relation and initial conditions for C n the number of times the basic operation Will be executed for an input of size n w Alternatively count recursive calls I Solve the recurrence to obtain a closed form or estimate the order of magnitude of the solution see Appendix B 080 3102 03 BB Karle LSU Example 1 Recursive Evaluation of n I Definition of factorial function Fn 11 Algorithm Fn 7 n 1 o 2 o n1 n Compute n recursively 1 0 1 nput A nonnegative integer n Output The value of n if n 0 return 1 else return Fn1 n I Recurrence relation for the number of multiplications Mn Mn1 1 for n gt 0 MO 0 initial condition I Solve the recurrence relation using the method of backward substitution Mn Mn1 1 Mn2 2 Mn3 3 Mni i Mnn n Mn n 080 3102 04 BB Karle LSU Example 2 Tower of Hanoi Puzzle I Given n disks of different sizes and three pegs Initially all disks are on the first peg in order of size the largest being on the bottom Pegs I Problem move all the disks to the third peg using the second one as an auxiliary 4 One can move only one disk at a time and it is forbidden to place a larger disk on the top of a smaller one I Recursive solution Three steps involved are at First move recursively n 1 disks from peg 1 to peg 2 With peg 3 as auxiliary Move the largest disk directly from peg 1 to peg 3 a Move recursively n 1 disks from peg 2 and peg 3 using peg 1 as auxiliary httpwwwmazeworkscomhanoi csc 3102 05 B B Karki LSU Tower of Hanoi Recurrence Relation I Recurrence for the number of moves Mn Mn1 1 Mn1 for n gt 1 2 Mn1 1 M 1 1 initial condition Tree of recursive calls made by the recursive algorithm I Solve the recurrence relation using the 7 method of backward substitution n1 n1 Mn 2Mn1 1 22Mn2 22 1 23 Mn3 23 1 n2 n2 n2 n2 2iMni 2 1 2 quot1 Mn rt1 2quot391 1 2 quot1 2quot391 1 2 2 Mn 2quot 1 12 12 080 3102 06 BB Karle LSU Example 3 Binary Recursive I Problem Investigate a recursive version of binary algorithm Which finds the number of binary digits in the binary representation of a positive decimal integer n I Recursive relation for the number of additions made by the algorithm An An2J 1 for n gt 1 A1 0 initial condition For n 1 no addition is made because no recursive call is executed Algorithm BinRec n nput A positive decimal integer n Output The number of binary digits in n s binary representation if n 1 return 1 else return BinRecn2J 1 080 3102 0 BB Karle LSU I A2k A2k1 1 A2k2 1 1 A2k2 2 A2k3 1 2 A2k3 3 An logzn E 8log n Exact solution is An log2 n 080 3102 08 Binary Recursive Solution I Use backward substitution method to solve the problem only for n 2k A2ki i A2kk k k A20 A1 0 log2 n n 2k 1 Smoothness rule implies that the observed order of growth is valid for all values of n B B Karkl39 LSU 080 3102 Algorithm Design Techniques I Brute force I Divide and conquer I Decrease and conquer I Transformandconquer I Space and time tradeoffs I Dynamic programming I Greedy techniques 01 B B Karkl39 LSU TransformandConquer 080 3102 02 BB Karkl39 LSU Basic Idea I Transform and conquer algorithm works as a two stage procedure 4 Transformation modify the problem to be more amenable to solution 4 Conquer solve the problem I Three major variations 4 Instance simplification Gaussian elimination AVL trees 4 Representation change 2 3 Tree Heaps 4 Problem reduction Counting paths in a graph linear programming 080 3102 BB Karkl39 LSU Gaussian Elimination I Solving a system of 12 linear equations in n unknowns aux1 aux2 alnxn b1 In matrlx notatlon allx1 anx2 a2an 92 AX b al1 a12 a1 b1 amx1 anzx2 annxn bn a a a b 21 22 Zn 2 A b I Substitution method an1 an2 am bn 7 Not su1table When 11 1s large I Gaussian elimination I I I I 7 Transform a given system of n equations an an am bl to an equivalent system With an upper 0 I I b a22 a2 2 tnangular coeff1c1ent matrix A b o I I Ax b 2 A x b 0 0 am bn 080 3102 04 BB Karle LSU Upper Triangular Matrix Coefficient I A series of elementary operations 4 Exchanging two equations of the system lt9 Replacing an equation With its nonzero multiple 6 Replacing an equation With a sum or difference of this equation and some multiple of another equation I Elementary operation does not change a solution to a system I Normal pivoting 4 Use ail as a pivot to make all xi coefficients zeros in the equations below ith one I Partial pivoting 3 Choose a row With the largest absolute value of the coefficient in the ith column 080 3102 05 BB Karle LSU Gauss Elimination Algorithm I Implements the elimination stage with partial pivoting Algorithm BetterGaussEimination A1n 1n b1n forilt 1 to ndoAi n 1 lt bi forilt 1 to n1 do pivotrowlt i forjlt i 1 ton do if Ai I gt Apiv0trow ijl pivotr0wlt j forklt i to n1 do swap Ai k Apiv0tr0w k forjlt i 1 ton do temp lt Ai I AU I forklt i to n1 do Ali k lt AU k AU k temp I Time efficiency The second stage back 11 1 n n1 3 substitution is in nz n n 1 2n 5 n cltngt2221geam i1 ji1ki The total efficiency is cubic 080 3102 06 BB Karki LSU Heaps I A heap is a binary tree satisfying two conditions it Shape requirement The tree is complete so that all its levels are full except that only some rightmost leaves may be missing I The height of the tree is h log2 n 7 Parental dominance The key at each node is greater than or equal to the keys at its children J The root always contains the largest element I Keys are ordered top down Index 0 1 2 3 4 5 6 I A node of a heap considered With all its descendants is alsoaheap values 10 5 7 4 2 1 parents leaves I Implementation as an array H 1n Hi z maXH2i H2i 1 for i 1 142 Parental node keys first fl2 positions a Leaf keys last 142 positions 080 3102 0 BB Karle LSU Constructing a Heap BottomUp Algorithm I Two steps heap construction A heal for the liSt 2a 9a 7a 6a 5a 8 a Initializes a complete binary tree With n nodes by placing keys in the order ln39t39al complete b39nary tree given r Heapifies the tree I Establish parental dominance starting With the last parental node I See the text for the pseudocode page 226 I Time efficiency the worst case After heapi cation as Each key on level i of the tree travel to the leaf level during heapification Cwmxn 312201 i 320 021 i 0 keys 2n log2n 1 e 0n 080 3102 08 Constructing a Heap TopDown Algorithm I Successive insertions of a new key into a previously constructed heap I The new key is sifted up Via a swap with its parent until it is not larger than its parent or it reaches the root I Time efficiency of insertion is in 0log n 080 3102 09 BB Karkl39 LSU Key Deletion from a Heap I Deleting the root s key maximum key deletion from a heap 4 Exchange the root s key With the last key K of the heap 4 Decrease the heap s size by 1 a Heapify the smaller tree by establishing the parental dominance for K at Time efficiency is in 0log n I Deleting an arbitrary key v it Search for v by sequential search in H 1n Matching is found in position i so v Hi a Delete the element H i by using three step process as before 0 Time efficiency is in 0n 010g n Deleting the root s key 080 3102 010 BB Karkl39 LSU Heapsort I A two stage sorting algorithm 91 Stage 1 Heap construction Construct a heap for a given array 91 Stage 2 Maximum deletions Apply the root deletion operation n l times to the remaining heap I The array elements are eliminated in decreasing order and are placed in a neW array in ascending order iv Sort the array 2 9 7 6 5 8 by heapsort Stagel Stage2 297658 968257 298657 768259 298657 86725 928657 56728 968257 2 25 2 080 3102 011 BB Karkl39 LSU Heapsort Time Efficiency I 1 First heap construction stage C1n E On I Second stage C2n s 2log2n 1J 2log2n 2 2log21J 1 n l s 2210g2 i s zglogzm 1 2n 1log2n 1 s 2nlog2 n E Onlog n i1 i1 I Total efficiency is in 0n log n or more accurately in n log n 080 3102 012 BB Karkl39 LSU Problem Reduction I Problem reduction strategy Algorithm A original problem solved Reduction to be solved solvable by algorithm A I Reducing geometric problems to algebraic ones 9 Solve the question about relative positions of three arbitrary points in the plane by finding the sign of a determinant 41 The sign determines Whether the point P3x3 y3 lies to the left or right of the directed line through points P1x1 yl and P2x2 yz x1 3 1 1 P3 x2 y2 1x1y2x3y1x2y3x3y2x2y1x1y3 Pz x3 y3 1 P1 080 3102 013 BB Karkl39 LSU Computing lcm Compute the least common multiple lcm of two integers m and n by solving the problem gcd greatest common divisor m n lcmmn gcdmn I Efficient algorithm Euclid s algorithm exists for finding gcd 4 Apply repeatedly the equality gcdmngcdnm mod n until m mod n is 0 The last value of m is the gcd 080 3102 014 BB Karkl39 LSU Counting Paths in a Graph I Count the number of paths of between two vertices in a graph by Y computing an appropriate power of its adjacency matrix A I Number of different paths of length k gt 0 from the ith vertex to the jth vertex equals the i jtl 1 element of Ak 4 A and A2 gives the number of paths of length l and 2 respectively between the corresponding vertices of the graph abcd abcd ao111 a3o11 Ab1ooo A2bo111 c1001 c1121 d1o1o d1112 080 3102 015 BB Karkl39 LSU Reduction of Optimization Problems I Reduce a minimization problem to a v maximization problem I To minimize a function we can maximize its negative instead and change the sign of the answer min f x maXfx I Opposite is also true maXfx minfx I Can be generalized to the case of function optimization 080 3102 016 BB Karkl39 LSU Linear Programming I A problem of optimizing a linear function of several variables subject to constraints in the form of linear equations and linear inequalities I Solve the knapsack problem by reducing it to an instance of linear programming problem n maximize 2 V jx J j1 n subject to 2Mij S W j1 X E01 forj 1 n 080 3102 017 BB Karkl39 LSU Algorithm Design Techniques I Brute force I Divide and conquer I Decrease and conquer I Transform and conquer I Spaceandtime tradeoffs I Dynamic programming I Greedy techniques 080 3102 01 BB Karle LSU SpaceandTime Tradeoffs 080 3102 02 BB Karkl39 LSU Basics I Trading space for time is much more prevalent than trading time for the space I Two variations 4 Input enhancement Preprocesses the problem s input and store the additional information obtained in order to accelerate solving the problem afterward J Sorting by counting J String matching 9 Prestructuring Exploits space for time tradeoffs and uses extra space to facilitate a faster and or more exible access to the data J Hashing J B trees 080 3102 03 BB Karle LSU 080 3102 String Matching String matching is a problem of finding an occurrence of a given pattern in a given text Pattern a string of m characters Text a longer string of 11 characters Better algorithms than the brute force approach are based on input enhancement idea Preprocess the pattern and store the related information and use it during search process Two types of algorithms ave Knuth Morris Pratt algorithm compares characters of a pattern from left to right 4 Horspool s and Boyer Moore algorithms compare characters of a pattern from right to left 04 B B Karkl39 LSU 080 3102 Horspool s Algorithm The Horspool s algorithm is a simplified version of the Boyer Moore algorithm Both algorithms start by aligning the pattern against the beginning characters of the text If no matching string is found that is the first trial fails shift the pattern to the right if They differ in deciding the shift size Searching the pattern BARBER in some text using Horspool algorithm s0 c B A R B E R The shift size is based on how 0 a text character Which is currently aligned against the rightmost character of the pattern is related to the pattern sn1 05 BB Karkl39 LSU 080 3102 Comparing Pattern with Text During searching the pattern BARBER in some text using Horspool algorithm four cases may arise If there are no 0 s in the pattern shift the pattern by the entire length m If c s are in the pattern but not the last one shift the pattern so that the rightmost c aligns With c in the text If c is the last character in the pattern and no more other c s is in the pattern shift the pattern by the entire length m If c is the last character and there are other 0 s in the pattern shift the pattern so that the rightmost occurrence of 0 among the first 112 1 characters in the pattern aligns With c in the text 06 B B Karkl39 LSU ShiftTable I Precompute shift sizes and store them in a table For every character 0 the shift s value is determined by it tc the pattern s length m I If c is not among the first 112 1 characters of the pattern it tc the distance from the rightmost 0 among the first 112 1 characters of the pattern to its last character I otherWise I See the textbook page 257 for the pseudocode of the ShiftTable algorithm it For the pattern BARBER all table entries Will be equal to 6 except for the entries for E B R and A Which Will be 1 2 3 and 4 respectively 080 3102 0 BB Karkl39 LSU HorspoolMatching I See the textbook page 258 for the algorithm pseudocode I The algorithm works as follows Step 1 Construct the shift table a Step 2 Align the pattern against the beginning of the text Step 3 Scan pattern from the right to left by comparing its characters with the corresponding characters in the text If a mismatching occurs shift the pattern to the right by tc where c is the text character currently aligned against the last character of the pattern And start the right to left comparison for the new aligned position of the pattern So on until a single or multiple matchings are found or the text is exhausted I Search for the pattern BARBER in the following text JIMSAWMEINABARBERSHOP 080 3102 08 BB Karle LSU Horspool s Efficiency I Horspool s efficiency classes are the same as those in the brute force algorithm nm for the worst case case and n for the average case 3 But the Horspool s algorithm makes bigger shifts and hence is faster than the brute force algorithm I Examples 4 Worst case input Searching for the pattern 100 that is 1 followed by ml 0 s in the text of n 0 s 4 Best case input Searching for a pattern of 0 0 that is m 0 s in the text of n 0 s 4 Can Horspool s algorithm make more character comparisons than the brute force algorithm J Yes I For the pattern 10 0 that is 1 followed by ml 0 s in the text of n 0 s 080 3102 09 BB Karkl39 LSU BoyerMoore Algorithm I Boyer Moore and Horspool algorithms act differently after some positive number k 0 lt k lt m of the pattern s characters are matched successfully before a mismatch is encountered I Boyer Moore algorithm calculates the shift size by considering two quantities 43gt Text characters as well as the pattern characters I Two types of shift a Bad symbol shift 4 Good suffix shift 080 3102 010 BB Karkl39 LSU BadSymbol Shift 1 Bad symbol shift is guided by the text character 0 that caused a mismatch d1 maXt1c k 1 I Where t1c is the entry in the pre computed table and k is the number of the matched characters I If t1c k S 0 then shift the pattern by one position to the right 080 3102 011 BB Karkl39 LSU 080 3102 Y characters of the pattern GoodSuffix Shift Good suffix shift is guided by a successful match of the last k gt 0 suffk is the ending portion of the pattern of size k 12 distance between such second rightmost occurrence of suff k and its rightmost occurrence 4 The second rightmost occurrence should not be preceded by the same character as in the last occurrence However if there is no other occurrence of suffk then find the the longest prefix of size I lt k that matches the suffix of the same size I The 12 is the distance between such a prefix and the suffix For k 3 the pattern is ABCBAB and the distance is 4 In this case AB is the prefix with l 2 012 BB Karkl39 LSU Algorithmic Steps I Step 1 Construct bad symbol shift table t1c I Step 2 Construct the good suffix shift table d2 I Step 3 Align the pattern against the beginning of the text I Step 4 Repeat the following until either a matching substring is found or the text is exhausted is Start comparison With the last character in the pattern and continue until either all 112 characters are matched or a mismatching pair is encountered after k 2 0 character matched In the latter case retrieve the value of t1c for the text s mismatched character and compute d1 and also retrieve d2 139 Shift the pattern to the right by the number of positions computed by the formula d d1 k 0 d max 11 d2 if k gt 0 080 3102 013 BB Karkl39 LSU 080 3102 Example I The worst case efficiency is linear When searching for the first Occurrence of the pattern See textbook example pages 261 263 of searching for BAOBAB in a text made of English characters and spaces BESSKNEWABOUT BAOBABS Number of character comparisons for searching the 00001 and 10000 in the binary text of one thousand zeros Pattern Horspool BoyerMoore 00001 996 996 10000 4980 1000 014 BB Karkl39 LSU 080 3102 Graphs B B Karkl39 LSU Graph Basics I Notion of a graph I Main varieties tr Undirected directed and weighted graphs r Complete dense and sparse graphs I Two principal representations 4 Adjacency matriX a Adjacency linked lists I Important properties 399 Connectivity Acyclicity 7 Shortest paths 4 Spanning trees 080 3102 2 BB Karkl39 LSU I A graph G V E is a pair of two sets a finite set V of items called vertices or nodes and a set E of pairs of these items called edges or arcs 4 Undirected graph a pair of vertices uv is the same as the pair vu Directed graph or digraphs the edge uv is directed from vertex u to vertex v I Number of edges IE I possible in an undirected graph with V vertices and no loops satisfy the following in equality OSIE IslV IV l2 I A graph with every pair of its vertices connected by an edge is called complete a Few possible edges missing dense graph 4 Only few edges existing sparse graph 080 3102 3 4 Vertices are often labeled with letters and integers Notion of a Graph Undirected graph l l GO 6 0 Directed graph B B Karkl39 LSU Graph Representations I Adjacency matrix 4quot An n by n boolean matrix With one row and one column for each of the graph s n vertices in Which the element in the ith row and the jth column is equal to 1 if there is an edge from the ith vertex to the jth vertex and equal to 0 if there is no such edge i We use a vector to store vertices and a matrix to store edges I Adjacency linked lists 4 A collection of linked lists one for each vertex that contain all the vertices adjacent to the list s vertex ie all the vertices connected to it by an edge We use a linked list to store the vertices and a two dimensional linked list to store edges I If the graph is sparse the adjacency linked lists represent more space efficient than the adjacency matrix 080 3102 4 CDQOO QJ abodef 001100 001001 110010 100010 001101 010010 Adjacency matrix 969d ecef 96191996 96 9de 91996 a b c deaee e f Adjacency linked lists B B Karkl39 LSU Weighted Graphs I A weighted graph is a graph with numbers assigned to its edges it These numbers are called weights or costs 7 Finding shortest path between two points in a transportation or communication network I In adjacency matrix representation the element Ai will simply contain the weight of the edge from the ith to jth vertex if there is such an edge and a special symbol eg 0C if there is no such edge I Adjacency linked lists include in their nodes not only the name of adjacent vertex but also the weight of the corresponding edge a g i 1 aeb ec b 5 00 7 4 bea5c7ed4 C 1 7 00 2 cealeb7ed2 d 00 4 2 oo deb4ec2 Weighted graph and 5its two representations 080 3102 BB Karkl39 LSU Path and Cycles I Two important properties are connectivity and acyclicity I A path from vertex u to vertex v of a graph G is a sequence of adjacent connected by an edge vertices that starts With u and ends With v 7 Simple path all edges of the path are distinct 3 Path length total number of vertices in the sequence a A directed path is a sequence of vertices in Which every consecutive pair of the vertices is connected by an edge directed from the vertex listed first to the vertex listed next I In a connected graph for every pair of its vertices u and v there is a path from u and v I Cycle a simple path of a positive length a 0 that starts and ends at the same vertex The sequence f h i g f is a cycle a e e a o 5 A graph With no cycles is said to be acyclic 0 Graph that is not connected with two connected components 6 080 3102 BB Karkl39 LSU I Exploits simple linked list concepts With the stack and queue ADT s incorporated for the graph traversal I Graph data structure implementation requires three types of structures I Head structure grathead v Contains metadata about the graph I Vertex node graphVertex 4r Pointers or reference variables v r Data 3 lndegree a Outdegree 439 Proces sed I Arc vertex graphArc 4 Destination 4gt Weight 3 Next are 080 3102 Graph ADT Structure grathead count ltintegergt first ltvertex pointergt compare ltpointergt end grathead graphVertex nextVertex ltvertex pointergt data ltdataTypegt inDegree ltintegergt outDegree ltintegergt processed lt0 1 2gt arc ltarc pointergt end graphVertex graphArc destination ltvertex pointergt weight ltintegergt nextArc ltarc pointergt end graphArc 7 B B Karkl39 LSU Graph ADT Functions I Functions to build and maintain a graph I Basic algorithms 4 Create graph Insert vertex Insert arc Delete vertex Delete arc I Graph data processing at Retrieve DFS and BFS traversals I Graph utility functions 4 First arc Graph empty Graph full Graph count Destroy graph 080 3102 8 BB Karkl39 LSU Insert and Delete I Insert vertex adds a disjoint unconnected vertex to the graph Y i Allocates memory for a new vertex and copies data to it I Insert arc adds an arc between two vertices Once a vertex is inserted connect it to other vertices r Determine two points to be connected source vertex and destination vertex 5 Insert a new arc in the adjacency list for this pair of vertices I Delete vertex deletes an existing vertex only if its degree is zero 2 Locate the vertex that needs to be deleted Make sure that the vertex is disjoint its degree is zero a If another vertex points to it or if it points to another vertex then it cannot be deleted I Delete arc deletes an arc between two vertices 4 Search the vertex list for the start vertex and search its adjacency list for the destination vertex i Remove the arc from the adjacency list using a simple linked list delete algorithm Update the indegree and outdegree for the vertices Recycle the memory 080 3102 9 BB Karkl39 LSU Graph Traversals DFS and BFS I Many graph algorithms requires processing vertices or edges of a graph in a systematic fashion I Two principal algorithms for doing graph traversals DFS Depth first search I All of a verteX s descendents are processed before moving to an adjacent vertex as BFS Breadth first search I All adjacent vertices are processed before processing the descendents of a vertex 080 3102 10 B B Karle LSU 1quot if 4 080 3102 DFS Traversal I DFS starts visiting vertices at an arbitrary vertex marking it as having been visited On each iteration the algorithm proceeds to an unvisited vertex that is adjacent to the one it is currently in This process continues until a dead end Then the algorithm backs up one edge to the vertex it came from and tries to visit unvisited vertices from there The algorithm halts after backing up to starting vertex With the latter being a dead end By then one connected component is visited If there are more components DFS must process each in the same way I It is convenient to use a stack to trace the operation of DFS It results in two orderings of vertices The order in Which the vertices are reached for the first time pushed on to the stack The order in Which the vertices become dead ends popped off the stack I DFS forest is obtained as a by product of a DFS traversal The traversal s starting vertex serves as the root of the first tree in the forest Tree edges are used by DFS to reach unvisited vertices Back edges connect vertices to previously visited vertices other than their immediate predecessors 11 B B Karkl39 LSU DFS Pseudocode I DFS takes the time proportional to the size of the data structure used for representing the graph in question Time efficiency is in l Val for the adjacency matrix representation and on v E I for the adjacency linked list representation Where WI and IE I are the number of the graphs vertices and edges respectively 080 3102 Algorithm DFSG implements a DFS traversal of a given graph G V E count lt 0 for each vertex V in Vdo if V is marked with 0 dfsv dfsv visits recursively all the unvisited vertices connected to vertex V and assigns them the numbers in the order they are encountered count lt count 1 mark VWith count for each vertex win Vadjacent to Vdo if wis marked with 0 dfsw 12 B B Karkl39 LSU DFS Example I Traversal s stack 45 First subscript number indicates the order in Which a vertex was Visited ie pushed on to the stack 7 Second one indicates the order in Which a vertex became a dead end ie popped off the stack I DFS forest Solid lines are tree edges 4quot Dashed lines are back edges 080 3102 13 B B Karkl39 LSU DFS Applications I Check connectivity of a graph r Start a DFS traversal in an arbitrary vertex and check after the algorithm halts Whether all the graph s vertices Will have been visited If they have the graph is connected otherwise it is not I Check acyclicity of a graph 2 If the DFS forest does not have back edges the Cycle graph is acyclic 1 A back edge from some vertex 1 to its ancestor a means the graph has a cycle I Find articulation point A verteX of a connected graph is said to be its articulation point if its removal With all edges incident to it breaks the graph into disjoint pieces Vertex fis an articulation point 080 3102 14 BB Karkl39 LSU BFS Traversal I BFS proceeds in a concentric manner by visiting first all the vertices that are adjacent to a starting vertex then all unvisited vertices two edges apart from it and so on a The process continues until all the vertices in the same connected component as the starting vertex are Visited Then BFS processes other connected component if there is I It is convenient to use a queue to trace the operation of BFS It results in single ordering of vertices The order in which the vertices are added to the queue is the same order in which they are removed from the queue er The queue is initialized with the traversal s starting vertex On each iteration the algorithm identifies all unvisited vertices that are adjacent to the front vertex marks them as visited and adds them in the queue after that the front vertex is removed from the queue I DFS forest is obtained as a by product of a BFS traversal iquot The traversal s starting vertex serves as the root of the first tree in the forest in Tree edges are used by BFS to reach unvisited vertices iquot Cross edges connect vertices to previously visited vertices other than their immediate predecessors 080 3102 15 B B Karkl39 LSU BFS Pseudocode Algorithm BFSG implements a BFS traversal of a given graph G V E count lt 0 for each vertex Vin Vdo if Vis marked with 0 bfsv bfsv visits all the unvisited vertices connected to vertex V and assigns them the numbers in the order they are encountered count lt count 1 mark vwith count and initialize a queue with V while the queue is not empty do for each vertex win Vadjacent to the front vertex do if wis marked with 0 count lt count 1 mark wwith count add wto the queue remove the front vertex from the queue I BFS has the same efficiency as DFS l2 I for the adjacency matrix representation V E I for the adjacency linked list representation 16 080 3102 BB Karkl39 LSU BFS Example a1 02 d3 e4 f5 b6 97 ha j9 i1o I Traversal s queue The number indicates the order in Which a verteX was visited ie added to or removed from the queue I BFS forest Solid lines are tree edges Dotted lines are cross edges 080 3102 17 B B Karkl39 LSU I Check connectivity of a graph i In the same manner as DFS does I Check acyclicity of a graph In the same manner as DFS does I Finding a minimum edge path A path with the fewest number of edges between two given vertices 4 Start with one of the two vertices given and stop it as soon as the other vertex is reached J Path abcg in the graph has the fewest number of edges among all paths between vertices a and g 080 3102 18 BFS Applications Part of its BFS tree that identifies the minimumedge path from ato g B B Karkl39 LSU Analysis of Algorithmic Efficiency Analysis Framework I Two kinds of efficiency 4 Time efficiency How fast an algorithm runs 4 Space efficiency Deals With extra memory space an algorithm requires We often deal With time efficiency I Input s size I Time s units I Order of growth 080 3102 43 BB Karle LSU Measuring an Input s Size I Efficiency as a function of some parameter 1 indicating the algorithm s input size it Size of the list for the problems of sorting or searching at Degree of polynomial for the problem of evaluating a polynomial n l px anxquot an1x do I Choice of input size parameter does matter in some situations I Operations of the algorithm can affect the choice I Size of inputs for algorithms involving properties of numbers is often expressed by the number b of bits in the n s binary representation 19 logy41 080 3102 44 BB Karle LSU Time Efficiency Units and Analyses I Standard unit of time measurement a second a millisecond and so on r Measure the running time of a program implementing the algorithm I Basic operation the operation that contributes most towards the running time of the algorithm Count the number of repetitions of the basic operation I Mathematical or theoretical analysis of an algorithm s efficiency Independent of specific inputs 4 Limited applicability I Empirical or experimental analysis of an al gorithm s efficiency a Applicable to any algorithm e Results dependent on the particular sample of instances and the computer used 080 3102 45 BB Karle LSU Mathematical Analysis I Time efficiency is analyzed by determining the number of repetitions of the basic operation as a function of input size n is input size 39 z CopCn running number of time executlon t1me for times basic bas1c operatlon on a Operation is partlcular computer executed 080 3102 463 BB Karle LSU 080 3102 Problem Input size measure Basic operation Search for key in list of n items Number of items in list n Key comparison Multiply two matrices of oating point Dimensions of Floating point matrices multiplication numbers Floatin oint Compute a n p multiplication Number of vertices Visitin a vertex or Graph problem g andor edges traversing an edge 47 Examples Input Size and Basic Operation B B Karkl39 LSU Bestcase Averagecase Worstcase For some algorithms efficiency depends on type of input I Worst case Wn maximum over inputs of size n I Best case 301 minimum over inputs of size n I Average case An average over inputs of size n a Number of times the basic operation Will be executed on typical or random input 45 Based on some assumption about the probability distribution of all possible inputs of size n I Amortized efficiency 3 Amortize high cost of some worst case occurrence for some single operation over the entire sequence of n such operations 080 3102 48 BB Karle LSU I Problem Given a list of n elements and a search key K find an element v equal to K if any I Algorithm Scan the list and compare its successive elements With K 080 3102 Example Sequential Search until either a matching element is found successful search or the list is exhausted unsuccess tl search Algorithm SequentiaISearch A0n 1 K ilt 0 while ilt n and AU 2 Kdo ilt i1 if ilt n return i else return 1 49 B B Karkl39 LSU Efficiency of Sequential Search I Worst case Cwmtn n I Best case C bestn I I Average case Probability of a successful search 2 p Cmm WT nlt1 p 4 p 1 for successful search and Cavgn n12 p 0 for unsuccessful search and Cavgn n 080 3102 410 B B Karkl39 LSU Types of Formulas for Basic Operation Count 1 Exact formula eg Cn nn12 I Formula indicating order of growth with specific multiplicative constant eg Cn z 05n2 I Formula indicating order of growth with unknown multiplicative constant eg Cn 2 cu2 080 3102 411 BB Karkl39 LSU Orders of Growth I Most important Order of growth of the algorithm s efficiency Within a constant multiple as new 4 See table 21 I Examples a How much faster Will algorithm run on computer that is ten times as fast J Ten times 4 How much longer does it take to solve problem of double input size J The function logzn increases in value by l log22n log22 logzn 1 logzn J The linear function increases twofold 2n J The cubic function increases eightfold 2n3 8n3 J The value for the 2quot function is squared 22quot 2quot2 080 3102 412 B B Karkl39 LSU Table 21 n logg n n n logi n n2 n3 239 n 10 33 101 33101 1E3 103 133 35135 101 3 as 1053 66102 1rt 103 13103 9310157 103 10 103 1010 1 136 109 104 13 104 13105 108 101 3 105 17 135 11105 1311 1015 105 3920 105 201th 1012 1018 Table 21 Values some approximate of several functions important for analysis of algorithms 080 3102 413 B B Karkl39 LSU Three Asymptotic Notations I Principal indicator of efficiency 2 Order of growth of basic operation count at A way of comparing functions that ignores constant factors and small input sizes I Ogn set of all functions tn with a smaller or same order of growth as gn to within a constant multiple as new I S2gn set of all functions tn with a larger or same order of growth as gn to within a constant multiple as neoo I gn set of all functions tn with the same order of growth as gn to within a constant multiple as n9 00 080 3102 414 B B Karkl39 LSU BigOh Notation I tn E Ogn A function tn is said to be in Ogn if it is bounded above by some constant multiple of gn for all large n There exist positive constant c and noninegative integer no such that tn S c gn for every n a no I I I I I I I I I I I I I I I I I I I I I I I I I I i n Example 100n5 is 0n1 Figure 21 Bigroh notation tn E 0gn 0563102 415 55 Kak LSU BigOmega Notation I tn E ngn A function tn is said to be in Ogn if it is bounded below by so constant multiple of gn for 11 large n I There exist positive constant c and noninegative integer no such that tn a cgn for every n a n0 Example 211365 is 5201 Fig 22 Bigromega notation 01 6 mm 6563102 416 33 mm LSL BigTheta Notation I tn E gn A function tn is said to be in gn if it is bounded both above and below b some constant multiples of gn for all large n I There exist positive constants cl and c2 and noninegative integer nO such that cgn Z tn Z clgn for every n 2 n0 Example l2nnl is nl cscswz l x l l l l l l x l l l l l l x l l l l x l l l l l l n 0 Figure 23 Bigitheta notation m e gn 417 as Kak LSU Comparing Growth Rate Using Limits I Compute the limit of the ratio of two functions under consideration 9 Using the limit based approach is more convenient than the one based in the Y definition 0 order of growth of tn lt order of growth of gn limngt tngn c gt 0 order of growth of tn 2 order of growth of gn 00 order of growth of tn gt order of growth of gn a Use calculus techniques such as L Hopital s Rule and Stirling s formula in computing the limits Examples nn 12 vs 112 logzn vs xn 11 vs 2quot 080 3102 418 B B Karkl39 LSU Basic Asymptotic Efficiency Classes lt 4 1 constant E 50 log n logarithmic 395 5 1 Q 11 linear E o n log n n log n E 4 LH 2 o n quadratic H 8 113 cubic 5 o on 2quot exponential S I 8 n factorial 5 E I Caution In defining asymptotic efficiency classes the values of multiplicative constants are usually left unspecified 080 3102 419 B B Karkl39 LSU a a a 41 080 3102 Empirical Analysis of Algorithms I A complementary approach to mathematical analysis is empirical analysis of an algorithm s efficiency I A general plan for the empirical analysis involves the following steps Understand the purpose of the analysis process called experimentation Decide on the efficiency metric to be measured and the measurement unit Decide on characteristics of the input sample Generate a sample of inputs Implement the algorithm for its execution to run computer experiment simulation Execute the program to generate outputs a Analyze the output data B B Karkl39 LSU Analyzing Output Data I Collect and analyze the empirical data for basic counts or timings I Present the data in a tabular or graphical form I Compute the ratios M ngn Where gn is a candidate to represent the efficiency of the algorithm in question I Compute the ratios M 2nM n to see how the running time reacts to doubling of its input size I Examine the shape of the plot J A concave shape for the logarithmic algorithm J A a straight line for a linear algorithm J Convex shapes for quadratic and cubic algorithms I An exponential algorithm requires a logarithmic scale for the vertical axis 080 3102 421 B B Karkl39 LSU 080 3102 Algorithm Design Techniques I Brute force I Divide and conquer I Decreaseandconquer I Transform and conquer I Space and time tradeoffs I Dynamic programming I Greedy techniques 01 B B Karkl39 LSU DecreaseandConquer 0803102 01 BB Karkl39 LSU Basics I Decrease and conquer algorithm works as follows Establish the relationship between a solution to a given instance of a problem and a solution to a smaller instance of the same problem 4 Exploit this relationship either top down recursively or bottom up without a recursion I Three variations 21gt Decrease by a constant I The size of an instance is reduced by the same constant usually one at each iteration of the algorithm Decrease by a constant factor J The size of a problem instance is reduced by the same constant factor usually two on each iteration of the algorithm Variable size decrease I A size reduction pattem varies from one iteration to another 0803102 02 BB Karle LSU DecreasebyOne I Decrease by one and conquer technique is common I Exponentiation problem of computing aquot Exploit relation aquot aquot 39 1 a 7 Top down solution uses the recursive definition 139a for n gt1 Cl fornzl 7 Bottom up solution multiplies a by itself 11 l I Requires On operations 0803102 03 BB Karkl39 LSU DecreasebyHalf I Decrease by half and conquer technique is common at Decrease the problem size by a factor of 2 I Exponentiation problem of computing aquot 4quot Exploit relation aquot anm for even 11 Recursive approach an22 for evenngt l fn aquot39122a for odd n gt 1 a for n l 4 Requires Olog n operations I Note that the divideand conquer actually solves two instances of the problem of size n2 0803102 04 BB Karle LSU VariableSizeDecrease BST Search I Binary search tree B ST search is an excellent example of a variable size decrease algorithm I Searching for an element of a given value v in BST starts by comparing v With the tree s root Kr at If they match a desired element is found 4 If they do not match continue search in the left subtree of the root if v lt K r or in the right subtree if v gt K r I On each iteration of the algorithm the problem of searching in a BST is reduced to searching in a smaller BST Worst case efficiency n Average case efficiency log n 0803102 05 BB Karkl39 LSU Insertion Sort Ilnsertion sort is based on the decrease by one and conquer approach 4 Provided that a smaller array A0n 2 is already sorted now sort the original array A0n 1 4 Find an appropriate position for an elementAn 1 among the sorted n 1 elements and insert it there I Right to left scan 4 Scan the sorted subarray from right to left until the first element smaller than or equal to An 1 is encountered and then insert An 1 right after that element 0803102 063 BB Karle LSU Iterative Approach I Starting With Al and ending With An 1 Ai is inserted in its appropriate place among the first i elements of the array that have already been sorted Algorithm InsertionSortA0n1 89 I 45 68 90 29 34 17 for ilt 1 ton1 do VltA 45 8968 90 29 34 17 jlt i1 whilejao and AIgt vdo 4U11lt All 17 29 34 45 68 89 90 llt l39 Aj1lt V 0803102 0 BB Karkl39 LSU 0803102 Efficiency Basic operation is key comparison A gt v The worst case occurs When the input is already an array of strictly decreasing values i l Cmm Q E ltn2gt j0 The best case occurs When the input is already an array of strictly increasing values n l Chaim 21 n 1E om i1 The average case occurs for randomly ordered arrays i1 2 n l 1 n 2 C n z EGn 08 BB Karkl39 LSU Permutations Minimal Change Approach I Generate all n permutations for a set of integers 1 2 I Decrease by one approach Exploit the relation 11 n n 1 r Insert 11 in each of the 11 possible positions among elements of every permutations of n 1 elements either left to right or right to left I Bottom up minimal change algorithm it Each permutation is obtained from its immediate predecessor by exchanging just two elements in it Start 1 1 Insert 2 12 21 Insert 3 123 132 312 321 231 213 B B Karkl39 LSU 0803102 09 Permutations J ohnsonTrotter Algorithm l J ohnsonTrotter Algorithm generates 11 without explicitly generating permutations for smaller values of n l Associate a direction with each component k in a permutation Direction is represented by an arrow above the component The component k is said to be mobile if the arrow points to a smaller value adjacent to it a Forgiz the components 3 and 4 are mobile while 2 and 1 are not Algorithm Johnson Trotter n initialize the rst permutation with 15 while there exists a mobile integer k do nd the largest mobile integer k swap k and the adjacent integer its arrow points to reverse the direction of all integers that are larger than k CSC 3102 010 Example n 3 i i i i i i BB Karki LSU FakeCoin Problem I Among 11 identically looking coins one is fake assume that it is lighter The problem is how to detect the fake coin I Dividing 11 coins into two piles of nZ coins each leaVing one extra coin apart if n is odd 4 Put the two piles on the scale If the piles weigh the same the coin put aside must be fake otherwise we can continue with the lighter pile containing the fake coin Wn WnZD 1 for n gt 1 and Wl 0 the worst case For n 2k Wn W2quot k log2 n I DiViding 11 coins into three piles in Divide 11 coins into three piles of n339 n339 and n 2fn239 I Put first two piles on the scale The lighter one of two should contain the fake coin If both piles weigh equal then the third pile should contain the fake coin I After weighing two of the piles we are left to solve a single problem of the one third the original size So it is a decrease by a factor of 3 4 The recurrence relation for the number of weighing Wn in the worst case Wn Wn3 1 for n gt 1 and Wl 0 For n 3k Wn W3k k log3 n 0803102 011 BB Karkl39 LSU Ternary Search I Searching in a sorted array A0 1 n 1 I Search recursively by comparing the search key K With An3J and ifK is larger compare it With A2n3J to determine Which third of the array to continue the search I The recurrence relation for the number of the number of key comparisons in the worst case Cn Cn3 2 for n 3k k gt 0 and C1 1 The solution is Cn 2 log3 n 1 0803102 012 BB Karkl39 LSU Computing a Median I A median is an element that is greater than one half of the list s elements and smaller than the other half 4 Selection problem find the kth smallest element in a list of 11 numbers k n2 the median k 1 the smallest number k n the largest element I Algorithm based on partitioning of an array s elements into two subsets With p as a pivot r Set 1 Elements that are less than or equal to some value p Set 2 Elements that are greater than or equal to p 0803102 013 BB Karkl39 LSU Median by Partition I Lets be the partition s split position W a If s k the pivot p obviously solves the problem a If s gt k the kth smallest element in the entire list is the kth smallest element in the left part of the partitioned array a If s lt k proceed by searching for the ksth smallest element in its right part Which can be solved by the same approach iteratively I Thus the array size reduces in an unpredictable fashion from one iteration to another 4 Some size reduction less than half and some larger I See the example 4 1 10 9 7 12 8 2 15 given in page 186 of the textbook 0803102 014 BB Karkl39 LSU 080 3102 Topological Sorting B B Karkl39 LSU Digraph I A directed graph or digraph is a graph With directions specified for all its edges I Resulting forest are complex 394 DFS forest exhibits all four types of edges possible I Tree edges ab bc de I Back edges ha I Forward edges ac I Cross edges dc I Directed acyclic graph dag has no back edges I Directed cycle 4 A back edge in DFS forest can connect a vertex to its parent 394 The presence of a back edge indicates that diagraph has a directed cycle 080 3102 017 BB Karkl39 LSU Digraph Example I A part time student needs to take a set of five courses C1 C2 C3 C4 C5 only one course per term in any order as long as the following course prerequ1s1tes are met 4 C1 and C2 have no prerequisites in C3 requires C1 and C2 a C4 requires C3 1 C5 requires C3 and C4 I The situation can be modeled by a diagraph 4 Vertices represent courses 439 Directed edges indicate prerequisite requirements I Topological sorting problem How to list the vertices in such an order that for every edge uv in the graph the vertex u Where the edge starts is listed before the vertex v Where the edge ends a Find an ordering of digraph s vertices from left to right iv Solution is possible only if digraph is a dag This is a necessary and sufficient condition for topological sorting to be possible 080 3102 018 Digraph representing the prerequisite structure of five courses B B Karkl39 LSU Solving Topological Sorting Problem I Solution Verify Whether a given digraph is a dag and if it is produce an ordering of vertices a Two algorithms for solving the problem They may give different alternative solutions I DFS based algorithm 4 Perform DFS traversal and note the order in Which vertices become dead ends that is are popped of the traversal stack it Reversing this order yields the desired solution provided that no back edge has been encountered during the traversal I Source removal algorithm 5 Identify a source Which is a vertex With no incoming edges and delete it along With all edges outgoing from it J There must be at least one source to have the problem solved W Repeat this process in a remaining diagraph it The order in Which the vertices are deleted yields the desired solution 080 3102 019 BB Karkl39 LSU DFSBased Topological Sorting The poppingoff order C51 CS C4 CS Ct 02 C42 C33 The topologically sorted list C14 C25 02 Ct gtCS gtC4 gtCS v v I DFS traversal stack With the subscript numbers indicating the popping off order I All edges in the sorted list point from left to right I Time efficiency is in OV2 for the adjacency matrix representation and OIV E I for the adjacency linked list representation 4 Since the reversing requires only 9 IV I and it can stop before processing the entire digraph if a back edge is encountered 080 3102 020 B B Karkl39 LSU Source Removal Topological Sorting Delete C3 Delete C4 Delet C5 4 The solution obtained is C1 C2 C3 C4 C5 I If there are more than one sources C1 and C2 break the tie arbitrarily I On each iteration a vertex with no incoming edges is deleted from the digraph 080 3102 021 BB Karkl39 LSU Algorithm Design Techniques I Brute force I Divide and conquer I Decrease and conquer I Transform and conquer I Space and time tradeoffs I Dynamic programming IGreedy techniques 080 3102 01 BB Karle LSU Greedy Techniques 080 3102 02 BB Karkl39 LSU Basics I Constructing a solution to an optimization problem through a sequence of steps each expanding a partially constructed solution obtained so far until a complete solution to the problem is reached On each step the choice made must be feasible locally optimal and inrevocable I Examples Constructing a minimum spanning tree MST of a weighted connected graph J Grows a MST through a greedy inclusion of the nearest vertex or shortest edge to the tree under construction I Prim s and Kruskal s algorithms r Solving the single source shortest path problem I Finds shortest paths from a given vertex the source to all other vertices J Dijkstra s algorithm 4 Huffman tree and code I A binary tree that minimizes the weighted path length from the root to the leaves containing a set of predefined weights I An optimal prefix free variable length encoding scheme 080 3102 03 BB Karkl39 LSU 080 3102 Minimum Spanning Tree B B Karkl39 LSU MST Problem Statement I Given n points connect them in the cheapest possible way so that there will be a path between every pair of points a Graph representation eg a network 3 A complete graph on 4 Solve the minimum spanning tree problem four vemces Many spanning trees are possible I A spanning tree of a connected graph is its connected acyclic subgraph a tree that contains all its vertices I A minimum spanning tree of a weighted connected graph is its spanning tree of the smallest weight 4 The weight of a tree is the sum of the weights on all its edges 39 0 8 02 o J Sum of the lengths of all edges 2 If the edge weights are unique then there will be only one minimum spanning tree otherwise 3 more than one MST exist A mlnlmum Spannlng tree with W 6 csc 3102 5 B B Karki LSU Constructing a MST I Exhaustive search approach List all spanning trees and find the one with the minimum weight from the list 4 The number of spanning trees grows exponentially with the graph size I Efficient algorithms for finding a MST for a connected weighted graph Prim s algorithm RC Prim 1957 J Constructs a MST one vertex at a time by including the nearest vertex to the vertices already in the tree 4 Kruskal s algorithm J B Kruskal 1956 J Constructs a MST one edge at a time by selecting edges in increasing order of their weights provided that the inclusion does not create a cycle 080 3102 06 BB Karle LSU Prim s Algorithm I Constructs a MST through a sequence of expanding subtrees I The initial subtree VT in such a sequence consists of a single vertex selected arbitrarily from the set V of the n vertices of the graph I On each iteration we expand the current tree VT by simply attaching to it the nearest vertex not in that tree Find the smallest edge connecting VT to V VT I The algorithm stops after all the graph s vertices have been included in the tree being constructed 7 The total number of iteration is n 1 since exactly one vertex is added to the tree at each iteration I The MST is then defined by the set of edges used for the tree expansions 080 3102 0 BB Karle LSU Pseudocode I Each vertex u not in the current tree VT needs information about the shortest edge connecting it to a tree vertex v it Name of the nearest tree vertex and weight of the corresponding edge For vertices that are not adjacent to any tree vertex the name label is null and the weight is ilk Algorithm PrimG Input weighted connected graph G V E Output E the set of edges I To find a vertex u with the smallest weight label in the V VT I llcompvo mg a MST of G Attach two labels to each non tree vertex u ET 9 o T lt forilt 1toV1 do infinity V u such that V is in VT and Split non tree u vertices into two sets u is in V V a Fringe u s are adjacent to at least one tree vertex VT lt VT U U v Unseen u s are yet to be affected by the E lt ET U 6 algorithm Return ET find a minimum weight edge e V u among all edges 080 3102 08 I Two operations after finding vertex u to be added to the tree VT iv Move u from the set V VT to the minimum spanning tree VT e For each remaining vertex u in V VT that is connected by a shorter edge than u s current distance label update its labels by u and weight of the edge between u and u B B Karkl39 LSU Example I Tree vertices and remaining vertices Selected vertex on each iteration is shown in bold The labels indicate the nearest tree vertex and edge Weight I a ba 3 C 00 d30 ea 6 f215 I ba 3 cb 1 d oo ea 6 fb 4 I Cb 1 dc 6 ea 6 fb 4 I fb 4 df 5 ef 2 I ef 2 df 5 I df 5 MST with the minimum total weight of 15 080 3102 09 BB Karle LSU Correctness I Correctness can be proved by induction I TO consisting of a single vertex must be a part of any MST For ith inductive step assume that Tl1 is part 0 7 2 some T and then prove that Ti generated from Tl1 is also a part of MST 39 I Proof by contradiction 4 Let e v u be the smallest edge connecting a vertex in Tl1 to GTi1 to expand Tl1 to Tl 7 Suppose you have a tree T not containing e then show that T is not the MST I Because T is a spanning tree it contains a unique path from v to u which together with edge e forms a cycle in G This path has to include another edge f v u connecting Tl1 to GTi1 4 Tef is another spanning tree with a smaller weight than T as e has smaller weight than f 4 So T was not minimum which is what we wanted to prove 080 3102 010 BB Karkl39 LSU Efficiency I Efficiency depends on the data structures chosen for the graph itself and the priority queue of the set V VT whose vertex priorities are the distances edge weights to the nearest tree vertices I For a graph represented by its weight adjacency matrix and the priority queue implemented as an unordered array the running time is 8 IVIZ I The priority queue implemented with a min heap data structure is A complete binary tree in which every element is less than or equal to its children 4 The root contains the smallest element Deletion of smallest element and insertion of a new element in a min heap of size n are Olog 11 operations and so is the operation of changing an element s priority I For a graph represented by its adjacency linked lists and the priority queue implemented as a min heap the running time is OE log IVI 080 3102 011 BB Karkl39 LSU Pseudocode with MinHeap I Use a min heap to remember for each vertex Algorithm Prim WithHeaps G the smallest edge VT4 V0 connecting VT with that ET 9 vertex make a heap of values vertex edge W2 edge for i41 to WI 1 do I Perform IVI 391 steps in let u e wte have the smallest weight which we remove the in the heap smallest element in the remove u e wte from the heap heap and IEI steps 1n add u and 6 to VT Wthh we exam1ne an edge e v u for each edge e u u do if uis not already in VT find value u f wtf in heap if WKe lt W10 replace u f wtf with u e wte return ET 080 3102 12 B B Karkl39 LSU Kruskal s Algorithm I A greedy algorithm for constructing a minimum spanning tree MST of a weighted connected graph Finds an acyclic subgraph with W 1 edges for which the sum of the edge weights is the smallest r Constructs a MST as an expanding sequence of subgraphs which are always acyclic but are not necessarily connected until the final stage I The algorithm begins by sorting the graph s edges in non decreasing order of their weights and then scans the list adding the next edge on the list to the current subgraph provided that the inclusion does not create a cycle Algorithm KruskaIG ET lt 6 ecounterlt 0 k lt 0 while encounter lt M 1 for klt k 1 to ndo if ET U eLk is acyclic ET lt ET U eLk ecounterlt ecounter 1 return ET 080 3102 013 BB Karkl39 LSU I Sorted list of tree edges the selected edges are shown in red bc ef ab bf cf af df ae cd de 123 44 556 6 8 I Picking up any of the remaining edges cf af ae cd de Will create a cycle I For a graph of 6 vertices only five edges need to be picked up Total weight 15 080 3102 014 BB Karkl39 LSU Kruskal s Algorithm A Different View I A progression through a series of forests containing all vertices of a given graph and some of its edges 4399 The initial forest consists of WI trivial trees each comprising a single vertex of the graph 7 The final forest consists of a single tree MST I On each iteration operation the algorithm takes next edge u v from the ordered list of the graph edge s finds the trees containing the vertices u and v and if these trees are not the same unites them in a larger tree by adding the edge This avoids a cycle I Checking Whether two vertices belong to two different trees requires an application of the so called unionfind algorithm 4 The time efficiency of Kruskal s algorithm is in OE loglEl V 080 3102 015 BB Karkl39 LSU UnionFind Algorithm I Kruskal s algorithm requires a dynamic partition of some n element set S into a collection of disjoint subsets S1 S2 Sk Initialization each disjoint subset is one element subset containing a different element of S b r Union find operation acts on the collection of n one element subsets to give larger subsets I Abstract data type for the finite set makesetx creates an one element set x findx returns a subset containing x unionxy constructs the union of disjoint subsets containing x and y I Subset s representative Use one element from each of the disjoint subsets in a collection a Two principal implementations J Quick find uses an array indexed by the elements of the set and the array s values indicate the subset s representatives containing those elements Each subset is implemented as a linked list J Quick union represents each subset by a rooted tree With one element per node and the root s element as the subset s representative 080 3102 016 BB Karkl39 LSU SingleSource ShortestPaths Problem Problem Statement I For a given vertex called the source in a weighted connected graph find the shortest paths to all its other vertices it Find a family of paths each leading from the source to a different vertex in the graph 7 a The resulting tree is a spanning tree 4 A variety of applications exist J to find shortest route between two cities 4 Dijkstra s algorithm finds the shortest paths to the graph s vertices in order of their distance from a given source J Works for a graph With nonnegative A tree representmg a possmle edge weights shortest paths to four vertIces b d cand efrom the source a of path lengths of 3 5 7 and 9 I Different versions of the problem respectively a Single pair shortest path problem at Single destination shortest paths problem H the source is different then a it All pairs shortest paths problem different tree results 391 Traveling salesman problem 080 3102 018 BB Karkl39 LSU Dijkstra s Algorithm I Dijkstra s algorithm works in the same way as the Prim s algorithm does 7 Both construct an expanding subtree of vertices by selecting the next vertex from the priority queue of the remaining vertices and using similar labeling 3 However the priorities are computed in differently J Dijkstra s algorithm compares path lengths by adding edge weights while Prim s algorithm compares the edge weights as given I The algorithm works by first finding the shortest path from the source to a vertex nearest to it then to a second nearest and so on I In general before its ith iteration commences the algorithm has already identified the shortest paths to i 1 other vertices nearest to the source 7 These vertices the source and the edges of the shortest paths leading to them from the source form a subtree Tiof the given graph 4 The next vertices nearest to the source can be found among the vertices adjacent to the vertices of Ti I These adjacent vertices are referred to as fringe vertices They are the candidates from which the algorithm selects the next vertex to the source 080 3102 019 BB Karkl39 LSU Labeling I For every fringe vertex u the algorithm computes the sum of the distance to the nearest vertex v and the length dv of the shortest path from the source to v and selects the vertex with the smallest such sum I Each vertex has two labels 4 The numeric label d indicates the length of the shortest path from the source to this vertex found by the algorithm so far I When a vertex is added to the tree d indicates the length of the shortest path from the source to that vertex The other label indicates the name of the nexttolast vertex on such a path I The parent of the vertex in the tree being constructed I With such labeling finding the next nearest vertex u becomes a simple task of finding a fringe vertex with the smallest I value I After a vertex u to be added to the tree is identified perform two operations Move u from the fringe to the set of tree vertices ts For each remaining fringe vertex u that is connected u by an edge of weight wu u such that dwk wu u lt d update the labels of u by u and dirk wu u respectively 080 3102 020 B B Karkl39 LSU Pseudocode I Shows explicit operations on two sets of labeled vertices 15 The set VT of vertices for which a shortest path has already been found a The priority queue Q of the fringe vertices I Initialize initialize vertex priority queue to empty I Insert initialize vertex priority in the priority queue I Decrease update priority of s with ds I DeleteMin delete the minimum priority element 080 3102 Algorithm DijkstraG Input weighted connected graph G V E and its vertex s Output The length dv of a shortest path from s to v and its penultimate vertex pv for every vertex v in V Pv is the list of predecessors for each v Initialize Q for every vertex Vin Vdo dv lt oo pv lt Q Insert 0 V dv dv lt 0 Decrease Q s d5 VT lt 6 for ilt 0to WI 1 do u lt DeleteMinQ VTlt VT U u for every vertex u in V VT that is adjacent to u do if du wu u lt d d lt du wu u p lt u Decrease 0 u du 1331 1101 Example I Tree vertices and remaining vertices Selected vertex on each iteration is shown in bold The labels indicate the nearest tree vertex and path length I a O ba 3 CU 00 da 7 62 00 I ba 3 Cb 34 db 32 e co I db 5 cb 7 ed 54 I Cb 7 The shortest paths of four vertices ed 9 from the source vertex a b a b of length 3 d a b d of length 5 I d 9 e c abc of length eabd e oflength9 080 3102 022 B B Karkl39 LSU 080 3102 023 B B Karkl39 LSU Correctness Correctness can be proved by induction For i 1 the assertion is true for the trivial path from the source to itself For general step assume that it is true for the algorithm s tree T I With 139 vertices Let v 11 be the vertex to be added next to the tree by the algorithm All vertices on a shortest path from s to v closer to s than vi 1 H1 must be in T I because they are Hence the i1st closet vertex can be selected as the algorithm does by minimizing the sum of dv and the length of the edge from u to an adjacent vertex not in the tree 4quot dv is the shortest path from s to v contained in T i by the assumption of induction Efficiency I The time efficiency of Dijkstra s algorithm depends on the data structures chosen for the graph itself and the priority queue I For a graph represented by its weight matrix and the priority queue implemented an unordered array the running time is 8 IV IZ I For a graph represented by its adjacency linked lists and the priority queue implemented as a min heap the running time is OE log IVI 080 3102 024 B B Karkl39 LSU 080 3102 Huffman Tree and Code B B Karkl39 LSU Huffman Tree and Code I Huffman trees allow us to encode a text that comprises characters from some n character alphabet I Huffman code represents an optimal prefix free variable length scheme that assigns bit strings to characters based on their frequencies in a given text Uses a greedy construction of binary tree Whose leaves represent the alphabet characters and Whose left and right edges are labeled With 0 s and 1 s it Assigns shorter bits to hi gh frequency characters and longer ones to low frequency characters 080 3102 026 B B Karkl39 LSU 080 3102 Huffman s Algorithm Initialize n one node trees and label them with characters of the alphabet with the frequency of each character recorded as the weight in its tree s root Find two trees with the smallest weights and make them the left and right subtrees of a new tree and record the sum of their weights in the root of the new tree as its weight 4 The resulting binary tree is called Huffman tree Obtain the codeword of a character by recording the labels 0 or 1 on the simple path from the root to the character s leaf 4939 This is the Huffman code 4 It provides an optimal encoding Dynamic Huffman encoding 4r Coding tree is updated each time a new character is read from the source text 027 B B Karkl39 LSU Constructing a Huffman Coding Tree I See the text book for the following five character alphabet A B C D example character A B C D probability 035 01 02 02 codeword 11 100 00 01 Expected number of bit per character is 5 2 2419 225 i1 5 Variance Var EU Z2pl z 019 i1 In the fixed length scheme each codeword will contain three bits SO Huffman code results in compression by 3 225 075 which is 25 080 3102 028 015 101 B B Karkl39 LSU Algorithm Design Techniques 080 3102 01 BB Karle LSU Algorithm Design Techniques I Various design techniques exist 1 Classifying algorithms based on design ideas or commonality a General problem solving strategies I Brute force I Divideandconquer I Decreaseandconquer I Transformandconquer I Spaceandtime tradeoffs I Dynamic programming I Greedy techniques 080 3102 02 BB Karle LSU 080 3102 Algorithm Design Techniques Brute force aquot Selection sort Bruteforce string matching Convex hull problem a Exhaustive search Traveling salesman Knapsack and Assignment problems Divideandconquer if Master theorem Mergesort Quicksort Quickhull Decreaseandconquer a Insertion sort Permutations Minimal change approach J ohnsonTrotter algorithm 4 Fakecoin problem Ternary search Computing a median Topological sorting Transformandconquer 9quot Gaussian elimination Heaps and Heapsort Problem reduction Spaceandtime tradeoffs a String matching Horspool s algorithm BoyerMoore algorithm Dynamic programming a Warshall s algorithm for transitive closure Floyd s algorithms for allpairs shortest paths Greedy techniques MST problem Prim s algorithm Kruskal s algorithm Sets and set operations a Dijkstra s algorithm for singlesource shortest path problem 0 Huf nan tree and code More on algorithms 08 BB Karki LSU Brute Force 080 3102 04 BB Karkl39 LSU The Simplest Approach I Brute force the simplest of the design strategies 4 Is a straightforward approach to solving a problem usually directly based on the problem s statement and definitions of the concepts involved 9 Just do it the brute force strategy is easiest to apply Vi Results in an algorithm that can be improved With a modest amount of time I Brute force is important due to its Wide applicability and simplicity I Weakness is subpar efficiency of most brute force algorithms I Important examples Vi Selection sort String matching Convex hull problem and Exhaustive search 080 3102 05 BB Karkl39 LSU Selection Sort I Scan the list repeatedly to find the elements one at a time in an non decreasing order 4 On the ith pass through the list search for the smallest item among the last n i elements and swap it With Ai After n 1 passes the list is sorted Algorithm SelectionSort A0n1 Sor1s a given array nput An array A0n1 of orderable elements Output Array A0n1 sorted in ascending order for ilt 0to n2 do min lt i forjlt i1 ton1 do if A lt Amin min lt j swap AU and Amin n 2 n 1 n 2 Cn2 212n 1 i11Sn 1 i n2 i0 ji1 i0 080 3102 06 BB Karkl39 LSU BruteForce String Matching I Align the pattern against the first In characters of the text and start matching the corresponding pairs of characters from left to right until all m pairs match But if a mismatching pair is found then the pattern is shift one position to the right and character comparisons are resumed I The goal is to find i the index of the leftmost character of the first matching substring in the text such that ll p0tlj pj Tim1 pm1 Algorithm BruteForceStringMatching T0n 1 P0m 1 lmplements string matching lnput An array T0n 1 of 11 characters representing a text an array P0m 1 of m characters representing a pattern Output The position of the first character in the text that starts the first matching substring if the search is successful and 1 otherwise forilt 0to nmdo i lt 0 whilejlt m and P Ti do j lt j 1 if j m return i return 1 In the worst case the algorithm is mn But the average case efficiency is shown to be linear ie m n n for searching in random texts 080 3102 0 BB Karkl39 LSU ConvexHull Problem I The convex hull of any set S of n points is a convex polygon With the vertices at some of the points called extreme points of S so that the polygon contains all points either inside or on its boundary 9 A set of points on the plane is called convex if for any two points in the set the entire line The convex hull for the set of eight points is the segment with the end convex polygon with its vertices at P1 P5 P6 points at P and Q P7 and P3 belongs to the set 080 3102 08 BB Karle LSU BruteForce Approach for ConveXHull Problem I A brute force approach is based on the fact A line segment connecting two points i and j is a part of its convex hull s boundary if and only 1 if all other points of the set lie on the same side of the straight line through these points 5 Repeating this test for every pair of points yields a list of line segments that make up the conveX hull s boundary I Find the equation of the straight line through two points X1 yl and X2 y2 ax by c where a y2 yl b X1 X2 and c X1y2 y1X2 I This line divides the plane into two half planes 3 For all the points in one plane we have ax by gt c whereas for all other points in the other plane we have ax by lt c a For all points on the line itself ax by c 4 Check whether the eXpression ax by c has the same sign at each of these points J To check whether certain points lie on the same side of the line J If yes the line forms one side of the polygon I Efficiency of the algorithm 0n3 4 For each of nn 12 pairs of distinct points one needs to find the sign of ax by c for each of the other H 2 points 0st 3102 09 B B Karki LSU Exhaustive Search I Exhaustive search is a brute force approach to combinatorial problems a It suggests generating each and every combinatorial object of the problem selecting those of them of that satisfy the problem s constraints and then finding a desired object w lmpratical for all but applicable to very small instances of problems I Examples v Traveling salesman problem J Finding the shortest tour through a given set of n cities that visits each city exactly once before returning to the city Where it started Knapsack problem J Finding the most valuable list of out of n items that fit into the knapscak 5 Assignment problem J Finding an assignment of n people to execute n jobs With the smallest total cost I These problems are the examples of so called NPhard problems 080 3102 010 BB Karkl39 LSU I Traveling salesman problem Asks to find the shortest tour through a given set of 11 cities that visits each city exactly once before returning to the city Where it started I Finding the shortest Hamiltonian circuit of the graph v A cycle that passes through all the vertices of the graph exactly once w A sequence of 11 1 adjacent vertices V0 V1 V2 Vn1 V0 Get all tours by generating all the permutations of n 1 intermediate cities compute the tour lengths and find the shortest among them I Efficiency Total number of permutations n 12 Impractical but ok very small values of n 080 3102 011 Traveling Salesman Problem Tour abcda abdca acbda acdba adbca adcba Length 281718 231511 23 H 23 18 More than one optimal solutions B B Karkl39 LSU I Given 11 items of known weights wl w and values v1 v and a knapsack of capacity W find the most valuable subset of the items that fit into the knapsack I Consider all the subsets of the set of 11 items given 4 Computing the total weight of each subset in order to identify feasible subsets the ones with the total not exceeding the knapsack s capacity 1 Finding a subset of the largest value among them The number of subsets of an n element set is 2 l so the exhaustive search leads to 92quot algorithm 080 3102 Knapsack Subset Null 1 2 3 4 1 2 1 3 1 4 2 3 2 4 3 4 123 124 1234 012 Knapsack Problem Total weight items item4 Total value 0 42 12 40 25 36 not feasible not feasible 52 37 65 not feasible not feasible not feasible B B Karki LSU Assignment Problem Gwen 71 people m be assigned m execute 71 jobs one person per job find an assignment stance of the problem Job I Job 2 Job Cl Job 4 with the smallest mtal cost erson I 9 1 a Ci j is the cost for the im person msigned to em 2 1 h 1h h P son 1 5 i a t e lo son 4 1 9 4 unit 7 9Ml4 n instance of the assignment problem is g 7 R m 7 WM 7 n completely spemfied by its cost matrix C F 7 a ll 3 7 cost 34 V 21 7 3 8 1 8 er9 36 n V 4 c 39 V 9 A r I Select one element in each row so that all 5 9 C2 7 933 7 23 s laments are in different Columns and u i r u 1 i c i the smllest possible Remaining nerauons Permulallon 005 I Generate all the permutations of integers 1 lt2 1 gt 33 g 2 n computing the mtal cost of eich assignment by summing up the lt12 4gt 7mm 25 corresponding elements of the cost matrix a 2 3 3mm and finally select the one With smallest sum The optimal solullon Is I Cl 4gt Wllh he minimum cost oiici SopersonIlojohzperson2l010h1persm1lo I Complexity n opemtions job i and person 4 lojoh 4 cscaroa 5 5 Kim LSL Brute Force Summary I Simplest approach just do it I Selection sort n2 I String matching mn I Convex hull problem 113 I Exhaustive search at Traveling salesman problem n l2 at Knapsack problem 2 at Assignment problem 11 080 3102 014 BB Karkl39 LSU 080 3102 Notion of Algorithm B B Karkl39 LSU Program Algorithms Data Structures N Wirth Algorithms Data Structures Programs PrenticeHall Englewood Cliffs NJ 1976 080 3102 15 B B Karkl39 LSU What is an Algorithm I An algorithm is a sequence of unambiguous instructions for solving a problem 4 Obtaining a desired output for any legitimate input in a finite amount of time 7 Procedural solutions to problems problem I Computer is capable of understanding and following the instructions given I An input specifies an instance of the algorithm problem the algorithm solves l f mmputzer gt 1th 080 3102 16 B B Karkl39 LSU Motivation for Algorithm I Theoretical importance lt3 The study of algorithms represents the core of computer science Proving the existence of a solution to problem and investigating the algorithm s properties I Practical importance A practitioner s toolkit of known algorithms Framework for designing and analyzing algorithms for new problems 080 3102 1 B B Karkl39 LSU Example god Algorithms I Problem Finding the greatest common divisor god of two nonnegative not both zero integers m and n denoted gcdm n I Solution Several algorithms eXist for solving this problem at Eaclid s algorithm Apply repeatedly the equality gcdm n gcdn m mod 11 until 112 mod n is 0 The last value of m is the god 4 De nitionbased algorithm Start by checking Whether t minm n divides both 112 and 11 If it does t is the answer if not decrease t by 1 and try again The last value of t which divides both integers is the god Middleschool algorithm Find the prime factors of both 112 and 1 identify all the common factors Whose product is the god 080 3102 18 B B Karkl39 LSU Example Sequential Search Algorithm SequentiaISearch A0n 1 K Searches for a given value in a given array by sequential search nput An array A0n 1 and a search key K Output Returns the index of the first element of A that matches K or 1 if there are no matching elements ilt 0 while ilt n and AI a Kdo ilt i1 if ilt n return i else return 1 080 3102 19 B B Karkl39 LSU Example Binary Search Algorithm BinarySearch A0n 1 K rnpements a nonrecursive binary search nput An array A0n 1 sorted in ascending order and a search key K Output An index of the array s element that is equal to K or 1 if there is no such element Ilt 0 rlt n1 while Is rdo m lt I 02 if K Am return m else if Klt Am rlt m1 else Ilt m 1 return 1 080 3102 110 BB Karkl39 LSU Characteristics of Algorithm I Finiteness terminates after a finite number of steps I Definiteness rigorously and unambiguously specified I Input 4 valid inputs are clearly specified I Output 4 can be proved to produce the correct output given a valid input I Effectiveness 4 steps are sufficiently simple and basic 080 3102 111 BB Karkl39 LSU 080 3102 Algorithm Design Techniques I Brute force I Divideandconquer I Decrease and conquer I Transform and conquer I Space and time tradeoffs I Dynamic programming I Greedy techniques 01 B B Karkl39 LSU DivideandConquer 080 3102 02 BB Karkl39 LSU I Divide and conquer algorithm works as follows at A problem s instance is divided into several smaller instances of the same problem ideally of about the same size t The smaller instances are solved typically recursively I Sometimes a different algorithm is applied when instances become small enough 4 The solutions obtained for the smaller instances are combined to get a solution to the original problem I No necessary to combine in some cases I Divide and conquer technique is ideally suited for parallel computers in which each subproblem can be solved simultaneously by its own processors I Common case Dividing a problem into two smaller problems 080 3102 Basic Idea 08 BB Karkl39 LSU Master Theorem I A problem s instance of size n is often divided into two instances of size n2 In general an instance of size n can be divided into several instances of size 1119 with a of them needing to be solved a 2 l b 2 2 If n is a power of I say n bk the recurrence relation for running time is T n a Tnb fn called the general divideandconquer recurrence Where fn is a function that accounts for the time spent on dividing the problem into smaller ones and on combining their solutions I Master Theorem lf n E nd where d 2 0 in recurrence equation then nd a lt bd Tn E 911d log n a 9d nlogb a gt bd it Analogous results hold for the big oh and big omega notations I It can only establish solution s order of growth to within an unknown multiplicative constant but solving a recurrence equation with a specific condition can actually yield an exact answer 080 3102 04 BB Karle LSU Mergesort I Mergesort sorts a given array by dividing it into two halves sorting each of them recursively and then merging the two smaller sorted arrays into a single sorted one I See Figure 2 of the text book for the operation of the algorithm on the list 8 3 2 9 7 l 5 4 aquot First divide the list recursively into two halves until each subarray contains one element t Then merge the subarrays two at a time to create a sorted array Algorithm Mergesort A0n 1 Sorts a given array nput An array A0n 1 of orderable elements Output Array A0n 1 sorted in nondecreasing order Hngt1 copy A0n2 1 to B0n2 1 copy An2 n 1 to C0 n2 1 MergesortB0n2 1 MergesortC0 n2 1 MergeB C A 080 3102 05 BB Karkl39 LSU Merging I Merges two sorted arrays using two pointers array indices i and j by comparing the elements currently pointed 12 The smaller of them is added to a new array being constructed Algorithm Merge B0p 1 C0q 1 A0p q 1 Merges two sorted arrays into one sorted array nput Arrays B0p 1 and C0q 1 both sorted Output Sorted array A0p q 1 of the elements of B and C ilt 0 jlt 0 klt 0 while iltpand jlt qdo if BlllsCll Ak lt Bi ilt i1 elseAk lt Cjjlt j 1 klt k1 copy Cjq 1 to Akp q 1 else copy bip 1 to Akp q 1 080 3102 06 BB Karle LSU 080 3102 07 Mergesort s Efficiency For simplicity assume that n is a power of 2 so the recurrence relation for the number of keys comparisons is Cn 2Cn2 Cmergem for n gt 1 C1 0 each C n 2 corresponds to one of two subarrays after dividing the array of size n Both subarrays have the same size of n2 The worst case efficiency The number of key comparisons performed during the merging state Cworstn n 1 So the worst case recurrence becomes Cworstn 2 CworstmZ n 1 for n gt 1 C Using Master Theorem a2b2andd1soabd andhence Cworstn E n log n 10 worst For n 2quot one can actually find the exact solution Cworstn n logz n n 1 The time efficiency is in n log n in all cases with the number of key comparisons being very close to the theoretical minimum It s principal drawback is a significant extra storage requirement B B Karkl39 LSU Quicksort I Quicksort sorts a given array by partitioning its input s elements according to their value relative to some preselected element I It rearranges elements to achieve its partition a situation Where all elements before some position s are smaller or equal to As and all the elements after position s are greater than or equal to A s E Continue sorting the two subarrays of the elements preceding and following As independently by the same method left subarray 7 ribsubarray Algorithm QuicksortAI r Sorts a subarray nput A subarray AI r of AOn 1 defined by its left and right indices land r Output The subarray AI r sorted in nondecreasing order if I lt r s lt PartitionAI r s is a split position Quicksortl s 1 Ouicksorts 1 r 080 3102 08 BB Karle LSU Achieving Partition I Select a pivot Simply chose subarray s first element p Al v Use median of three partitioning I Achieve a partition by the lefttoright and righttoleft scans The left to right scan starts With the second element skips over elements that are smaller than pivot and stops on encountering the first element greater than or equal to the pivot J The elements smaller than the pivot are in the first part of the subarray The right to left scan starts With the last element of the subarray skips over elements that are larger than the pivot and stops on encountering the first element smaller than or equal to the pivot J The elements larger than the pivot are in the second part of the subarray I Three scan situations If scanning indices i and j have not crossed ie i lt j exchange Ai and A39 and resume the scans incrementing i and decrementing j respectively 39 If the scanning indices have crossed over ie i gt j we have partitioned the array after exchanging the pivot With A39 a If the scanning indices stop While partitioning to the same element ie i j the value they are pointing must be equal to p Thus we have partitioned the array 080 3102 09 BB Karle LSU 080 3102 Q Partition Pseudocode I An example of quicksort An array of size n 8 is 5 3 1 9 8 2 4 7 See figure 43 in the text book I Use a pivot AO 5 the first element of the array Algorithm Partition AI r Partitions a subarry using its first element as a pivot lnput A subarray AI r of A0n 1 defined by its left and right indices I and Output A partition of AI r with the split position returned as this function s value p lt AI ilt I jlt r 1 repeat repeat ilt i1 until Ai 2p repeatjlt j 1 until A s p swap AI AM until izj swapAi A l undo last swap when i2 swapAl AUD return j B B Karkl39 LSU Quicksort s Efficiency I Number of key comparisons made before a partition is achieved a n 1 if the scanning indices cross over 4 n if they coincide I The best case occurs if all the splits happen in the middle of corresponding subarrays The bestcase recurrence is Chestm 2 Chestn2 n for n gt 1 Chestl 0 Using Master Theorem a2b2andd l soazbd andhence Chestm E n log2 n For n 2quot one can actually find the exact solution CbeStn n log2 n 080 3102 011 BB Karkl39 LSU 080 3102 Quicksort s Efficiency Worst Case I The worst case occurs if all splits Will be skewed to the extreme one of the subarrays Will be empty While the size of other Will be just one less than size of a subarray being partitioned This is true for a strictly increasing array and using A0 as the pivot It requires n l comparisons The recurrence relation is Cworstn Cworstm l n l for n gt 1 Cworst0 0 Solving the relation gives Cworstm E nz B B Karkl39 LSU 080 3102 Quicksort s Efficiency Average Case Average case recurrence Cann 1n En 1 Cans C Can0 C n l s forngt1 avg 120 avg The sum goes fron s 0 to n 1 The terms Cans and Cann 1 s are for the left and right subarrays Solving the relation Convert the equation to the following form An An 1 2n 1 Where An Cnn 1 to get Cann z Zn In 11 z 138m log2 n B B Karkl39 LSU QuickHull QH Algorithm I Quicksort compute Convex Hull 7 Find the smallest convex polygon that contains 11 given points in the plane a Points are sorted by x coordinate values as primary keys and y coordinates as secondary keys I Identify extreme points P1 and PIl This line divides whole set S of points into two subsets SI and 82 k The convex hull of S is composed of upper and lower hulls which can be constructed independently I Compute upper hull find point Pmax that is farthest away from line P1PIl 4 compute the hull of the points to the left of line P1P max a compute the hull of the points to the 0 left of line P P 39 max 11 I Compute lower hull ina similarmanner quot 080 3102 014 BB Karkl39 LSU QH Efficiency I Finding point farthest away from line P1P2 can be done in linear time Maximize the area of the triangle With the line as its base 7 Determine the sign of the determinant formed by three points P1Xlyl P2X2y2 and P3 X3y3 to decide Whether or not the third point P3 is to the left of the line X1 y1 1 X2 y2 1 X3 y3 l I This gives same efficiency as quicksort 4 Best case When all points lie on the same line n 4 Worst case When points Pi i 2 111 of the set are obtained by placing them successively in the middle of the circumference s upper arch between PH and P nz 4 Average case n log n or better since points also lie in the interior of the polygon 080 3102 015 BB Karkl39 LSU DecreaseandConquer 0803102 01 BB Karkl39 LSU Basics I Decrease and conquer algorithm works as follows Establish the relationship between a solution to a given instance of a problem and a solution to a smaller instance of the same problem 4 Exploit this relationship either top down recursively or bottom up without a recursion I Three variations 21gt Decrease by a constant I The size of an instance is reduced by the same constant usually one at each iteration of the algorithm Decrease by a constant factor J The size of a problem instance is reduced by the same constant factor usually two on each iteration of the algorithm Variable size decrease I A size reduction pattem varies from one iteration to another 0803102 02 BB Karle LSU DecreasebyOne I Decrease by one and conquer technique is common I Exponentiation problem of computing aquot Exploit relation aquot aquot 39 1 a 7 Top down solution uses the recursive definition 139a for n gt1 Cl fornzl 7 Bottom up solution multiplies a by itself 11 l I Requires On operations 0803102 03 BB Karkl39 LSU DecreasebyHalf I Decrease by half and conquer technique is common at Decrease the problem size by a factor of 2 I Exponentiation problem of computing aquot 4quot Exploit relation aquot anm for even 11 Recursive approach an22 for evenngt l fn aquot39122a for odd n gt 1 a for n l 4 Requires Olog n operations I Note that the divideand conquer actually solves two instances of the problem of size n2 0803102 04 BB Karle LSU VariableSizeDecrease BST Search I Binary search tree B ST search is an excellent example of a variable size decrease algorithm I Searching for an element of a given value v in BST starts by comparing v With the tree s root Kr at If they match a desired element is found 4 If they do not match continue search in the left subtree of the root if v lt K r or in the right subtree if v gt K r I On each iteration of the algorithm the problem of searching in a BST is reduced to searching in a smaller BST Worst case efficiency n Average case efficiency log n 0803102 05 BB Karkl39 LSU Insertion Sort Ilnsertion sort is based on the decrease by one and conquer approach 4 Provided that a smaller array A0n 2 is already sorted now sort the original array A0n 1 4 Find an appropriate position for an elementAn 1 among the sorted n 1 elements and insert it there I Right to left scan 4 Scan the sorted subarray from right to left until the first element smaller than or equal to An 1 is encountered and then insert An 1 right after that element 0803102 063 BB Karle LSU Iterative Approach I Starting With Al and ending With An 1 Ai is inserted in its appropriate place among the first i elements of the array that have already been sorted Algorithm InsertionSortA0n1 89 I 45 68 90 29 34 17 for ilt 1 ton1 do VltA 45 8968 90 29 34 17 jlt i1 whilejao and AIgt vdo 4U11lt All 17 29 34 45 68 89 90 llt l39 Aj1lt V 0803102 0 BB Karkl39 LSU 0803102 Efficiency Basic operation is key comparison A gt v The worst case occurs When the input is already an array of strictly decreasing values i l Cmm Q E ltn2gt j0 The best case occurs When the input is already an array of strictly increasing values n l Chaim 21 n 1E om i1 The average case occurs for randomly ordered arrays i1 2 n l 1 n 2 C n z EGn 08 BB Karkl39 LSU Permutations Minimal Change Approach I Generate all n permutations for a set of integers 1 2 I Decrease by one approach Exploit the relation 11 n n 1 r Insert 11 in each of the 11 possible positions among elements of every permutations of n 1 elements either left to right or right to left I Bottom up minimal change algorithm it Each permutation is obtained from its immediate predecessor by exchanging just two elements in it Start 1 1 Insert 2 12 21 Insert 3 123 132 312 321 231 213 B B Karkl39 LSU 0803102 09 Permutations J ohnsonTrotter Algorithm l J ohnsonTrotter Algorithm generates 11 without explicitly generating permutations for smaller values of n l Associate a direction with each component k in a permutation Direction is represented by an arrow above the component The component k is said to be mobile if the arrow points to a smaller value adjacent to it a Forgiz the components 3 and 4 are mobile while 2 and 1 are not Algorithm Johnson Trotter n initialize the rst permutation with 15 while there exists a mobile integer k do nd the largest mobile integer k swap k and the adjacent integer its arrow points to reverse the direction of all integers that are larger than k CSC 3102 010 Example n 3 i i i i i i BB Karki LSU FakeCoin Problem I Among 11 identically looking coins one is fake assume that it is lighter The problem is how to detect the fake coin I Dividing 11 coins into two piles of nZ coins each leaVing one extra coin apart if n is odd 4 Put the two piles on the scale If the piles weigh the same the coin put aside must be fake otherwise we can continue with the lighter pile containing the fake coin Wn WnZD 1 for n gt 1 and Wl 0 the worst case For n 2k Wn W2quot k log2 n I DiViding 11 coins into three piles in Divide 11 coins into three piles of n339 n339 and n 2fn239 I Put first two piles on the scale The lighter one of two should contain the fake coin If both piles weigh equal then the third pile should contain the fake coin I After weighing two of the piles we are left to solve a single problem of the one third the original size So it is a decrease by a factor of 3 4 The recurrence relation for the number of weighing Wn in the worst case Wn Wn3 1 for n gt 1 and Wl 0 For n 3k Wn W3k k log3 n 0803102 011 BB Karkl39 LSU Ternary Search I Searching in a sorted array A0 1 n 1 I Search recursively by comparing the search key K With An3J and ifK is larger compare it With A2n3J to determine Which third of the array to continue the search I The recurrence relation for the number of the number of key comparisons in the worst case Cn Cn3 2 for n 3k k gt 0 and C1 1 The solution is Cn 2 log3 n 1 0803102 012 BB Karkl39 LSU Computing a Median I A median is an element that is greater than one half of the list s elements and smaller than the other half 4 Selection problem find the kth smallest element in a list of 11 numbers k n2 the median k 1 the smallest number k n the largest element I Algorithm based on partitioning of an array s elements into two subsets With p as a pivot r Set 1 Elements that are less than or equal to some value p Set 2 Elements that are greater than or equal to p 0803102 013 BB Karkl39 LSU Median by Partition I Lets be the partition s split position W a If s k the pivot p obviously solves the problem a If s gt k the kth smallest element in the entire list is the kth smallest element in the left part of the partitioned array a If s lt k proceed by searching for the ksth smallest element in its right part Which can be solved by the same approach iteratively I Thus the array size reduces in an unpredictable fashion from one iteration to another 4 Some size reduction less than half and some larger I See the example 4 1 10 9 7 12 8 2 15 given in page 186 of the textbook 0803102 014 BB Karkl39 LSU Algorithm Design Techniques I Brute force I Divide and conquer I Decrease and conquer I Transform and conquer I Space and time tradeoffs I Dynamic programming I Greedy techniques 080 3102 02 BB Karle LSU Dynamic Programming 080 3102 03 BB Karkl39 LSU Basics Dynamic Programming is a general algorithm design technique Invented by Richard Bellman in the 1950s to solve optimization problems 39 Programming here means planning I Main idea Solve several smaller overlapping subproblems Record solutions in a table so that each subproblem is solved only once 4 Final state of the table Will be or contain solution I Examples 4 Computing a binomial coefficient 9 Warshall s algorithm for transitive closure Floyd s algorithm for all pairs shortest paths 9 Constructing an optimal binary search tree at 080 3102 04 BB Karkl39 LSU Binomial Coefficient I Binomial formula a b Cn0a Cnka kbk Cnnb TWO properties are Cnk Cn 1k 1 Cn 1k for n gt k gt O Cn0 Cnn 1 I Solving the recurrence relation by the dynamic programming method 4 Record the values of the binomial coefficients in a table n1 by k1 0 1 2 k1 k 1 1 1 Cn 1 k1 Cn1k Cnk B B Karkl39 LSU 080 3102 05 Pseudocode Algorithm Binomialn k V Computes Cnk by the dynamic programming algorithm Input A pair of nonnegative integers n 2 k2 O Output The value of Cnk for i lt 1 to n do forjlt 1 to mini k do if j 0 or j i Cii1lt 1 else Cij lt Ci 1j 1 Ci 11 return an k Time efficiency nk Space efficiency nk 080 3102 06 BB Karkl39 LSU Warshall s Algorithm I The transitive closure of a directed graph is the nbyn boolean matrix T til in Which the element ti in the ith row and the jth column is 1 if there exists a directed path from the ith vertex to jth vertex otherwise ti is 0 I Performing traversal DFS or BFS for every vertex as a starting point yields the transitive closure in its entirety I Warshall s algorithm is an algorithm based on the idea of dynamic programming a b c d a b c d a o 1 o o a 1 1 1 1 A b o o o 1 T b 1 1 1 1 C o o o o C o o o o d 1 o 1 o d 1 1 1 1 Digraph Adjacency matrix Transitive closure 080 3102 0 BB Karle LSU Matrices for Transitive Closure I Warshall s algorithm constructs the transitive closure of a given digraph With n vertices through a series of n by n boolean matrices Rm RU 1 RU R I The rij element of matrix Ra k 0 1 n is equal to 1 if and only if there exists a directed path from the ith vertex to the jth vertex With each intermediate vertex if any numbered not higher than k 4 RW is simply the adjacency matrix I Does not allow any intermediate vertices in its paths 4 R contains the information about paths that can use the first vertex as intermediate J May contain more 1 s than Rm e In general each subsequent matrix in the series has one more vertex to use as intermediate for its paths than its predecessor I The matrix Rquot reflects paths that can use all n vertices of the digraph as intermediate This is the digraph s transitive closure 080 3102 08 BB Karkl39 LSU R00 and Rltk 1gt I We can compute all the elements of each matrix RU from its immediate predecessor RU 39 1 in the series I For R0 r516 1 means vi a list of intermediate vertices each numbered not higher than k V I Two situations regarding this path are possible I First The list of its intermediate vertices does not contain the kth vertex in Then this path has intermediate vertices numbered not higher than k 1 k l my 1 I Second The path does contain the kth vertex vk among its intermediate vertices so the vk occurs only once in that list vi vertices numbered 3 k I vk vertices numbered 3 k 1 VJ k l e First part 1 and second part 1 1 080 3102 09 BB Karle LSU Rule for Changing Zeros I Formula for generating the elements of matrix Ra from the elements of matrix Ra 39 1 k k l k l k l 13 rij rik and rkj or I Following rule nOW follows from this formula 4 If an element rij is 1 in Rk391 it remains 1 in R0 4 If an element ri is 0 in Rk391 it has to be changed to 1 in Ra if and only if the element in its rOW i and column k and the element in its column j and rOW k are both 1 s in R0 1 i L j k Hk391k I1 I gt 300 k 1 T l 0 gt1 i 1 1 080 3102 010 BB Karkl39 LSU Pseudocode Algorithm WarshaIIA0n 1n Implements Warshall s algorithm Input Adjacency matrix A of a digraph of n vertices Output Transitive closure of the digraph H0 lt A for k lt 1 to n do for i lt 1 to n do for j lt 1 to n do FWD 1 lt R kquoti 1 0r R kquoti k and R kquotk I return Hquot Time efficiency n3 Improve by restructuring the innermost loop Space efficiency Separate matrices needed for recording intermediate results Avoid using extra memory for storing elements of the algorithm s intermediate matrices 080 3102 011 BB Karkl39 LSU Digraph ab cd a010 H2b0001 CI0000 d111 Paths With intermediate vertices numbered no higher than 2 that is just vertices a and b TWO neW paths 080 3102 30 a b c d a O 1 0 0 b 0 0 0 1 C O O O 0 d 1 0 1 0 Adjacency matrix paths With no intermediate vertices a b c d a 390 1 0 39139 H3 b 0 0 0 1 C O O O O d1 1 1 1 Paths With intermediate vertices numbered no higher than 3 that is a b and c No neW paths 012 ab cd a390T00 Rmb0001l 0000 d110 Paths With intermediate vertices numbered no higher than 1 that is just vertex a One neW path Paths With intermediate vertices numbered no higher than 4 that is a b c and d New five paths are added New five paths B B Karkl39 LSU Floyd s Algorithm I Solves all pairs shortest paths problem it Find the distances the lengths of the shortest paths from each vertex to all other vertices I Record the lengths of the shortest paths in an nbyn matrix D dy in Which the element di in the ith row and the jth column indicates the length of the shortest path from the ith vertex to the jth vertex I Floyd s algorithm generates the distance matrix a b c d a b c d a o oo 3 00 a o 10 3 4 W b 2 o oo 00 D b 2 o 5 6 C00 7 o 1 C 7 7 o 1 d6 00 oo o d 6 16 9 o Digraph Weight matrix Distance matrix 080 3102 013 BB Karkl39 LSU Distance Matrices I Floyd s algorithm computes the distance matrix of a weighted graph with n vertices through a series of n by n matrices Dlt0gt Dltk1gtDltkgt Dquot I The dij element of matrix D0 k 0 1 n is equal to the length of the shortest path among all paths from the ith vertex to the jth vertex with each intermediate vertex if any numbered not higher than k 4 Dan is simply the weight matrix I Does not allow any intermediate vertices in its paths k D contains the lengths of the shortest paths that can use the first vertex as intermediate a In general each subsequent matrix in the series has one more vertex to use as intermediate for its shortest paths than its predecessors I DO from D 39 1 with one more vertex kth vertex used as intermediate I The matrix Dquot contains the lengths of the shortest paths among all paths that can use all n vertices of the digraph as intermediate This is the distance matrix being sought 080 3102 014 BB Karkl39 LSU m and Dltk1gt I We can compute all the elements of each matrix D0 from its immediate predecessor Da 39 1 in the series I For D0 dymeans shortest paths vi a list of intermediate vertices each numbered not higher than k V I Partition all such paths into two disjoint subsets 4 The list of its intermediate vertices does not contain the kth vertex Then this path has intermediate vertices numbered not higher than k 1 ally 1 shortest length a The path does contain the kth vertex vk among its intermediate vertices so the vk occurs only once in that list di thus means vi vertices numbered 3 k I vk vertices numbered 3 k 1 vj Each of the paths is made up of 1and dig 1 080 3102 015 BB Karkl39 LSU Recurrence for the Shortest Paths I Recurrence for D0 from the elements of matrix DU 1 k k k k 0 dfjmind j 1dfk 1Halli 1 for k21 dfjwlj I Following rule nOW follows from this formula k 1 at Replace d by the sum of the elements 4 and if and only if the latter l is smaller than its current value kl dij 080 3102 016 BB Karkl39 LSU Pseudocode Algorithm FondW0n 1n lllmplements Floyd s algorithm nput Weight matrix Wof a graph of n vertices Output The distance matrix of the shortest paths lengths 00 lt W for klt 1 to n do for ilt 1 to n do forjlt 1 to n do DWII39 1lt min D kquoti1 D kquoti k D kquotk 1 return Dquot Time efficiency n3 Space efficiency Separate matrices for recording intermediate results Both time and space efficiencies can be improved as in the case of Warshall s algorithm 080 3102 017 BB Karkl39 LSU abcd aOoo 00 02b205oo C 701 160020 Lengths of the shortest paths With intermediate vertices numbered no higher than 2 that is just vertices a and b One path ca 080 3102 Example a b c d aIO 00 3 oo nob2 0 oo 00 C 7 O 1 d6 00 00 O Weight matrix Lengths of the shortest paths With no intermediate vertices Lengths of the shortest pats With intermediate vertices numbered no higher than 3 that is a b and 0 Four paths ab ad bd and db 018 Smo39m Lengths of the shortest paths With intermediate vertices numbered no higher than 1 that is just vertex a Two paths 196 and dc cd 34 56 01 9o Lengths of the shortest paths With intermediate vertices numbered no higher than 4 that is a b c and d A neW shortest path from a to c B B Karki LSU 080 3102 Algorithm Design Techniques I Brute force I Divide and conquer I Decrease and conquer I Transformandconquer I Space and time tradeoffs I Dynamic programming I Greedy techniques 01 B B Karkl39 LSU TransformandConquer 080 3102 02 BB Karkl39 LSU Basic Idea I Transform and conquer algorithm works as a two stage procedure 4 Transformation modify the problem to be more amenable to solution 4 Conquer solve the problem I Three major variations 4 Instance simplification Gaussian elimination AVL trees 4 Representation change 2 3 Tree Heaps 4 Problem reduction Counting paths in a graph linear programming 080 3102 BB Karkl39 LSU Gaussian Elimination I Solving a system of 12 linear equations in n unknowns aux1 aux2 alnxn b1 In matrlx notatlon allx1 anx2 a2an 92 AX b al1 a12 a1 b1 amx1 anzx2 annxn bn a a a b 21 22 Zn 2 A b I Substitution method an1 an2 am bn 7 Not su1table When 11 1s large I Gaussian elimination I I I I 7 Transform a given system of n equations an an am bl to an equivalent system With an upper 0 I I b a22 a2 2 tnangular coeff1c1ent matrix A b o I I Ax b 2 A x b 0 0 am bn 080 3102 04 BB Karle LSU Upper Triangular Matrix Coefficient I A series of elementary operations 4 Exchanging two equations of the system lt9 Replacing an equation With its nonzero multiple 6 Replacing an equation With a sum or difference of this equation and some multiple of another equation I Elementary operation does not change a solution to a system I Normal pivoting 4 Use ail as a pivot to make all xi coefficients zeros in the equations below ith one I Partial pivoting 3 Choose a row With the largest absolute value of the coefficient in the ith column 080 3102 05 BB Karle LSU Gauss Elimination Algorithm I Implements the elimination stage with partial pivoting Algorithm BetterGaussEimination A1n 1n b1n forilt 1 to ndoAi n 1 lt bi forilt 1 to n1 do pivotrowlt i forjlt i 1 ton do if Ai I gt Apiv0trow ijl pivotr0wlt j forklt i to n1 do swap Ai k Apiv0tr0w k forjlt i 1 ton do temp lt Ai I AU I forklt i to n1 do Ali k lt AU k AU k temp I Time efficiency The second stage back 11 1 n n1 3 substitution is in nz n n 1 2n 5 n cltngt2221geam i1 ji1ki The total efficiency is cubic 080 3102 06 BB Karki LSU Heaps I A heap is a binary tree satisfying two conditions it Shape requirement The tree is complete so that all its levels are full except that only some rightmost leaves may be missing I The height of the tree is h log2 n 7 Parental dominance The key at each node is greater than or equal to the keys at its children J The root always contains the largest element I Keys are ordered top down Index 0 1 2 3 4 5 6 I A node of a heap considered With all its descendants is alsoaheap values 10 5 7 4 2 1 parents leaves I Implementation as an array H 1n Hi z maXH2i H2i 1 for i 1 142 Parental node keys first fl2 positions a Leaf keys last 142 positions 080 3102 0 BB Karle LSU Constructing a Heap BottomUp Algorithm I Two steps heap construction A heal for the liSt 2a 9a 7a 6a 5a 8 a Initializes a complete binary tree With n nodes by placing keys in the order ln39t39al complete b39nary tree given r Heapifies the tree I Establish parental dominance starting With the last parental node I See the text for the pseudocode page 226 I Time efficiency the worst case After heapi cation as Each key on level i of the tree travel to the leaf level during heapification Cwmxn 312201 i 320 021 i 0 keys 2n log2n 1 e 0n 080 3102 08 Constructing a Heap TopDown Algorithm I Successive insertions of a new key into a previously constructed heap I The new key is sifted up Via a swap with its parent until it is not larger than its parent or it reaches the root I Time efficiency of insertion is in 0log n 080 3102 09 BB Karkl39 LSU Key Deletion from a Heap I Deleting the root s key maximum key deletion from a heap 4 Exchange the root s key With the last key K of the heap 4 Decrease the heap s size by 1 a Heapify the smaller tree by establishing the parental dominance for K at Time efficiency is in 0log n I Deleting an arbitrary key v it Search for v by sequential search in H 1n Matching is found in position i so v Hi a Delete the element H i by using three step process as before 0 Time efficiency is in 0n 010g n Deleting the root s key 080 3102 010 BB Karkl39 LSU Heapsort I A two stage sorting algorithm 91 Stage 1 Heap construction Construct a heap for a given array 91 Stage 2 Maximum deletions Apply the root deletion operation n l times to the remaining heap I The array elements are eliminated in decreasing order and are placed in a neW array in ascending order iv Sort the array 2 9 7 6 5 8 by heapsort Stagel Stage2 297658 968257 298657 768259 298657 86725 928657 56728 968257 2 25 2 080 3102 011 BB Karkl39 LSU Heapsort Time Efficiency I 1 First heap construction stage C1n E On I Second stage C2n s 2log2n 1J 2log2n 2 2log21J 1 n l s 2210g2 i s zglogzm 1 2n 1log2n 1 s 2nlog2 n E Onlog n i1 i1 I Total efficiency is in 0n log n or more accurately in n log n 080 3102 012 BB Karkl39 LSU Problem Reduction I Problem reduction strategy Algorithm A original problem solved Reduction to be solved solvable by algorithm A I Reducing geometric problems to algebraic ones 9 Solve the question about relative positions of three arbitrary points in the plane by finding the sign of a determinant 41 The sign determines Whether the point P3x3 y3 lies to the left or right of the directed line through points P1x1 yl and P2x2 yz x1 3 1 1 P3 x2 y2 1x1y2x3y1x2y3x3y2x2y1x1y3 Pz x3 y3 1 P1 080 3102 013 BB Karkl39 LSU Computing lcm Compute the least common multiple lcm of two integers m and n by solving the problem gcd greatest common divisor m n lcmmn gcdmn I Efficient algorithm Euclid s algorithm exists for finding gcd 4 Apply repeatedly the equality gcdmngcdnm mod n until m mod n is 0 The last value of m is the gcd 080 3102 014 BB Karkl39 LSU Counting Paths in a Graph I Count the number of paths of between two vertices in a graph by Y computing an appropriate power of its adjacency matrix A I Number of different paths of length k gt 0 from the ith vertex to the jth vertex equals the i jtl 1 element of Ak 4 A and A2 gives the number of paths of length l and 2 respectively between the corresponding vertices of the graph abcd abcd ao111 a3o11 Ab1ooo A2bo111 c1001 c1121 d1o1o d1112 080 3102 015 BB Karkl39 LSU Reduction of Optimization Problems I Reduce a minimization problem to a v maximization problem I To minimize a function we can maximize its negative instead and change the sign of the answer min f x maXfx I Opposite is also true maXfx minfx I Can be generalized to the case of function optimization 080 3102 016 BB Karkl39 LSU Linear Programming I A problem of optimizing a linear function of several variables subject to constraints in the form of linear equations and linear inequalities I Solve the knapsack problem by reducing it to an instance of linear programming problem n maximize 2 V jx J j1 n subject to 2Mij S W j1 X E01 forj 1 n 080 3102 017 BB Karkl39 LSU 080 3102 BTrees B B Karkl39 LSU Multiway Trees I For any binary tree eg binary search tree AVL tree or even for 2 3 tree the Outdegree is restricted to two or three I As the trees grow in size their height can become significant 4 A binary tree With 1000 entries has height of at least 9 While a tree With 100000 entries has a height of at least 16 assuming that h 0 for single entry In case of unbalanced trees the height can be significantly larger I For multiway trees the outdegree is not restricted to two or three While retaining the general properties of binary search trees i A m way tree is a search tree in Which each node can have from zero to m subtrees Where m is defined as the order of the tree 1 A mway tree is not a balanced tree 4 A balanced m way search tree is a B tree 080 3102 2 BB Karkl39 LSU BTrees I A B tree provides an efficient index organization for data sets of structured records a Introduced by R Bayer and E McGreight in 1972 4 Extends the idea of the 2 3 tree by permitting more than a single key in the same node of a search tree I In a B tree all data records or record keys are stored at the leaves in increasing order of the keys 4 The parental nodes are used for indexing J Usually called B tree 4 When used for storing a large data file on a disk the nodes of a B tree usually correspond to the disk pages blocks 080 3102 3 BB Karkl39 LSU Parental Node of a BTree I Each B tree node contains 11 1 ordered keys K1 lt K2 lt K n l I The keys are interposed With n pointers or references to the node s children so that all the keys in subtree T0 are smaller than K1 all the keys in subtree T1 are greater than or equal to K1 and smaller than K2 With K1 being equal to the smallest key in T1 and so on I The keys of the last subtree Tm1 are greater than or equal to Km1 With Km1 being equal to the smallest key in TM I A nnode is shown here Po1K1 I P il IP139KFquot6 l 080 3102 4 BB Karkl39 LSU Properties of BTrees I The root is either a leaf or has between 2 and m children 4 A leaf has between 1 and m 1 entries or keys I Each node except the root and the leaves has between mZ and m children 6 Has keys between mZ 1 and m 1 BTree of order 4 I The tree is perfectly balanced 4 All its leaves are at the same level A Ode can have 2 3 or 4 children which means that a node can have 1 2 or 3 keys or data entries min entries 471o 1114 151619 I20 24 2528 I34 38 4o 43 46 5155 I60 68 80 080 3102 5 BB Karkl39 LSU Example of BTree Insertion I Inserting 65 into the B tree under restriction that the tree s leaves cannot contain more than three items I Can also be done by moving 60 the smallest key of the full leaf to its sibling With keys 51 and 55 and Lending the entry 60 to the left replacing the key value of their parent by 65 the new smallest value in the second child Inserting 65 471o 1114 151619 I20 24 2528 3438 4o 43 46 5155 6065 6880 080 3102 6 BB Karkl39 LSU Efficiency of BTree I Time efficiency depends on the height h of the tree er The number of nodes we need to access during a search for a record with a given key value is equal to the height of the tree plus one I Find the smallest number of keys a B tree of order m and height h can have 2 The root of the tree will contain at least one key in Level 1 will have at least two nodes with at least mZ 1 keys in each of them for the total minimum number of keys 2m2 1 2 Level 2 will have at least meZ nodes the children of the nodes at level 1 with at least mZ 1 in each of them for the total minimum number of keys 2fm2m2 1 394 In general the nodes at the level i will contain at least meZ39F391 mZ 1 keys 2 Finally level h the leaf level will have at least meZ39Vquot1 w Thus we have the following inequality h 1 m 1 1 m m h 1 n 21 Ezl l 1 zl l The searching in a Btree is a 0log n i1 2 2 2 o eration h1 p n 2 4i 1 The h s upper bound is 6 5 and 4 respectively for order m of 50 100 250 for a le of 100 h 5 log n 11 million entries lmZl 4 080 3102 7 BB Karkl39 LSU BTree Abstract Data Type I B Tree data structure Implementation BTREE requires three types of structures count ltintegergt Head structure BTREE root ltnode pointergt Data node NODE compare ltpointergt 4 Entry structure ENTRY end BTREE I Egree functions to build and maintain the N ODE a Basic a1 gorithms firstPtr ltnode pontergt J BTreeCreate BTreeInsert numEntr39es ltmtegergt BTreeDelete entriesm 1 ltENTRYgt 5r Tree data processing end NODE J BTreeRetrieval BTreeTraversal 4 Tree utility functions ENTRY J BTreeSearch BTreeEmpty key ltkeyTypegt BTreeFull BTreeCount data ltdataTypegt BTTCCDCSUOY rightPtr ltnode pointergt end ENTRY 080 3102 8 BB Karkl39 LSU Searching I Similar to searching in the binary search tree and even more so in the 2 3 tree I Starting With root we follow a chain of pointers to the leaf that may contain the search key I Search for the search key among the keys of that leaf I Since keys are stored in the sorted order at both parental nodes and leaves one can use binary search if the number of keys in a node is very large 080 3102 9 BB Karkl39 LSU Insertion I Apply search procedure to the new record s key K to find the appropriate leaf for the new record I If there is room for the record in that leaf place it there Insert logic requires several I If there is no room for the record the leaf is split in half algorithms by sending the second half of the records to a new node 2 The smallest key K in the new node and the pointer to it have to be ADT Interface function inserted into the 01d leaf s parent Internal functions Search node Insert node Insert entry and Split node functions I This recursive procedure may percolate up to the tree s root 6quot If the root is already full too a new root is created with the two halves of the 01d keys split between two children of the new root it One can avoid the possibility of recursive node splits by I Splitting full nodes as we are searching for an appropriate leaf for the new record I Moving a key to the node s sibling to avoid some node splits 080 3102 10 B B Karkl39 LSU BTree Deletion I Process involves several steps 1 Search for the data to be deleted as Once the data entry is deleted determine whether it has caused an underflow J A deletion that results in a node with fewer than minimum number of entries is an underflow as Correct the under ow structural deficiency as node delete backs out of the recursion I Delete logic requires two algorithms 4 ADT delete interface J This user interface function calls the recursive delete function and checks for root over ow at lntemal functions J Delete entry J Delete middle J Reflow 080 3102 11 B B Karkl39 LSU BTree Delete Internal Functions I Delete node Searches for the node delete key to be deleted If the key is not in the current node it calls itself recursively With a new subtree If the key is found it deletes the key by calling either delete entry or delete mid Then it checks for under ow and if necessary repairs the underflow reflow I Delete entry Removes the entry from a node and compresses it ie moves the entries to the right of the deleted key to the left I Delete middle When the data to be deleted are not in leaf node then the immediate predecessor ie the largest node on the left subtree of the entry to be deleted is used as the substitute data The function recursively follows the left subtree s right pointer in the last entry until it comes to a leaf node Then it replaces the deleted data With the data in the leaf s last entry and then physically deletes the predecessor from the leaf node I Reflow Brings the underflowed node up to a minimum state by adding at least one entry to it Balance shifts an entry from one sibling to another through the parent I Borrow left Checks Whether the left subtree is above the minimum and if it is burrows one entry from the left to right subtree Which has gone under owed J Borrow right Burrows one entry from the right subtree if it is above minimum to the under owed left subtree Combine Joins the data from an underflowed entry a minimal sibling and a parent in one node J It results in one node With the maximum number of entries I The empty node is recycled csc 3102 12 B B Karki LSU Balance Borrow from Left HIE I Steps Right subtree borrows the left 9 First shift data right 7 V V H gt 4r Rotateparentnodedown ENE Eli 1 Rotate data up Deleting 91 in the right subtree causes an underflow I For orderm 5 Minimum number of of entries m239 1 2 Maximum number of entries g g m m14 IE EH Underflowed node Ifl Right shift of 85 Iguana Parent 78 downrotation WEE ii quotIE i Left data 74 uprotation B B Karkl39 LSU Balancing by by borrow left algorithm 080 3102 13 Combine l I Steps Combines the underflowed node right El node 78 the left subtree with minimal entries A quot 45 and 63 and their parent node 74 i39 Move root to the left subtree v gt gt v l 4 Move entries in the under owed node to the left m 39 m i subtree 4 Shift the root I For order m 5 Minimum number of of entries A 39 n A 2 1 2 Maximum number of entries V 14 l ll Underflowed node Move parent 74 to left 317 Ely Move right entry 78 ilElEtll Shift root 080 3102 14 B B Karkl39 LSU Balancing by by burrow left algorithm 080 3102 Nonrecursive Algorithms B B Karkl39 LSU Definition and Examples I Nonrecursive algorithm r Executed only once to solve the problem I Examples 4 Largest element in a list of numbers 4 Element uniqueness problem quot 9 Matrix operations addition multiplication and transpose quota Digits in binary representation I Recursive algorithm invokes makes reference to itself repeatedly until a certain condition matches 080 3102 02 BB Karle LSU Analyzing Efficiency of Nonrecursive Algorithms Steps in mathematical analysis of nonrecursive algorithms I Decide on parameter n indicating input size I Identify a1 gorithm s basic operation I Determine worst average and best case for input of size n if the basic operation count depends not only on n I Set up summation for C n re ecting a1 gorithm s loop structure 7 Express the number of times the al gorithm s basic operation is executed I Simplify summation using standard formulas and rules of sum manipulation see Appendix A 7 Find a closed form formula for the count andor establish it s order of growth 080 3102 03 BB Karle LSU Example 1 Maximum Element Algorithm MaxEIementA0 n 1 Determines the value of the largest element in a given array nput An array A0 n 1 of real numbers Output The value of the largest element in A maxval lt AO forilt 1 ton1 do if AU gt maxval maxval lt AI return maxval 080 3102 04 BB Karkl39 LSU Example 1 Maximum Element Cont I Input size n the number of elements in the array I Algorithm s basic operation is comparison 4 It is executed on each repetition of the loop I Formula for the basic operation count 3 Sum is simply I repeated by n1 times CnnZ 1n 1E n i1 080 3102 05 BB Karle LSU Example 2 Unique Elements Algorithm UniqueEIementsA0 n 1 Checks whether all the elements in a given array are distinct nput An array A0 n 1 Output Returns true if all the elements in A are distinct and false otherwise forilt 0ton2do forjlt i1ton1 do if AI Aj return false return true 080 3102 06 BB Karle LSU Example 2 Unique Elements C0nt I Input size n the number of elements in the array I Basic operation single operation element comparison in the innermost loop I Formula for the basic operation count 4 Depends also on Whether there are equal elements in the array and if there are Which positions they occupy In worst case the algorithm needs to compare all nn12 distinct pairs Cmm22fZrltn 1gt i1gt1201 14we ltz 080 3102 0 BB Karle LSU Example 3 Matrix Operations I A 2D matrix is the standard representation I Classic matrix multiplication On3 a Given tWO n by n matrices A and B compute their product C 2 AB using definition based algorithm C is an n by n matrix Whose n l elements are computed as the scalar dot products of Cij 2 AikBkj the rows of A and the columns of B k0 3 Strassen s algorithm based on a divide and conquer approach On281 Algorithm MatrixMuItipIicationA0 n 1 0 n 1 B0 n 1 0 n 1 Mu1tiplies two square matrices of order n by the de nitionbased algorithm Input Two nbyn matrices A and B Output Matrix C AB for ilt 0to n1 do forjlt 0ton1 do Ci j lt 00 forklt 0ton1 do CI i lt CII39 I AIL k Bk i return C 080 3102 08 BB Karkl39 LSU Matrix Multiplication Cn I Input size n the matrix order I Two operations in the innermost loop are addition and multiplication 4 Both are executed exactly once on each repetition of the innermost loop is The basic operation is multiplication I Formula for the basic operation count 4 One multiplication is executed on each repetition of the innermost loop k and we need to find the total number of multiplications made for all pairs of i and j The algorithm computes n2 elements of the product matrix Each element is computed as the scalar dot product of an n element row of A and an n element column of B Which takes 11 multiplications n 1 n 1 n 1 n 1 n 1 n 1 cltngt22212 n i0j0k0 i0 j0 i0 n2 n3 E 0n3 Where 0 and c are times of one 3 z Ca Cml l addition and onenmultiplication for 0303102 09 a given machine BIB Karla LSU Sparse Matrlx SM Transpose I A sparse matrix of order n has only a 15 0 0 22 0 15 small number t of nonzero elements 0 11 3 O O 0 at A 2D representation is not efficient as 0 0 0 6 0 0 n 6 storing the zero elements wastes space M 0 0 0 0 0 0 t 8 91 O O O O O I Sparse matr1X representatlon O O 28 O O O A set of ordered tr1plets ltrow col valuegt Matrix M Transpose M 4 row and col are integers defining row col value row col value each element in the matrix Mm 6 6 8 Mm 6 6 8 4 value 1s the actual element or 1tem 1 0 0 15 1 0 0 15 1 Constralnt The entnes must be sorted in ascending order using the row and 2 0 3 22 2 0 4 91 col indices as the primary and 3 0 5 15 3 1 1 11 secondary keys respectively 4 1 1 11 4 2 1 3 5 1 2 3 5 2 5 28 I Transpose of M is M 39 6 2 3 6 6 3 0 22 1quot M39irow Micol 7 4 0 91 7 3 2 6 M39icol Mirow 8 5 2 28 8 5 0 15 080 3102 010 BB Karle LSU Algorithms for SM Transpose I Naive approach it Swap the row and column indices of all terms it Sort the terms according to the new row index as a primary key I Fast approach a Calculate the position in the list where each term would go after the transpose 9 Swap the row and column indices of all terms by placing them in their appropriate positions 080 3102 011 BB Karkl39 LSU 080 3102 Naive Sparse Matrix Transpose Code currentM i for i0 i lt MUcol i for j1 jltMDvalue j if Mjcol i M CurrentM row Mjcol M CurrentM col Mjr0w M CurrentM value Mjvalue currentM Variables MO row and MO col define the order of matrix M and M0AmluegWesmenumbem ebmemswnhemaMXMMhnowzaokue 012 B B Karkl39 LSU Nai39ve SM Transpose Analysis I Given a matrix M of order 11 find its transpose matrix M 39 4r Inner loop over the number of non zero elements t For each element exchange the row and col indices ie create a transpose 9 Outer loop over the col index of the original matrix n values using it as the primary key per row i in M 39 For each index i process inner loop ie process all elements and pick up their col numbers if they match With i I Basic operation Check Whether a new row M 39 contains non zero elements Mj col i 4939 The runtime complexity of the algorithm is CnnE 11 t ntE 0nt i 0 j1 i 0 a For dense matrix t gt n2 the complexity is of the order On3 080 3102 013 BB Karkl39 LSU immal Fast Sparse Matrix Transpose Code see 3102 H o H H o ize and Conn ch rowterms 110 for 11 iltMova1ue i rowterms M 1 col epoqoje for 11 ilt M0row 1 gt posh spos171rowtermsirlj determine startlng position of each new row in M 1 A New Create the transpose for 11 1lt MOvalue 10 jspos M 1 col 39 jsposmicol and spos M 1 col M39 j row Mm col M j Col 7 ME row M j Value Mb value t e number of term per row m M 1 lt I IOrow 10 33 Km LSU Fast SM Transpose Analysis I Given a matrix M find its transpose matrix M 39 The process involves four isolated loops 7 Initialize the number of terms per row in M 39 4 Count the number of terms per row in M 39 J Scan all the non zero elements and put them into multiple bins rows as Determine the starting position of each new row in M 39 J Scan all terms and decide the position of each bin row Create the transpose matrix M 39 J For each term find the correct old col number and swap the row and col indices I The runtime complexity of the algorithm is Ont 4 Four loops listed above are On Ot On and Ot respectively I So putting together and ignoring the constant factor we have Ont 7 For dense matrix t gt n2 the complexity is of the order Onz 080 3102 015 BB Karkl39 LSU 080 3102 Example 4 Binary Algorithm Binaryn Find the number of binary digits in the binary representation of a positive decimal integer nput A positive decimal integer n Output The number of binary digits count lt 1 while ngt 1 do count lt count 1 n lt n2j return count Basic operation is comparison 11 gt 1 Which decides Whether the While loop Will be executed The value of n is about halved on each repetition of the loop The complexity is logzn 1 tr Number of times the comparison is executed Number of repetitions of the loop s body 1 B B Karkl39 LSU 080 3102 Algorithm Basics Chapters 1 and 2 Notion of algorithm Fundamentals of algorithmic problem solving Important problem types Analysis of algorithmic efficiency Nonrecursive algorithms Recursive algorithms B B Karkl39 LSU 080 3102 Fundamental Data Structures Chapter 1 Section 4 and other parts of the textbook Reference books in C and Java I Linear lists I Trees I Graphs I Sets B B Karkl39 LSU

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

#### "I made $350 in just two days after posting my first study guide."

#### "I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

#### "Their 'Elite Notetakers' are making over $1,200/month in sales by creating high quality content that helps their classmates in a time of need."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.