### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# DISCRETE OPTIMIZATN MATH 409

UW

GPA 3.76

### View Full Document

## 6

## 0

## Popular in Course

## Popular in Mathematics (M)

This 81 page Class Notes was uploaded by Addison Beer on Wednesday September 9, 2015. The Class Notes belongs to MATH 409 at University of Washington taught by Paul Tseng in Fall. Since its upload, it has received 6 views. For similar materials see /class/192072/math-409-university-of-washington in Mathematics (M) at University of Washington.

## Reviews for DISCRETE OPTIMIZATN

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 09/09/15

MATH 409 DISCRETE OPTIMIZATION Course Notes Version March 2008 by Paul Tseng Department of Mathematics University of Washington FOREWORD As the chief engineer for the fabled kingdom Kopan you are charged to build roads to link its ve principal cities Al Bo Ci Da Be Your engineers estimate that building a road between Al amp Bo would take 15000 manhrs between Al amp Ci would take 11000 manhrs etc Which roads should be built to link all cities in fewest man hours There are clearly more than one way to link the cities AliBo BwCi CiiDa DarEe or AliBo BwCi CiiEe EePDa or Instead of building roads we can similarly consider laying down power lines or communication cables to link citiestowns This is called the minimum spanning tree MST problem and it is an example of a discrete optimization also called combinatorial optimization problem namely nd from among a nite number of choices one an optimal solution that minimizesmaximizes a certain quantity For another example suppose that the ve cities are already joined by roads and the travel time on each road is known What is the fastest way to go from say Al to Re If there is a road between Al and Be then that s certainly one path But it could be that there is also a road between Al and Ci between Ci and Da and between Da and Re which would be another path Among all such paths we wish to nd one whose travel time is minimum There is a nite number of such paths but the number can be large This is called the shortest path SP problem A variant of this problem is to nd a shortest path that visits each city exactly once called the Traveling Salesman problem TSP Another example called integer program 1 is a linear program ie the problem of minimizingmaximizing a linear function subject to the variables satisfying some linear equations and inequalities with the variables further required to be integers This is common since quantities often must be integers eg we can buy 1 or 2 or 3 tickets but not ticket Discrete optimization traces its roots to a problem studied by the famous 18th century mathematician Leonhard Euler 17071783 In the Hanseatic town of Konigsberg now named Kalinigrad there were 7 bridges going across the river Pregel to join four land masses It was asked whether a trip can be made to cross each bridge exactly once and return to the start Euler showed using the mathematics of what we now call Eulerian path in graph theory that this is impossible This problem is closely related to a popular childhood game of tracing a certain line drawing without lifting the pen pencil as well as to street sweeping Over the years discrete optimization has developed to play major roles in realworld decision making from scheduling to vehicle routing to VLSI layout etc This is helped by advances in computer softwares for solving large I eg LINDO CPLEX also recent issues of the bulletin ORMS Today For discrete optimization problems such as the ones described above MST SP TSP 1 the number of choices is often largeitoo large to explicitly search through them all even on a fast computer For example in the above 5 city example the number of possible paths visiting each city exactly once can be as much as 4 24 For n cities this number can be as much as 11 which is huge even for n 100 Thus clever algorithms are needed to exploit the problem structure to nd an optimal solution faster We will study these structures and algorithms Our study will involve graph theory networks linear programming and some combinatorics matrices and linear algebra We will that MST and SP are much easier problems than TSP and IP in the sense of worst case x f x concerns how I x easily a problem can be solved by a digital computer or more formally by a Turing machine A famous open question concerns Whether TSP and IP are also easy socalled PNP question For a good discussion of the TSP and discrete optimization the book LLRS90 For a history of IP and other optimization topics the collected personal reminiscences LKS91 For computational complexity the classic GJ79 is still worth a lookiat least for the cartoons on pages 23 References GJ79 MR Garey and BS Johnson Computers and Intraetability A Guide to the Theory of NPCompleteness WH Freeman and Co San Francisco 1979 LLRS90 EL Lawler J Lenstra AHG Rinnooy Kan and DB Shmoys The Traveling Salesman Problem a Guided Tour of F 39 39 39 39 f 39 39 39 W y 39 Chichester 1985 and 1990 LRS91 J Lenstra AHG Rinnooy Kan and A Schrijver Editors History of Mathematical Programming Elsevier NorthHolland Amsterdam 1991 Foreword Chapter 1 Chapter 2 Chapter 3 Chapter 4 Chapter 5 Chapter 6 Chapter 7 Index Table of Contents What is a Discrete Optimization Graphs and Digraphs 21 Sets 22 Digraphs 23 Graphs 24 Vertex Degrees and Data Structures for Graphs Digraphs 25 Subgraphs Trees and Spanning Trees 26 Eulerian Paths 27 Additional Topics Minimum Spanning Tree in a Weighted Graph 31 De nitions and Properties 32 Prim and Kruskal Algorithms 33 Optimality Condition and Sensitivity Analysis 34 Additional Topics Shortest Path in a Weighted Digraph 41 De nitions and Properties 42 Dijkstra and BellmanFord Algorithms 43 Additional Topics Flows Cuts and Matchings 51 Maximum Flow De nitions and Properties 52 Maximum Flow Flow Augmentation Algorithm 53 Matching and Covering in Bipartite Graphs 54 Minimum Cost Flow Additional Topics Integer Program 61 De nitions and Properties 62 Branch and Bound Algorithm 63 Cutting Plane Algorithm 64 Additional Topics Computational Complexity 71 Problem Size and Method Ef ciency 72 R NR and NPHard Problems 73 Approximate Solutions Traveling Salesman Problem IWWWH 1 What is Discrete Optimization In the children s tale of Ali Baba and the Forty Thieves Ali Baba carried away treasures from a cave in a bag Suppose his bag can carry at most 10 kg in weight and there are ve treasures weighing 2 3 4 5 6 kg and worth 346710 Dinars respectively Which treasures should he carry to maximize their total worth He can carry three treasures weighing 2 3 5 kg with total worth of 14 Dinars Or he can carry two treasures weighing 4 6 kg with total worth of 16 Dinars Can he do better After checking all possible choices we can convince ourselves that 16 Dinars is the maximum But if instead of 5 treasures there are 100 or even 1000 treasures checking all possible choices would take too long Is there a faster way to nd an answer The above problem known as the knapsack problem bag problem just doesn t sound the same is an example of a discrete optimization problem ie among a nite number of choices choose one whose worthcost is maximumminimum Typically the number of choices though nite is too large to check exhaustively so we must seek a faster way to nd an answer For a second example of a discrete optimization problem suppose there are 10 cities numbered from 1 to 10 and the distance between each pair of cities is known Starting at city 1 we wish to visit all other cities exactly once and to return to city 1 What order should we visit the other cities so that the total distance traveled is minimum We can rst visit city 2 then 3 then 4 Or we can rst visit city 3 then 2 then 4 Or we can rst visit city 10 then 9 This is called the Traveling Salesman Problem TSP A counting argument shows that there are 9 8 7 2 1 51840 possible choices In general for n cities the number of choices is n 1 n 2 2 1 n 1 which grows very rapidly with 71 Both the knapsack problem and the TSP belong to a class of notoriously dif cult problems called NP hard problems This means that no fast way of nding an answer exactly is known for these problems and likely none exists In practice people settle for nding an inexact answer fast But not all problems are so dif cult Suppose in our 10 city example we wish to travel from city 1 to city 10 in the shortest total distance but we can visit any number of cities in between Thus we can go 1 gt 10 directly or 1 gt 2 gt 10 or 1 gt 3 gt 10 or or 1 gt 2 gt 3 gt 10 or This is called the Shortest Path SP problem A counting argument shows that there are 1817161 1 1 818 17 111 2 8L2 possible choices In general for n cities the number of choices is n 21n 2 1n 3 11 1 2 n 2 2 which also grows very rapidly with 7 Yet this problem turns out to be much easier in that fast ways of nding an answer are known see Chapter 4 A linear program LP which is the problem of maximimizing minimizing a linear function over the feasible solutions to a system of linear equations inequalities may be viewed as a discrete optimization problem since at least one optimal solution occurs at a corner point of the feasible solution set which is a polyhedron and the number of such points is nite This number can be huge but fast ways of nding an answer are known On the other hand if we further restrict the variables to be 1 integervalued then the problem called integer program 1 is NPhard see Chapter 7 Thus the knapsack problem TSP and I are hard while SP and LP are easy In general the difficulty of a discrete optimization problem can be subtle If the problem is changed somewhere then it can go from difficult to easy or vice versa Much experience and knowledge may be needed to detect which is which like an art and the boundary between hard and easy problems is one of the mysteries in discrete optimization Knowing the difference can also save fruitless effort in trying to solve an impossibly difficult problem These days discrete optimization problems arise all around us When we fly on a major airline there is a good chance that the airfare we pay the seat availability the crew on the plane and the plane itself are determined by solving very large millions of variables discrete optimization problems When we send packages by UPS the routes taken by the packages may well be determined by solving large shortest path type problems Computer models of protein use discrete optimization Schedules of college sports teams may be determined by discrete optimization Routing of Meals onWheels vans has been done by solving inexactly a TSP using of all things a space lling curve Unlike LP there is no one or two dominant methods for solving discrete optimization problems Rather there are many different methods each with strengths and weaknesses A good understanding of these methods will enable us to choose a suitable method from our tookit and adapt it to the application we face In real life a problem rarely is exactly like ones in a book so we must be able to adapt accordingly We will study these methods in the next chapters One can hardly wait 2 Graphs and Digraphs As we saw from the examples of Chapter 1 discrete optimization may involve optimally traveling through cities or connecting cities in some way Mathematically cities and the connections between them can be represented by graphs and digraphs depending on whether each connection can be traveled in both directions or only in one direction To describe them we rst review some basic set theory 2 SETS We will use capital letters like 5 X Y V to denote a set of elements and use braces and to delimit the elements of the set The emptyset is always denoted by D For example let U 235711 denote the set of prime numbers and let V a I c 2 denote the set of English alphabets We can take intersection and union U of sets If W 024 then U0 W 2 U u W 02345711 and U D ID 9 U U10 U etc Also U D W E U W Q means is a subset of while U W E U U W We use 5 to denote the cardinality number of elements of a set S and use a E 5 respectively a g S to mean a is respectively is not an element of 5 Thus U 00 V26 UDW 1 ID 0 3 E U 1 Q U etc Is 4 a set Most sets we consider will have nitely many elements More than Oprah mathematics is all about relationships We will be interested in pairwise relationships No menage a troisf A nonempty set V and a possibly empty set A of pairs of elements of V form a graph written as V A and typically denoted by the captial letter G Each 11 E V is called a vertex or node Each pair um E A is called an are or edge We will consider only nite graphs ie V lt 00 There is some study of in nite graphs but relatively few In the next two sections we study two kinds of graphs directed and undirected 22 DIGRAPHS A graph is directed called digraph if the pairs in A are ordered ie uv E A is distinct from v u E A This is represented pictorially by a dot for each vertex 1 and a line joining two dots for each arc um with an arrow pointing from u to v Think of each arc as a oneway street Example 22 Let V UW W5 U 0U 05 U denote the northwest men s basketball teams in the PAC 10 and an arc um means team 11 has at least one win over team v in the 20052006 season Then A 3 UW 0U UW OSUWSUUW OUWSU OUOSUOSUWSU OSUOU Here V 4 A 7 The digraph V A has the pictorial respresentation UW V WSU 6 Example 222 Let V S 7 07 P7 58 08 P8 denote six street intersections in downtown Seattle cerca OU OSU 2000 and an arc uv means a car is allowed to travel in the direction from intersection u to intersection v Then A 5707 O7P7O708P7P8 P8P7P808 0858 SSS7 Here V 6 A 8 The digraph V A has the pictorial respresentation SS 08 P8 s7 07 39 P7 For convenience we will assume that digraphs have no selfloops and no parallel arcs in the some 01 will be excluded This avoids ambiguity in its algebraic representation V o b A o b o b I 2 direction Thus a digraph like This ambiguity can be resolved by introducing additional notation to distinguish between parallel arcs but it s not worth the extra complexity In fact there is no loss of generality by making this assumption since each self loop and parallel arc can be broken up by introducing a dummy vertex ac Here V obcd A ob od 20 01 db The topology of the graph is unchanged by this transformation The only drawback is the increase in the number of vertices and arcs What does the digraph with vertex set V 1mm 1m n and arc set A ij i 1 m j m 1 n look like What is jA In a digraph G V A we often like to know if we can reach a vertex from another vertex A path in G which we typically denote by the letter P is a sequence of vertices and arcs in G P 714713 u1 where p 2 0 u E V fori 1p 1 and mull1 E A fori 1p So u is the ith vertex in the path and p is the number of arcs in the path For simplicity we do not explicitly write out the arcs 4 Thus in a digraph we can traverse a path only in the direction of the arcs in the path For the digraph in Example 222 one path is 57070858 CAUTION Do not confuse sequence with set They are very different Example 223 For the digraph shown below 3 5 we have V 1 2345 A 12 13 24 25 32 45 54 53 Which of the following sequences is a path P1 1 P2 1253 P3 12532 P4 454 P5 453254 P6 123 P7 1243 Answerr All except P6 P7 are paths For P1 12 0 and 111 1 For P2 12 3 and 111 1113 2114 3114 5 and so on A path is simple if no vertex repeats in it ie m 74117 whenever i lt j In Example 223 P1 and P2 are simple A path is closed if it starts and ends at the same vertex ie 711 71744 In Example 223 P1 P4 and P5 are closed A closed path is a cycle if i no vertex repeats in it except at the start and end and ii it has at least 2 arcs ie 11 75117 whenever i lt j except i 1j p1 and p 2 2 In Example 223 only P4 is a cycle Notice that a simple path can never be a cycle but can be a closed path ie when it is a single vertex Fact 221 In a digraph G 12A if there is a path from vertex 1 E V to vertex v E V then there is a simple path from u to v Proof This is left as an exercise A digraph G is strongly connected if every vertex can reach other vertex by a path in G It can be checked that the digraphs in Examples 221 and 222 are strongly connected but not the one in Example 223 no vertex can reach 1 Now verifying a digraph is strongly connected using the de nition is a lot of work since for each pair of vertices we must check there is a path from one to the other The following equivalent characterization provides an easier way to show that a digraph is strongly connected This characterization does not seem to make it easier to show that a graph is not strongly connected however Fact 222 A digraph G V A is strongly connected if and only if G has a closed path P going through all pertices 139 e each 11ertez in V appears at least once in P Proof If Assume G has a closed path P going through all vertices ie P 111112111 111 for some 12 2 0 For any 11 v E V since P goes through all vertices 11 and v must appear in the P somewhere possibly more than once ie 11 11i and v 117 for some 1 g 1 j g p Ifi j then a path from 11 to v is 11 Iii lt j then a path from 11 to v is 11111111 111 11111211j Iii gt j then a path from 11 to v is 11111111 111 11111211j Only if Assume G is strongly connected Let n W and label the vertices from 1 to n Since G is strongly connected there is a path P1 from vertex 1 to 2 a path P2 from 2 to 3 a path Pn1 from n 1 to n and a path P from n to 1 ie P1 1 11i11 37111 2 I71 Z 0 P2 2 11 11 11 21 I72 Z 0 RH n 1 111 113 111ZI311 n pquot 1 2 0 P n 111111g11g1 1 12quot 2 0 Connecting these paths in sequence yields a closed path P through all vertices 1 1 1 2 2 2 3 nil n n n P 1111121111 1111121121 3 11111p 11 11 11111211p1 1 6 For Example 221 a closed path through all vertices is UW 0U 05 U W5 U UW For Example 222 one such path is S707P7P8 08 58 57 Can we nd other such paths Exercises 221 Let P 111112117 be a path in some digraph V A Suppose that 11 111M for some 1 g 1 g 12 3 50 12 2 4 a Write down a path from 111 to 111 with 4 fewer arcs than P b Write down a path with 8 more arcs than P c Can the digraph have a path with 1 fewer arc than P If give an example If no explain why not d Can V be less than 4 If give an example If no explain why not 222 Prove Fact 221 Hintr Start with the de nition of a path 11 111112 111 v with p 2 0 and 11 E V 1111111 E A 1 1 12 Then write down what it means mathematically if this path is not simple etc 223 Bo owns a vacation home on Vachon Island that she wishes to rent for the period June 1 to Aug 31 She has solicited a number of bids each having the following form the day the rental starts a rental day starts at 3 pm the day the rental ends checkout time is noon and the total amount of the bid in dollars Bo wants to choose a selection of bids that would maximize her total revenue Formulate this as a longest path problem ie nding a path in a digraph from a given starting vertex to a given ending vertex whose total weight sum of weights on each arc is maximum Specify clearly the digraph the weight of each arc etc Is this digraph acyclic ie contains no cycle 23 GRAPHS A graph is undirected if the pairs in A are unordered ie 11v E A is the same as v11 E A This is represented pictorially like a directed graph but without the arrows Since each pair is unordered an arc can be traversed in both directions so think of each arc as a twoway street Example 23 Let V Mdamdomeon1nan denote six 2letter English words and an arc 11v means word 11 has a letter in common with v Here each pair is unordered since 11v means the same as v11 Then A adamadanaddo ammeaman doonon1 n 1 nan Here V 6 A 8 The undirected graph V A has the pictorial respresentation 7 ad on am in We will follow a common convention and use graph to always mean an undirected graph We will use digraph to mean a directed graph We can think of a graph as a digraph whereby each undirected arc is replaced by two directed arcs in opposite direction We will assume that graphs have no selfloops and no aQO will be excluded Like digraphs there is no loss of generality by making this assumption since selfloops and parallel arcs Thus a graph like parallel arcs can be broken up by introducing dummy vertices A graph is called complete if pair of vertices is an arc As a counting exercise how many arcs does a complete graph with n vertices have 11 1 n l 2 3 Answer or equivalently Each vertex has n 1 arcs joined to it and there are n vertices for a total of nn 1 arcs but we double count so we divide this by 2 For a graph a path P is de ned exactly as for a digraph ie P 714713 u1 where p 2 0 u E V for i 1 p 1 and maul1 E A for i 1 p A path in an undirected graph is sometimes called a chain However since the air is unordered so that 111311 1 u 1u each arc in the path can be traversed in either direction Example 232 For the graph shown below b d g a Q C e f we have V alcde A ab a c I c I d I 6 06 16 f 9 Some examples of undi 8 rected paths are P1 a P 2 T f 9 P3 abca P4 aldebca P5 acba P6 aba Notice that P3 and P5 are the same path A path in a graph can be traversed in either direction For P3 12 3 and 111 au2 12114 1714 a For a graph simple path and closed path are de ned exactly as for a digraph Cycle is de ned as for a digraph except that it must have at least 3 arcs This is to exclude closed paths with 2 arcs like P6 in Example 222 which has no repeating vertex except at the start and end In Example 222 P1 P2 are simple paths P1 P3 P5 P4 P6 are closed paths and only P3 is a cycle Fact 231 In a graph G V A if there is a path between vertex 1 E V and vertex v E V then there is a simple path between 11 and v Proof Similar to the proof of Fact 221 A graph is connected if vertex is joined to every other vertex by a path It can be checked that the graph in Example 231 is connected as is the complete graph but not the one in Example 232 Similar to Fact 222 for digraph the following equivalent characterization provides an easier way to show that a graph is connected Fact 232 A graph G V A is connected if and only ifG has a closed path P going through all pertices Proof This is similar to the proof of Fact 222 Exercises 231 a Give an example of a digraph and a path in that digraph which is not a simple path but has no repeated arcs b Give an example of a graph in which the shortest circuit has 5 arcs and the longest cycle has 8 arcs c For a digraph 1721 that is strongly connected and W n what is the least number of arcs What is the most 232 a In a connected graph with n n 2 1 vertices what is the maximum possible number of arcs What is the minimum possible number of arcs b In a connected graph with 15 arcs what is the the maximum possible number of vertices What is the minimum possible number of vertices Hintr Use your answer to a 233 A given set of pins on a circuit board numbered from 1 to n has to be connected by wires Think of the pins as points on the Euclidean plane In view of possible future changes or corrections each pin can have at most two wires attached to it Let 617 denote the length of wire needed to connect pin 2 to pin j The total wire length is to be minimized Formulate this as a shortest hamiltonian cycle problem ie nd a cycle in a graph that visits each vertex exactly once and whose total weight is minimum Specify clearly the graph the weight of each arc etc Hintr Add a dummy pin 0 with Co cm 0 for i 1 n 24 VERTEX DEGREES AND DATA STRUCTURES FOR GRAPHSDIGRAPHS The degree of a vertex 1 E V in a graph digraph G V A is the number of arcs joined to u denoted by degu Thus the degree counts the number of neighbors of a vertex For the digraph in Example 221 with 6 arcs we have degUW 3 degWSU 3 degOU 4 degOSU 4 whose sum 3 3 4 4 12 equals 2 6 For the graph in Example 232 with 8 arcs we have dega 2 degb 4 degc 3 degd 2 dege 3 degf 1 degg 1 whose sum 2 4 3 2 3 1 1 16 equals 2 8 Hmm the sum of the degrees seems to always equal twice the number of arcs Fact 241 For any graph or digraph G V A Z degu 23A 116V Proof Each arc is joined to exactly 2 vertices so summing the degree over all vertices double counts each arc A consequence of the above fact is the following basic property of graphs digraphs whose proof we leave as an exercise Fact 242 For any graph or digraph the number of vertices of odd degree is even For the digraphs in Examples 221 222 223 and the graphs in Examples 231 232 the number of vertices of odd degree are respectively 2 4 2 and 2 4 The outdegree degomu of a vertex 1 E V is the number of arcs pointing out of u ie of the form am for some v E V The indegree deg3nu is defined analogously Notice that degm u deg3nu degu for all u E V For the digraph in Example 222 degout07 2 deg3n07 1 degm P7 1 deg3nP7 2 etc For the digraph in Example 223 degom1 2 deg3n1 0 degm 2 2 deg3n2 2 etc While small graphs digraphs say with fewer than 10 vertices can be studied by their pictorial rep resentations large graphsdigraphs with thousands or millions of vertices can be studied only by using digital computers How should graphs digraphs be stored on digital computers We consider two common data structures below 1 Adjacency matrix For a graph or digraph G V A label the vertices from 1 to n W and form an n x n matrix Adj whose i jth entry is 1 if i j E A A r mu 0 else Adj is called the vertex vertex adjacency matrix associated with G For the digraph in Example 223 the adjacency matrix is l 2345 1 ll 2 l l Adj3 1 4 1 5 11 Empty entry means a 0 For the graph in Example 231 the adjacency matrix is ad am do an on in ad 1 1 1 am 1 Adj do 1 1 an 1 1 1 on 1 1 1 HI 1 1 Notice that the adjacency matrix for a graph is always symmetric and has 2A nonzero entries In contrast the adjacency matrix for a digraph may not be symmetric and has jA nonzero entries Adjacency matrices can say a great deal about the topology of graphs digraphs One such property is the following Fact 243 For any graph or digraph G V A with adjacency matrix Adj ijth entry of Adj 4 paths with k arcs fromi to j 1 g ij g W Proof This is left as an exercise For example for the adjacency matrix associated with the digraph in Example 223 raising it to the 8th power 1234 10571010 204699 Adjls303466 402465 504578 By Fact 243 there are 10 paths using exactly 8 arcs from vertex 1 to 4 and there are 8 paths using exactly 8 arcs from vertex 5 to itself Notice that Adj c can be computed ef ciently in 0log2 113 matrix matrix multiplications For example we can compute Adj13 by rst computing Adj2 Adj4 Adj22 Adj8 AdjV2 and then 1461 Adjlsmdjllmdjl This requires only 5 matrixmatrix multiplications 2 Arc lists Here we order the arcs from 1 to M and use two arrays lists to store the start and end vertex for the jth arc in their jth location For the digraph in Example 223 one such arc lists are Arc 1 2 3 4 5 6 7 8 start vtx 1 1 2 2 3 4 5 5 end vtx 2 3 4 5 2 5 3 4 For the graph in Example 231 one such arc lists are Arc 1 2 3 4 5 6 7 8 start vtx ad ad ad am an an do on end vtx am an do an on in on in An advantage of arc lists is the low storage when the graphdigraph is sparse say M g 0W with a gt 0 a constant though additional data structure may be needed to do certain graph operations ef ciently In contrast the adjacency matrix would need to store WV entries or use more complicated data structure to store only the nonzero entries Exercises 241 Prove that in any graph 12A the number of vertices of odd degree is even Hintr Use Fact 241 and break Eng degu into two sums one over those a E V of odd degree and the other over those a E V of even degree Alternatively do an induction argument on the number of arcs 12 242 Show that in a graph V A there is a vertex of degree at least 2AV Hintr Argue by contradic tion 25 SUBGRAPHS TREES AND SPANNING TREES In this section we study special kinds of graphs that are useful later For a graph G V A a subgraph is a graph G V A satisfying In particular V A and V ID are subgraphs of V A Example 25 For the graph V abc A ab I 1 shown below a C 7 its subgraphs are m edges subgraphs 1 T a b c a a c c 2 0 b b a c 2 1 b b a c a c a c 3 1 b b a c 3 2 b V Notice that the number of sugraphs can be large even for a small graph A connected component of a graph G is a connected subgraph that is not a subgraph of any other connected subgraph of G In Example 251 the subgraphs in rows 1 3 and 6 are connected Of these only the subgraph in row 6 namely the graph itself is a connected component The graph in Example 232 has two connected componentsione comprising the two vertices f g and the arc joining them and the other comprising the rest of the graph The following property of connected components which is intuitive and not hard to prove will be useful By the way can vertices in different connected components be joined by a path 13 Fact 251 If G1 V1A1Ge V A t 2 1 are the connected components of a graph G V A then lli39 ll llVl lA1l lAellAl A graph that is connected and has no cycle is called a tree The graphs in Examples 231 and 232 are not trees In Example 251 the subgraphs in rows 1 3 and 6 are trees while the other subgraphs are not A spanning tree of a graph G is a subgraph T that i has all vertices of G and ii is a tree More precisely T is a spanning tree of G V A if i T V B with B Q A and ii T is connected and has no cycle The graph in Example 251 has one spanning tree namely itself The graph in Example 231 has many spanning trees We show four of them below ad do on ad do on am an in am an in ad do on ad do OH 5111 an in am an in The graph in Example 232 has no spanning tree Since a spanning tree of G is connected and contains all vertices of G G being connected is necessary for it to have a spanning tree When we write have a XXX we mean have at least one XXX The following fact shows that this is also suf cient Fact 252 A graph G V A is connected if and only ifG has a spanning tree Proof If Assume G V A has a spanning tree T V B with B Q A For any vertices uv E V since T is connected there is a path in T joining u and v ie 1 714713 u1 v where p 2 0 71471494 6 B for i 1 2 p Since B Q A this path is also in G This shows G to be connected Only if Assume G is connected We will construct a spanning tree of G using the following TREE algorithm which searches along arcs of G to color new vertices until all vertices are colored TREE Input A graph G V A Output A spanning tree of G if G is connected 0 Choose any 1 E V Initialize W r B ID 1 If W V then output T W B stop Else choose any 11 v E A with u E W U g W If no such arc exists G is not connected stop Update W W U U B lt B U uv Return to Step 1 Clearly T W B always remains a subgraph of G since only vertices in V and arcs in A are added to W and B respectively We now prove by induction that T W B always remains a tree Initially T 7 ID which is a tree Suppose T W B is a tree on entering Step 1 We show below that Tnew W U U B U u v is also a tree where uv is the chosen arc in Step 1 This would complete the induction Since T is connected by Fact 232 it has a closed path going through all vertices of W which we can without loss of generality assume to start and end at u u 114412 u1 11 Then vu1u2 u1v is a closed path in Tnew that goes through all vertices of W U U By Fact 232 Tnew is connected Since T has no cycle if Tnew has a cycle the cycle must use um But v has degree 1 in TM so v cannot be in a cycle of Tnew since each vertex in a cycle has degree at least 2 Thus Tnew has no cycle Thus when we stop in Step 1 with W V T WB is a spanning tree of G since T is a tree and its vertices spans V I Notice that initially W 1 B 0 and both are increased by 1 in Step 1 of TREE So when W V we must have W 5V EB 5V 1 Thus TREE has the following additional property Fact 253 If a graph G V A is connected then TREE outputs a spanning tree with V 1 ares If G is not connected then TREE would nd a spanning tree for the connected component containing the vertex 1 chosen in Step 0 If a graph has more than one spanning tree then which spanning tree will be found by TREE depends on the choice of r in Step 0 and of the arc uv in Step 1 Depthfirst search 15 DPS corresponds to choosing am such that u was most recently added to W Breadthfirst search BPS corresponds choosing am such that u was least recently added to W DPS tends to nd long and skinny spanning trees while BPS tends to nd short and fat spanning trees Example 252 Let us apply DPS and BPS to the graph in Example 232 starting at a a BPS b DPS W L W i a 9 a 9 a b a 2 a b a 2 a I c a b a c a 126 a b I c a I c d a b a c I 1 a I c e a b I c c 6 a I c d e a b a c I d Ie a I c ed a b I c c e e 1 b b d g d g c e f c e f The trees found by DPS and BPS which span the connected component containing a are shown above During the rst iteration instead of adding I to W and a b to B we could have added 6 to W and a c to B and so on So other spanning trees could have been found There are many equivalent characterizations of a tree each of which may be easier to use depending on the situation Below we give four wellknown characterizations including the de nition The proof makes use of the earlier facts Theorem 251 For any graph G V A the following four properties are equivalent a G is connected and has no eyele ie a tree b G is connected and M W 1 c G has no cycle and M W 1 d For any uv E V there is exactly one simple path in G joining a and v Proof ab Assume G is a tree Apply TREE to G Sine G is connected TREE nds a spanning tree T V B with B Q A and B M 1 We claim that B A If not there would exist an arc am 6 AB Sine T is connected there is a simple path in T joining a and U see Pact 231 11 114412 u1 v where p 2 0 algal1 E B for i 12 p and 111u2u1 are distinct Then 111u2u1u is a cycle in G contradicting G being a tree bc Assume G is connected and M W 1 Apply TREE to G Since G is connected TREE finds nds a spanning tree T V B of G with U3 W 1 see Fact 253 Since jA W 1 this means jB jA Since B Q A then B A So T G Since T is a tree it has no cycle so neither does G ca Assume G has no cycle and M W 1 Let 71111 179219 13 2 1 be the connected components of G These subgraphs are connected and have no cycle since G has no cycle so ab implies that by M 1 Me 1 Summing this yields A1A jliV t which by Fact 251 is equivalent to M W t Since jA W 1 by assumption then If 1 so G is connected ad Assume G is a tree For any 11 35 v E V since G is connected there is a simple path P from 11 to U see Fact 232 Suppose there is another simple path P in G from 11 to v Then P and P must diverge at some vertex 112 and then meet up again at some vertex 2 later ie P 11111 19111 13117 111 P 11111 191121 21191 v11 P1 forsomep22p 22and1 1ltj p11 1 ltj p1 Since P is simple 19111 12117 have no repeating vertex Since P is simple 19112 211 have no repeating vertex Also no vertex in 1911 12117 repeats in 19112 211 except at 112 and z or else P and P would have met up earlier before 2 Then I I I 112 111311 2 117 1171 11i1111i 111 is a cycle in G contradicting the assumption of G being a tree da This is left as an exercise I Theorem 251 is very useful For example we can use it to prove that if G 1721 is a tree with W 2 2 at least 2 vertices then G has at least 2 vertices of degree 1 In particular since G is connected each vertex has degree at least 1 Suppose G has fewer than 2 vertices of degree 1 Then the number of vertices of degree 2 or more is at least W 1 so 2 deg11 2 21m 1 1 116V 17 Since Zuev degu 2A by Fact 241 and A V 1 by Theorem 251 this implies that 2W 1gt 2 2W 1gt1 which is a clear contradiction Exercises 251 Consider the graph shown 4 1 5 a Apply the DFS Depth First Search version of TREE to nd a spanning tree for this graph starting at vertex 1 Write down the vertex set IV and arc set B at each iteration b Apply the BFS Breadth First Search version of TREE to nd a spanning tree for this graph starting at vertex 1 Write down the vertex set IV and arc set B at each iteration 28 EULERIAN PATHS In the foreword we mentioned the famous bridges of Konigsberg problem solved by Leonhard Euler in the 18th century This is the question of whether each of 7 bridges can be traversed exactly once in either direction and return to the start at Q top bottom Figure 261 On the left are the bridges as they were seen in Euler s time The corresponding graph is shown on the right with each bridge corresponding to an arc and each land mass corresponding to a vertex Two dummy vertices have been added to break up parallel arcs Euler proved that the answer is no How did he do it In general proving that something does not exist is dif cult A popular childhood game is to trace a line gure without lifting the pen like the two gures below whose vertices we have numbered Figure 262 This is of the same nature as the bridge problem Let s make our problem more precise We say that a path P in a graph digraph G V A is eulerian if P traverses every arc in A exactly once Then the bridge problem reduces to asking whether the graph in Figure 261 has a closed eulerian path Similarly the game of tracing a line gure reduces to asking whether the gure viewed as a graph has an eulerian path It is not dif cult to that both graphs in Figure 262 have eulerian paths a 421352345 and b 1243253 1 But if the graph has hundreds or thousands of arcs checking by hand would be impossible Is there an easy way to check whether a graph has a closed eulerian path Is there a systematic way ie a programmable algorithm to nd such a path The following theorem attributed to Euler gives an answer to these questions Theorem 261 A graph G V A has a closed eulerian path if and only if i G has no vertex of odd degree and ii G is connected excluding isolated verticesl Proof Only if Assume G V A has a closed eulerian path P 111ugu1 u1 where p A and lli7li1 E A for i 1 2 12 Start at 111 and traverse along P Each time we leave and return to 111 the number of traversed arcs joined to 111 increases by 2 so this number always remains even For each vertex 1 35 111 each time we enter and leave a the number of traversed arcs joined to 11 increases by 2 so this number always remains even When we nish traversing P so we return to 111 the number of traversed arcs joined to each vertex is even Since each arc is traversed exactly once this number is the degree of the vertex The vertices in P belong to one connected component of G see Fact 232 The remaining vertices must have degree 0 since all arcs are joined to vertices in P so they are isolated If Assume G V A has no vertex of odd degree and is connected excluding isolated vertices Then G has one connected component with all the arcs plus some isolated vertices If we discard these isolated vertices then G still has no vertex of odd degree and is connected We will construct a closed eulerian path of G by applying the following EPATH algorithm to G with these isolated vertices discarded The algorithm starts at any vertex and extends the path by searching along previously untraversed arcs until it closes on itself and then restarts the search from a vertex in this closed path that has an untraversed arc joined to it etc until all arcs are traversed 1 An isolated vertex is a vertex of degree 0 EPATH Input A connected graph G V A with no vertex of odd degree Output A closed eulerian path of G O Choose any 1 E V that has an arc in A joined to it Plt r Blt lD ult r Choose any um E AB P Pv B lt B U uv If v ga r then H u lt v Return to Step 1 Else v r so P closed on itself If B A then output P stop Else choose any vertex 1 in P with an arc in AB joined to it Resequence P to start and end at r ie P r r u lt 1 Return to Step 1 Hi aQaQQu P can be easily updated by storing the predecessor and successor of each vertex in P To that EPATH works correctly rst note that P always remains a path and B is the set of arcs in P Since we only add to B an arc in AB no arc in B repeats Thus P always remains an eulerian path of the subgraph V B and is closed whenever v r in Step 1 So when U r and B A P is a closed eulerian path of G and remains so after isolated vertices are added back to G It only remains to show that each time we return to Step 1 we can nd an untraversed arc u v E AB joined to 11 Here we use the assumption that each vertex of G has an even number of arcs joined to it so each time the path P enters and leaves a vertex 1 345 r the number of untraversed arcs joined to 11 decreases by 2 and hence remains an even number This means that when we extend P by an arc to enter a vertex 1 35 1 there must be another untraversed arc joined to u which allows P to leave 11 I If we apply EPATH to the graph in Figure 262b with r 1 in Step 0 and with v in Step 1 chosen to have the least label then r and P are updated as follows root 1 path P 1 1 12 123 1 2 3 1 2 231 11 Corollary 26 A graph G V A has an eulerian path that is not closed if and only if i G has exactly 2 pertices of odd degree and ii G is connected excluding isolated pertices Proof See Exercise 262 I By Theorem 261 the graph in Figure 262b has a closed eulerian path By Corollary 261 the graphs in Figures 261 and 262a have eulerian paths that are not closed which must start and end at the two vertices of odd degree What about the graphs in Examples 231 and 232 For digraphs analogous results can be shown by taking into account the direction of the arcs The proofs are similar to those for graphs and are omitted Theorem 262 A digraph G V A has a closed eulerian path if and only if i all pertices in V have indegree equal outdegree and ii ignoring directions G is connected excluding isolated pertices Corollary 262 A digraph G V A has an eulerian path that is not closed if and only if i all pertices in V have indegree equal outdegree except for two pertices one of which has indegree equal outdegree plus 1 and the other has indegree equal outdegree minus 1 and ii ignoring directions G is connected excluding isolated pertices By Theorem 262 the digraph in Example 221 has an eulerian path that is not closed starting at UW and ending at WSU By Corollary 262 the digraphs in Figures 222 and 223 have no eulerian path Can we easily nd the largest subgraph that has a closed eulerian path Exercises 261 Suppose that after removing an arc from some graph G the remaining graph has a closed eulerian path Explain your answers below a Must G have a closed eulerian path b Must G have an eulerian path c Does your answer to a and b change if G is connected 21 262 Prove Corollary 261 You can use Theorem 261 Hintr Add a vertex 2 and join 2 by two arcs to two suitably chosen vertices in G Treat the if and the only if directions separately 1 27 ADDITIONAL TOPICS A path P in a graphdigraph G V A is hamiltonian named after Sir William Rowan Hamilton who described such path in 1857 if each vertex in V appears in P exactly once except at the start and end Thus a hamiltonian path is either a simple path or a cycle depending on whether its start and end vertices are the same Although the de nition of a hamiltonian path looks very similar to that of an eulerian path determining whether a graphdigraph has a hamiltonian path is much harder In fact this problem is among the class of NP complete problems see Chapter 7 Some partial results are known For example 0 Ore showed in 1960 that a suf cient condition for a graph G V A with V 2 3 to have a hamiltonian cycle is that degu degv 2 V for all u 35 v um g A Roughly speaking any two non neighboring vertices must together have many neighbors However this condition is not necessary for existence as can be seen from examples Similarly DR Woodall showed in 1972 that a a suf cient condition for a digraph G V A with V 2 3 to have a hamiltonian cycle is that degm u deg3nv 2 V for all u 345 v um Q A A GhouilaHouri proved in 1960 a different result that a digraph G V A has a hamiltonian cycle if G is strongly connected and 998W 2 V for all u E V This suf cient condition is somewhat easier to check Exercises 271 Prove that a digraph G V A is strongly connected if degm u 2 and deg3nu 2 for all u E V 2 2 3 Minimum Spanning Tree in a Weighted Graph In Chapter 2 we studied graphs digraphs paths subgraphs and spanning trees for graphs In this and the next two chapters we will use these knowledges to study three kinds of discrete optimization problems In problems where we wish to link up cities or towns or computers at minimum cost or maximum reliability see the foreword for such an example it often involves nding a spanning tree on a graph whose weight is minimum or maximum In this chapter we study this kind of discrete optimization problem and in particular properties of such spanning trees and algorithms to nd them 3 DEFINITIONS AND PROPERTIES Recall from Fact 252 that a graph has a spanning tree if and only if it is connected Let G V A be a connected graph with a weightlength wm 6 5R for each arc um E A w E 81 means 212 is a real number The weight of a spanning tree T V B of G is de ned to be the sum of the weight of all arcs in T ie MT 2 am under Instead of sum other operands such as max can also be used Exercise 334 In applications the arc weight wm could be the physical distance between two points 11 and v or the cost of linking u and v or the time to travel between 11 and v etc Example 31 Consider the graph with arc weights shown below 3 b 1 d a 0 2C 5 6 So wag 3 wag 2 who 0 etc Two spanning trees shown darkened with their weights are d T1 3 wT130104 3 e b d T2 3 wT2250025 3 6 Below we give some properties of MT based on its de nition as a sum of arc weights in T For example if we add 7 to all arc weights in Example 311 then the weight of the spanning trees T1 and T2 both change 23 by 28 to respectively 32 and 53 Fact 311 Let G V A be a graph with weights aim for uv E A a If we add a 6 5R to aim for all uv E A then wT changes by an additive factor of le 1a for all spanning trees T of G b If we multiply a 6 5R to aim for all uv E A then wT changes by a multiplicative factor ofa for all spanning trees T of G Proof a By Theorem 251 each spanning tree T V B of G has B V 1 arcs so if we change aim to aim a for all uv E A then w T Z wm a Z B a J dm m 1a quotJOE S quotJOE S b If we change 1111 to am a for all uv E A then WWW Z wmy 039 Z wm a weldT a quot4qu mes We wish to nd a spanning tree whose weight is minimum which we call a minimumweight spanning tree MST The problem of nding a spanning tree of maximum weight can be reduced to this problem by negating all arc weights This is a discrete optimization problem since there are nitely many spanning trees to choose from But explicitly search through them all would be too slow so a more ef cient algorithm is needed Notice that simply picking the cheapest ie least weight arcs would not work since they might form cycles In Example 311 the three cheapest arcs form the cycle 261 I Fact 311a implies that we can change all arc weights by equal amount without changing the MSTs Thus we can add a large positive constant to all arc weights to make them all positive Similarly Fact 311b implies that we can multiply all arc weights by a positive constant without changing the MSTs We next have a key property of a spanning tree T namely any arc u v outside of T can be swapped with any arc in the cycle created by adding uv to T to yield another spanning tree Fact 312 Let T V B be any spanning tree of a graph G V A For any uv E AB let P1111 be the simple path in T joining u and v For any arc 2122 in PM the subgraph T V B39 with B39 B U uv wz is also a spanning tree of G Proof Since B Q A and u v E AB we have B Q A so T is a subgraph of G Since we replaced an arc in B by an arc not in B to obtain B jB B Also since T is a spanning tree Theorem 251 implies jB 1 Together this means jB 1 We now prove that T is connected which by Theorem 251 would imply that T is a tree and hence a spanning tree of G Since 212 2 is in PM we have PM 21 21121 212212 2211 v for some 1 g 2 g p where p 2 1 Since T is connected for any two vertices 8t E V there is a simple path P in T joining them If P does not use 2122 then P is still a path in T If P uses 2122 then P has the form 9 2122 t with 2122 appearing only once in P since P is simple Hence P 82139 212211 21211 v21391 2t is a path in T joining 9 t Thus for any two vertices in V there is a path in T joining them so T is connected I CAUTION Do not write T Q G to indicate that T is a subgraph of G E means a subset of not a subgraph of A graph is not a set Exercises 311 Can the following problems be transformed into an MST problem Explain your answer a G V A is a connected graph with arc weights wmwej Find a spanning tree T V B of G 12 that minimizes the Euclidean weight 2 921112 quotmes b G V A is a connected graph with each arc 21 v having a probability pm gt 0 of not failing Assuming the arcs fail independently the probability that no arc in a spanning tree T V B fails is H pm mums Find a spanning tree of G that maximizes the probability of no arc failing Hintr Use log c G V A is a connected graph with every arc colored either red or blue Find a spanning tree with the maximum number of red arcs 32 PRIM AND KRUSKAL ALGORITHMS A simple and popular algorithm for nding an MST is to specialize the TREE algorithm of Section 25 so that in Step 1 we choose an arc whose weight is minimum among all arcs between W and VW This 25 is usually called Prim s algorithm since it was publicized in a 1957 paper of RC Prim though it had been described by V Jarnik in 1930 and rediscovered by EW Dijkstra in 1959 MSTPrim Input A graph G 12A with weights 9J1 for um E A Output An MST if G is connected 0 Choose any 1 E V Initialize W r B lt 9 If W V then output T WB stop H Else choose any 11 v E A with u E W U Q W and whose weight wm is minimum among all such arcs If no such arc exists G is not connected stop Update W W U U B lt B U uv Return to Step 1 How ef cient is MST Prim It searches at most M arcs per Step 1 and repeats Step 1 at most W 1 times see Fact 253 Thus the overall effort is about UV 1A operations which is OUVHAD By operations we mean gt lt Also 03 means a quantity that is less than or equal to a 3 for s some a gt 0 independent of the graph and the arc weights With a more sophisticated data structure called Fibonacci heaps this can be improved to 00A W log2 jVD operations If we apply MSTPrim to the weighted graph in Example 311 starting at vertex a then arcs may be added in the order A ID a 6 a c C 6 WC 66 611 a c C e e b In 1 In this example the MST is not unique and MST Prim can alternatively add arc 601 instead of 61 W B a 9 W C a 0 032626 0326 C2 archerd arcrcrerdr a c ed b a c c e d e I 1 which results in the MST shown on the right Which MST is found depends on the mechanism used to break tie when the minimum in Step 1 is achieved by more than one arc as a computer code Such a tiebreaking 26 mechanism is inherent when writing MSTPrim as a computer code We show below that MSTPrim indeed works correctly This is not obvious The proof uses Fact 313 If G is not connected then MST Prim will nd an MST for the connected component that contains the starting vertex r Fact 321 MSTan outputs an MST whenever the graph G V A is connected Proof We will prove by induction that each WB in Step 1 of MST Prim is a subgraph of some MST Then since MSTPrim is a special case of the TREE algorithm when W V the output T W B is a spanning tree so it must be equal to this MST since spanning trees have the same number of arcs Initially W B r1D which is a subgraph of any MST Suppose on entering Step 1 W B is a subgraph of some MST T V B so B Q B We now show that W 39 B 39 with Wm WUv BM BUuv where uv is the chosen arc in Step 1 is also a subgraph of some MST possibly different from T which would complete the induction If 71v E T then B Q B as well as W E V so W BMW is a subgraph of T Else u v Q T and let Pm be the simple path in T joining u and v Since 11 E W and v Q W there exists some arc 2122 in PM with w E W and 2 g W Then our choice of uv in Step 1 implies wm S wwz By Fact 312 T 023 with B39 3 u mm mm is also a spanning tree of G Moreover w w on s w Since T is an MST this shows that T is also an MST Since B Q B and 2122 Q B due to 2 Q W we that B Q B and hence W BMW is a subgraph of T I A second popular algorithm for nding an MST attributed to J B Kruskal in 1956 uses the simple idea of adding arcs in order of increasing weight skipping any that forms a cycle However some care is needed to easily detect the formation of a cycle This is achieved by giving the same label to all vertices in the same connected component so two vertices in the same connected component are easily detected MSTKruskal Input A graph G V A with weights 9J1 for um E A Output An MST if G is connected 0 Sort the arcs in order of increasing weight Initialize B lt 10 11 lt u for all u E V H Choose the next arc uv in the ordering If 11 v then return to Step 1 Else u 35 v Update B lt B U uv 5a u for all 212 E V with 212 v Return to Step 1 How ef cient is MSTKruskal It is wellknown that the sorting jA numbers can be done in 00A log2 jAj operations Since M g VV 12 so that log2 M g Qlog2 W this means OAlog2V operations The rst case in Step 1 uses 01 operations and can repeat at most M times for a total of OA operations The second case in Step 1 uses OV operations and can repeat at most W 1 times for a total of OV2 operations So the total effort is OA log2 W W32 operations This can be improved to OA log2 V by adopting the convention that the connected component of V B containing v is smaller in number of vertices than the one containing 11 This necessitates keeping track of the number of vertices in each connected component of V B which can be done ef ciently Then each time when a vertex s label is updated the connected component of V B to which it belongs at least doubles in size so its label can change at most log2 W times Summing over all vertices yields at most W log2 W label changes in total If we apply MSTKruskal to the weighted graph in Example 311 starting at vertex a then arcs after by weights 1 0 0 5 1 2 3 may be added in the order 6 a f 2 wt 9 9 ab 0 16 1 1 a I c I e bd Ie ab 012 skip arc 16 1 d I e c 6 ab 212 skip arc cd 1 d I e c e a c I 2121212 and the resulting MST is shown below on the left with each new connected component circled by dotted line In this example the MST is not unique and MSTKruskal can alternatively add arc 16 instead of 26 which results in the MST shown on the right We can similarly show that MSTKruskal works correctly Fact 322 MSTKmshal outputs an MST whenever the graph G V A is connected 28 Proof Similar to the proof of Fact 321 this involves showing by induction that each B in Step 1 of MSTKruskal belongs to some MST We leave this as an exercise I 33 OPTIMALITY CONDITION AND SENSITIVITY ANALYSIS Suppose after we have found an MST some of the arc weights change as can happen in practice Can we update the MST in some simple way instead of nding it from scratch using one of the two algorithms of the last section We will in this section that the answer is yes by appealing to an optimality condition for MST Imagine that a stranger approaches you with a spanning tree and claims it to be an MST How would you check his her claim Let s revisit one of the MST we found for the weighted graph in Example 311 for ideas 3 b 1 d a 1 0 2 5 39 6 Figure 331 Notice that each arc not in the MST has the property that it s heavier ie higher weight than all arcs in the cycle formed by adding this arc to the MST In particular arc a 2 forms a cycle with arcs I e e c c a whose weights 0 5 2 are all below wag 3 arc cd forms a cycle with arcs 11 I e e 6 whose weights 10 5 are all below wed 1 arc 16 forms a cycle with arcs 61 I 1 whose weights 10 are all below We 0 In other words wait 2 maxlwac 2 wee 2 win wed 2 maxlwcmwte WM 331 wde gt maxwt43wbe Remarkably the above condition is necessary and suf cient for the above spanning tree shown darkened to be an MST We state and prove the general result below Fact 331 Optimality Condition for MST Let G V A be a graph with weights aim for am 6 A A spanning tree T V B of G is an MST if and only if for every am 6 AB am 2 aim for all ares 2122 in the simple path ofT joining uv 332 Proof If This is left as an exercise Only if Let T V B be a spanning tree that does not satisfy Eq 332 Then there is some um E AB such that wm lt wm for some arc 2122 in the simple path of T joining u v By Fact 312 T V B with B39 B U uv wz is also a spanning tree of G Moreover MT M am lt m Thus T is not an MST I Fact 331 is useful for sensitivity analysis of MST For example what is the range of wed for which the spanning tree shown in Figure 331 in dark remains an MST Since adding cd forms a cycle with arcs 16 26 120 in the spanning tree Fact 331 implies we need wed 2 max50 1 5 also Eq 331 If wed lt 5 then cd would replace 26 in an MST b 1 d 3 b 1 d What is the range of was for which the spanning tree shown in Figure 331 remains an MST Since ce belongs to cycles formed by adding arc ab or cd Fact 331 implies we need 3 2 wc91 2 was or equivalently was g min3 1 1 also Eq 331 If was gt 1 then 26 would be replaced by c d in an MST Fact 331 also suggests an arc swapping algorithm for nding an MST starting at any spanning tree MSTarcswap Input A connected graph G V A with weights wm for u v E A and a spanning tree T V B Output An MST 1 If there is some 11 v E AB and some arc 2122 in the simple path of T joining u U such that wm lt wwz then B lt B U uv wz return to Step 1 Else output T V B stop For example if we start with the spanning tree shown below on the left then after swapping 120 with a b and then I 6 with c d we nd an MST Here we choose the heaviest arc to swap out 30 What is the ef ciency of MSTarcswap This seems not well studied There is yet a fourth algorithm for nding an MST Starting with B A successively remove the next heaviest arc from B if it does not disconnect V B until no more arc can be removed However there is no fast way to check whether removing an arc will disconnect a graph so this algorithm is little used Exercises 331 This is an exercise on logic Recall that the negation of the quali ers for all is for some at least one and vice versa The negation of and is or and vice versa For example the negation of every person is happy every day is some person is unhappy some day Give the negation of each of the following statements a Every cycle has at least one arc with weight zero b At least one cycle has an odd number of arcs c At least one spanning tree has two vertices of degree 1 and no vertex of even degree 332 Consider a graph G V A with arc weights wm for um E A Prove that a spanning tree T V B is an MST if for every uv E A B we have wm 2 wm for all arcs 2122 in the simple path of T joining uv Hintr Let T V B be any MST Show that if B 35 B then we can replace some arc in T with some arc in T to get another MST G V A is a connected graph with arc weights aim for uv E A Consider the problem of nding a spanning tree T V B whose maz wez ght wmu T magi wm is minimum a Prove that every MST T solves this problem ie wmaxT is minimum The argument is similar to Exercise 332 b Is the converse true In other words if a spanning tree T has minimum maxweight is T an MST Explain 34 ADDITIONAL TOPICS There are many variants of the MST problem some of which are much more dif cult For example there may be an additional requirement that the spanning tree has at most k arcs joined to each vertex so the tree is not too bushy where k 2 2 is a xed integer However in general even nding such a spanning tree is a dif cult problem belonging to the socalled NPhard problems 31 Another famous variant of the MST problem is the socalled Steiner tree problem named after a mathematician Steiner who had studied this kind of problem In this problem a subset of the vertices is designated as Steiner vertices and the tree is required to connect only the nonSteiner vertices possibly using some of the Steiner vertices rather than all vertices Such a tree is called a Steiner tree The problem is to nd a Steiner tree Whose weight sum of arc weights in the tree is minimum In the special case Where all vertices are nonSteiner this reduces to the MST problem In general the Steiner tree problem is much more dif cult belonging to the NPhard problems 4 Shortest Path in a Weighted Digraph Whether it s driving to your grandma s house or routing traf c on the internet nding a shortest path is one of the most often solved discrete optimization problems in practice In this chapter we study its structures as well as popular algorithms for nding shortest paths 4 DEFINITIONS AND PROPERTIES Let G V A be a digraph with a weightlength 9J1 E 5R for each arc um E A The weight of a path P 714713 u1 in G is de ned to be the sum of the weight of all arcs in P ie 7 MP 29mm wum wum wuqu 11 Instead of sum other operands such as max or can also be used Exercise In applications the arc weight 9J1 could be the cost time distance to travel from u to v Example 41 Consider the digraph with arc weights shown below 3 1 5 3 1 39 1 5 3 0 4 80 W12 5 OJ 0 23 5 etc Two paths from vertex 1 to 5 shown darkened with their weights are 1 3 Pl 5 wPl033 2 4 1 3 I 5 w1 50335 2 4 Other paths and their weights can be similarly computed Below we give some properties of MP based on its de nition as a sum of arc weights in P For example if we add 7 to all arc weights in Example 411 then the weight of the paths P1 and P2 change by 14 and 21 to respectively 17 and 245 Thus unlike MST the weight of the paths do not change by equal amount Fact 411 Let G V A be a digraph with weights wm for um E A 33 a If we add a 6 5R to wm for all am 6 A then wP changes by an additive factor of arcs in P a for all paths P in G b If we multiply a 6 5R to wm for all am 6 A then wP changes by a multiplicative factor ofa for all paths P in G Proof The proof is very similar to that of Fact 311 and is omitted In general there is more than one path from one vertex to another We wish to nd a shortest path SP ie a path of minimum weight from a given vertex 9 E V to all vertices v E V reachable from This is sometimes called the oneto all SP problem The problem of nding paths of maximum weight can be reduced to this problem by negating all arc weights For the weighted digraph in Example 411 it is not hard to convince ourselves that a SP from 1 to 5 is 1 4 3 5 with weight w a wag 0 1 1 2 This SP is not unique however as the path 1235 also has weight 2 Fact 411b implies that we can multiply all arc weights by a positive constant without changing the SPs This is not true if we add a positive constant to all arc weights Unlike MST negative arc weight can pose dif culties for SP For example the following digraph with 1ltgt 2 1 has a path 12 with weight 0 But 12 12 is also a path from vertex 1 to vertex 2 with weight 1 And arc weights shown 121212 is another path from 1 to 2 with weight 2 In general by going k times around the cycle 121 the path weight would be k which tends to 00 as k gt 00 In other words the above weighted graph has no SP from 1 to 2 due to the presence of a cycle of negative weight If there is no such cycle then not only is there a SP maybe more than one from any vertex to any other vertex reachable from it but there is at least one such SP that is simple compare with Fact 221 Fact 412 Let G V A be a digraph with arc weights 9J1 for am 6 A and s E V If there is no cycle of negative weight then for any v E V reachable from s by a path there exists a SP from s to v that is simple Proof The proof is similar to that of Fact 221 and is left as an exercise Thus under the assumption that there is no cycle of negative weight equivalently every cycle has positive or zero weight our SP problem is a discrete optimization problem since there are nitely many simple paths to choose from But explicitly search through them all would be slow since there can be many such paths possibly exponential in V Exercise 312 so a more ef cient algorithm is needed We will study two such algorithms in the next section These algorithm are based on a simple but very useful observation about SP namely if the shortest path from s to v goes through a predecessor vertex u then the 34 portion of this path from s to u must be shortest and it minimizes the weight of this path plus the weight 11 weight of SP from s to v for all v E V 411 of the arc to U More precisely suppose with 11 00 if v is not reachable from 9 If 9 yv is a SP from s to v then s u is a SP from s to u and d1 11 wm Moreover d3 0 and d1 min Adu way for all v E Vs 412 114151va In Example 411 starting at s 1 we have 11 0 and it can be checked that 12 5 d3 1 d4 0 d5 2 Thus 12 mind1 942 13 mind2 w23d4 9J43 d4 mind1 M4 15 mind3 35344 W45 This provides a very useful characterization of SP weights which we state below without proof Fact 413 Optimality Condition for SP Let G V A be a digraph with weights aim for am 6 A and 8 E V Any nonnegative 11 possibly 11 00 u E V satisfy 411 if and only if they satisfy 412 Thus if a stranger claims to have found the SP weights we can easily check his her claim by checking that the weights satisfy the equations 412 Exercises 411 a If all arcs of a digraph have different nonnegative weights is the SP from a vertex 9 to another vertex t necessarily unique Explain b It is not ef cient to consider all possible paths from s to t in search of a SP from s to t For the digraph shown below nd the number of distinct paths from s to t n1 n2 n3 2n 412 We wish to nd the shortest route from point A to point B in the 2dimensional plane but there are obstacles in the shape of polygons in our way Formulate this as a SP problem on a suitable weighted digraph Hintr The corner points of the polygons correspond to the vertices of the digraph 42 DIJKSTRA AND BELLMANFORD ALGORITHMS 35 A popular algorithm for the onetoall SP problem was suggested by EW Dijkstra in a 1959 paper It builds the paths one arc at a time like Prim s algorithm for MST Its main restriction is that all arc weights should be nonnegative SPDijkstra Input A digraph G V A with nonnegat39i39ve weights aim for um E A and 9 E V Output For every v E V reachable from 9 by a path a SP from 9 to v and its weight 11 0 d3 0 Initialize W 8 B ID H Choose any um E A with u E W U g W and whose 1quot Pam is minimum among all such arcs If no such arc exists the vertices in VW are not reachable from stop Update 11 du 1 way W W U U B lt B U 1579 Return to Step 1 How ef cient is SPDijkstra It searches at most M arcs per Step 1 and repeats Step 1 at most W 1 times Thus the overall effort is about V 1A operations which is OVA By further storing and updating the quantity 13 du 1111 for all v g W this can in fact be improved to OV2 min mungm operations If we apply SPDijkstra to the weighted digraph in Example 411 starting at vertex 1 then arcs may be added in the order weight of SPs from 1 W B 11 lt 0 1 d4 lt min0005 0 14 14 12 min0501035 142 1412 d3 min0103 551 1423 141243 d5 min03112 14235 14124335 and the resulting B is shown below on the left with each W set circled by dotted line The SPs from 1 to In this example the SPs from vertex 1 to 3 is not unique and SPDijkstra can alternatively use arc 23 instead of 4 3 resulting in the SPs shown on the right Which SPs are found depends on the mechanism used to break tie when the minimum in Step 1 is achieved by more than one arc Notice that the arcs of the SPs form a spanning tree when arc directions are ignored Also SPs are found in order of their weights with the shortest paths found rst If we want only an SP from 9 to a particular vertex t then we can stop SPDijkstra early whenever t is We show below that SPDijkstra indeed works correctly The proof uses Fact 313 If G is not connected then MSTPrim will nd an MST for the connected component that contains the starting vertex 1 36 Fact 421 SPDz39jkstm outputs 11 as the weight of SP from 9 to every vertex v E V reachable from 9 by a path Proof We will prove by induction that in Step 1 we always have 1quot weight of SP from 9 to u for all u E W 421 Initially d3 0 W 9 so 421 is true Suppose on entering Step 1 421 is true We now show that 421 is true when U is added to W ie d1 weight of SP from 9 to v 422 where um is the chosen arc in Step 1 Since 11 E W 421 implies d is the weight of a SP from 9 to u 811116 11 11 where 12 2 0 112311 1 6 A for i 12 Then 11 dquot 1 way is the weight of the path from 9 to v P 111113 u 1v Consider any path P from 9 to v P s1t1ugu1v where p 2 0 ulna1 E A for i 1 12 Since 9 E W and v Q W the path P must contain an arc from W to VW ie there exists 1 g i g 12 such that u E W 111 g W 423 Let P1 111u P2 1ti1u1 Then wP wP1 wuiui wP2 Since P1 is just a path from 9 to m its weight is at least that of a SP from 9 to m which by 421 is equal to dug since u E W Thus wP1 2 1quot Also since all arc weights are nonnegative wP2 2 0 Thus we conclude that 9JP 2 dug 9J1mn1 Since 423 holds our choice of the arc um in Step 1 implies the righthand side is not less than 1quot 1 way which we showed earlier to equal wP Thus wP 2 wP 37 Since this is true for any path P from 9 to v and P is itself a path from 9 to v P must be a SP from 9 to v Since 11 MP this proves 422 If there are arcs with negative weight even if no cycle of negative weight SPDijkstra can fail to nd the correct SP Take the digraph with arc weights shown 2 1 2 1 3 If we apply SPDijkstra to this example with 9 1 it would output 13 1 which is not the correct SP weight to vertex 3 The problem here is that Dijkstra s algorithm is in some sense myopic and cannot look far ahead enough to take into account negative arc weights It is sometimes called label xing as it xes the path weights in increasing order We will next look at a second popular algorithm attributed to BE Bellman 1958 and LR Ford 1956 that uses labelcorrection via Fact 413b to avoid this pitfall SPBellman Ford Input A digraph G 12A with weights 9J1 for u v E A having no cycle of negative weight and 9 E V Output For every v E V reachable from 9 by a path the weight 11 of SP from 9 to v P 13 lt 0 d1 00 for all v E V9 For each v E V9 chosen in any order update H 11 min duwm uu11El If no distance label changes stop Else return to Step 1 If we apply SPBellmanFord to the weighted digraph in Example 411 the following distance labels may be generated 0 d1 0 d2 ood3 ood5 00 1 d2 mind1 min5 5 d3 mind2 5 14 1 min1 00 1 d4 mind1 0d2 0 min0 0 d5 mind3 1d4 3 min23 2 d2 mind1 min5 5 d3 mind2 5 14 1 min1 1 1 d4 mind1 0d2 0 min0 0 d5 mind3 1d4 3 min23 2 H 1 No further change in distance labels stop The shortest paths can be found among the arcs that achieve the minimum In the above example the minimum in computing 12 is achieved by the arc 12 so this arc is on the SP from 1 to 2 The minimum in computing 1 is achieved by the arc 2 3 or 4 3 so either of these arcs is on a SP from 1 to 3 and so on Since SPBellmanFord stops only when no distance label changes the distance labels must satisfy 412 on stopping and hence by Fact 413 are SP weights Thus it suf ces to show that SPBellmanFord eventually stops which we do below Fact 422 SPBellmanFord stops after at most V 1 iterations Proof idea The key step in the proof is to show by induction that for all vertices v that has a SP from 9 with at most k arcs their distance labels would stop changing after k iterations and possibly earlier In the above example vertices 2 and 4 have SP from 1 with one arc so their distance labels would stop changing after the 1st iteration Vertex 3 has SP from 1 with two arcs so its distance label would stop changing after the 2nd iteration and so on I Since each iteration of SPBellmanFord involves OA operations Fact 422 shows that the overall effort for SPBellmanFord is OVA operations In fact we can run SPBellmanFord without knowing a priori whether the digraph has a cycle of negative weight We simply run the algorithm for V iterations If the distance labels stop changing after V 1 iterations then these are the SP weights Otherwise Fact 422 shows that the digraph has a cycle of negative weight With some re nement the algorithm can even nd one such cycle The ability to detect and possibly nd negativeweight cycle is a key advantage of the BellmanFord algorithm And in contrast to the Dijkstra algorithm Bellman Ford does not require all arc weights to be nonnegative For example the home rental problem of Exercise 223 can be formulated as a SP problem by negating the arc weights The digraph has no cycle so it has no cycle of negative weight even though the arc weights are all negative Thus we can apply SPBellmanFord to solve this SP problem Similar to Section 33 we can use the optimaly conditions of Fact 413 for sensitivity analysis For example the arc 24 in Example 411 does not belong to any SP starting from vertex 1 What if we decrease am From the optimality condition 14 mind1 w14d2 am min0 5 am we that 24 remains outside of any SP as long as 0 lt 5 am or equivalently am gt 5 Similarly the arc 14 belongs to an SP What if we increase 9 This will affect the weight of all SPs that use the arc 14 From the optimality conditions for these SP weights d4 mind1 w14d2 w24 minw14 5 d3 mind2 w23d4 w43 min 14 1 d5 mind3 w35d4 w45 mind3 1414 3 39 we that 14 would not be on any SP from 1 to 3 or 5 whenever 9114 gt 0 since 13 remains at 1 It also would not be on any SP from 1 to 4 whenever 944 gt Exercises 421 Consider a digraph G V A with nonnegative arc weights wm for um E A and 9 E V For any path from 9 to v E V de ne its maz wez ght to be the maximum of the weight of all arcs in the path We wish to nd for each v E V reachable from 9 a path from 9 to v whose maxweight is minimum a Show by example that a SP need not have minimum max weight nor vice versa b How would you modify SPDijkstra to nd a path of minimum maxweight 43 ADDITIONAL TOPICS Both SPDijkstra and SPBellmanFord nd SP from one vertex 9 to all other vertices In some situ ations we may wish to nd a SP from every vertex to every other vertex Of course we can simply run SPDijkstra or SPBellmanFord V times each time with a different start vertex However there is an elegant alternative algorithm attributed to RW Floyd and S Warshall in 1962 which we describe now Let us label the vertices from 1 to n ie V 1 n Denote 197 weight of SP from i to j using only 1 k as intermediate vertices for i j E V Then 19 7 0 if z j 00 else Moreover any SP from i to j using only 1 k 1 as intermediate vertices either i does not use k 1 or 947 if ij E A ii is made up of a SP from i to k 1 and a SP from k 1 to j This means that 81 39 k k k 127 1111417241 31 13 71 The FloydWarshall algorithm simply applies the above recursive formula for k 01 n 1 to compute for ij E V Let D 6 be the n x n matrix with 197 as its ijth entry Then for the weighted digraph of Example 411 we have that 0 5 so 0 so 0 5 so 0 so 0 5 1 0 2 so 0 5 0 so so 0 5 0 so 8 0 5 0 15 D 3 00 0 00 1 D1 3 8 0 3 1 D5 3 8 0 3 1 00 00 1 0 3 00 00 1 0 3 13 18 1 0 2 00 00 00 00 0 00 00 00 00 0 00 00 00 00 0 When programming this on a computer 00 would be replaced by a very large constant The overall effort for the FloydWarshall algorithm is OV3 operations With some re nement this algorithm can also detect cycles with negative weight 40 We saw in Section 41 that if there is a cycle of negative weight then there may not exist a path of minimum weight What if we instead wish to nd a simple path of minimum weight Since there is only a nite number of simple paths between two vertices such a shortest simple path always However it turns out that this problem is much more dif cult to solve In fact like the traveling salesman problem it belongs to the class of NPhard problems In general there is often a ne line between easy problems such as MST and SP and more dif cult problems 41 5 Flows Cuts and Matchings In this chapter we study optimization problems that can be interpreted as optimizing ows on a capacitated digraph For example we may wish to send as much ow as possible from a source to a sink or send ow at minimum cost Such problems have applications to routing of data or packages or vehicles on networks as well as to matching people to tasks for examples 5 MAXIMUM FLOW DEFINITIONS AND PROPERTIES Suppose we wish to send goods over a road network or a rail network from a source to a sink Or we wish to transmit data over a data network Each link in the network has a capacity limit on the rate of transmission amount of goodsdata that can be sent per time unit What is the maximum rate of transmission This can be modeled in simplest form as a maximum ow problem which we describe below Let G V A be a digraph with a nonnegative number cm capacity for each arc 11v E A possibly cm 00 Let 9 and t be two distinct vertices in V An st ow on G is a vector of real numbers MWMEA arc ows satisfying 0 S 33m S C1111 for all 11 v E A capacity 19 if 11 9 2 331w 2 33m 19 if 11 t for all 11 E V ow conservation 1vu1vEl 1v11uEl O for some 19 E 83 We call 19 the value of this st ow The capacity constraints require the ow on each arc to be nonnegative and below the arc capacity The ow conservation constraints require the total ow out of each vertex to equal the total ow into that vertex with an exogenous in ow of 19 at the source 9 and an exogenous out ow of 19 at the sink t We wish to nd an st ow whose value is maximum This is the maximum ow problem which was rst studied by LR Ford and DR Fulkerson Example 51 Consider the digraph with arc capacities shown boxed with 9 1 and t 4 2 So 012 3 013 1 C23 2 etc The arc capacities need not be integer in general The capacity and ow conservation constraints are 0S12 330S13 130 23 230 24 23 0S31 130 34 23 3312 3313 3331 19 3312 3323 3324 0 3313 3323 3331 3334 0 3324 3334 9 Some examples of 14 ows and their values are listed below a 19 3339 3339 33 0 0 2 v 2 21 3 215 311202 4 Is 4 the maximum value of 14 ows How can we check this systematically Notice in Example 511 that the capacity constraints are linear inequalities and the the ow conservation constraints are linear equations in the variables 32123213 a3419 Also 19 is a linear function of these variables Thus the maximum ow problem is a special case of a linear program LP whereby we wish to maximize a linear function subject to the variables satisfying a nite system of linear inequalities and equations Thus the maximum ow problem is a discrete optimization problem in the sense that if an LP has an optimal solution then it has an optimal solution that is a corner point of the feasible region and the number of such corner points is nite We might recall from duality theory for LP that LP has a dual which has the same optimal cost objective value Moreover solving the LP also solves its dual The dual of a maximum ow problem has very interesting graphical properties which we will now study and then use to solve the maximum ow problem For any 5 E V and T E V we denote 5T uv E A u E S v E T ie 5 T comprises those arcs in A directed from vertices in S to vertices in T For any 5 E V that contains 9 but not If ie 9 E S t 5 we denote 451 Z mamavi ie c5 is the sum of capacities of arcs directed from S to its complement VS SVS is sometimes called an st cut since deleting these arcs would cut off all connections between 9 and t and CS is called the capacity of this cut which depends on S We wish to nd an st cut whose capacity is minimum This is the minimum cut problem Notice that this is a discrete optimization problem since the number of st cuts is nite For Example 511 the 14 cuts and their capacities are listed below 0 S S 43 1 1321033 314 12 11311231129 1225 13 121134 325 123 241134 224 Is 4 the maximum value of 14 cuts How can we check this systematically Notice that the value of each 14 ow is below the capacity of each 14 cut in Example 511 Is this a coincidence Recall weak duality for LP and its dual Observe that the ow conservation constraints have exactly two nonzero coef cients a 1 and a 1 in each lefthand column Linear equations of this type have special properties In particular for any 5 E V if we sum the ow conservation constraints over all 11 E S then each term of the form arm with 1111 6 S will appear twice in the sum once with a plus sign and once with a minus sign and hence they cancel This leaves only terms of the form 321 with 11 E Sm g S with a plus sign and arm with 11 E Sm g S with a minus sign For example if we sum the ow conservation constraints in Example 511 over vertices in S 1 2 we obtain 1313 1323 1324 1331 Since any 14 ow satis es 3213 S 613 3223 S C23 9324 S C24 and 331 2 3 the lefghand Side is below C13 C23 C24 0 which exactly equals c12 Thus c1 2 2 19 If we instead sum over vertices in 5 1323 we obtain 1324 1334 By a similar reasoning the lefthand side is below C24 634 which exactly equals c123 Thus c 1 2 3 2 19 This can be generalized to the following key fact Fact 511 Weak Duality For any st aw arm we have 45 2 19 with value 19 and any 5 E V with 9 6 51 Q S uvyEA Proof Since am My satis es ow conservation at each vertex 11 E V summing over all 11 E 5 yields mm 2 new 19 1168 1vu11El m1vueA We get 19 on the righthand side because 5 contains 9 but not t so the righthand side is the sum of 19 and a bunch of zeros Thus we 2 2 1iES1Hlt1l1vEl ueSvvuei 2 am 2 new 1111ESV wengs Z 331w 2 any Z xm 2 saw uu6ss 1L1YESVS panama 111LEVSS Z xuv 2 saw 1111ESVS 1vu61SS 44 Since 321 g cm for all 1111 6 5 V5 and 33m 2 0 for all 1111 6 V5 5 the righthand side is below cm 0 c5 Thus 19 g c5 I 1L1v681 8 Fact 511 implies that if an st ow has a value equal to the capacity of an st cut then this ow value must be maximum since no ow value can exceed the cut capacity and this cut capacity must be minimum since no cut capacity can be less than the ow value In the next section we will describe an algorithm for nding such ow and cut Exercises 511 a Give an example of a digraph G V A with integer arc capacities and two vertices 8t such that V g 4 and the 8 15 ow of maximum value is not unique b For your answer to a Find an st ow of maximum value whose arc ows are not all integer c Give an example of a digraph G V A with arc capacities of 1 and two vertices 8t such that V 3 and the 815 cut of minimum capacity is not unique 512 Prove that if armhwge A is an st ow of value 19 and 5 V5 is an st cut of capacity c then v c if and only if 321 cm for every 1111 6 5 V5 and arm 0 for every 1111 6 V5 5 You can use the fact that 19 2613063ng 33m Zmuw v arm for any st ow armhwgel of value 19 513 Consider a digraph G V A with arc capacities cm 1111 6 A and 9 35 t E V Suppose that 5173 and 52T2 are two st cuts of minimum capacity where T1 V51 T2 V52 Prove that 51 D 52 T1 U T2 and 51 U 52 T1 0 T2 are also st cuts of minimum capacity Hintr First show that they are st cuts Then show that their capacities equal that of 5173 and 52T2 by considering arcs out of 51 05251S2S2511 52 MAXIMUM FLOW FLOW AUGMENTATION ALGORITHM An intuitively reasonable way to solve the max ow problem is to start with any st ow eg zero ow on all arcs nd a path along which more How can be sent from 9 to t to obtain a new st ow and so on until no more How can be sent However care is needed in nding such a path To illustrate consider Example 511 again An 14 ow for its capacitated digraph is 3212 3 3223 3234 2 3224 1 3213 3231 0 with value 3 The only way we can send more ow from 1 to 4 and still satisfy ow conservation at all vertices is by increasing ow on the arcs 1324 and decreasing ow on the arc 2 3 This is shown in the gure below where A is the increase in ow value 45 To satisfy the capacity constraints on these arcs we need 0 g 0 A g 1 0 g 2 A g 2 0 g 1 A g 2 so the largest A is A min12 1 1 After the flow change the new 14 flow is 3212 3 3213 3223 1 3224 3234 2 3231 0 with value 4 The above example shows that in order to send more flow we need to consider path from 9 to t that has not only forward arcs along which we can increase flow but also backward arcs along which we can decrease flow Formally we say that a sequence of vertices 8111112 a 13 p 2 1 is an 31 augmenting path relative to an st flow 321 WMEA if either ulna1 E A scum lt emu or UTHm E A scumquot gt 0 i 12 p Such an augmenting path corresponds to a path in the usual sense in the residual digraph relative to xMLuMEA de ned as r A uw r 7w 6 A am lt Cm G O AI U A mm A uv vu E A arm gt 0 In words Gma is obtained from G by reversing the direction of any arc whose flow is at capacity and adding an opposite direction arc for any arc whose flow is positive and not at capacity In the above example the 14 augmenting path is 1 3 2 4 which corresponds to a path from 1 to 4 in the residual digraph Gm below in fact the only such path 3 Thus we can nd an st augmenting path on G by nding a path from 9 to t in Greg Such a path can be found using say a SP algorithm Section 42 with the arc weights of Gma set to zero or one We formally describe the algorithm below attributed to Ford and Fulkerson in 1956 MAXFLOW Input A digraph G V A with nonnegative capacities cm for um E A and distinct 815 6 V Output An st flow of maximum value and an s t cut of minimum capacity 0 Initialize 32m 0 for um E A 19 lt 0 and augment flow H Find an st augmenting path P relative to 321 WMEA am lt xm A for um e A in P arm 32 A for um E A in P 9 9 A 46 min C1111 331w 3 min 11116A in P mumL in If no such augmenting path can be found then If is not reachable from 9 in Gma 1221 U A7 stop where A min arm return to Step 1 P 321 WMH is an st flow of maximum value Let S v E V v is reachable from 9 in Gma 5 VS is an st cut of minimum capacity Applying MAXFLOW to Example 511 yields the following sequence of 14 flows and 14 augmenting paths 32123213x23a24x31 334 19 14 augm path A 000000 0 1234 min3 02 02 02 202002 2 1 24 min3 22 0 1 302102 3 1324 min1 O22 11 3 1 1 2 0 2 4 none so stop The residual digraph for each 14 flow is shown below together with the augmenting path used Gm 2 2 2 2 3 3 3 S 3 1234 124 1324 The nal 14 flow has value of 4 Only 1 is reachable from 1 in Greg so 5 1 and the 14 cut found by MAXFLOW is 1 234 12 13 with capacity of 1 3 4 By weak duality Fact 511 4 is the maximum flow value and the minimum cut capacity Notice that 123 4 24 34 is also an 14 cut of minimum capacity though MAXFLOW would not nd it If the arc capacities cm are all integers as in the case of Example 511 then it is not dif cult to that the arc flows 32m and value 19 would remain integer at all iterations of MAXFLOW Each pass through Step 1 is an iteration Then A would be a positive integer at all iterations so the flow value 19 increases by at least 1 at each iteration Thus MAXFLOW must stop after a nite number of iterations whenever a flow of maximum value exists ie the maximum value is nite The number of iterations in general would depend on the size of the arc capacities MAXFLOW would still terminate nitely if the arc capacities are rational numbers This is because we can scale the capacities by a common denominator to make them integer which does not affect nite termination However if the arc capacities are irrational then there exist examples on which MAXFLOW may perform an in nite number of flow updates But if we choose the augmenting paths more carefully and in particular always choose an augmenting path with fewest number of arcs which can be found by nding a SF in the residual digraph with arc weights set to 1 then it can be shown that the number of iterations is at most VHA regardless of whether arc capacities are rational When MAXFLOW stops we have that t g S v E V v is reachable from 9 in Gma Moreover there is no arc in Gma from S to VS This can happen only if 321 Cm for 11v E 5 VS arm 0 for 1311 E VSS 47 and it is not dif cult to that this implies 19 CS see Exercise 512 so that by weak duality Fact 511 49 is maximum st ow value and CS is minimum st cut capacity Exercises 521 522 Consider a digraph G 1721 with nonnegative arc capacities CM um E A and a vertex 9 E V For each path P in G de ne the capacity of P to be the minimum of the capacity of all the arcs in P Thus if P 111112 u1 then the capacity of P is mincuw2cupup1 Indicate how you would modify SP Dijkstra so to nd a path from 9 to each vertex v reachable from 9 that is of mrwimum capacity Hintr Replace minimum and in SP Dijkstra by suitable arithmetic operations 53 MATCHING AND COVERING IN BIPARTITE GRAPHS How best to pair up boys with girls or students with tutors or people with jobs or if not all pairings are compatible This is a classical problem of matching on bipartite graphs which as we shall is closely related to the maximum ow problem More precisely a graph G 1721 is bipartite if we can partition V into two nonempty subsets X Y such that all arcs in A go across X and Y only ie XUYV XDYID 11vEA 2 116XvEYoruEYvEX X Y Notice that a tree is bipartite More generally it can be shown that a graph is bipartite if and only if all its cycles have even number of arcs 1 2 1 2 1 2 1 2 3 4 3 4 6 7 5 3 D E 5 5 5 4 5 6 5 4 7 48 A matching of G is any subset of arcs M Q A whose end vertices are all distinct ie no two arcs in M share an end vertex We wish to nd a matching of maximum cardinality most number of arcs If X represent girls and Y represent boys and an arc u v means 11 and v are compatible then we wish to match up as many compatible pairs of boys and girls as possible Example 52 For the bipartite graph 1 4 2 GVA 3 6 7 some examples of matchings and their cardinalities are listed below M M ID 0 12 4 1 22 7 1 Is 2 the maximum cardinality of matchings How can we check this A covering of G is any subset of vertices W E V such that every arc in A is joined to at least one vertex in W We wish to nd a covering of minimum cardinality fewest number of vertices Example 52 continued Some examples of coverings and their cardinalities for the bipartite graph of Example 521 are listed below W W ID 0 1234567 7 123 3 4567 4 24 2 Is 2 the minimum cardinality of matchings How can we check this The above maximum matching and minimum covering problems are in some sense dual to each other In particular weak duality holds in that M j g W for any matching M and covering W Moreover equality holds if and only if M has maximum cardinality and W has minimum cardinality This can be proven from scratch but it is more easily shown by transforming the maximum matching problem into a maximum ow problem and then applying the results from previous sections This we do below using the bipartite graph of Example 521 for illustration First we direct the arcs in A from X to Y and assign capacity of 00 to them Then we add to two vertices 9 and t with arcs from 9 to X and from Y to t each with capacity of 1 This results in a capacitated digraph which we denote by G V A Thus V V U 8 t For the bipartite graph of Example 49 521 the associated capacitated digraph G is shown below 1 G V A Each arc aw in a matching M of G corresponds to one unit of flow sent on the arcs 811 11vvt in G ie am 6 M 4 33 32m 321 1 531 For example the matching M 3 4 2 7 corresponds to the 813 flow 3233 3234 3249 1 3232 3227 G also has st flows that do not correspond to matchings of G This occurs when some arc flow is fractional An example is 3231 3233 3214 3234 5 3249 1 with all other arc flows zero Each vertex 1 E X respectively v E Y in a covering W of G corresponds to arc 8 respectively v t in an st cut in G In particular the 815 cut in G corresponding to W is 5 V S where S 9 U XW U Y D W or equivalently W XS U Y D 5 532 For example the covering W 24 corresponds to the 815 cut 8 134 25 67 t 82 4 t Its capacity of 2 equals the cardinality of W V 4 E0 G also has st cuts that do not correspond to coverings of G This occurs when some cut arc has 00 capacity An example is 1 2 34 5 6 7 t 8 3 1 2 1 7 2 4 2 2 6 2 Since each matching M of G corresponds to an st flow in G with value equal to M and each covering W of G corresponds to an st cut in G with capacity equal to jW the following weak duality result follows from Fact 511 Fact 531 Weak Duality For any matching M and any covering W of a bipartite graph we have 1W 2 1M Since the capacities of arcs in G are either 1 or 00 and its maximum flow value is at most minUX jY it cannot exceed the total capacity of arcs out of 9 which is jX or the total capacity of arcs into t which 50 is Y the MAXFLOW algorithm applied to G will stop after a nite number of iterations and output an integer st ow and an st cut whose respective value and capacity equal Since arcs from 9 to X and from Y to t have capacity 1 the ow on each arc must be either 0 or 1 and this st ow corresponds to a matching M of G via 531 Also the st cut has nite capacity so it cannot contain any arc with 00 capacity Therefore it corresponds to a covering W of G via 532 Moreover equal ow value and cut capacity means M W so Fact 531 implies M is maximum and W is minimum A consequence of this is the following classical result of P Hall in 1935 characterizing when all vertices on one side can be matched Fact 532 Philip Hall 5 Theorem A bipartite graph X U YA has a matching M with M X if and only if U g NAU for all U Q X2 Proof If Assume U g NAU for all U Q X We know there exist matching M Q A and covering W E V such that M lW Since W is a covering each arc in A that has one end not in W must have the other end in W Thus NMXW E Y D W and hence NAXW g Y D W Also by assumption XW g N1XW Thus XW g Y OW This yields X XW X 0 W g Y 0 W X 0 W W M Since M is a matching so that M g X also we must have M lX Only if Assume X U YA has a matching M with M X For any U Q X since M g A we have NMU Q NMU and hence NMU g NAU Since M is a matching NMU lU Thus W S INMWl Fact 532 equivalently says that there does not exist a matching M with M j X if and only if there exists a U Q X with U gt NAU This allows for easy veri cation of the nonexistence of such a matching For the graph in Example 511 we have for U 13 that NMU 4 so U 2 gt 1 NAU Thus by Hall s theorem vertices in X cannot all be matched If the graph is not bipartite then nding a matching of maximum cardinality becomes more involved though still not so dif cult and must take into account the presence of cycles with odd number of arcs Exercises 531 For each positive integers k and I let GM denote the bipartite graph with k vertices on one side 1 vertices on the other side and an edge joining every vertex on one side to every vertex on the other side Give all values of k and l for which GM has a closed eulerian path Explain your answer 2 NAU Uueuh E Y am 6 A denotes the set of Y vertices joined by arcs in A to U 51 54 MINIMUM COST FLOW Both the shortest path problem and the maximum flow problem are themselves special cases of a discrete optimization problem called the minimum cost ow MCF problem This problem has many realworld applications For example when I visited the National University of Colombia in Medellin in the late 1980 s they had a computer lab of PCs As each PC can be used by only one student each student had to ll out a form indicating his her preferred times for using the lab per week and the students were assigned to lab times based on their preferences The assignment was done by hand and hence was slow and inef cient We formulated this problem as an MCF which we then solved using a computer algorithm that I codeveloped with my former thesis advisor Let us motivate the problem with an example Example 54 Consider a situation whereby 5 units of a certain product say wheat is to be shipped from city 1 to city 5 via cities 2 3 4 The cities are joined by 1way roads as shown and on each road there is a capacity limit on the number of units that can be shipped and a per unit cost for shipping the product We wish to nd a way to ship the 5 units at minimum cost 15 2 1 55 1 13 314 If we introduce for each road from city 11 to city v the variable 32m number of units shipped from u to v then this problem may be written as the following linear program LP with 8 variables 5 equations and 16 inequalities min 153312 1313 11324 1325 13332 1334 53345 subject to 3212 3213 5 3312 3324 3325 3332 0 3313 3332 3334 0 3324 3334 3345 0 3325 3345 5 US312 430 13 330 24 230 25 130 32 330 34 SLOSJM 5 st is an abbreviation of subject to In the general MCF problem we are given a digraph G V A and for every arc um E A two nonnegative numbers cm capacity and wm cost per unit of flow We are also given two distinct 52 vertices 8t E V and a nonnegative number 2 supply at 9 or equivalently demand at t u An st flow armhwge A on G is feasible if its value equals 2 and we de ne its cost to be Zmi e A wave An st flow is de ned as in Section 51 We wish to nd a feasible st flow whose cost is minimum that is less than or equal to the cost of all other feasible st flows This problem may be written as min 263106 A wave St Z111L1VEA 331w Z1v111LEl 3W I if u 9 0 else 0 g arm g CM for all um E A which is a special kind of LP with A variables V equations and 2A inequalities Recall that an LP is the problem of minimizing maximizing a linear function subject to the variables satisfying linear equations and linear inequalities As we mentioned earlier both the shortest path problem and the maximum flow problem are special cases of the MCF problem which we show below Shortest Path Consider the shortest path problem see Chapter 4 of nding the minimum weight path in a digraph V A with arc weights wm uv E A from a vertex 9 to another vertex t This problem is equivalent to the MCF problem on V A with capacities cm 00 um E A weights um um E A and supply I 1 at 9 and demand I 1 at t Intuitively the cheapest way to send 1 unit of flow from 9 to t is via the minimum weight path from 9 to L M u s t s K t v v SP problem equivalent MCF problem Maximum Flow Consider the maximum flow problem of nding an st flow in a digraph V A with nonnegative capacities CM 11 v E A whose value is maximum We can transform this as a MCF problem as follows Let 9J1 0 for all um E A Let A A U 15 8 cw 00 and w 1 Then this problem is equivalent to the MCF problem on V A with capacities CM um E A arc weights um um E A and supply I 0 at 9 and demand I 0 at t Intuitively since wags 1 and all other arc weights are zero minimizing the cost of a feasible st flow on V A is equivalent to maximizing the flow on 58 which in turn is equivalent to maximizing the external flow from 9 to t in V A 53 v Max ow problem equivalent MCF problem Maximum Wei 39ht Matchin 39 Consider a bipartite graph G X U YA with X Y and nonnegative arc weights wm uv E A We wish to nd a matching M of cardinality X whose weight wM Z wm is maximum In the special case where am 1 for all um E A we have wM M and the reduces to that of nding a matching of maximum cardinality If X represents girls and Y represents boys then 9J1 can be interpreted as the level of compatibility between girl 11 and boy v and the problem is to maximize the average compatibility of the matched pairs Unlike mzmimumcardinality matching it cannot be transformed into a maximum flow problem However it can be transformed into a MCF problem as follows We construct the associated capacitated digraph G V A as in Section 52 and we assign a weight of 9J1 for each arc uv from X to Y and assign zero weight to all arcs from 9 to X and from Y to t Max weight matching Equivalent MCF problem It can be shown that a matching of G of maximum weight corresponds to an integer feasible st flow on G of minimum cost Since the MCF problem is an LP we can solve it using any algorithm for solving LP such as the simplex method or dual simplex method Many algorithms have been proposed One such algorithm extends the MAXFLOW algorithm in Section 52 which ties in nicely with what we have seen so far We sketch its idea We start with the zero st flow ie 32m 0 for all u v E A Then as in the MAXFLOW algorithm we successively augment flow from 9 to t along st augmenting paths until its value equals I The only difference is that each st augmenting path is chosen specially namely it corresponds to not just any path from 9 to t in the residual digraph Gm V A U A from but a shortest path where the weight of each arc uv 6 At is wm and the weight of each arc u v E A is wm Notice that arc weights can be negative so Gma can have a cycle of negative weight in which case we instead augment flow around this cycle and repeat this until no cycle in the residual digraph has negative weight It can be shown that augmenting flow along a cycle of negative weight corresponds to a simplex pivot The SP paths can be found using either BellmanFord algorithm Section 42 or FloydWarshall algorithm Section 43 These algorithms can also 54 nd the negativeweight cycles when suitably modi ed The MCF problem has the nice property that if it has a minimum cost flow and the supply demand I and the arc capacities cm are all integer then it has an integer minimum cost flow 55 ADDITIONAL TOPICS A great deal of research have been devoted to understanding the structures of and designing fast al gorithms for solving shortest path maximum flow maximum weight matching also known as the optimal assigment problem and MCF problems These problems can also arise in unexpected places In some applications a fraction of the flow on an arc may get lost For example in Example 541 suppose that a fraction 1 of the flow on arc 1 2 is lost Then the flow conservation equations at vertices 1 and 2 would change to 3312 3313 5 93312 1324 1325 1332 0 A digraph whose arc flows are attenuated in this manner is called a gain network where the fraction 9 is called the gain of arc 12 The resulting MCF problem has a more complicated structure though it is still a special case of an LP In the MCF problem the incoming flows into a vertex are mixed and sent out on outgoing arcs Thus the flows are in a sense of the same kind so they can be mixed In many applications flows of different kinds may share the same arcs but they cannot be mixed For example apples sent from Washington to Georgia might share some routes with potatoes sent from Idaho to Florida but they cannot be mixed they are not interchangeable to consumers Such problem is sometimes called a minimum cost multicommodity flow MCMF problem Different commodities may have different costs but they must satisfy the same flow conservation constraints and they share the capacity constraints As an illustration consider the 2 commodity flow problem shown below with the commodity unit costs and the capacity shown for each arc Here 3 units of commodity 1 is to be sent from vertex 1 to 4 and 1 unit of commodity 2 is to be sent from vertex 2 to 4 19 3 071 Letting 32 number of units of commodity k shipped from u to v the problem can be written in the form min 632i2 32i3 232 3 32 4 322 9323 2333 2334 132 4 st 323 1 1 3 3212 3223 3224 0 32i3 32 3 1 324 0 3324 3334 3 32 2 2 0 3212 3223 3224 1 2 2 3313 3723 3734 0 3334 x 4 1 0 32i232f2 3 0 xigxf2 10 32332 3 20 32432 4 20 32 432 4 2 The MCMF problem is still a special case of LP However due to its large size real world problems can have thousands of vertices and tens of commodities specialized algorithms are needed to solve it fast If the ows are further required to be integers then the MCMF problem becomes much harder belonging to the class of NPhard problems 6 Integer Program Oftentimes in practice we encounter a linear program LP whose variables are further required to be integervalued For example when a manufacturing company decides on how many factories to build or when an airline decides on how many planes of each type to purchase the answer must be integers Such a problem is called an integer program 11 for short More precisely a IP is the problem of a linear function the cost function subject to the variables being integervalued and satisfying a system of linear equations inequalities In general an IP is much harder to solve than an LP but as we shall we can in some sense solve an IP by solving a sequence of LP s 6 DEFINITIONS AND PROPERTIES Let s some examples to get a sense of how I can arise in practice Example 61 Suppose that we have three objects 1 2 3 with weight of 3 lb 2 lb 4 lb and value of 1 2 3 respectively We also have a boat that can carry at most 5 lbs of cargo small boat What combination of these objects should we put on the boat so to maximize the total value of the cargo 2 3 max weight of 5 lbs 1 By introducing the variables N xi 1 if we take object 2 i 13233 else 0 we can formulate this problem as the max 1 max 31 232 333 51 3131 2132 4133 lt 5 0 g 3 g 1 integer 2 123 Notice that we have expressed the constraint 339 6 01 as 0 g 339 g 1 and 339 is integer Suppose that in addition we are told that objects 1 and 2 cannot go together How can we model this Answer We add to the above I the linear inequality 31 32 g 1 So if 31 1 then 32 must be 0 and if 32 1 then 31 must be 0 Suppose that instead we are told that exactly one of objects 1 2 and 3 must be on the boat How can we model this Answer We add to the above I the linear equation 331332333L 57 So either 31 132 33 0 or 32 131 33 0 or 33 131 32 0 Thus we that an IP is very good for modeling logical constraints eg if event A occurs then event B cannot occur Example 612 Yankee airline is deciding how many Airbus A3000 and how many Boeing 777 to buy Each Airbus A3000 costs 5 million and has a service life of 5 years each Boeing 777 costs 9 million and has a service life of 8 years not the most realistic gures Yankee estimates they need to buy only 6 jets in total and their purchasing budget is 45 million What should Yankee do so to maximize the total service life of its new fleet of jets k1 By introducing the variables 31 number of Airbus A3000 bought 32 number of Boeing 777 bought we can formulate this problem as the max 1 max 531 832 st 31 32 g 6 531 932 g 45 31 32 2 0 integer For an 1 we will call any integer value of the variables that satisfy the system of linear equa tionsinequalities in the I a feasible solution of the 1 For the IP in Example 611 31 32 33 0 is a feasible solution as are 31 32 133 0 and 31 32 033 1 but 31 32 33 1 is not For the IP in Example 612 31 32 0 is a feasible solution as is 31 132 4 but 31 132 5 is not We call the set of all feasible solutions the feasible region of the 1 Thus a max respectively min I is the problem of nding a feasible solution whose cost is greater respectively less than or equal to the cost of all other feasible solutions Such a feasible solution if it exists will be called an optimal solution of the IP and its cost will be called the optimal cost of the 1 Finding an optimal solution will be our main goal Example 612 continued For the IP in Example 612 the feasible region comprise the integer points indicated by the black dots inside the shaded region And as we shall shortly this IP has an optimal solution at 31 032 5 and the optimal cost is 531 832 40 So Yankee should buy 5 Boeing 777s and no Airbus A3000f 62 BRANCH AND BOUND ALGORITHM How do we solve an IP A naive way is to check all possible integervalues for the variables discarding those that do not satisfy the equationsinequalities and amongst those remaining pick one that yields the maximum minimum cost if we are maximizing minimizing the cost function For Example 612 say we would rst x 31 0 and try 32 012 etc then we would x 31 1 and try 32 012 etc then we would x 31 2 and try 32 012 etc and so on This of course is very slow It turns out that IP belongs to a famous class of hard problems called NPHard problems for which no fast solution method is known In contrast an LP is considered to be fairly easy Nonetheless there are methods for solving IP s that at least on small to mediumsized problems say fewer than 300 variables and fewer than 500 equations inequalities work reasonably well The best known of them is branch and bound invented by Ralph Gomory around 1960 and used by many computer packages such as LINDO which we describe below For any IP if we ignore the requirement that the variables be integervalued we obtain an LP which we will call the LP relaxation of the IP The feasible region for this LP clearly contains that for IP The idea of branch and bound is to start by solving the LP relaxation of IP and whenever one of the variables is not integer in the optimal solution we add either an 2 inequality or an g inequality to the LP so to rule out this noninteger value for the variable Thus we will solve a sequence of LP s For simplicity we will describe the procedure for solving a max IP only A min IP can be reduced to this case by negating its cost function Branch amp Bound BampB Input A max IP with variables 31 3 and whose LP relaxation has an optimal solution Output An optimal solution of the max IP if there is one P Let L0 be the LP relaxation of max IP Initialize m 0 Go to Step 1 H If L0 L1 Lm have all being solved stop Claimr If one or more of these LPs has an integervalued optimal solution then the integervalued optimal solution having the maximum cost is an optimal solution of the max IP Else the max IP has no optimal solution Else pick some Lc k 6 01 m that has not yet been solved We solve Lc using any suitable method such as the simplex method or dual simplex method and either i Lc has no optimal solution or ii Lk has an optimal solution 313 n and 313n are all integers or iii Lc has an optimal solution 313 n and at least one of 31 3 is not an integer In cases i and ii go to Step 1 in case iii go to Step 2 3 7 Branching on a variable Choose any index j E 1 n such that 37 is not integer Let Lm1 be L with the inequality 37 2 37 added and let Lm2 be Lc with the inequality 37 g 37 added Increment 59 m by 2 and return to Step 1 Example 612 continued Let us apply the branch and bound procedure to solve the max I of Example 612 Step 0 L0 is max 531 832 st 31 322 g 5131 9132 S 45 331 33220 Initialize m 0 Go to Step 1 Step 1 Upon solving Lg we obtain the optimal solution 31 2 32 neither of which is integer Thus we go to Step 2 Step 2 We can choose j 1 or 2 say j 1 Either choice works though the work involved may differ We let L1 be Lg with the inequality 31 2 3 added and let L2 be Lg with the inequality 31 g 2 added Increment mby2tom2andgoto Step 1 Step 1 We can pick either L1 or L2 to solve say L We obtain the optimal solution 31 3 32 3 both of which are integer so we go to Step 1 Step 1 Only L2 has not been solved Upon solving L2 we obtain the optimal solution 31 2 322 3 the second of which is not integer Thus we go to Step 2 Step 2 j 2 and we let L3 be L2 with the inequality 32 2 4 added and let L4 be L2 with the inequality 322 g 3 added Increment m by 2 to m 4 and go to Step 1 We continue in this manner until we reach m 8 at which time we stop The calculations are shown in the last two pages The branching on variables can be summarized graphically in the form of an 8amp8 tree shown below The boxes are called the nodes of the tree Upon comparing all the integervalued optimal solutions we found we that 31 0 322 5 has the maximum cost of 40 so this is an optimal solution of the IP Pheww add my NM x1lt1 L5 X7 Z 58 L 4 154 5 no feasible solution 6 k 9 Z 409 X11 X1 add XI 9 Ndd XZlt 4 X2 0 X2 4 58 05 Z 558 14 237 L7 L8 X11 X1 X11 X1 It is not hard to show that the 8amp8 procedure works and that the number of repetitions of Steps 1 and 2 is nite A precise proof is a bit involved so we will not give it here Notice that the amount of effort in the 8amp8 procedure depends on m the number of LPs we need to solve which can be 2quot in the worst case 61 We can reduce this number in a number of ways see Section 64 for further discussions 1 Bounding When we are in case iii of Step 1 if the optimal cost of L9 is less than the cost of some feasible solution of the max IP then go to Step 1 rather than Step 2 All feasible solutions of Le are worse than this feasible solution of the IP so none of them can be an optimal solution of the IP and hence Lc can be discarded 2 In Step 1 pick L9 to be an LP whose optimal cost is the highest possible Since the optimal solution of the IP is more likely to be in the feasible region of such an LP We can further reduce the effort by using the dual simplex method to resolve the LPs created in Step 2 Each new LP is created by adding an inequality to an LP we just solved The dual simplex algorithm is very ef cient at resolving an LP after an equation inequality is added If an LP has more than one optimal solution then we should seek a corner point optimal solution since such an optimal solution tends to have fewer fractional components Integer programming is a fascinating subject with many unexpected results For example can we approximate the optimal solution of an IP by rounding off the optimal solution of its LP relaxation The answer is alas no Consider the IP max 321 200322 st 321 100322 lt 10 321 322 2 0 integer The LP relaxation has the optimal solution 321 0322 01 which rounds off to 321 322 0 The IP has the optimal solution of 321 10322 0 xl 100x 2 10 0 1 1 100 T For another example if IP has an optimal solution must its LP relaxation have an optimal solution The answer again is no Consider the IP max 331 st 321 322 0 321 322 2 0 integer Because 2 is irrational there can be only one integer point on the line 321 322 0 namely 321 322 0 Thus IP has this as its optimal solution However its LP relaxation clearly has no optimal solution On the other hand if all coef cients are all integer or more generally rational then this cannot happen In that case if IP has an optimal solution then so does its LP relaxation Or put it in another way if its LP relaxation has no optimal solution then neither does IP which is obviously very useful to know Exercises 621 Consider the following IP in 2 variables max 31 532 st 331 232 g 18 4a1 332 g 6 331 2 0 332 2 0 31 32 integer Apply the branch amp bound algorithm to this IP to determine if it has an optimal solution and if so to nd an optimal solution 622 Consider the following IP in 2 variables max 331 332 st a1 232 331 V Hid O 4552 3 31 32 integer Apply the branch amp bound algorithm to this IP to determine if it has an optimal solution and if so to nd an optimal solution 623 a Give an example of an IP that has exactly two feasible solutions Give an example of an IP that has an in nite number of optimal solutions b Consider a max IP whose cost function is of the form C1331 charquot where C1Cn are integers Suppose we solve the LP relaxation of this IP and obtain an optimal solution 321321 with non integer cost ie C133 tonic is not an integer Write down a linear inequality that when added to the LP relaxation will i exclude 32 af from its feasible region and ii not exclude any optimal solution of the IP 63 CUTTING PLANE ALGORITHM The branch amp bound algorithm creates two new LPs from each LP whose optimal solution is noninteger An alternative algorithm proposed by Ralph Gomory and others in the late 1950s and 1960s creates only one new LP Thus it has the advantage that we only need to keep track of a single LP at all times This algorithm like branch amp bound repeatedly adds new linear constraints to the LP to cut away the current noninteger optimal solution A valid cut of an LP is a linear inequality that is satis ed by all integer feasible solutions of the LP To simplify the discussion we assume that the IP has the standard form min 1 33 st A3 I a 2 0 integer 631 63 where A E 27 I 6 27 c E 2quot and we assume that A has linearly independent rows Its LP relaxation is I min 6 3 st A3 12 6 32 3 2 0 Cutting Plane Input An IP in standard form 631 Output An optimal solution of the IP if there is one 0 Solve the LP relaxation If LP is infeasible stop IP is infeasible If LP is unbounded stop IP has no optimal solution see Exercise 631 Else LP has an optimal BPS 32 go to Step 1 1 If 32 is integer stop 32 is an optimal solution of IP since feasible solutions of IP are feasible for LP Else 32 is noninteger Add to LP a valid cut that is violated by 32 return to Step 1 How can we construct the desired valid cut Let 32 be any noninteger optimal BPS of 632 Then 32 Aglb 0 for some B Q 1 n and N 1 nB We describe two wellknown choices of valid cuts 1 Any feasible solution 3 of 631 satis es A5325 Z A7327 1 jeN where A7 denotes the jth column of A If 3w 0 then 35 A 32 would be fractional So it must be that 32N 35 0 Since 37 is integer and nonnegative for all j E N this implies Z 37 2 1 633 jeN so 633 is a valid cut Moreover 32 violates 633 2 Gomory cut Any feasible solution 3 of 631 satis es A5325 Z A7327 1 jeN 4 35 Z Ag1A7327 Aglb jeN 4 3352 2 517337 3955 i 1m jeN 64 Where 617 AglAjL and EU is the ith element of B Since 32 is noninteger 325 is fractional for some 2 Since 37 2 0 for allj E N We 215wa S We 2 WW Eltigt jEN jEN Since 3255 and arrv are integer this implies 9951 2 WWW S HistM 634 jEN so 634 is a valid cut Moreover for any i E 1 such that saga is fractional we have Largmj lt 3236 implying that as violates 634 Example 63 We illustrate the Gomory cuts Suppose the original I is min 31 232 st 4a1 6322 9 131 132 4 31 2 0 32 2 0 integer Transform this 1 into standard form min 331 232 st 4a1 632 333 9 331 332 324 4 31 2 0 34 2 0 integer Its LP relaxation has optimal BFS as 152500 corresponding to basis B 1 2 Then 46 110 16 115 Ala 1 il ABA 0 1 1 ABI 25 The corresponding optimal tableau is Both 15 and 25 are fractional so we can use either the 1st or 2nd equation say the 2nd The 2nd equation is 32 1323 4324 25 so the Gomory cut is x2lt2 or equivalently xvi 3352 33520 which we add to LP The new LP is in standard form The new optimal BPS is as 75201250 The 1st equation in the new optimal tableau is 31 25323 153 75 The Gomory cut is 131 133 135 E 0 or equivalently 31 x3x5x6 0 36 20 which we add to LP The new optimal BPS is as 121100 Since 32 is integer stop as is optimal for the transformed IP 1 2 is an optimal solution of the original IP 73x1 5x2 7 1525 752 1 2 xl xZ 4 7 Gomory cuts 322 g 2 and 31 x3x5 g 0 4 31 94x1 6x2 HQ 32 g 0 4 3a1 5322 g 7 When a new Gomory cut is added we can resolve LP using the dual simplex method starting with the optimal tableau of the previous LP It can be shown that the cutting plane algorithm terminates after a nite number of Gomory cuts with an optimal solution of the IP However its practical performance is not good Exercises 631 Consider the IP 631 with integer coef cients A 26 Suppose that its LP relaxation is unbounded Prove that the IP has no optimal solution Hint Show that there exists an 2 E Zquot satisfying C lt 0 A 0 z 2 0 64 ADDITIONAL TOPICS In practice the best algorithm for solving IP is obtained by combining branch amp bound with cutting plane namely add valid cuts to each LP generated by the branch amp bound algorithm to cut off more fractional solutions This yields the branch amp cut algorithm 66 There are many ways to generate valid cuts not necessarily always from an LP optimal solution For example the IP max 331 332 333 st 31 32 g 1 331 333 g 1 32 323 lt 1 31 2 0 32 2 0 33 2 0 integer has three optimal solutions 100 010 001 with optimal cost 1 Its LP relaxation has optimal solution 5 5 with optimal cost 15 If we sum the three inequalities and divide by 2 we obtain 131 332 333 E 15 If 321322323 are integer then the lefthand side is an integer below 15 so it must be below 15 1 Thus a valid cut for the IP is x1x2x3 1 If we add this constraint to the LP relaxation its optimal cost becomes 1 the same as that of IP In general we can generate valid cuts by taking positive weighted sums of inequality constraints and suitably rounding up or down the lefthand and righthand coef cients There are other ways to generate valid cuts by exploiting the problem structure For example when a TSP is formulated as an IP we can generate valid cuts based on properties of the traveling salesman tours 7 Computational Complexity Problems are not created equal Some are harder than others maybe much harder For example it is fairly easy to check whether a graph has an eulerian path see Section 26 but no one has found an easy way to check whether a graph has a hamiltonian path see Section 27 Imagine that you have been slaving away for days at some discrete optimization problem that your boss assigned you and have made no progress in solving it Perhaps it is not you but the problem is just too dif cult But how can you persuade your boss that the problem is dif cult The Problem was too hard And he has a SW7 00 PhD C 39 I I need i 3 proof We need a way to quantify the dif culty of a problem If the problem is dif cult then instead of trying to solve it exactly we might settle for nding approximate solutions We study this topic in this chapter 7 Problem Size and Method Ef ciency For the following three LPs which looks to be the hardest min 331 2322 st 31 a2 L131 31 2 0 32 2 0 min 3131 4332 3100 st 2331 5x2 33100 5 L132 a1 4322 6x100 2 331 20 332 20 3310020 min 3215331 14451332 7134x100 st 124331 345x2 4376433100 523 LPg 997x1 4363 158x100 213 331 20 33220 33100 20 If you think that LPg looks hardest it is probably because LP3 has more variables and constraints than LP1 and its coef cients have a wider range than LP2 Let s quantify this intuition below For an LP in standard form min 1 33 st A3 I as 2 0 711 68 with A 120 having integer coef cients the size also called length and denoted by L is the number of binary ie 01 bits digits needed to represent the data A I c The size estimates the length of the data le when stored on a digital computer If A I c have rational coef cients we can always multiply them by a common denominator to make them integer We do not consider irrational coef cients since an in nite number of binary digits is required to represent them For L131 data are 321 14 3 120 121 11b3 2020 4 121 1 01b3n 1 120 11 1 120 13 41120 1121 122 001b3n So size of L131 is L3322414 In general we need a l10g2 W 2 binary bits to store an integer a Then the size of LP 711 is m n m n L Z 2 20 2ltCjgt 21 71 21 71 It is readily seen that LP3 has larger size than L131 and L132 Intuitively larger the problem size L more dif cult is the problem But size alone does not determine the dif culty of a problem What really matters is the total time taken by a method algorithm to solve the problem on a digital computer Turing machine and in particular how fast this time grows with L Does it grow linearly with L Or polynomially Or exponentially Examples of each type of growth is shown in the table below L 10 100 1000 1000L linear 104 105 106 10L3 poly 104 107 1010 2quot expon z 103 z 1030 z 10300 Linear and polynomial growth are OK Exponential growth is clearly bad it would take forever to solve a problem even if its size is only 1000 72 JV and NPHard Problems A problem is easy if we know of a method algorithm that can nd a solution or determine no solution exists in time that is polynomial in L The class of such polynomialtime solvable problems is denoted P Assuming that each operation 1 gt lt takes time that is polynomial in L which is true for the algorithms studied in earlier chapters and is typically true in general the total time would be polynomial in L whenever the total number of operations is polynomial in L Below are some examples Consider the MST problem of Chapter 3 We are given a graph with vertices V 1n arcs A C V x V and integer weights wm um E A Rational weights can be scaled up to be integer We wish to nd an MST The data are am 11 v for u v E A The problem size is L Z 11 v M1 quotmay We saw that the MST can be solved by say Prim s algorithm in about n 1A g L2 operations Since L2 is a polynomial function of L MST is in P Consider the SP problem of Chapter 4 We are given a digraph with vertices V 1n arcs A C V X V nonnegative integer weights um um E A and 9 E V Rational weights can be scaled up to be integer We wish to nd a SP from 9 to all v E V reachable by a path The data are MW 11 v for um E A The problem size is L Z 7 v wmd 8 quotmay We saw that the SP problem can be solved by say Dijkstra s algorithm in about n 1A g L2 operations Since L2 is a polynomial function of L SP is in P a For the LP in standard form 711 with integer coef cients the simplex method can in the worst case take time exponential in L to solve LP However there exist other methods called ellipsoid method and interior point method that can solve LP in time polynomial in L Thus LP is in P The maximum flow problem and minimum cost flow problems of Chapter 5 are special cases of LP so these problems are also in P On the other hand no one has found polynomialtime algorithms for solving IP or TSP or determining the existence of a hamiltonian cycle So these problems seem hard Moreover these problems are often related in that if one of them is in P then so is the other For example the problem of determing whether a graph G V A has a hamiltonian cycle is equivalent to the TSP on complete graph with arc weight wm 0 if um E A and aim 1 if um g A G has a hamiltonian cycle if and only if there is a TSP with zero total weight Notice that the size L of this TSP is quadratic in the size L of the hamiltonian 70 cycle problem ie L g L23 Thus if the TSP is in so that there exists an algorithm to solve it in at most 12L time where p is some increasing polynomial function then by applying this algorithm to the aforementioned TSP we obtain an algorithm that solves the hamiltonian cycle problem in at most 12L2 time D g L2 and p is increasing so 12L g pL2 which is a polynomial in L Intuitively the hamiltonian cycle problem is a special case of TSP so it is no harder than the TSP Thus if TSP is easy to solve then so is the hamiltonian cycle problem Equivalently if the hamiltonian cycle problem is hard then so is the TSP In general if a problem can be reduced in polynomial time to another problem with at most a polynomial increase in the size then we say that the rst problem is polynomialtime reducible to the second problem The hamiltonian cycle problem has an important feature namely if the problem has a yes answer ie the graph has a hamiltonian cycle then there is a short certi cate to con rm this In particular the cycle itself in the form of a sequence of vertices and arcs would be the certi cate Given such a sequence we can check in polynomial time that indeed every vertex appears in it exactly once except at the start and end and that every arc is in the graph The size of the sequence is also polynomial in the size of the problem The class of problems for which a positive answer can be veri ed in polynomial time using a polynomialsized certi cate is denoted N short for nondeterministic polynomial since such a problem can be solved in polynomial time by something called a non deterministic Turing machine The class of problems for which a negative answer can be veri ed in polynomial time using a polynomialsized certi cate is denoted coN Thus the hamiltonian cycle problem belongs to N but is not known to belong to coN since we know of no short certi cate for nonexistence of hamiltonian cycle LP with integer coef cients belongs to N5 as well as co N In general any problem in belongs to N Another wellknown problem belonging to N is the 3 SAT problem Given a logical expression of n variables with 3 variables per clause 31 ngVx7 ing3Vx5 does there exist 31 2 E TF to make this expression T In 1970 Steve Cook proved a remarkable result that every problem in N is polynomialtime reducible to the 3SAT problem Thus 3SAT is in some sense the hardest among all problems in N Moreover people notice that many other problems are at least as hard as 3 SAT in the sense that 3 SAT is polynomial time reducible to these problems Among these are IP TSP hamiltonian cycle problem shortest simple chde problem is the graph G V A so its size is L Z 11 v quotmay We need A 2 V in order for G to have a hamiltonian cycle so we can assume without loss of generality that A 2 V and hence L 2 A 2 V The data of the TSP are the WV arc weights each of which requires 1 binary bit to represent so its size is L WV Thus L WV g L2 5 By strong duality any optimal solution of the LP has associated with it a feasible solution of the dual LP with equal cost This feasible solution of the dual LP is the certi cate With a bit of work using Cramer s rule we can show that the size of this certi cate is polynomial in the size of the LP 71 path problem allowing negative arc weights knapsack problem These problems are called NPhard So a problem is NPhard if the 3 SAT or any NPhard problem is polynomialtime reducible to it And a problem is NPcomplete if it is NPhard and in N 3SAT and hamiltonican cycle problem are NP complete I and TSP are NPhard but not known to be in NP If a stranger claims to have found an optimal solution of an 1 how can we verify this claim without solving the 113 On the other hand closely related yesno questions eg does the I have a feasible solution with cost below 19 are NPcomplete A famous open questions is whether NV Many false proofs have been claimed The bet is on a negative answer Another is whether N D co NV complete Thus if you can show that the problem your boss assigned you belongs to the class of NPhard problems by showing that a known NPhard problem is polynomialtime reducible to it then he she will know that no one else can solve the problem faster either roblem was N39Pncomplete No one else can solve it either We need to nd approximate solns What a genious 73 Approximate Solutions Traveling Salesman Problem If a problem is NPhard then we are unlikely to design a fast algorithm that is guaranteed to nd an exact solution always But if the operation of your company depends on this problem getting solved then we must settle for designing a fast algorithm that can nd exact solution most of the time or nd inexactsuboptimal solutions This topic has been extensively studied In this section we look at some of these strategies For concreteness we illustrate them on the Traveling Salesman Problem TSP 72 Consider the following complete graph with arc weights shown 1 a b We wish to nd a hamiltonian cycle tour of minimum weight Our aim is to nd a suboptimal tour quickly rather than nd an optimal tour since the problem is NPhard One way is the socalled greedy algorithm Starting with the cheapest arc we continue to add the next cheapest arc to our path as long as it maintains a simple path until all vertices are in the path This is in the spirit of Kruskal s algorithm for MST For the above weighted graph we would add say the arcs M Inc M C d 136 This creates the initial tour a I 6 16 a of weight 17 We next try to improve on this tour through a local exchange algorithm In a 2exchange algorithm we try to swap 2 arcs in the current tour with two arcs outside the tour to if this would decrease the weight For example if we swap 26 16 with I 6 66 the resulting tour has weight 15 If instead we swap a b Cd with a6 Id the resulting tour has weight 12 a 1 b a b 3 1 3 1 lt7 4 C C C C 6 6 d d b This 2arc swapping technique can be repeatedly applied to generate a sequence of improving tours In general one can use kexchange algorithm in which Is arcs are swapped k 2 3 4 Practical experience suggests that k 2 3 work best There are various re nements of this arcswapping idea using lookahead to predict which swaps are most promising However the tour found by these algorithms can in the worst case be far from optimal TriangleInequality Case If the arc weights satisfy the triangleinequality wm g wuk 9J1 for all u v k then it is possible to quickly nd a suboptimal tour that is never too far from optimal To illustrate suppose that the vertices are points in the plane and aim is the Euclidean distance between points 11 and v Euclidean distances satis es the trizmgleinequality We construct a tour as follows 73 TSPMST Find an MST Construct a closed path that starts at some vertex visits all vertices using only arcs in the MST with NH each arc used twice and returns to the starting vertex Convert the path to a tour by shortcutting intermediate vertices already visited 3399 39e illustrate this on an example of 7 vertices with the weight of each arc being the Euclidean distance between its two end vertices below MST pa l tour 1 2 3 4 5 6 a a 7 Let WMST W W denote the weight of the MST the path and the tour constructed above Then path 2 mu I39V 2 I V39MST pm h and by triangleinequality each time we shortcut an intermediate vertex the weight of the path is reduced Thus W lt W mu pmh Also removing any arc from an optimal tour yields a spanning tree Since the removed arc has positive weight the spanning tree has lesser weight than the optimal tour whose weight we denote by WTSP Being a spanning tree it has greater or equal weight than the MST This implies I V39TST 2 I vi vlST 39 Putting the above three relations together yields W lt Wm mu Thus the tour we constructed is guaranteed to be within a factor of 2 in weight from the optimal tour Since an MST can be found quickly Chapter 3 and the shortcutting can also be done quickly this is a pretty fast algorithm We can further improve on the tour using say an exchange algorithm N Christo des showed in 1970s that by also using minimum weight matching optimal assignment the factor 2 can be further improved to His algorithm constructs a tour as follows TSPChristofides 1 Find an MST N Let Had vertices of odd degree in MST There is an even number of these vertices Find a matching M E de x de ie each vertex in Ed has exactly 1 arc of M joined to it of 5 minimum weight 239 Adding M to MST gives each vertex even degree Construct a closed path 5quot Convert the path to a tour by shortcutting intermediate vertices already visited We illustrate this on the earlier example of 7 vertices below ST MST M M path tour 4 i g 5 R K Let WMST WM WW1 me denote the weight of the MST the matching M the path and the tour constructed above Then W WMST WM path and by triangleinequality we have W lt W tau path Also as we argued earlier I39V39rsr 2 I V39MST Finally by triangleinequality an optimal tour on all vertices has greater weight than an optimal tour on Kidd Moreover the arcs of an optimal tour on Had form two matchings Ml and Mg on Kidd As we move along this tour the 1st 3rd 5th arcs are in Ml and the 2nd 4th 6th arcs are in Mg etc Thus Wm gt W WM2 2 2W TSP an vmid Putting the above four relations together yields Ecur W lt git3973 Clever no For a long time was the best approximation factor achievable in polynomial time for the Euclidean TSP The drawback with Christo des algorithm is that it requires solving an additional matching problem Index Adjacency matrix Arc list Augmenting path Breadth rst search Branch and bound Branch and cut Cardinality coNP Connected component Covering Cutting plane Cycle Degree Depth rst search Digraph residual strongly connected Dual simplex method EPATH Exchange algorithm Flow feasible value Graph bipartite complete Greedy algorithm Indegree Integer program feasible solution feasible region 11 optimal solution 3 optimal cost 12 Linear program relaxation 46 Matching 15 maximum weight 59 MAXFLOW 66 Maximum flow problem 3 Minimum cost flow 71 13 Minimum cost multicommodity flow Minimum cut problem 49 MST 64 MSTarcswap o and 9 MSTKruskal 10 MSTPrim lo NP 3 46 NPcomplete 6 NPhard 62 Outdegree 20 P 73 Path 42 augmenting 53 closed 42 eulerian 3 hamiltonian 48 shortest 8 simple 73 Problem size 10 SP 4 and 9 46 5 and 9 21 33 5 and 9 69 34 SPDijkstra SPBellmanFord Subgraph Traveling salesman problem Triangle inequality TREE Tree spanning ertex Weight of a path of a spanning tree 33 23

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

#### "I bought an awesome study guide, which helped me get an A in my Math 34B class this quarter!"

#### "Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

#### "It's a great way for students to improve their educational experience and it seemed like a product that everybody wants, so all the people participating are winning."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.