Class Note for CMPSCI 611 at UMass(2)
Class Note for CMPSCI 611 at UMass(2)
Popular in Course
Popular in Department
This 22 page Class Notes was uploaded by an elite notetaker on Friday February 6, 2015. The Class Notes belongs to a course at University of Massachusetts taught by a professor in Fall. Since its upload, it has received 16 views.
Reviews for Class Note for CMPSCI 611 at UMass(2)
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 02/06/15
CMPSCI611 NP Completeness Lecture 17 Essential facts about NPcompleteness c Any NPcomplete problem can be solved by a simple but exponentially slow algorithm 0 We don t have polynomialtime solutions to any NP complete problem 0 We can prove that either all NPcomplete solutions have polynomialtime solutions or none of them do 0 It is generally believed that none of them do But proving this would require solving the P versus NP problem one of the best known unsolved problems in mathematics much less theoretical computer sci ence NPcompleteness is a property of decision problems It can be used to prove that other types of problems are NP hard which means that unless P NP they have no polynomialtime solutions These include search opti mization and approximation problems De nition The class P is the set of decision problems for which there exists an algorithm solving them in 0nk time for some constant 1 De nition A formal language A is in NP if there eXists another language B in P such that for any string 1 1 is in A iff there exists a string 3 with lyl m0 such that 1y is in B An equivalent more algorithmic de nition of NP is as follows An NPprocedure consists of a guess phase and a veri cation phase Given an input 1 the guess phase chooses an arbitrary string 3 such that lyl MO The veri cation phase is an algorithm that takes both 1 and y as input and returns a bit We say that 1 is in the language of the NPprocedure iff it is possible for the procedure to guess a 3 making the veri cation phase out put true There is an obvious deterministic decision procedure for any NP language simply cycle through all possible strings y and see whether the veri cation procedure accepts 1 y The problem is that there are 27pm possible y s of course Our question is whether there is a better way to decide membership in the NP language one that might run in polynomial time Proving a language to be in NP is generally simple We need to de ne a string to be guessed and de ne a poly time veri cation procedure that accepts the input and a guess iff the guess proves that the input is in the language We need to determine relationships among the languages in NP using the notion of reduction We want to show that if problem B is in P then so is problem A One way to do this is to describe an algorithm for A would run in polynomial time if it were allowed to make calls to a hypothetical polytime algorithm to decide membership in B This is called a Cook reduction For de ning NP completeness though we will need a slightly different notion of reduction Let A and B be two formal languages possibly over dif ferent alphabets A Karp reduction from A to B is a polytime computable function f such that for any string 51 1 E A if and only if f E B If such a reduction ex ists we say that A is polytime reducible to B and write A 3 B We also sometimes read this as A is no harder than B Two languages A and B are said to be pequivalent if both A 3 B and B 3p A The relation 3 is a partial order on the equivalence classes We are interested in the maximal equivalence class in NP De nition A language B is NPcomplete if 1 it is in NP and 2 for any language A in NP A 3 B Thus the NPcomplete languages if they exist are the hardest languages in NP It should be easy to see that they form an equivalence class and that if any NPcomplete language is also in P it follows that P NP If a language is NPcomplete then we have strong evi dence that it is not in P We can use the NPcompleteness of a language to talk about nondecision problems even though by de nition these cannot be NPcomplete De nition A problem X with boolean output or other wise is said to be NPhard if there is a Cook reduction from X to some NPcomplete problem B That is there is a polytime algorithm with access to a hypothetical polytime algorithm for X that decides B It should be clear that if X is NPhard and there actually is a polytime algorithm for X then P NP Our general methodology will be to develop a library of NPcomplete problems Once we have one problem there is a straightforward way to get more Lemma Let B be an NPcomplete language and A be a language If we prove OAENP ongl then we may conclude that A is NPcomplete But how do we start this process In CMPSCI 601 we prove the CookLevin Theorem that the language SAT is NPcomplete Last time we gave an unconditional proof that a particular generic NP language is NPcomplete We ll still need the CookLevin Theorem because SAT is a much more convenient starting place to build up a library of NPcomplete languages CMPSCI611 SAT and 3SAT Lecture 17 Remember that a boolean formula is de ned to be satis able if there is at least one setting of its input variables that makes it true The language SAT is the set of boolean formulas that are satis able Recall that a boolean formula is in conjunctive normal form CNF if it is the AND of zero or more clauses each of which is the OR of zero or more literals A for mula is in 3CNF if it is in CNF and has at most three literals in any clause The language CNFSAT is the set of CNF formulas that are satis able and the language 3SAT is the set of 3CNF formulas that are satis able Note that 3SAT is a subset of CNFSAT which is a sub set of SAT In general if A g B we can t be certain that A 3p B Although the identity function maps elements of A to elements of B we can t be sure that it doesn t map a nonelement of A to an element of B But here we know more we can easily test a formula to see whether it is in CNF or 3CNF To reduce 3SAT to SAT for ex ample we map a formula ltp to itself if it is in 3CNF and to 0 if it is not A similar reduction works to show A 3 B whenever A is such an identi able special case ofB thatiswhenA B CandC E P 6 The CookLevin Theorem tells us that SAT is NPcomplete essentially by mapping an instance of the generic NP problem to a formula that says that a particular string is a witness for a particular NPprocedure on a particular input Recall that if A is an NP language de ned so that 1 E A iff 3y 1 y E B we call 3 variously a witness proof or certi cate of x s membership in A We d like to show that CNFSAT and 3SAT are also NP complete It s clear that they are in NP but the easy spe cial case reduction does not su ice to show them NP complete We can reduce 3SAT to SAT but what we need is to reduce the known NPcomplete language SAT to the language we want to show to be NPcomplete 3 SAT On HW4 I ll have you work through the general re duction from SAT to 3SAT Here I ll present the eas ier reduction from CNFSAT to 3SAT The proof of the CookLevin Theorem given in CMPSCI 601 actually shows directly that CNFSAT is NPcomplete Let s now see how to reduce CNFSAT to 3SAT We need a function f that takes a CNF formula ltp in CNF and produces a new formula f m such that f m is in 3 CNF and the two formulas are either both satis able or both unsatis able If we could make ltp and f m equiv alent this would do but there is no reason to think that an arbitrary CNF formula will even have a 3CNF equiv alent form Every formula can be translated into CNF but not necessarily into 3CNF Instead we will make f m have a different meaning from ltp and even a different set of variables We will add vari ables to f m in such a way that a satisfying setting of both old and new variables of f m will exist if and only if there is a satisfying setting of the old variables alone in ltp In fact the old variables will be set the same way in each formula Because ltp is in CNF we know that it is the AND of clauses which we may name 11 V V Km 21 V V Em Elm V V Emkm where the E s are each literals For each of these clauses in ltp we will make one or more 3CNF clauses in f m possibly including new variables so that the one clause in ltp will be satis ed iff all the corresponding clauses in f m are satis ed So let s consider a single clause 1V V k in ltp Ifk g 3 we can simply copy the clause over to f m because it is already suitable for a CNF formula What if k 4 We can add one extra variable and make two clauses 1 V 2 V 11 and xl V 3 V 4 It s not too hard to see that both of these clauses are satis ed iff at least one of the E s is true If 1 or 2 is true we can afford to make 1 false and if E3 or E4 is true we can make 1 true The general construction for k gt 4 is similar We have 19 2 clauses and k 3 new variables The clauses are 1 V 2 V 1 171 V 3 V 2 172 V 4 V 3 and so on until we reach xkn V k2 V M4 and nally Ix k3 V k1 V If we satisfy the original clause with some 1 this satis es one of the new clauses and we can satisfy the others by making all the MS before it true and all those after it false Conversely if we satisfy all the new clauses we cannot have done it only with xi s because there are more clauses than xi s and each 511 only appears at most once as true and at most once as false and so can satisfy at most one clause Since this reduction is easily computable in polynomial time it shows that CNFSAT gp 3SAT and thus with the quoted result that CNFSAT is NPcomplete that 3 SAT is NPcomplete 3SAT is often the most convenient problem to reduce to something else but other variants of SAT are also some times useful One we ll use later is notallequalSAT or NAESAT Here the input is a formula in 3CNF but the formula is satis ed only if there is both a true literal and a false literal in each clause Let s prove that NABSAT is NPcomplete Is it in NP Yes if we guess a satisfying assignment it is easy in lin ear time to check the input formula and verify that there is a true literal and a false literal in each clause So we need to reduce a known NPcomplete problem to NAE SAT we ll choose 3SAT itself Again we ll transform each old clause into the AND of some new clauses in this case three of them 10 Given the clause 1 V 2 V 3 we introduce two new vari ables 51 and 3 that appear only in the new clauses for this clause and a single new variable oz that appears several times The three new clauses are 1V 2VL IL V 3VyL VyVOz We must show that the three new clauses are jointly NAE satis able iff the original clause is satis able in the ordi nary way First we assume that the old clause is satis ed and show that we can choose values for 51 y and oz to NABsatisfy the new clauses We make oz true for all the clauses in the formula and consider the seven possibili ties for the values of 1 2 and 3 In each of the seven cases we can set 1 and 3 not both true to NABsatisfy the three new clauses we ll check this on the board 11 Now assume that the three new clauses are NABsatis ed we will show that at least one of the Ei s is true First assume that 0 is true because if the new formula is NAE satis ed with 0 false we can just negate every variable in the formula and get a setting that NABsatis es all the clauses but has 0 true If 0 is true then either 1 or 3 must be false If 1 is false then either 1 or 2 must be true If y is false and 1 is true then 3 must be true So one of the three E s must be true and the original clause is satis ed in the ordinary way 12 CMPSCI611 CLIQUE and VERTEXCOVER Lecture 17 So far we ve seen that several problems in logic are NP complete In fact there are NPcomplete problems in a huge array of domains we ll next look at some prob lems in graph theory similar to some problems we ve already solved in polynomial time Let G be an undirected graph A clique in G is a set A of vertices such that all possible edges between elements of A exist in G Any vertex forms a clique of size 1 the endpoints of any edge form a clique of size 2 and any triangle is a clique of size 3 The language CLIQUE is the set of pairs G k such that G is an undirected graph that contains some clique of size k It should be clear that CLIQUE is in the class NP Our NPprocedure guesses an arbitrary set A of vertices by guessing a bitvector of length n Then the veri cation phase checks that A has size exactly 19 and that there is an edge between every pair of vertices in A 13 We ll prove CLIQUE to be NPcomplete by reducing 3 SAT to it Recall that this means de ning a function from 3CNF formulas to graphinteger pairs such that satis able formulas are mapped to pairs G k such that G has a lcclique and unsatis able formulas are mapped to pairs where G does not have a lcclique The essential element of any reduction is a correspon dence between the witnesses of the two NPproblems The 3CNF formula is satis ed or not satis ed by a bitvec tor of length n and the possible cliques are also denoted by bitvectors We want to arrange the graph so that a sat isfying instance corresponds to a lcclique and Vice versa and we get to pick 19 for our convenience Here s the construction We have a node for each appear ance of a literal m the formula So if there are m clauses each with 1 literals the number of vertices in the graph is the sum from 1 to m of kl Now we need edges We place an edge between nodes 1 and y if they refer to lit erals that occur in di erem clauses and are not in con ict aren t negations of one another We set Is to be m the number of clauses l4 Nodes Appearances of literals in clauses Edges Pairs of nodes that are in different clauses and not in con ict We claim that the 3CNF formula is satis able iff there is an mclique in the graph First assume that there is a sat isfying assignment which means that there is at least one literal in each clause that is set true Fix a set containing exactly one true literal in each clause The m nodes cor responding to these literals must form a clique No two of them are in the same clause and no two of them can be in con ict so all possible edges between then exist Conversely suppose that we have an mclique in the graph The m nodes must occur in m different clauses since edges only connect nodes in different clauses Because the m nodes also contain no con icts we can construct a setting of the variables consistent with all those literals We may have to arbitrarily set variables that don t occur in the set either as true or as false This setting makes at least one literal in each clause true so it satis es the formula 15 With some easy reductions we can use CLIQUE to prove some similar problems to be NPcomplete Again let G be an undirected graph A set of vertices A is an independent set if there are no edges in G between vertices in A The language INDSET is the set of all pairs G k such that there exists an independent set of size k in G Clearly INDSET is in NP because we can guess the set A verify its size and verify that it contains no edges We prove INDSET to be NPcomplete by reducing CLIQUE to it now that we know CLIQUE to be NPcomplete This reduction will be a function that takes pairs G k to pairs H K such that G has a clique of size k iff H has an independent set of size E But this is easy The problems are very similar so much so that we can give H the same set of vertices as G and arrange that a set A is a clique in G iff it is an independent set in H How do we do this We want to map sets with all the edges to sets with none of the edges so we just make H the complement of G the graph that has an edge 1 y exactly when that edge is not an edge of G Then the function taking G k to H k is the desired reduction 16 Another similar problem is VERTEXCOVER A set A of nodes of an undirected graph G is a vertex cover if every edge of G has at least one endpoint in A The lan guage VERTEXCOVER is the set of pairs G k such that G has a vertex cover of size k As before it is clear that VERTEXCOVER is in NP The Adler notes give a direct reduction from 3SAT to VERTEXCOVER which is very similar to the reduction from 3SAT to CLIQUE But we don t need to use this reduction because it s very easy to reduce CLIQUE or INDSET to VERTEXCOVER and thus use our previ ous work to prove VERTEXCOVER to be NPcomplete IfA g V is a vertex cover in G look at the set V A of nodes not in A There are no edges between these nodes since every edge has at least one endpoint in A So V A is an independent set in fact it is an independent set i A is a vertex cover So G has a vertex cover of size 9 iff it has an indepen dent set of size n k Thus the function taking G k to G n k is a reduction from INDSET to VERTEX COVER proving that the latter problem is NPcomplete l7 We don t want to get carried away with this sort of ar gument though Consider the language NONCLIQUE de ned to be the set of pairs G k such that G does not have a clique of size k Is this problem NPcomplete In all likelihood it is not A Karp reduction must take yesinstances of one problem to yesinstances of the other so the identity map is not a Karp reduction In fact it s not at all clear that NONCLIQUE is in NP because there is nothing that we can guess to prove that a clique does not eX1st 18 We de ne a class coNP to be the set of languages whose complements are in NP Note that this is quite different from the complement operation taking us from CLIQUE to INDSET We can de ne coNPcompleteness anal ogously to NPcompleteness and see that a language is coNPcomplete iff its complement is NPcomplete Could a language be both It follows easily from the de nitions that if there is NP and coNP are the same class This is considered unlikely though not quite as unlikely as P and NP being the same A Cook reduction is allowed to take the answer of a query and negate it so there is a simple Cook reduction from CLIQUE to NONCLIQUE To decide whether G k is in CLIQUE determine whether the same pair is in NONCLIQUE and reverse the answer So the coNP complete problems are all NPhard though probably not NPcomplete A nal note if we insist that k or n k be a constant CLIQUE and these other problems become solvable in P because we now have time to guess all possible cliques or independent sets or vertex covers Sometimes an identi able special case of an NPcomplete problem is eas1er l9 CMPSCI611 The SUBSET SUM Problem Lecture 17 For our nal problem today we revisit the SUB SETSUM problem the input is a set of numbers ah an and a target number t and we ask whether there is a sub set of the numbers that add exactly to t Using dynamic programming we showed that we could decide this lan guage in time that is polynomial in n and s the sum of all the 01 Now we allow the numbers to get larger so that they now might be n bits long The problem is still in NP because we can guess a subset by guessing a bitvector add the numbers in the set and verify that we get t But it s no longer clear that we are in P and in fact we will now see that the general problem is NPcomplete We reduce 3SAT to SUBSET SUM with large num bers We rst assume that every clause in our input for mula has exactly three literals we can just repeat literals in the same clause to make this true Our numbers will be represented in decimal notation with a column for each of the 1 variables and a column for each clause in the formula 20 We ll create an item 01139 for each of the 21 literals This item will have a l in the column for its variable a l in the column of each clause where the literal appears and zeroes everywhere else We also have two items for each clause each with a l in the column for that clause and zeroes everywhere else The target number has a l for each variable column and a 3 for each clause column We now have to prove that there is a subset summing to the target iff the formula is satis able If there is a sat isfying assignment we choose the item for each literal in that assignment This has one 1 in each variable col umn and somewhere from one to three 1 s in each clause column Using extra items as needed we can reach the target Conversely if we reach the target we must have chosen one item with a l in each variable column so we have picked 1 variables forming an assignment Since we have three 1 s in each clause column and at most two came from the extra items we must have at least one 1 in each clause column from our assignment making it a satisfy ing assignment 21 Given a problem with numerical parameters we say that it is pseudopolynomial if it becomes polynomial when those parameters are given in unary If it is NPcomplete with parameters given in unary we say that it is strongly NPcomplete The SUBSET SUM problem is pseudopoly nomial but all our graph problems are strongly NPcomplete Recall that the KNAPSACK is similar to SUB SETSUM but has a value for each item as well as its weight We are asked to nd whether a set of at least a given value exists with at most a given weight Since SUBSET SUM is an identi able special case of KNAPSACK where weight and value are both equal we know that SUB SETSUM 3p KNAPSACK Since KNAPSACK as a decision prob lem is in NP it is NPcomplete The associated opti mization problem is thus NPhard 22
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'