ENGR OPTIMIZATION ISEN 629
Popular in Course
Popular in Industrial Engineering
This 76 page Class Notes was uploaded by Ms. Keira Gleichner on Wednesday October 21, 2015. The Class Notes belongs to ISEN 629 at Texas A&M University taught by Sergiy Butenko in Fall. Since its upload, it has received 28 views. For similar materials see /class/226197/isen-629-texas-a-m-university in Industrial Engineering at Texas A&M University.
Reviews for ENGR OPTIMIZATION
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 10/21/15
ISEN 629 Engineering Optimization Lecture 20 Sergiy Buten k0 Industrial and Systems Engineering Texas AampM University Fall 2007 115 mporta nt inequalities gt Let t39X be a selfconcordant function with the constant M 2 we call such selfconcordant functions standard 5 If M 7 2 we can consider W M ax mm M 2 gt We assume that don f contains no straight line so that t39 X is nondegenerate for any X E don f gt Denote by llullx lqu Xul127 iivii itxirlvlZ W llf xll lf xf x 1f xl125 Then lVTUl S illll llull gt llullx is called the local norm of direction u with respect to X gt AX is called the local norm of the gradient tquotX or the Newton decrement of f at X 215 Important ineq uaities For fixed X E dom f and u E Rquot u 0 consider the function 1 4 with the domain domq t E R X tu E dom f Lemma 413 For all feasible t We have l tl g 1 Proof 7 7 f X tuu7 u7 u lt1 W 7 2uTrX mm since f is selfconcordant El 315 mporta nt inequalities Corollary Domain of function t contains the interval 07 450 Proof gt From the lemma it follows that 0 150 3f 0 2 150 lfl for some t between 0 and t Thus if t E 7 0 0 then t gt 0 gt It remains to show that domq t t gt 0 gt flt tu gt 00 as X tu approaches the boundary of dom f gt Therefore the function quX tuu cannot be bounded B 115 Important ineq uaIities Consider the ellipsoid W X r WX r 7 y E Rquot I IIy7XIIx lt r 01W X r E y 6 IRquot IIy 7 XIIx S r This ellipsoid is called the Dikin ellipsoid of f at X Theorem 415 1 For anyX E dom f We have W0X 1 Q dom f 2 For alXy E don f IIy7XIIx Important inequalities Proof 1 We have 74507450 Q dom 45 t 6 IR x tu 6 dom f Since 450 1IIuIIX this implies that y Xtu 71IIuIIX lt t lt 1IIuIIX y xtu tQIIuIIi lt 1 Q dom f 2 Choosing u y 7X we obtain 1 1 1 0 Iy XIV 21in 7xlIx 1 7 IIy7XIIy 7 IIX7XIIX 3 Hy 7 XHX lt1 the and since I45tl S 1 we have 451 S 450 1 H 3 If My 7ltIIX lt 1 then 450 gt 1 and since I45tl S 1 we have 7X X IIy7xlIy y 5 2 451 2450 1 17 My 7XIIx 616 El n15 Important inequalities Important inequalities Theorem 416 Theorem 417 LetX E dom f Then for anyy E W0X 1 We have For any X y E dom f we have 2 1 I S I T S gt IIY XIIX 17 My 7xiix2f x 5 f y 5 WWW f y f X y X 1HySXHX 3 Thus I39 2 I39gtltIquotgtltTygtltwllyXIIX 4 gt At any point X E don f there is an ellipsoid Where w t T ln1 039 W0 1 Rquot Tf 1 C d f Theorem 418 XI TXE 39yTX XyTXlt I7 0m LetXyEdomfandllinIIxlt1 Then gt Inside the ellipsoid WX r with r 6 01 function f is almost 7 XHZ quadratic the Hessian is almost the same when r is small f y 7 f XTy 7 X S liHy ig 5 1 1 WIX 5 W 5 1 0241 W 6 WI 1 My fix f XTy 7 x 4in 7 XIX 6 ms Where wt 7t 7In17 t xiS Important ineq ualities Theorem 419 Each of the inequalities 16 is an equivalent characterization necessary and sufficient characteristic of a standar selfconcordant function Proof The proof consists of the following steps 1 Definition of selfconcordant function gt 1 gt 3gt 4gt definition of selfconcordant function 2 Definition of selfconcordant function gt 2 gt 5gt 6gt definition of selfconcordant function El 915 Importa nt inequalities Theorem 4110 For any Xy E dom f We have M 2 fgtlt f XTy X wllf y f Mllj 7 lfllf y 7 mm lt 1 then M S fX f XTy X Mllf y f Xll 5 8 1015 Important ineq ualities In the above theorems we used the functions Lut t 7 ln1 t and ugh 7T 7ln17 T Since w r 7 1720 Wt mm w r 1 720 ugh vgt0 Lut and ugh are convex functions Lemma 414 For any t 2 0 and 739 6 01 We have 41047 T7 44010 f7 WU ogggll f 7 MOL MT gnggl r 7w 7 wfW5T 2 7f WAT WXT WWAT WU MW MAW0 1115 Minimizing the self concordant function Consider the problem minfgtlt X E dom f As before we assume that f is a standard selfconcordant function and don f contains no straight line Recall that AX Theorem 4111 Let AX lt 1 for some X E dom f Then the solution X of our problem eXists and is unique Proof For any y E don f we have f0 2 fgtlt f XTy X wlly Xllx 2 fgtlt llf Xll39 lly Xllx wlly TXllx fgtlt MX39 lly Xllx wlly Xllx 1215 Minimizing the self concordant function Minimizing the self concordant function Therefore for any y E ffX y E Rquot fy S fX we The following example shows that the assumption AfX lt1 have cannot be relaxed w y 7 X X g 1 X lt 1 My 7 XHX f Example t For a fixed 6 gt 0 consider a univariate function Since the function gt 7 7 ln1 t is strictly increasing in t we have My 7 Xllx g E where E is a unique positive root of the f5X 5X 7 lnXX gt 0 equation 17 MOO ln1 0 gt This function is selfconcordant Thus ffX is bounded and X exists The uniqueness follows gt Since X e 7 f5 X we have MAX l17 eXl from the inequality gt Thus for e 0 we have AfOX 1 for any X gt 0 my 2 RX MHy 7 XNX gt f0 is not bounded from below gt lfegt0thean for all y Edom f B 1315 1115 Damped Newton method 0 Choose X0 6 don f 1 Xk1 Xk WWIXMTVXk k 2 0 Theorem 4112 For any k 2 0 fgtltk1 S fXk Wfgtltk Proof Denote by 1 AfXk Then lle1 7 Xkllx 14 w and fgtltk1 S fgtltk f XkTXk1 Xk wllgtltk1 Xkllx fgtltk 112A ww rm 7 Mm wwm M 7M 1515 ISEN 629 Engineering Optimization Lecture 22 Sergiy Buten k0 Industrial and Systems Engineering Texas AampM University Fall 2007 120 Path following scheme gt Consider the standard minimization problem T X Xmelgc 7 where Q is a closed convex set with nonempty interior We also assume that we know a uself concordant barrier FX such that Dom F Q gt We will solve as by following the central path X t 39 t39 X Maggy F 7 where t39tX tcTX FX and t 2 0 gt Recall that any point of the central path satisfies the equation tc F Xt 0 gt Since Q is compact its analytic center XE argmig7 FX XE exists and 0 Xp 220 Path following scheme To follow the central path we will update the points so that the approximate centering condition is satisfied Ann x E llf fgtltll llfc F Xll S y where the centering parameter is sufficiently small lt 3 VEV Theorem 427 Let c denote the optimal value of Then for any t gt 0 We have cTXt 7 c g It39a point X satisfies the approximate centering condition then cTX7c ltVWgt 320 Path following scheme Assume that X E don F and consider the following iterate r HIElg x XilF X 1rcF Xi Theorem 428 LetX satisfy lltc F Xll S With lt X Then for any39y such that E 1 E We have lltc FlXll S 39 lvl 320 Path fol lowing scheme Lemma 421 lfa point X satisfies the approximate centering condition then 1 llcllx em m Proof We use the following inequalities 1 llf tXll lltc F Xll g approximate centering 2 F XTF X 1F X g u uself concordance We have fllcll llf fgtlt T F Xll S llf fgtltll T llF Xll S W1 El 520 Path following scheme t C x e F X 1rc W be M Let us fix some reasonable parameter values in our scheme 9 lei the 7 1 E 36 Since g if we use 7 536 we have 1 and if y 7536 we obtain Y 7 gttlt1gtt 436 llcll gt tt 120 The general path following scheme 0 Set to 0 Choose an accuracy 6 gt 0 and X0 6 don F such that llF Xollo S 1 kth iteration k 2 0 Set t 7 t 1 k1 k l x Xk1 Xk T FlXkl 1fk1c FXkt 2 Stop if etlt 2 u 720 The general path following scheme Theorem 429 The patherollowmg seheme terminates m at most N steps Where gt WitthXN 78 g e vllcllx N g 0 Win 5 F Proof 1 From Theorem 4 1 13 MAX lt 1 then llx ex ix g when therefore m E Hm emits when g Tit lt 1 2 Recall Theorem 4 1 6 lfx e dom f for a sel concordant function f then foranyy e W0x1 y 6 1Rquot lly Xllx lt 1 We have 2 1 1 T llY Xllx 1 X i f y j Wquot X Thus for fx FXquoty W x we have merwm Mme s 7 T 71 12 1 c 4e c F x c lt uh Tim gt Himk P so 1 hh 1 113 s liwwh c 3 lt 1 7 roll llXF T 120 The general path following scheme 3 sinoo tk to 5 X1 2 1 tka we have k4 k7 7 71 i213 lt 7 gt Awaw Zoesmell 13w 39 The stopping criterion is satisiiod if gt s it lt 1 1 l2 ftg pt lt 1ygt kill 1 7 lquotlt WHW sinooin1xgt4fot 1gtlt stopping Criterion is satisfied pi ip lsxiiiciigi 7 vllclli k712fnlt72ltv gt so lni 7 i any x gt 71 the following guarantes that the 71 23 920 The general path following scheme N g 0 ln i 6 Recall the inequality llX 7 Xillx u 2 holding for any X E Dom F Thus llCTX XFle S llCllxgllX Xillx S llCllxV 2W Hence the value ullcllf estimates the variation of the linear T 5 function c X over Dom F and the ratio m can be thought of as a relative accuracy of the solution 1020 Finding the analytic center 0 Set to 0 Choose an accuracy 5 gt 0 and X0 6 dom F such that llFX0ll0 S 1 1 keth Iteration k 2 0 Set m1 tk 7 Xk1 XkF Xk 1tk1cFXKD 2 Stoplf tk2VW We still need to figure out how to determine the analytic center of Dom F in step 0 ie we need to find an approximate solution gt39lt E dom F of the problem Shin FFX that satisfies the x om inequality g i We will consider two methods damped Newton and path following m Finding the analytic center damped Newton 0 Choose yo 6 dom F 1 kth iteration k 2 0 Set W yk iFquotyki1F yk 1 llF ykll p 2 Stop the process if llF ykllk g i 1220 Finding the analytic center damped Newton Theorem 4210 The above damped Newton scheme terminates in at most JIESF0 FXF iterations Proof We have Fk1 S Fk MUFOk S Fyk WW for any k before termination Thus RX S FJk1 S Fyo WW 1320 Finding the analytic center path following For a fixed yo 6 dom F define the auxiliary central path as Hf Mg min fF yoTy Fyl7 yEdom F where t 2 0 Then the equation Flo0 tFloo is satisfied Note that for t 1 we have N1 arg min FliF yoTyFyl yo yEdom and No erg min Fy x yEdom F Thus we can follow the trajectory yt with decreasing t 1320 Finding the analytic center path following We obtain the following auxiliary pathfollowing scheme 0 Choose yo 6 Dom F Set to 1 1 kth iteration k 2 0 Set tk1 Mimi n1 yk FlYkl71fk1FYOFloU 2 Stop if UFOkm S Set gt39lt yk FlYUlTlFJk 1520 Finding the analytic center path following Recall the following theorem Theorem 4114 LetX E don f and AfX lt 1 Then the point X X lf Xl 1f X belongs to don f and Afx S Since the stopping criterion of the auxiliary pathfollcming scheme is given y xE A F lt k H Millw this guarantees that I Ak 2 4 llFXll i 1n20 Finding the analytic center path following Theorem 4211 The auxiliary pathefallawmg scheme terminates m at most 5 min 2 llFxoll Iterations Proof 1 Recall that 3 19 7 l VB 73 536 to 1 2 we have k1 7 7 7 7 77k 1 WE lt1 gtlk lt1 gt t explt gt39 3 For any 2 0 we have llFytll e lltFy0ll e S v 2xI7lllFyollligt thus llFykllk HOWXe Fykll thXolllh S l3 tkllFXohlh S 13 My 2xI7lllFXoll 1720 Finding the analytic center path following Thus the process will terminate if the following inequality holds tkV2 llF Xoll 7 r3 7 ie in at most ln u2 llF Xoll iterations O lnu lnllFX0ll D 1x20 Complexity of the general path following scheme 0 Set to 0 Choose an accuracy 6 gt 0 and X0 6 don F such that llFX0llo S 1 kth iteration k gt 0 Set fk1 fk W7 Xk1 Xk lFquotXkl 1fk1C F XkA 2 Stop if etk 2 u 1920 Complexity of the general path following scheme gt Step 0 terminates in O lnu ln llFX0ll steps using the auxiliary pathfollowing scheme l llCll gt Step 1 terminates in 0 Win XFgt steps 5 The total run time is 0 y in In iiFXoyi2 In iieii2 In 2020 ISEN 629 Engineering Optimization Lecture 8 Sergiy Buten k0 Industrial and Systems Engineering Texas AampM University Fall 2007 122 Stroneg convex functions gt We are looking for a restriction of the functional class fL ORquot for which we can guarantee a reasonable rate of convergence to a unique solution of the problem Xmeilgquot fx f e flakquot gt Let us introduce the global nondegeneracy assumption assume that there exists a constant u gt 0 such that for any 2 with f gt39lt 0 and for any X E Rquot we have 1 fgtlt 2 fgtlt EullX XllZA gt We can use the same reasonings as in the optimization definitionH of smooth convex functions to define the class of strongly convex functions 222 Stroneg convex functions Definition A continuously differentiable function fx is called strongly convex on Rquot denoted by f E 5Rquot is there exists a constant u gt 0 such that for any xy E Rquot we have I T 1 2 f0 2 fgtlt f X y X MllinllA Constant u is called the convexity parameter of function f We will also consider the classes with the same meaning of the indices k7 7 L as for the class Cf lQ 322 Stroneg convex functions Theorem 218 ff 6 6101quot and f x 0 then x 2 W uiix e viz for alx E Rquot Proof Since f x 0 and f is strongly convex we have fgtlt 2 fXf XTX XHllX Xll2 D fgtlt MllX Xll2 222 Stroneg convex functions Stroneg convex functions Theorem 219 Let f be continuously differentiable Then the inclusion f E SHRquot is equivalent to each of the following two conditions Lemma 214 holding for any Xy 6 Rquot and a 6 01 We 5Wquot 2 E 5Wquot andav 2 0 then 1 W17 f yTX 7y 2 Mllxiyllz quot 2 fx1if 2f X17 17 Exi 2 f a6 Jr fz E Simwmm A a 04 1 a 001 a 002M Jll Theorem 2110 1 n 1 n Note that SOUR 7 f R thus adding a convex function to a IN E SEARquot the for anyX andy from Rquot we have strongly convex function yields a strongly convex function with the same convexity parameter I T 1 I I 2 W S fX f X y XH Ziif X s f yii f gtlt f yTX y S iiWX f yii2 522 n22 Stroneg convex functions Stroneg convex functions Theorem 2111 Two times continuously differentiable function f belongs to the cassSSURquot ifand only iffor anyx E Rquot fX fyTX 7y 2 MHX yH27 Note that the class 5101 is described by W t W Wx 7 f yll Lllxiyll The value Q Lu 21 is called the condition number of function f The first inequality can be strengthened Theorem 2112 Example fx Win 6 5311quot since wX Iquot 1 2 For a symmetric matrix A satisfying the condition 11 quot n Mquot j A i Lquot we have lff E SMLUR then for anyXy E R We have L 1 1 00 1quot 1 1quot 7 T7gtll 727 2 W EXTM hr c 6 5M R C Swag r x r m x y t W Lllx yll W Liv x r yll since f X A 722 222 Lower complexity bounds for SirRRquot We consider the following problem class note that 5531012quot c swim Model Sign ax r 6 531012quot n gt 0 Oracle Firstorder local black box Approximate solution gt39lt to i fquot S 61 Xll2 S 6 Assumption An iterative method M generates a sequence of test points Xk such that Xk E X0Liintquotgtlt0111t gtltlt17 k 211 922 Lower complexity bounds for Sfi1Rquot gt The lower complexity bound will be given in terms of the condition number f E gt We will consider infinitedimensional problems n gt Consider 1R 2 the space of all sequences X Xl39l i2 1 with finite norm llxllz i Mg lt 00 11 1022 Lower complexity bounds for S RRquot Theorem 2113 For any X0 6 R and any constants u gt 07 Of gt 1 there eXists a function f E Sigh Rm such that for any firstorder method M 3V6 2k 7 1 2 gt VQf l 7 1 2 lle X ll 7 Q f1 llXo X ll 1 71 2k rm 7 n 2 g llXo evils Where X is the minimum off and W t39X 1122 Gradient method on TLl laRquot gt Next we will look at performance of the constantstep gradient method on the class fi IURquot gt Recall the scheme of the gradient method 0 Choose X0 6 1Rquot 1 keth Iteration k 2 0 fXk and hk Xk a Compute fgtltk b Xk1 Xk k gt We will analyze the simplest variant where the step size is constant hk h gt 0 gt Similar results hold for other stepsize rules as well gt For the problem Xminquot t39X let Xquot denote an optimal solution and W t39X 1222 Gradient method on ILLIGRquot Theorem 2114 Let f E fi IURquot and 0 lt h lt Then the gradient method generates a sequence Xk such that 2W0 WilXo X llz 7 lt Wk f 2HX0 7 vii kh2 7 Lhr x0 7 h To choose the step size that minimizes the above upper bound we need to maximize h h2 7 Lh with respect to h We have h 1 and the corresponding bound is 2Lfxo 7 fiiXo 7 it 7 lt l X f 2Lllxo 7 viz kfxo 7 m 1322 Gradient method on TLl l Rquot Recall a property of the class fi IURquot o My 7 fix 7 r XTy7x iix7yii Using this inequality with X X and y X0 we have Xo t S fquotXTXo X llXo X llz llXoi lle Since our upper bound in mime 7 fiiXo 7 Wig 2L a X f S ixo7xii2krxo7r is increasing with respect to t39Xo 7 f we have M ft lt 2L iixo 7mm 7xii2 2Liixo 7m 2Liixo7xii2k iixo7xii2 k4 1322 Gradient method on TE IGRquot Recall the lower bounds for fi IURquot 3Lllx07xll2 fXk t39 gt 32k 12t Compare to the upper bound 2fgtlto llgtlt0gtltll2 c ltlt Wk f 2HX0 7 vii kh2 7 Lhr x0 7 W 7 2 Xk V S The gap is quite significant so the gradient method is not optimal for fi IURquot 1522 Gradient method on S Rquot Theorem 2115 It39t39 E 5101 and 0 lt h lt then the gradient method generates a sequence Xk such that k lle 7W s lt17 iiXo 7 u It39h then 071 lerXii ltQf1 ilxrx li L inl 2k 2 J J M r 29 MM xii Where Qf Lu 1n22 Gradient method on S Rquot Recall the lower complexity bounds for 5101 2k in 7 viz 2 V lgt HXo evils Qf 1 Ly Zkllx err NH quot Compare to the upper bounds Q 71 k iixrvii lt ilxoi ii rm 4 2 Qi1 L 0712k 2 7 2 7 fXk f 2ltQf1gt llXO xii 11 Again the gradient method Is not optimal for 5LRquot 1722 Optimal methods gt The fact that upper bounds differ from the lower bounds by an order of magnitude in general does not imply that the method is not optimal since the lower bounds may be too optimistic gt However we will show by constructing a method with the efficiency corresponding to the lower bound that in our case the lower bounds are exact up to a constant factor b The schemes and efficiency bounds for optimal methods are based on the notion of estimate sequence 1222 Optimal methods Definition A pair of sequences kX k 2 0 and Ak k 2 07 Ak 2 0 is called an estimate sequence of function fX if Ak gt 07 k gt 00 and for any X E Rquot and all k 2 0 we have kX S 1 MWX M45004 1922 Optimal methods Lemma 221 If for some sequence Xk We have lt E 39 fgtltk 7 4 gig MX then fX 7 V g Ak 0x 7 W gt o k gt 00 Proof WM S 1 21quot kX Xmigi1iAkrxAkltoxi S 1 MWX t AWN gt We can derive the rate of convergence for RXJ directly from the rate of convergence for Ak gt How to form an estimate sequence that would satisfy the lemma s condition 2022 Optimal methods Optimal methods Lemma 222 Lemma 223 Assume that Let 0X 153 7 VoHZ Then the recursive process in the 1 f E 51Rquot above lemma preserves the canonical form of functions 2 0X is an arbitrary function on Rquot 45 X V 7 V H2 3 ylt k 2 0 is an arbitrary sequence in Rquot k 7 k 2 k 7 4 oak k 2 0 C 01 is such that if 04k 00 Where the sequences 7k vk and are defined as follows k0 5 0 1 7k1 1 s 040 akM7 Then the pair ofsequences kX k 2 07 Ak k 2 0 Vk 7 0407ka 7 akMk 7 elm07 recursively defined by s 7 7 s 7 0 7 I 2 W 1 W m 7 lt1 Wm WM new M k1X 1 ak kgtlt akfyk f ykTX Jk HX MHZ m iiyk 7 kaZ f ykTvk 7 yk is an estimate sequence for fX 2122 2222 ISEN 629 Engineering Optimization Lecture 15 Sergiy Buten k0 Industrial and Systems Engineering Texas AampM University Fall 2007 120 Lower complexity bounds Consider the problem M Model 1 Unconstrained minimization 2 f is convex on Rquot and Lipschitz continuous on a bounded set Oracle Firstorder black box at each point 9 we can compute t39gt g0 E 8H9 Solution Find gt7 6 Rquot fgt39lt 7 fquot g 5 Methods Generate a sequence Xk Xk 6 X0 LingX07 2 2 2 750091 220 Lower complexity bounds In addition we assume that f belongs to the class 77X0 R M defined as follows gt There exists a solution Xquot of our problem and Xquot E BRX0 gt f is Lipschitz continuous on BRX0Wltl1 constant M gt 0 Theorem 321 For any class 77X0 R M and any k0 g k g n 71 there exists a function f E 77X0 R M such that gt i T 21 k 1 for any scheme generating a sequence Xk 6 X0 LingX07 2 2 2 750091 fXltf 320 Nonsmooth optimization methods Consider the problem minfx X E Q where Q Q Rquot is a closed convex set and f 1Rquot gt R is a convex function Can we use a subgradient gX in place of the gradient gt The objective function may not decrease in the direction gt We cannot expect that gX gt 0 X gt X However we can use the inequality T s 500 X i X 2 0 that holds for any X E Q 220 Nonsmooth optimization methods For an interior point X of Q consider y X 7 a X E Q where a is a sufficiently small positive scalar Since gXTX 7 X 2 0 for any X E Q we have 7 X i 04500 XTX 04500 i X X X llz 20450079 ag X i X S llX X llZA 2 z llyix Thus The distance between X and Xquot is decreasing in the direction 7g X 520 Nonsmooth optimization methods If we fix X X E Q and consider the hyperplane EMTO y 07 then for any optimal solution y X we have gem em 2 or Thus Only one of the two halfspaces defined by the above hyperplane can contain an optimal solution Xquot as its in terior poin t nzo Nonsmooth optimization methods Next we develop a technique for estimating the quality of an approximate solution to our optimization problem We define 1 T 7 Mix W500 X X 500 7 01 0 5X 0 Then clearly VfXX S Geometrically VfXX is a distance from X to hyperplane i XTX Y 0 Indeed for y gt39lt Vfgt39lt7XgXllggtltll ave lly gt39ltll vrgt39lt7gtlt 5XTX y 5XTX gt7 Vfgt7XllgXll 0 Nonsmooth optimization methods The following function measures variation of f with respect to X Hxiill g t if r 2 0 iftlt0 r WW This function has the following properties gt WfX t is a nondecreasing function of t t E R1 RX W S Wfgt39lt llXiilll xzo 720 Nonsmooth optimization methods Lemma 321 For anyX E Rquot We have fX 7 f0 S WfX VfXX 1 If t39X is Lipschitz continuous on BRO with constant M then RX W S MVfgtltgtlt 2 for aIX E Rquot with VfXX g R 920 Nonsmooth optimization methods Consider the problem mint X X E Q Definition Let X i2 0 be a sequence in Q Define 5k xe QgXTX7X 20i0mk We call 5k a localization set of our problem generated by the sequence X i2 0 Let Xquot be a solution to the problem Then for all k 2 0 we have Xquot E 5k Denote by vi min v M VfX7X7 099 Thus v maxr gXTX 7 X 2 01quot 0 kVX E BrX 1020 Nonsmooth optimization methods Lemma 322 Let W Ogigk t39X Then f 7 V g WfX vk Proof Using the previous lemma wrgtlt 0 gk WfX v 2 Ogi t39X 7 ta 7 ff n gk 1120 Subgradient method We consider the problem mint X X E Q where f is convex on Rquot and Q is a simple closed convex set gt By simple we mean that we can easily solve some minimization problems over Q gt In particular we should be able to easily find a projection of any point onto We assume the firstorder oracle which for any X provides t39X and one of the subgradients gX 1220 Subgradient method 0 Choose X0 6 Q and a sequence I71lt k 2 0 hkgt0 thO thoo k0 1 kth iteration k 2 0 Compute t39gtltk7 gXk and set in 7 50 X Q X Hamid 1320 Subgradient method Theorem 322 Let f be Lipschitz continuous on BRX with constant M and X0 6 BRX Then 1320 Subgradient method Denote by k WZ 10 Ak k 2 E h 10 Then AkHOithoo 10 To choose hk in a way that would be optimal for a fixed number of iterations gt assume that we perform N steps of the subgradient method gt Then Ak is a convex function of hkN gt Minimizing Ak as a function of mm0 we find that the optimal strategy is given by h 7 i0N xN1 gt Then AN WOT 1 and ft39 g 41 1520 Subgradient method Note that this bound is proportional to the lower bound for firstorder methods W f r gt M 2uvTID W lt fk f N1 1120 Problems with functional constraints Consider the problem mint X x e Q 5x 0j1mm M where t are convex and Q is a simple bounded closed convex set llX ll S R WW 6 Q Denoting by ax 131mm we can write our problem as minfXXE Q7X 0 1720 Problems with functional constraints Let Xquot be a solution to the problem We have gt 7Xquot 0 and gt vxx 2 0 for all X E Rquot gt Thus 7X g Wgtltquot7 lffXXt gt If t are Lipschitz continuous on Q with constant M then for any X E Rquot we have 7X S MVXXt 1x20 Subgradient method 0 Choose X0 6 Q and sequence hlt k 2 0 h 7 L k ik 0 5 1 kth iteration k 2 0 Compute t39gtltk7 50 t39gtltk7 Xk and set pk 50 if KW lt ll gtltkllhk7 7 if t39Xk 2 ll XkllhkA Pk X1 QltX hkiipkiil 1920 Subgradient method Theorem 323 Let t39 be Lipschitz continuous on BRX with constant M1 and M2 max llgll i g E 3600M 6 BRX lggm Then for any k 2 3 there exists a number 170 g Iquot g k such that xEMIR MZR 7 lt lt X39 f M W39Lm 2020 ISEN 629 Engineering Optimization Lecture 7 Sergiy Buten k0 Industrial and Systems Engineering Texas AampM University Fall 2007 113 Lower complexity bounds for Tfo l R We consider the following problem class note that 31013quot c fi 1Rquot Model mil fx r e le IURquot XE quot Oracle Firstorder local black box Approximate solution gt7 6 1Rquot fgt39lt 7 fquot g ea Assumption An iterative method M generates a sequence of test points xk such that xk 6 x0 Liinrquotxo7 i i i r xk17 k 213 213 Lower complexity bounds for ffo l R We will construct a function that appears to be difficult for all schemes satisfying the above assumption For some constant L gt 0 consider the family of quadratic functions W g My E Xv 7 XltI1gt2 M 7 w 1 for k 13Hn Note that tux xT j xx 7 450 so for any X E Rquot we have T u L 1 2 kil 01 2 k 2 xr kxx1ltxgtltx 7x gtltxgt 20 1 I ie fkx is convex 313 Lower complexity bounds for Tfo l R In addition lt71 XTWXV lx gtgt2zxlt39gtexlt39 gt2x gtgt2l 11 lt71 er 712Xlt39gt2 W152 wal Lien Thus 0 5 f x 5 LIquot so rm 6 ff 1Rquot 1g k g n Next we will compute the minimum of fk m3 Lower complexity bounds for ffo l R Note that f x ELAk with 7 Ak 0kn7k Ak 7 7 0PM 0n7kn7k where 0M denotes an i gtltj zero matrix Ak is a k X k matrix given by 2 71 0 71 2 71 0 0 71 2 Elk 2 71 0 0 71 2 71 0 71 2 513 Lower complexity bounds for Tfo l R The equation fx Akx 7 e1 0 has the following solution which is unique with respect to the first k components 2 17 11Hik7 k 07 k1 i ni The optimal value of fk is given by L 1 L L 1 7 7T 7T 7 fk 7fkXk74 Lacika 61 Xkl 891Xk 8lt 1k1gti n13 Lower complexity bounds for ffo l R llikllz i 392i17 2 k 1311 l1k13 S k ti ldletzk lels k1 Here we used the inequality 2 kk 162k 1 S M I 1 713 Lower complexity bounds for Tfo l R Denote by Rk quotXElRquot2X39 Oyktlgignl Note that fkx XTAkX 7 4LelTX with Ak 0k 7k gt A quot k lt 0PM 0n7kn7k 7 therefore for any X E Rkiquot we have 313 Lower complexity bounds for TL 1Rquot Lemma 213 Let p 1 g p g n be fixed and et X0 0 Then for any sequence Xk 0 such that Xk E 1k LiniCXXOL x x 7 5004 We have Lk Q Rk quot Proof Note that tux 4LApX 7 4Le1 We use induction 1 The induction basis Since X0 0 we have txo 7e1 E R1 and 1 R1 2 The inductive step Let Lk Q Rk quot for some k lt p Since A is threediagonal for any X E Rhquot we have fp x ELApX 7 e1 6 RMquot so AW g RM D Corollary 211 For any sequence Xk 0 such that X0 0 and Xk E Lk We have fpxk 2 f since Xk E Lk Q Rk quot so xk fkxk 2 913 Lower complexity bounds for Tfo l R Theorem 217 For any k 1g k g n 71 and any X0 6 1Rquot there exists a function f E ff IURquot such that for any firstorder method M We ave 3LllX0 Xll2 7 gt Wk f 32k12 1 lle X llz 2 gllXo lea Where Xquot is a minimum of t39X and W t39X Proof Recall that the method M generates a sequence of points such that Xk 6 X0 Linf xo f xk1 k 21 Without loss of generality we can assume that X0 0 iic it is not we can consider the function t39X X Xo 1013 Lower complexity bounds for TL 1Rquot To prove the first inequality let us fix k and apply M to minimizing t39X t39gk1x Then Xquot 22k and W W 2k1 From corollary 211 we have Xk t392k1gtltk MM 2 fl L 1 lt 1tk 1gt 1 llellZ gm 1 Hence since X0 0 we have Recall that and MM ew i pi 1 llxoix lz 2k2 8 4k12 1113 1 Lower complexrty bounds for Tfo GRquot Next we derive the second inequality Since Xk E Rk quot we have 2k1 2 39 2 2k1 I 2 lerXll 2 2 ng 2 17m k1 1k1 1 2k 1 1 2k1 2 k17 214 7 E k1li 1 4k1 39k1 1 3k2 k1 2k1 7k6 1 2 k17k1 2 l 24k1 Q 2k17k617 5 Q 2k77k6 7 2k77k6 l 24 k1 2 24 k1 724Ilt12 32k2 1 2 2 Wiixoexmiiz2iixoexmii2 lerX ll a 2k1 1 z 2 2k12k24k37kk12k1 k1 7 k 12k 17k 6 k1 1213 Lower complexity bounds for ffo l R p Y The above theorem assumes that the number of steps of the numerical method is smaller than the spaces dimension 1 k g 5n 7 It describes the potential performance of numerical methods in the initial stage of the minimization process The lower bound on the objective function value is not so bad since the denominator is 002 3LllXo Xll2 f X 7 fquot gt i l k 32k 12 However the convergence to the optimal point can be extremely slow lle X llz 2 1 s 2 gllXo X ll Can we define easier classes of problems um ISEN 629 Engineering Optimization Lecture 17 Sergiy Buten k0 Industrial and Systems Engineering Texas AampM University Fall 2007 11n Methods with polyhedral localization sets Several methods work with localization sets in form of a polytope EkXElRquotaijgbjj1mk gt Inscribed Ellipsoid Method yk is chosen as the center of the maximal ellipsoid inscribed in Ek gt Analytic Center Method yk is chosen as the minimum of the mk analytic barrier FkX 7 Z lnbj 7 ajTX 11 gt Volumetric Center Method yk is chosen as the minimum of the volumetric barrier VkX lndetFX where FkX is the analytic barrier for the set Ek All these methods are polynomial with complexity bound nln i with p 1 or 2 However the complexity of each iteration is n3 to n4 arithmetic operations with yk being computed using interior point methods 2 Model of nonsmooth function gt Subgradient and ellipsoid methods are known to perform according to their theoretical bounds even on simple problems and we cannot expect a much better performance on specific problems gt Some more flexible schemes are based on the notion of a model of nonsmooth function Definition Let X Xk k 2 0 be a sequence in Q Then the function MX X oggkl x gXTX 7 X where gX E 0fX is called the model of the convex function f corresponding to the sequence X 31n Model of nonsmooth function MXW Orgal x gXTX 7 Xl gt fkXX is a piecewise linear function of X and x 2 max for all X e Rquot gt For all test points X7 0 g i g k we have fgtlt fkXX7 5Xr E 3fkXXA gt The next model is always better than the previous one fk1xx2kxx for all X e Rquot gt Model fkXX represents the complete information on function f accumulated after k calls of the oracle ma Kelley method 0 Choose X0 6 Q 1 kth iteration k 2 0 Find Xk1 E erg mig7 fkXX XE V The auxiliary problem min kXX can be easily solved using linear programming XEQ gt However the method is still impractical due to its instability V The solution ofthe auxiliary problem may not be unique V the set algimn fkXlt can be unstable With respect to small variations of data flt7glt V This feature can be used to construct problem instances where the Kelly method has a very poor lower complexity bound 51n Kelley method Example For the problem min fyX with yxeQ fOiX maxllJli llelZL y E RX 6 1Rquot Q Z in iyz llel2 S 1 the following lower bound can be established for the Kelley method 1 W321 1 wk 7 W 2 4 Thus to get an esolution we need at least calls of the oracle nin Level method Denote by g min kXX 7 the minimal value of the model7 XE min fX 7 the record value of the model ogigk We have t g f Kg For some a 6 01 denote by ka17 at at and consider the level set ma X e Q Wm Ika Then ka is a closed convex set ma Level method gt Note that inside ka there are no test points of the current mo el gt Thus by selecting Xk1 in 401 we obtain a new record value of the mode and mm c Ma V Unlike argmig7 kXX the set ka is stable with respect XE to small variations in the data Xin Level method 0 Choose a point X0 6 Q accuracy 6 gt 0 and the level coefficient a 6 01 1 kth iteration k 2 0 3 Compute and fkquot b If f e f g 5 then STOP C Set Xk1 nm w 91n Level method The most expensive steps in the level are computing and n kaxk lf Q is a polytope then they can be computed by solving the following LP and convex QP respectively a min t st fXgXTX7X ti 0k7 X E Q n kaxk min llgtlt7gtltkll2 st fXgXTX7X g ltoz7 i 0 7k7 X E Q Both problems can be solved in polynomial time 101n Level method gt The optimal values of the model increase and the record values decrease f5 g f g r g r gt Denote by Ak g 5 and 6k f e f Then Ak1 Q Ak and 5k1 S 6K We call 6k the gap of the model kXX 111n Level method Lemma 331 Assume that for some p 2 k We have 6 2 17 a6k Then for all ik g p We have a 2 Proof A Recall that 6k f 7 f and 6k1 g 6k For any ik g g p we have 6 2 17006lt 2 17 a6 Therefore a r 717a6 2 r 717a6 6p717a6 2 a El 121n Level method Let M maxlllgll 16 6 WWW E 0 Lemma 332 For the sequence Xk generated by the level set method We have 1 005k 7 gt lle1 Xkll 7 Mi Proof A Since Xk1 E ka gt fkXXk1 g ka we have WM 1 a5k 13 1 i 04W M04 kXXk1 2 fgtltk gXkTXk1 Xk fgtltk Mfllgtltk1 XkllA El 131n Level method Lemma 333 Let diamQ g D lffor some p 2 k We have 6 2 17oz6k then 2 2 Pt 1 k S Proof A Ta 7 Let X 6 arg meig fkXlt Then for all 17k S I S p we have 094 g pXx f g Mai Applying the Inequality llx 7 7r5lt0ll2 S llX Xollz ll7rslt0 Xollz W39th d L X0ltXXpan kaweo ain llxi17xi ll2 llX7X ll27llX17Xll2 H L 2 2 17 a iix7xii271 i Sllxi7lelLl i73rt Summing up these Inequalities for I k7 7p we get 1700262 P 1 i 0sz S lle lel2 S Di win Level method Theorem 331 Let diamQ D Then the level method needs at most 2 2 N 1 iterations to guarantee f 7 V g 6 Proof A Assume that fkquot 7 fquot 2 5 Then 6k fkquot 7 fkquot 2 fkquot 7 fquot 2 5 Let N70 I0UI1UUlm where J0 IO PJ77kllyPJZkl p0 N7PJ1kj17k 5kg S E 5130 lt 5mm 3 6PU139 Then for 2 0 we have 6p01jz 15 97 2 14193 2 Thus MQD2 MQD2 W E P0 1 k0 S s f 1 7002 1 7 026230 52 1 7 or2 m M202 m M202 and N Zeno g Th H 201 7 002 g w 1 HM 151n Level method Mgoz 620417 122 7 a We can obtain the optimal value of the level parameter a by solving the following problem N 1 17227 17217172 17 arglg l a a gm a a Wrenlg l 7 yielding yquot 17 or 2 12 ie ozquot 17 Hence with a 17 i we have 4 2 2 Nge ZMfD Notethatfi7f e ff7 ff7f e gt For most reallife optimization problems the accurac e E 10751074 is achieved after N E 3n4n iterations 1n1n ISEN 629 Engineering Optimization Lecture 16 Sergiy Buten k0 Industrial and Systems Engineering Texas AampM University Fall 2007 115 Cutting plane methods Consider the problem minr X X E Q where f is convex on Rquot and Q is a compact convex set such that intQ7 Ql diamQDltoo We assume that we are given a separating oracle which for any test point X 6 1Rquot returns a vector g whic is gt a subgradient of f at X if X E Q gt a separator of from Q if X g Q Example For the problem with functional constraints mig7 fX where XE Q X 6 1Rquot g 0 for X g Q the oracle provides us with any subgradient E E 87X which separates X from Q since it is supporting to the level set fX 215 Cutting plane methods Consider a sequence X X i2 0 C Q Then the localization sets generated by this sequence are given by 5000 Q 5k1X XESkOQ35XkTXk X20 Then for all k 2 0 we have X E 5k and as before we use the following notations v VXX v min v ogigk We will denote by V01quot5 the ndimensional volume of set 5 C Rquot 315 Cutting plane methods Theorem 326 For any k 2 0 We have Proof 1 Denote by 01 vZD 1 2 Since Q Q BDX we have 1 7 01lt 010 Q 1 7 01lt 0180Xquot EV 3 Since Q Is convex we have 1 7 01lt on E 1 7 01lt aQ O Q Q BVX O Q Q SAX 4 Therefore V01kX 2 Vol1 7 01lt on 01quotVolQ 115 General cutting plane scheme 0 Choose a bounded set E0 2 Q 1 kth iteration k 2 0 a Choose yk 6 Ek b If yk 6 Q then compute fyk7gyk If We 0 then compute yk separates yk from Q c Set gk 007 ny 6 07 gyk7 1f yr 0 quot0 Choose Ek1 2 X 6 Ek ggm X 2 0 Note that we use supersets of 0 since 0 may have a complicated structure 515 General cutting plane scheme To analyze the performance of the above scheme we introduce the following notations Yykk207 XY Q7 iklyj0 jltk Ql Lemma 324 For any k 2 0 We have 5 Q Ek Corollary 1 For any k such that ik gt 0 We have vgk x D 1quot lt DV01quotEk1quot VOlnQ MW 2 lfvolquotEk lt V01quotQ then ik gt 0 Proof Observe that Q 50 5 Q Ek for all k such that ik 0 El n15 General cutting plane scheme vjkX lt DV01quotEk1n we gt To have the convergence it suffices to ensure that V01quotEk gt 0 gt The rate of decrease of the volume defines the methods rate of convergence gt We want to decrease the volume as fast as possible ms Method of centers of gravity Historically first nonsmooth minimization method based on cuttin 3 planes is the method of centers of gravity lts idea is based on the following geometrical fact Lemma 325 Let 5 be a bounded convex set in Rquot with nonempty interior Define the center ofgravity of this set as 1 7 VO1quot5 de 5 For a direction g E Rquot define 5 X e s gTcg5 7x 2 0 Then 1 5 1 V0quot lt 7 V01quot5 1 e xiS Method of centers of gravity 0 Set 50 Q 1 kth iteration k 2 0 3 Choose Xk cgk and compute fltk7gltk 7 Set k1lX 6 5k gXkTXk X 2 0 Theorem 327 It39t39 is Lipschitz continuous on BDX with a constant M then for any k 2 0 We have 1m f7f MDlt17gt where f 02ni2k 4 915 Ellipsoid method gt Another method using approximation of localization sets is the ellipsoid method gt It is based on the geometrical observation that a hahc of an ellipsoid belongs to another ellipsoid with a strictly smaller vo ume gt Let H be a positive definite symmetric n X n matrix Consider the ellipsoid EHgt lt x e Rquot X7TH71X7 g 1 gt Choose a direction g E Rquot and consider a hahc of EHgt39lt defined by the corresponding hyperplane E x e EHgt lt gTgt39lt7X 2 0 1015 Ellipsoid method Lemma 326 Denote by 7 1 H X X m39m v 7 2 TH H Tunil H m39 gggHg gt Then 5 C EH7gtlt and volawn 17 1115 Ellipsoid method o Choose yo 6 Rquot and R gt 0 such that BRO0 Q Q Set 2 0 7Rquot 1 kth iteration k 2 0 gk 7 gok ifyk607 EM Ifyi ZQ 1 Yk1 7 Yk m39gmv H 7 n7 H 7 2 39HkgngHk k1 1 k n1 W Note that this isjust a special implementation of the general cutting plane scheme where Ek X E Rquot X7 TH 1X7gt39lt g 1 and yk is the center of this ellipsoid 1215 Ellipsoid method As before we use the following notations Yykk20 XY Q iklyy0 jltk Ql 7 fk 7 Oglgk Theorem 328 Let f be Lipschitz continuous on BRX with some constant M Then for k gt 0 We have 1315 Ellipsoid method To apply the above theorem we need to ensure that k gt 0 ie X 0 Assume that there exists p gt 0 and 7 E Q such that mmga this can be guaranteed by assuming that all constraints are Lipschitz continuous and there is a feasible point at which all functional constraints are strictly negative Slater condition 2323 Him L If e WW7 R g 1 then V01quotEk g V01quotQ gt k gt 0 for all k gt 2n12ln Moreover if k gt 0 then r 7 h g lMR2 W 39k 1115 Ellipsoid method b Each iteration of the ellipsoid method takes 0n2 arithmetic operations gt In order to generate an esolution this method needs 2 MR O n2 In 5 calls of the oracle pe gt Thus we have a polynomial dependence on ln and on logarithms of the class parameters M Rp gt For problem classes whose oracle has a polynomial complexity such algorithms are called weakly polynomial 2n12ln 1515 ISEN 629 Engineering Optimization Lecture 19 Sergiy Buten k0 Industrial and Systems Engineering Texas AampM University Fall 2007 1 Introduction We consider the following problem min mix W 011mm XERquot where all functions involved are convex gt All numerical methods we considered so far are based on the blackbox concept gt We only used some information on the functional components of the problem at some test points gt Even though the developed numerical methods assume convexity the only time we pay attention to the structure of the function ie convexity is before we apply a numerical met 0 gt Can we utilize the structural information in numerical schemes 2m Introduction A common problemsolving process 1 Find a class of problems that can be solved efficiently using a particular met 0 2 Describe the transformation rules for reducing a given problem to a problem from the above class 3 Describe the class of problems for which these transformation rules are applicable With respect to the considered optimization problem our goals are gt To derive a family of special convex functions the selfconcordant functions and selfconcordant barriers which can be efficiently minimized using the Newton method gt To develop a technique for transforming our original problem into its barrier model 317 Newton method revisited Newton method Xk1 Xk flanillrXk How does the Newton method behave when we apply an affine transformation For a nondegenerate n X nmatrix A consider the function 450 WW Lemma 411 Affine invariance of the Newton method In addition to the sequence X1lt k 2 0 generated by the Newton method for f consider the sequence ylt k 2 0 generated by the Newton39s method for 39 yk1 yk l quotykl71 lyk7 k 2 07 With yo A IXO Then yk A lxk for a k 2 0 m Newton method revisited Newton method revisited Theorem 125 Local convergence of Newton method con de the P Obem all RX and assume that gt If We choose a new basis in Rquot then all objects in our 22 quot description Will change including the region of the quadratic 1f CMR I I I I I I I convergence Which is defined by the standard Inner product 2 There eXists a local minimum Xquot off With positive definite H gt HoWever the Newton method is affine invariant With respect Hessian f X llquot7 gt 0 I I I II I 2 to affine transformation of variables Thus its real region of 339 our Staf ng pomtXO 5 Close enough fox 39 iiXO T X lt W39 quadratic convergence does not depend on a particular inner Then HXlt 7 X H lt for all k and the Newton method converges product it depends only on the local topological properties quadraticay of t39X gt Can We modify our assumptions to correct this inconsistency MiiXk X X lt H k H s 27 MiiXk mi nn 517 Newton method revisited Newton method revisited Consider the Lipschitz continuity of the Hessian If v u then iit39quotgtlt tquot Wii S MiiX Jiii Wm 6 RT VTfquotXHUlV S Miiuii 39 iiViiZ becomes Assuming that f E C3Rquot denote by quNXuu S MHUHSI 1 llMild Cline girdX a flXH gt The lefthand side is affineinvariant with respect to X Will show later the 3rd directional derivative of f along the direction u this is an gt The righthand side does not depend on XI n X nmatrix Then We have gt We Will replace the standard norm With an affineinvariant norm defined by the Hessian t39 X iit Xuii S Miiuiii iiuiif x uTt Xu12A therefore for any X E Rquot gt This leads to the definition of selfconcordant functions th Xuv S Miiuii M W my Defining self concordant functions Consider a closed convex function fX E C3dom f with open domain For a point X E dom f and a direction u E Rquot consider the function X t fX tu as a function of t E don X Q R Denote by DfXu X 0 K Wu DZfXlU7 ul X 0 TVgt011 llullfiXy DslquotXu7 u7 u X 0 qu Xuu recall that f Xu lim0f X au 7 f X Definition Selfconcordant function We call function f selfconcordant if there exists a constant Mf 2 0 such that the inequality DsfXuuu Millullx e osiixiiuwi maximums2 holds for any X E dom f and u E Rquot gm Defining self concordant functions An equivalent definition of selfconcordant functions useful for establishing properties of selfconcordant functions is given by the following lemma Lemma 412 A function f is selfconcordant if and only if for any X E dom f and any L117 ug7 u3 E Rquot We have 3 lD3fXlU17 U27 Usll S MfH llullpX 1 1017 Affine invariance of self concordance Theorem 412 Let AX AX b Rquot gt R be a linear operator Assume that fy is sel coricordarit With constant M Then X fAX is also sel coricordarit With M4 affl esm variari t property Proof I e sel coricordarice is an 1 X is closed and convex 2 FIX X 6 dom lt75 X Alt 6 dom f and u 6 R and denote by y AX7 v Au Then fAXTAU fYTV7 AUTfAXAU flXV AUTf AXlAUlAU Dafylvy V7 Vl lD3fYlV7 V7 Vll S lVr TTWYV32 MfD2 Xlu7ul32 D lt u 02 Xluy ul HwyX u 03 Xluy U7 ul UTWX ulu and lDa Xluyuyull XTu I 1117 Examples of self concordant functions 1 Linear function fX aTX b7 dom f Rquot Then f X a7 f X 07 f X 07 so fX is selfconcordant with Mf 0 Convex quadratic function N fX XTAXJr bTX c7 dom f Rquot where A AT 2 0 Then f X AX b7 f X A7 f X 07 so fX is selfconcordant with Mf 0 12m Examples of self concordant functions 3 Logarithmic barrier for a ray fX 7lngtlt7 domf XE 1RXgt 0 Then f X 3 mx i mx 73 X7 X27 X3 Hence fX is selfconcordant with Mf 2 13 Examples of self concordant functions 4 Logarithmic barrier for a region defined by a quadratic Let AT 5 0 Consider the concave function 45X 7XTAX bTx c Let flt 7lnlt75lt7 dom f X 6 1Rquot 45X gt 0 Then DfXu D2fltu7 u D3fltu7 u7 u 71jbTu7ltT4u7 bTu 7 XTAu2 gt1XuTAu7 7WbTu 7ltTAu3 7 gqag bTu 7 XTAuuTAu If we denote by M1 Dfltu and my quAu 2 0 then szXlu7ul Ml w2 2 0 Slnce DQleuyul llullfwwl lDafltu7 u7 ul lZLdf 3w1w2l Assume that M1 0 the only nontrivial case Denote by w then lD3fltu7uul lt Zinnia 3lw1lw2 21 a lt 2 02mm um w mm 1 003 s 39 Thus flt Is self7concordant With M 2 mm Properties of self concordant functions Theorem 411 Let functions f1 and f2 be selfconcordant With constants M1 and M2 respectively and let m gt 0 Then the function fx af1X fgX is selfconcordant With constant 1 Mfmaxa 1 M M 1w 2 and domf don f1 dom f2 1517 Properties of self concordant functions Corollary 411 lff is selfconcordant With some constant Mf and A AT 2 0 then the function X fX XTAXJr bTX c is also selfconcordant With the constant Mg Mf Corollary 412 Let function f be selfconcordant With some constant Mf and a gt 0 Then the function X afX is also selfconcordant With the constant Mg Mf 1n17 Properties of self concordant functions Theorem 413 Let f be selfconcordant If dom f contains no straight line then the Hessian t39 X is nondegenerate at any X from dom f Theorem 414 Let f be selfconcordant Then for any 7 in the boundary of dom f and any sequence XkCdomt39Xkgtgt lt We have t39Xk gt 00 Thus t39X is a barrier function for cdom f 1717 ISEN 629 Engineering Optimization Lecture 23 Sergiy Buten k0 Industrial and Systems Engineering Texas AampM University Fall 2007 Problems with functional constraints We consider the following problem mnma siti fgtlt 0j1ium7 1 XEQ7 where Q is a simple compact convex set with nonempty interior and all functions 6139 twm are convex We also assume that Slater condition is satisfied There exists 26th quch that lt0forallj1ium Problems with functional constraints Assume that we know an upper bound T such that X lt T for all X E Q Introducing two additional variables T and K our problem can be equivalently written in the standard form min T7 siti fox g T7 MMSKJL m XE QT TI 0i In order to apply the interiorpoint methods to this problem we need to be able to construct a selfconcordant barrier for the feasible set consisting of gt A selfconcordant barrier FQX for the set Q gt A selfconcordant barrier F0X7 T for epi f0 gt Selfconcordant barriers FAX7 K for epi 6139 17 i i i m Problems with functional constraints Then the resulting barrier is given by HXVT K FQX Fogtlt7 T Z F109 K T nT T T T nTK7 11 with the parameter m 9mu0uj2 J where uQ ij 17 i i i m are the parameters of the corresponding barriers We need to find a way to determine a starting point from don Problems with functional constraints We change the notation to obtain the problem min cTz st 2 E 5 3 de g 0 where z XTI cTz E T de E K and 5 X6 Q f0gtlt gn x mj1mr ie the feasible set of our problem without the constraint K g 0 We know a selfconcordant barrier Fz for the set 5 assuming that we know barriers for all constraints and we can easily find a point 20 6 int 5 by selecting 7390 and no large enough to guarantee fogtlt0lt70 lti 500 ltN07j17m7m In addition the set 5az 5dTZ a is bounded and has a nonempty interior if a is large enough Problems with functional constraints To solve our problem we will use a threestage process 1 Find an approximate analytic center of the set 5a 2 Eind an approximate analytic center 2 of the set 5 zE 5amp1de g 0 Apply the main pathfollowing scheme starting with E to solve the problem no mincTz z E 3 which is equivalent to problem 3 min cTz min cTz st 265 ltgt st 265a de g 0 de g 0 since 5a z E 5 de g a Problems with functional constraints 1 Choose a starting point 20 6 int 5 A gt 0 and set a deo A If a g 0 then we have a feasible point and we can use the twostage pathfollowing method discussed before Otherwise we find an approximate analytic center of the set 5a generated by the barrier Fz Fz E lna E de This can be done using the auxiliary path following scheme The approximate analytic center 2 will satisfy the condition 12 S M2 2 PG T new Fe Problems with functional constraints 2 To find an approximate analytic center 2 of the set 5 z E 5amp1de g 0 E z E 5 de g 0 generated by the barrier Fz E lnEde we solve the minimization problem minde z E 5a by following the central path zt defined by the equation rd F zt 0 r 2 0 Due to the Slater condition the optimal value will be negative Since the analytic center 2 satisfies the equation F z E yd 0 it is a point of the central path zt with t 71de gt 0 As a result we find 2 such that E quotl d T El quotl d 12 Adz FZ T2 F Z FZ E S xg Problems with functional constraints 3 Finally starting with Z we apply the main pathfollowing scheme to solve the problem mincTz z E 3 which is equivalent to the original problem To summarize we have developed efficient interior point methods for problems for which we know selfconcordant barriers for Q and epigraphs of the objective and the functional constraints Later we will describe classes of problems for which such barriers can be provided ISEN 629 Engineering Optimization Lecture 13 Sergiy Buten k0 Industrial and Systems Engineering Texas AampM University Fall 2007 11 Separation theorems Theorem 3111 Let Q be a closed conveX set and X0 Q Then there eXists a hyperplane Hg y Which strictly separates X0 from Q Proof Take g Xo 7TQX0 7f 07 7 X0 7TQX0T7TQX0 Then since for any X E 3 X0TX 7TQX0 2 07 for any X E Q we have X0 7TQX0TX S X0 7TQX0T7TQX0 X0 7TQX0TX0 llXo 7TQX0 lt X0 7TQX0TX07 ll2 ie for any X E Q gTX g y lt gTX0 D 2 Separation theorems Corollary Let Q1 and Q2 be two closed conveX sets and wqg be the support function ofset Q wqg supgTX X E 1 lffor allg E don 111 We have 1113 g g 111 then Q1 Q 02 2 lfdom 1113 don 111 and for allg E don 1113 We have Weds 711mg then Q1 02 Proof 1 Assume that there exists X0 6 Q1 Q2 Then there exists a direction g such that for any X 6 Q2 gTXo gt 7 gt gTX Thus g E don 111 and 11Q1g gt 11Q2g a contradiction 2 Due to the first statement Q1 Q Q2 and Q2 Q Q1 31 Separation theorems Theorem 3112 Let Q be a closed conveX set and X0 belong to the boundary ofset Q Then there eXists a supporting hyperplane Hg39y for Q passing through X0 Proof Consider a sequence yk such that yk g Q and yk gt X07 k gt oo enote y yk 7TQk T gk Yk gr abk llyk 7TQklll Then for all X E Q we have ng 7kltEIIYIv Note that llgkll 1 and the sequence 7k is bounded lel lng7TQk X0ngX0l S llmW Xoll llXoll S llJk Xoll llXoll oi Separation theorems Since a bounded sequence always has a limit point without loss of generality we can assume t at g lim gk and 7 lim yk 700 700 Taking the limit in T T gkgtlt 7kltgkyk Subgradients Definition For a convex function f a vector g is called a subgradient of f at point X0 6 don f if for any X E don f we have RX 2 f39Xo ETX 7 X0 The set of subgradients of f at X0 is denoted by 0fX0 and is called the subdifferential of f at X0 bt Example weo am ngX S 7 S sz0 VX E Q Let fX leX E R X0 0 Then a subgradlent of f at X0 Is anyg satisfying so 7 gTX0 D M 2 gX for all X E R This inequality is satisfied for all X E R if and only if g E 7171 sn nn Subgradients Subgradients gt Note that the set 8fX0 is defined by the set of linear constraints 8fX0 g E Rquot fX 2 f39XogTX7X07 X E dom f Therefore the subdifferential is a closed convex set gt Next we will show that subdifferentiability implies convexity 71 Lemma 316 fquotfX is nonempty for any X E dom K then f is conveX Proof Consider z ay 17 aX where Xy E don f a 6 01 Let g E 0H2 Then 2 2 gTy 7 z fz17 agTy 7 X RX 2 f39ZETXZ f39Z045Tygtlt Multiplying the inequalities by a and 17 a respectively we taln afy17 afX 2 2 Xtt Subgradients Theorem 3113 Let f be closed and conveX and X0 6 intdom f Then 8fX0 is a nonempty bounded set Proof X07 fX0 belongs to the boundary of epif Hence there exists a supporting hyperplane for epif at X07 fX0 dTX W S dTXo 7 afgtlto WW 6 WW 1 where d and a are selected so that lldll2 a2 1 Since for all X0y E epif we have y 2 fX0 using X X0 in 1 we conclude that a 2 0 The rest of the proof consists in showing that Subgradients 1 Since a convex function is bounded in the interior of its domain there exist 6 gt 07 M gt 0 such that 85X0 Q don f and fX 7 fX0 g MHX iXoll VX E 85X0 Therefore using 1 for any X E 85X0 Q don f we have dTix 7x0 WM 7 we aMHX 7on Choosing X Xo ed we have lldll2 g Malldll and since lldll2 17042 we obtain a 2 11 2 gt 0 2 Thus with g da 1 is equivalent to X 2 X0ETXX0 i g E 0fXo gt fX is convex and closed m0 Q Note that X0 0 intd0m f so the condition of the theorem is violated 1111 1 a gt 0 3 If g E 8fX0 then for X X0 egllgll E 85X0 we have 2 d 0 39 g E X ellgll gTXXo x 7 axe Mllxixoll Mel B 3 for any g E 8fX0 we have g M 911 1011 Subgradients Example Consider the function fX 7X with the domain X 6 R X 2 0 ISEN 629 Engineering Optimization Lecture 26 Sergiy Buten k0 Industrial and Systems Engineering Texas AampM University Fall 2007 117 Duality in nonlinear optimization Let f g7 F be real valued functions defined on X Q R Y Q Rquot and X X Y Q R X Rquot respectively Assume that the global minima and maxima which we address below do exist in all cases Suppose that fX g gy for all Xy E X X Y Then lt I i I Tea fX Q gy weak duality Under certain conditions this inequality can be satisfied as an equality max fX min gyi strong duality XEX er 2m Duality in nonlinear optimization It is easy to show that the following inequality holds max min F X lt min max F X i V xex 7y xex yEV 7y Under certain conditions we can prove that gnea x FXy FXyi mlnlmax 317 Duality in nonlinear optimization The point Xy E X X Y is a saddlepoint of F with respect to maximizing in Y and minimizing in X if FX7y S FX7y S FX7y for every Xiy E X X Y Theorem The point Xy E X X Y is a saddlepoint ofF ff FX 39FX 39 in 7y gngxmgg 7y 3331625 7y Proof Let Xy be a saddlepoint of F s s s FX7y Egg w lt 5mg Egg xiy lt 39 F lt F Eng 5mg Xiy 5mg X m Foam D om Duality in nonlinear optimization Consider the primal optimization problem min fX st g 07 i 0 P X 6 X7 whereglRquotgtlR quot7 hRquotgtlR andXQlRquot The Lagrangian of P is Lgtlt7 A7 u fX ATgX uThX7 where AERCLuElR 517 Duality in nonlinear optimization Note that for the Lagrangian Lgtlt7 A7 y fX ATgX yThX7 we have Xl if 500 S 0 and hx 0 AskljonlLXlAly T 00 otherwise and the PrOblem P can be restated in the form min max LXAu XEX AZOJ nn Duality in nonlinear optimization For A 2 0 and u define the function dAV min LXAV XEX which is concave independently of the convexity of f Then the dual problem of P is defined by the following optimization problem gag dAV gag LXAu D The objective function dAV of D is called the dual function my Duality in nonlinear optimization Theorem Weak Duality Theorem Let Xquot be a global minimum point of the primal problem Then for every A 2 07 u 107 S dWyV S V Where A 7 W is a global optimal solution of the dual Proof Since A 2 0 gX g 0 and hX 0 it follows that ATgX uThw g 0 On the other hand for any A 2 0m and Xquot E X dA7 u g t39X ATgX VThX and hence dAu g 93x dAu dmm lt My xn Duality in nonlinear optimization If d5u lt t39X then the difference t39gt5 7 d5V is called the duality gap The following result is an immediate consequence of last theorem Theorem A point X 5 u E X X Rfquot X R is a saddlepoint of the Lagrangian LX u it39FX is a global minimum point of the primal problem 5 W is a global maximum of the dual and the optimal values t39X of P and d5 u of D coincide Theorem A point X 5 u E X X Rfquot X R is a saddlepoint of the Lagrangian LX u i r the following conditions hold LX5u minLX5u X E X gx g 0 hX 0 5TgX 0 gm Duality in nonlinear optimization Example Consider the quadratic programming problem min fX CTX XTQX st AX g b where the n X n matrix Q is symmetric positive definite c E Rquot A E Rm b E Rquot The corresponding Lagrangian function is LX A CTX XTQX TAX 7 b c ATTX XTQX 7 NA The minimum of LX with respect to X occurs at the point 5 where L X 0 Ie gt5 7Q 1c AU 1017 Duality in nonlinear optimization Substituting in L we obtain the dual function d AQ lAU 7 ATb AQ lc 7 JQ lc Hence the dual problem is given by 1 dA7 T dTA T233 2 H v where M AQ lAT and d 7b AQ lc lf 5 is the solution of the dual problem then gt5 7Qilc AT5 is the solution of the original primal problem and Kw aw nn Primal dual interior point methods Consider the problem in the form min tux st t39X 0i1m AX b where f Rquot 7gt R i 1 m are convex twice continuously differentiable functions and A E Rpm with rank p lt n We assume that the problem is strictly feasible Solving this problem is equivalent to solving the KKT system fo xzAfxATy 0 11 MAX 07i1 7m AX b W S 07i1 mm A 2 0 12m Primal dual interior point methods Denote by X 7 i ln7fX the standard logarithmic barrier and consider the probiem min tfogtlt gtlt7 siti AX bi Then the central path is given by Xt such that AXt b7 fXt lt 01391Him7 and for some 9 E R 0 tfo gtltf Xf A79 th Xt I W xt ATM 13m Primal dual interior point methods Denotingby 1 77 39 At7 t Xtl 1iumut ut7 we obtain 0Xt Em YUM 0 ATVW 0 1 Thus Xt minimizes the Lagrangian LX A u om in Ar x VTAX 7 b 1 for A At and u ut Then Atut is dual feasible and gtut f0Xt Aft xt utTAXt 7 b f0Xt 7 mti Hence f0Xt 7 f g mti um Primal dual interior point methods Thus the central path conditions can be interpreted as deformed KKT conditionsH faxrimamxwowWo o warm 1ri1mm AXt b mm 0171 m m 2 o where the complementary slackness condition in KKT is replaced wit A f gtltf 1fA The search directions of a primaldual interior point method are obtained from Newton method applied to the modified KKT eq uations 1517 Primal dual interior point methods We are dealing with both primal and dual variables We express the modified KKT conditions as rtXu 0 where x DfXT ATu tX77V 7diagAfX 7 1te 7 AX7 b t0e1u1T ER quot and 6X f1 XT x 7Dr39gtlt m MW 1n17 Primal dual interior point methods Denote the current point and Newton step by y Mm Ay AXAAui Then the Newton step is obtained from the linear system ty AY N tJ DrtyAy 07 which is given by f0 x i AMYX DrXT AT AX rd 7diagXDfX 7diagfx 0 AA C A 0 0 Au rp where 7diagf39X 109 AX f0 gtlt DfXT AU 7 b 1717 ISEN 629 Engineering Optimization Lecture 6 Sergiy Buten k0 Industrial and Systems Englneerlng Texas Aamp M University Fall 2007 11x Classes of differentiable functions Lipschitz condition Let Q Q Rquot Denote by Cf pQ the class of functions with the following properties gt any f E Cf pQ is k times continuously differentiable on Q gt lts pth derivative is Lipschitz continuous on Q with the constant WW 7 fl yll Lllx e yll for all Xy E Q Properties gt We always have p g k gt If q 2 k then cmo g cMQ gt If h e cffm 2 e Cf o and mg 6 R then for L3 iaiL1 W2 we have at ms 6 cmo 21x Lipschitz condition Lemma Lemma 122 in the text Function fX belongs to CE IURquot C CLI IURquot ifand only if llf Xli L w e Rquot Example 1 Linear function fX aTX b E C01 1lRquot since f X a7 f X 0 2 For the quadratic function fX XTQXd CTX we have f X QX c7 f X Q Thus X e CLI IURquot with L Hall 3 For the function fX 1 X2X E R we have I 4 l 4 1 W e W K x e MW2 1 50 X e C11 1R atx Lipschitz condition Lemma 123 Let f E CLI IURquot Then for any Xy E Rquot We have L WY fX f XlTO Xl S Elly Xllzt Geometrically this means that for a point X0 6 Rquot the graph of function f is located between the graphs of the following two quadratic functions L 1X fgtlt0 f gtlt0Tgtlt X0 EllX Xollz I T L 2 2M fgtlt0 f X0 X X0 EllX Xoll A otx Lipschitz condition Consider a function f E C ZURquot Then we have llfquotgtlt W ll S MllXiylly Vny E R7 Lemma 124 Let f E C fURquot Then for any Xy E Rquot We have N M 2 llfJ HX f XJXll S gllinlly I T 1 T A4 3 llfJfgtltfgtlt inFEOiX f XJXll S gllinll A Corollary Let f E CEJ ZURquot and llyixll r Then f x 7 Mrquot 5 t39 y 5 f X Mrn Where Iquot is the identity matrix in 1Rquot 513 Smooth Convex Optimization Chapter 2 of the text nix Which properties make a problem tractable gt Consider the unconstrained problem min f X R 7 where t39X is a sufficiently smooth function gt We want to introduce some reasonable assumptions on t39X to make our problem more tractable than it is in general gt Next we will define a class f of differentiable functions that posses some desirable with respect to minimization properties V Since the methods we discussed converge to a stationar point we want a stationary point to be a global minimizer for any f E V We want class f to be sufficiently wide and include functions that are obtained from other functions in f by applying some basic arithmetic operations such as multiplication by a nonnegative scalar and addition my Basic assumptions We formalize the desired properties of the hypothetical class f in the following assumptions 1 For any f E f the firstorder optimality condition is sufficient for a point to be a global minimizer of our problem 2lft391t392 E f and m 2 0 then at t z E f 3 Any linear function t39X aTX b belongs to f These properties appear to be sufficient to define the class of smooth convex functions Xix Smooth convex functions For f E f fix some X0 6 Rquot and consider the function X fgtlt f XoTXA Then 15 E f due to the assumptions 2 and 3 We have X0 flX0 flX0 07 thus according to assumption 1 X0 is a global minimum of 15 so for any X E Rquot we have 00 2 X0 l39Xo l39gtlt0Tgtlt07 which is the same as fx 2 fx0 f x0TX 7 X0 913 Smooth convex functions fX 2 fx0 f x0TX 7 x0 Recall that this inequality is exactly the firstorder characterization of a convex function which can be viewed as an alternative definition of a smooth convex function Definition 211 A continuously differentiable function fX is called convex on Rquot denoted by f E f1lRquot if for any Xy E Rquot we have l1y fgtlt f XTy X If 7fx is convex we call fX concave mm Smooth convex functions gt We can define classes of convex functions ff pQ similarly to how we defined classes Cf pQ Let Q Q Rquot Denote by ff pQ the class of convex functions with the following properties gt any f 6 ff PQ Is k times continuously differentiable on 0 V Its pth derivative Is Lipschltz continuous on 0 With the constant L lifWX WMH S LiiX Yii forallxy 6 Q 111x Smooth convex functions Next we check that our three assumptions regarding f are satisfied by the class f1lRquot and thus are the properties of this functional class Theorem 211 lff E flakquot and f X 0 then X is a global minimum of fX on Rquot Lemma 211 IN and f2 belong to flakquot and m 2 0 then function f a fg also belongs to f1lRquot The third assumption is also trivially satisfied Thus for differentiable functions our hypothetical class f is exactly the class of convex functions f1lRquot 121x Properties of smooth convex functions Recall the original definition of a convex function that we gave Theorem 212 Continuously differentiable f belongs to the class flakquot if and only if for any Xy E Rquot and a 6 01 We have faX17 ay g afX17 afy Here is another equivalent firstorder characterization Theorem 213 Continuously differentiable f belongs to the class flakquot if and only if for any Xy E Rquot We have f gtlt f yTgtlt y 2 0 131x Properties of smooth convex functions Proof The necessity follows from adding the inequalities fX 2 f0 f JTX Y f0 2 fgtlt f XTy X To show the sufficiency assume that the inequality f X 7 f yTX s y 2 0 holds for all Xy E Rquot Denote by X7 XTy7x T en My ax m w 7 XTy 7 Xdr iv 2 X 1o1x Properties of smooth convex functions Recall the secondorder characterization of a convex function Theorem 214 Two times continuously differentiable function f belongs to f2lRquot if and only iffor any X 6 Rquot We have fquotx 0 151x Properties of smooth convex functions Lemma 212 lff E f1Rmb E R andA Rquot 7gt R then X fAX b e flakquot Proof For Xy E Rquot denote by gt39lt Axby Ay b We have 450 f7 2 f39 f gt39ltTy X 00 f gt39ltTAy X X Tf 370 X X ATf AX 17 00 XTy X 1n1X Properties of smooth convex functions Theorem 215 Each conditions below holding for all Xy E Rquot and a 6 01 are equivalent to inclusion f E fi IURquot L 0ltHY RX f XTin iiXJii27 1 RX f XTy X 21 wa f yii2 S W 2 Wm 7 WW W 7 mm 7y 3 W 7 mm 7 y LHX 7 yiiz 4 arx17ary 2 rax17aiyi iimxi7myiii 5 04fgtlt104t39y S fagtlt1aya1a iigtltyii27 6 171x Properties of smooth convex functions The following theorem characterizes the class fE IURquot Theorem 216 Two times continuously differentiable function f belongs to fE IURquot if and only if for any X 6 Rquot We have 0 5 t39 gtlt 5 LIquot 1x1x ISEN 629 Engineering Optimization Lecture 14 Sergiy Buten k0 Industrial and Systems Engineering Texas AampM University Fall 2007 11x Subgradients Theorem 3114 Let f be a closed convex function Then for any X0 6 intdom f and p E Rquot We have f gtlto P maxlgTP g 6 W90 Proof The proof consists of the following steps 1 Prove that f X0 p 2 maxng g E 8fX0 Indeed for an arbitrary g E 8fX0 we have 1 f gtlto P alga EWXO 0P T fgtltol 2 HP 2 Find a vector g E 8fX0 such that f X0p g gp p 21x Subgradients fy 2 fx f xy 7 X To prove the second part we will consider f X0 p as a function of p and show the following 3 3quotXo apf gtlto 0 b For any g E 0fX0 3 We have g E 8f Xo 0 E 8fXo and ngP 2 f gtlto P e e e a Since f X0 0 0 the inequality f X0 p 2 ng is equivalent to fXo p 7 fXo 0 2 ng for any g E 8fXo Thus 8fX0 Q Bpf Oq 0 On the other hand f X0 p is convex in p and for any y E dom f we have f0 2 fgtlt0 f X0TX0 2 1fgtlt0gTJTX0 for any g E Bpf Oq 0 Thus Bpf Oq 0 Q 8fX0 so 8fxo 8f xo 0 Here we used the definition of subdlfferentlal for SpitX0 o for any g e marw we have my exo 2 men gTy 7x0 W Subgradients fy 2 fx f x y 7X b Denote by p fXo p and let g E 8f Xo p E 8 p For any v E RquotT gt 0 using as with y TVX p we have Tf Xo v f gtlto TV 3 WV 2 P P Tv 7 p 2 P ngTV T P E f gtlto P ngTV P Tf X0 v f X0 TV 2 f X0 p ngTV 7 p 1 Thus 1 1 f gtlto v 2 lf gtlto PngTV Pl gTVlf Xo PngPli and taking T 7gt 00 we obtain f X0 v 2 glv g E 8f Xo 0 E 8fXo Finally taking T 7gt 0 in we obtain f gtlto P i ngP S 0 D oix Optimality conditions Theorem 3115 For a closed conveX function f We have 7 X 7 argxer igpn ffX ltgt 0 E 8fX Proof fX 2 fXfor all X E domf ll fX 2 fX0TX7X for all X E domf ll 0 E 8fX0 Stx Subgradients and optimization Theorem 3116 For any X0 6 don f all vectors g E 8fX0 are supporting to the level set ffX0 gTX07X 2 0 for allX e maxi E X e dom r X g we Proof If fX g fX0 and g E 8fX0 then fX0 2 fX 2 fXo gTX 7 X0 Corollary Let Q Q don f be a closed conveX set X0 6 Q and Xquot arg miE2 fX Then for any g E 8fX0 We have XE gTX0 X 2 0 am Computing subgradients Lemma 317 Let f be closed conveX and differentiable on the interior of its domain Then 8fX f X for anyX E intdom f Proof Consider an arbitrary X E intdom f Then for any direction p 6 Rquot and any g E 8fX we have f XTP f X P maxlgTP g 6 WM 2 ng Using 7p instead of p we have f XTP S gTP Thus f XTp ng for all g E 8fX Using p ek k 17 n we conclude that each component of g is equal to the corresponding component of f X therefore 5 f gtlt D m Computing subgradients Lemma 318 Let fy be closed and conveX With don f Q R For a linear operator AX AX 1Rquot gt Rm consider the function X fAX Then X is a closed conveX function With domq X AX E don f and for anyX E intdom 15 We have 045m Mammy Proof For X E intdom let y AX Then for all p 6 Rquot we have lsX p raw raw fy AP X p maxgTP g 6 045M f y AP maxgTAP g 6 WM maxiE TP i E E ATOM1 Therefore the support functions W5 Xp and WATBfyp are equal for any p E Rquot yielding 8 X E AT8fAX El w Computing subgradients Computing subgradients Lemma 319 Since f1 and f2 are closed we have Let f1x7 f2x be closed convex functions and a1 a2 2 0 Then fx a1f1x a2f2x is a closed convex function and for any x E intdom f intdom f1 intdom f2 likm inf f1xk 2 52 in inff2xk 2 52 and t klim tk 2 in inf f1xklikm inf f2Xk 2 f0 gt Sgt E epif a a a ace goo Hoe RX a1 1100 a2 lax To prove the relation for the subdifferentials consider x E intdom f1 intd0m f2 Then for any p E Rquot Proof For flt f1lt consider a sequence xk7 tk C epif f x p a1f1 x p a2f2 x p t 7 max5ia1Prgl E 05M maxig azp 52 E WAX tk 2 X0 flak 1 X07 2394an X7 kins tk t39 Xk g dam f39 maXa1g1 azgzTP 1 51 E 3500752 E WAX 7 T Note that for a closed function lt75 for any sequence xk Q 110111075 such 7 max1g p 39 g 6 11811001 12812001 that xk gt 2 we have ILnlLrlf xk 2 452 Indeed if we assume the on the other hand HOG p maxg7p g E WOO Since opposite then there exists a subsequence yk of xk such that af x and af x are bounded and for any p E Rn klquot 4500f 452 if g 513075 SWCe M74500 6 513075 W5fxp Wa15f1xa25f2xp we conclude that and kliammyilt x this contradicts to the assumption that lt75 is closed gm 8fX a18f1x i 042an x D mm Computing subgradients Computing subgradients Lemma 3110 m Let functions fx7 i 17 m be closed and convex Then conSlder X 6 ID mtwom 039 Assume that X 117 quotk l39e39 fx 12naltx fx is also closed and convex and for any fx f1x fkx Then for any p E Rquot 7 quot39 max xap4x x E 1ntdom f 7 3991 1ntdom f We have HOG p m fgxailifgx1 m 199 aa0 aa0 T a 7 i 7 I 7 12 X P 122 quotWig P g E 8fx k ac k Z A39 maxigTP g e 850 i1 8fx Convquotfx iE x7 Where x i fx fx Proof 39 k We will use the following fact in the proof for any set of values max AgTp g E MOO7 H C Ak a1alt we ave 11 k k maxigTP g E Mmg E WAX N C Ak A A A T 397 1rglagxka max 2 C k maxg p g E Convquotfxi E k Hence 8fx Conv8fx iE x D where Ak A 2 0 Z A 1 1113 um Computing subgradients Lemma 3111 Let A be a set such that for any fixed y E A the function yx is closed and convex in x Then fx sup yx y E A is a closed convex function with the domain domf X 6 dom yi3 y yx S W E A YEA and for any x E don f We have 0M 2 Common y e X where X y mm fX Proof For any xox E don fy E xog E 8X yxo fX 2 yX 2 yXo gTX 7 X0 We 5 Thus g E 8fx0 TX X0 D 131x Computing subgradients Example 1 Consider fx Z iaTx 7 bi 11 gt Denoting by fx iaTx 7 bi we have fx i fx 1 gt If my M then fx mm where Ax aTx 7 b Since 1 y gt 0 5450 1 y lt 0 171 y we have a7 alTx 7 b gt 0 8fx 7a7 aTx 7 b lt 0 T 723 2 x7 b o Computing subgradients gt Denote by 7X I39aTx7blt0 X iaTx7bgt07 0X iaTx7b0 Then 2 ie X WX 72 ie X 7aa i6 0X1 gt Since 8fx i fx we obtain 1 Z 27 Z a E 723 iEIgtlt 6Lx iEIogtlt 8fx 151X Computing subgradients Exa m pie 2 For llnorm fxHxH1 2quot ix39i we have 1 fgtlt Z ian bri 11 where a e and b 07 i 1111n Therefore 0W 2 e 7 2 e 2 76m iEIgtlt 6Lx iEIogtlt where x i x39 lt 07 i X39 gt 0 ix39 0 i V3 ii 1n1X Computing subgradients Example 3 Consider the function fX max Xl39 19gquot gt Denote by fX Xl39 eTX gt Then 8 X 6 gt Thus 8fX Conv8fX E X Conve E X where X i Xl39 fX Note that this implies that 8H0 X E Rquot lleloo g 1 171x Computing subgradients Example 4 For the Euclidean norm fX we have gt f is difFerentiable in all points except for X 0 and 0M0 g e Rquot llel 2 ng VX e Rquot gt Since gTX g we have g E Rquot g 1 Q 8H0 gt On the other hand if gt 1 then selecting X g we obtain gTX llel2 gt since gt 1 Thus Mo g e Rquot llgll 1 gt Since f X for X 0 we get X X X 0 a x X e ilklllpll g 1 X 0i 1x1x ISEN 629 Engineering Optimization Lecture 18 Sergiy Buten k0 Industrial and Systems Engineering Texas AampM University Fall 2007 115 Constrained minimization Consider the problem min RX st 0j1111m X 6 Q7 where Q is a compact convex set and RX are Lipschitz continuous on Q Denoting by RX maltx we obtain the equivalent problem 197m min RX 51 Fx 0 X 6 Q7 Then RX are convex and Lipschitz continuous on Q 215 Constrained minimization Let us introduce the parametric function RtX maxRX 7 t and denote by t39tminRtX1 1 er Note that t39t is nonincreasing in t Lemma Let Xquot be the optimal solution ot39probem as With t RX being the optimal objective value Then ft g Ot39oraltZt ft gt 0foratltt1 Thus t is the smallest root of function t39t and it corresponds to the optimal value of as 315 Constrained minimization To solve we will use models of both RX and which will be used to introduce a model function for RtX For a sequence X Xlt k 2 0 denote by W gagrm mom 7 xiii ax Mm Og kl7X XTX7Xl M where ggt9 E 8Rgt9 and gt9 E 8709 115 Constrained minimization The models of fX and can be used to introduce the model for f39t X maXkXX f7 7kX7X S ux X tX t minfkXtX ft er WX Then X t is nonincreasing in t and its smallest root tX does not exceed t Lemma 334 tX minkxx max g 0X e Q Denote by f Xt min f XtX k 099 k 7 the record value of our parametric model 515 Constrained eve method 0 Choose X0 6 Q7 to lt gs 6 012 and e gt 0 1 kth iteration k 2 0 a Keep generating sequence X X J 2 0 by the level method applied to function ftklt If the Internal termination criterion A mx rt 2 1 7 mmx w holds then stop the Internal process and set jk j Global stop tk g 5 0 set M1 Mx lsame as 6X tk 7 8X tk g 6 with 6 K X tk ntS Constrained eve method Let the global stop condition be satisfied 6X tk g 6 Then there exists j such that TOM WX fk S 6 Therefore we have WWW maxlf39O W 709 S 6 Since tk g t we conclude that M n6 Kw ms Constrained eve method complexity To derive the analytical complexity bound for the constrained level method we need to estimate 1 the complexity of the master process ie the maximum number of iterations indexed by k 2 the complexity of step 1a xtS Constrained level method master process complexity Lemma 337 For a k 2 0 We have to 7 tquot 17 K mix th s 5 J Thus the algorithm terminates in at most 1 t0 7 tquot N l n217 h 1 ehe full iterations of the master process full iterations are iterations terminated by the internal termination criterion 915 Constrained level method internal process complexity Denote by M maxllgll g 6 W00 U 87XX E Q We need to analyze two cases 1 Termination by internal criterion full step and 2 Termination by global criterion last step 1 Full step The internal process is terminated by the rule 5X th 7 8X fk WM tr e where t X tk 2 e Reeall Theorem 33 1 Let diamQ D Then the level method needs 2 Wt at most lv W J 1 lteratlons to guarantee f e P g e Thus MED2 1k 71k T 1 S K262041 122 7 a for any full iteration of the master process 015 Constrained level method internal process complexity 2 Last step The internal process was terminated by the global stop rule 5X tk g 6 Since the internal termination criterion was not satisfied we ave 51X fk e 81x th 2 1Xrk 2 he Thus the number of iterations of the level method at the last step does not excee Mgoz K262041 122 7 a 1115 Constrained level method total complexity Thus the total complexity of the constrained level method is M7D7 h e a 17a M707 1 t 4 K7e7ail7a37i27ot 1 ln217K lnz17kje M7D7 In Mil h7e7a1ee72ealn21eh It can be shown that the following values of parameters are N61 27oz reasonable 1 a K 1 7 xE The total complexity of the constrained level method is 215 Constrained level method 0 Choose X0 6 Q7 to lt ti E 012 and e gt 0 1 kth iteration k 2 0 a Keep generating sequence X X j 2 method applied to function ftklt lft criterion 0 by the level 6 09 tk 2 1 7 af x tk holds then stop the internal process and set jk j Global stop tk g 5 h set M1 571100 he internal termination Remaining issues to discuss gt Computing 8X tk reduces to LP if Q is a polytope previous lecture gt Finding the root 510X 1315 Constrained level method finding 5kX Recall Lemma 334 x miniltXgtlt 71Xx g 0x g Q Thus finding fjkX is equivalent to solving the problem minkxx ow g 0X e Q which is equivalent to min t s t fxig Vow ryj0mk gt9 fxi 57xexi0j0mk XEQ gt If Q is a polytope this is an LP gt If Q is more complicated we can use interior point methods 1115 Constrained level method Remark Since 7X 12naltx fX we can use Wm Mix gx7x e xiii Fix where ggt9 E instead of l kXX 032be gt9TX gt9l S RX In practice this complete model accelerates the convergence of the process at expense of increased periteration complexity 1515 ISEN 629 Engineering Optimization Lecture 2 Sergiy Buten k0 Industrial and Systems Engineering Texas amp M University Fall 2007 1 Local Methods in Unconstrained Optimization gt We consider an unconstrained optimization problem in the orm min f X 1 w where f Rquot gt R is a smooth function gt We will consider several methods attempting to solve this problem by generating a descending sequence fXk k 2 0 fXk1 S WM R 0717 gt Note that if fX is bounded from below on Rquot then the sequence fXk k 2 0 converges gt In any case we improve the initial value of the objective 2 Preliminaries p X1 By vector X E Rquot we will mean a columnvector X Xn Then the corresponding rowvector is XT X17 7Xn Unless otherwise stated for a vector X E Rquot by we Will H mean the second norm that is 1 We call X0 6 X an interior point of X if there exists 5 gt 0 such that 85X0 C X If X0 6 X is not an interior point then we call it a boundary point of X A set is open if all its points are interior A set is closed if it contains all its boundary points A set X is bounded if there exists C gt 0 such that for any XEX g C A compact set is a closed bounded set 3 Preliminaries gt For X0 6 Rquot let BSX02 X 6 1Rquot llX 7 Xoll lt 5 denote the open ltSball in X0 and 85X0 X 6 1Rquot llX 7 Xoll g a the closed 5ball in X0 gt A vector d E Rquot is called a feasible direction for problem 1 at X0 6 X if there exists 6 gt 0 such that X0 ad E X for any a g 6 Note that any direction is feasible at an interior point gt For f 1Rquot gt R and X07 d E Rquot the directional derivative of fX at X0 in the direction d is defined as 0W0 8d hm W 9H0 a If 1 then 2 is also called the rate of increase of f at X0 in the direction d om Preliminaries For two functions fg 1Rquot gt R and X0 6 Rquot we say that gt fX OgX if is bounded near X0 ie there exist numbers 5 gt 0 and C gt 0 such that lfXl g ClgXl for all X 6 8500 fXogX if Iim 0 xgtgtlto g X 517 Preliminaries Assume that f is differentiable Denoting by a fXo 04d7 we have 8fxo 7 m fXo ad 7 fXo 8d 7 gt0 a m ae o 0H0 a 0 On the other hand using the chain rule a Wm adTd 0 VfX0Tdi So we obtain nn First order necessary conditions Theorem FONC for a set constrained problem lff 1Rquot gt R is a continuously differentiable function and X0 is a point of local minimum for problem then for any feasible direction d at X0 the rate of increase off at X0 in the direction d is nonnegative Vfx0Td 2 0 Corollary lf X0 is a local minimizer and an interior point ofX then VfX0 0 Theorem FONC for an unconstrained problem le0 is a point of local minimum of the unconstrained problem mil fX Where f E C1Rquot then XE quot VfX0 0 1m FONC and existence of a global minimum The following example shows that FONC is not sufficient for a local minimizer We consider fX X3 and apply the FONC We have f X3X20 ltgt X0i But obviously X 0 is not a local minimizer of fX In fact for any given point X0 and any small 5 gt 0 there always exist XX 6 8500 such that fX lt fX0 lt fX so fX does not have any local or global minimum or maximum xn Existence of a global minimum gt An important question is when does a minimum exist This question is in general very difficult to answer gt Even a bounded function may not have any minimizers For example the function fX ex is bounded fX gt 0 for any X but it does not have any local or global minimizers gt However there are some cases where the existence of a global maximum is guaranteed One of such cases is given by the Weierstrass theorem Theorem Weierstrass For a continuous function f Rquot 7gt R and a compact set X C Rquot there always exist thquot E X such that fx fx and fx Tea fx gm Existence of a global minimum Another special case where a global minimum is guaranteed to exist is represented by coercive functions A function f Rquot 7gt R is called coercive if lim fX 00 llelH00 Theorem Any coercive continuous function f Rquot 7gt R has at least one point ofglobal minimum in 1017 Convex sets and functions gt Convex problems represent one of the most important categories of optimization problems Their very specia properties put them among the few tractable optimization problem classes gt A set X is said to be convex if for any Xy E X and any a 6 01 ax17 ay E X In other words all points located on the line segment connecting X and y are in X gt Given a convex set X a function f X 7gt R is convex if for any X1X2 E X and any a 6 01 we have fax1 17 axz g afX117 afxz gt If the last inequality is strict ie fax1 17 aX2 lt afX117 afX2 for any X1X2 E Xa 6 01 then fX is called strictly convex nn Convex problems A problem man fX is called a convex minimization problem or XE simply a convex problem if fX is a convex function and X is a convex set Theorem Any local minimum ofa convex problem is global 12m First order characterization of a convex function Theorem If fX is differentiable on a conveX set X then it is conveX if and only iffor any Xy E X W 2 fx WXTy 7 X 1317 FONC are sufficient for convex problems Consider a convex problem man fX For any Xy E X a XE differentiable convex function satisfies fy 2 fX VfXTy X so if for X X VfX 0 we obtain l y 2 W for any y E X Thus X is a point of global minimum for this problem This implies that the FONC in the case of an unconstrained convex problem become sufficient conditions In other words for an unconstrained problem 39 f X lt 7 where fX is differentiable and convex Xquot is a global minimizer if and only if VfX 0 mm Quadratic forms and functions gt A quadratic form qX is given by qm 2 11 11 where ay ij 1 n are the coefficients of the quadratic form X X1 XquotT is the vector of variables V Let A ahhm be the matrix of coefficients of qX Then qX XTAX XTQX where Q A AT2 is a symmetric matrix Therefore any quadratic form can be associated with the symmetric matrix of its coefficients gt A function f Rquot gt R in the form fX XTQX CTX where Q is a real n X n symmetric matrix and c E Rquot is the vector of linear coefficients is called a quadratic function 1517 Positive definite forms and matrices gt A symmetric matrix Q is called positive definite semidefinite and associated with Q quadratic form qX XTQX is positive definite semidefinite if qX XTQX gt o 2 0 for all nonzero X E Rquot V It is known that a symmetric matrix A is positive definite semidefinite if and only if all the eigenvalues of A are positive nonnegative V Another equivalent characterization of a positive definite matrix is positivity of all its leading principal minors A kth leading principal minor of a square matrix A is the determinant of a matrix that consists of the intersection of first k rows and first k columns of A 1n17 Second order characterization of a convex function Theorem A twice continuously differentiable function t39X is convex on an open convex set X if and only if V2fX is positive semidefinite for any X E X 1717 ISEN 629 Engineering Optimization Lecture 12 Sergiy Buten k0 Industrial and Systems Engineering Texas AampM University Fall 2007 1 Operations with convex functions Examples I 39 1 fx lrglagx x Is closed and convex 2 Let A All quot and A be a set in RC Consider the function fx Al39r39x 22 where f are closed and convex Then f is closed and convex w Let Q be a convex set Consider the function WAX sngTX gEQ Then wqx is closed and convex Function wqx is called the support function of the set Q It is homogeneous of degree one wQtX thxx E dom Q r 2 0 2m A closed convex function that is not continuous 4 Let Q Q Rquot Consider the function 11g y sup yg39y yEQ where Y 1507577 gTy 2 llyllzx gt g y Is closed and convex In g y gt If 0 Is bounded then dom 3940 1R 1 gt If 0 R then dom 3940 contains only points With 7 2 0 Indeed If y lt 0 then for any g 7 0 we can build a sequence yk kg such that ykg y gt 00 k gt 00 gt If 7 0 then g 0 otherwise y g 0 IS unbounded gt IM gt 0 then yg7 g IS the maximizer of yg39y wuth raped to y and g39y 1 Thus 07 ifg07707 Mew 7 i gt07 With dom 3940 ORquot x y gt 0 U 00 This Is a convex set neither open nor closed 3940 Is a closed convex function which Is not continuous In 00 since 7g y llgll2 y 0 am Convexity and continuity Lemma 312 Let f be convex and X0 E intdom f Then f is locally upper bounded at x0 Proof Choose 5 gt 0 such that X0 i ee 6 intdom f i 17 n and denote by A Convx0ieei1n 1 We can show that A D Bgxo with E 2 Thus M A xergga o fX S Enean fx lt max fx0 i ee D 19gquot 3 Convexity and continuity Theorem 318 Let f be convex and x0 6 intdom f Then f is locally Lipschitz continuous at x0 Proof Fix 6 such that 85x0 Q dom f and M such that supfx x E 85x0 g M Let x0 y E 85x0 Denote by 1 1 a iiy7Xoii7 zXo y7Xo 6 0t Then a g 1 and y az17 ax0 Due to convexity and since lZ Xo llY Xoll 61 0 S 04quotZ1 04fgtlt0 S XoaM Xo 7 on Wily 7xoii To finish the proof we need to show that 7 M7 axe e y 2 fgtlto lly XollA 5m Convexity and continuity Choose u x0 x0 7 y Then llu 7x0ll e and y 7 x0 agtlto 7 u Recall a property of convex functions f is convex it39and only it39t39or any xy E dom f and 2 0 such thaty y 7x E don f We have fy My X 2 t39 UM 00 Using this inequality with y xox u a we obtain an 2 rixo awe 7 fu 2 rim 7 aM 7 we fixoi7Wiiy7xoii D am Convexity and differentiability Definition We call f difFerentiable in a direction p at point x E don f if the following limit exists f X P 8W 1 8p 7 gig 3m up 7 Mi The value tquotx p is called the directional derivative of f at x in the direction p Theorem 319 Convex function f is differentiable in any direction at any interior point ot39dom Proof Let x E intdom f and denote by 1 ma 7 3mm 7 Mi a gt or To prove the statement it suffices to show that a is decreasing and bounded from below 7 Convexity and differentiability 1 Let 6 gt 0 be small enough to have x 6p 6 don f Then for a E 0e and 6 01 town 7 f17 x xap 17 fx fxap mam 7 i fxa p7fxl fxap7fxl 7 way This means that a decreases as a 7gt 0 N Let us choose 7 gt 0 small enough to have x 7 ryp E don f Then using the inequality t397 7gt39lt 2 Wdr UUF t39gt39lt with y xxx739yp we obtain Odom 2 fgtltlfgtltfgtlt7pl 5 450 2 glfX fX YPl xw Convexity and differentiability Lemma 313 Let f be a conveX function andX E intdom f Then 1 f X p is a homogeneous function ofp of degree 1 2 f X p is a conveX function of p 3 for anyy E don f We have W 2 fXf XyX7 Proof 1 For p 6 Rquot and 739 gt 0 we have m fXWP ale T lirgf p 7 fx TfXoi p fXTap7 fX gm Convexity and differentiability 2 For any phpg 6 Rquot and 6 01 we have f X P1 1 MP2 ilfXa P11 P2 fXl 7 lim 333 g max apl 7 fX1 one 0P2 7 M1 f OC P1 1 f gtlt P2 Let a E 01y E dom f and ya Xay X Then using the inequality W W 7 2 2 W My 7 to with 7 yo X 17 a we obtain lA w My 7 fya17aya7x 2 m7lt17awltm7axi Taking the limit in a gt 0 we obtain an 2 ax my 7 x a 101 Separation theorems Definition Supporting hyperplane separating hyperplane Let Q be a convex set and Mew X E 1Rquot 57X 77g 0 is a hyperplane gt We say that the hyperplane Hg y is a supporting hyperplane of Q if Hg y O Q 0 and Q Q Hg39y or C Q H7g77 gt We say that the hyperplane Hg y separates a point X0 from Q if gTX S 7 S ngo orgTXo S 7 S 57X for all X E Q If one of the inequalities is strict we call the separation strict um Separation theorems Definition Let Q be a closed set and X0 6 Rquot Then the projection 7TQX001c point X0 onto the set Q is given by 300 erg minllgtlt 7Xoll X E Q Theorem 3110 If Q is a conveX set then there exists a unique projection 7TQX0 Proof 7TQX0 Mg min X X E Q where X 7 Xoll2 belongs to D 12 Separation theorems Lemma 314 Let Q be a closed convex set and X0 g Q Then for anyX E Q mgtlto 7 XoTX 7 7T9 2 0 Proof Since 7rqxo erg min x X E Q7 where X 7 XOHZ we have 7TQX0TX 7 7TQX0 2 0 VX E Q Noting that X X7X0 and thus 7rqxo 300 7X0 we obtain the inequality of the lemma D 13 Separation theorems Lemma 315 For any X E Q We have iiX 77TQgtlt0ii2 iimgtlto 7 Xoiiz S iiX 7Xoii2 Proof iiX 7 7TQgtltoii2 7 iiX 7 Xoii2 7 72XT7TQX0 ii7TQgtltoii2 2XTX0 7 MM2 X0 7 7TQX0T2X 7 7TQX0 7 X0 7 X0 7 7TQX0T2X 7 27TQX0 7TQgtlto 7 X0 7 727TQX0 7 XoTX 7 7TQX0 7 ii7TQX0 7 Xoii2 7iigtlto 7TQX0ii2 D iA ww ISEN 629 Engineering Optimization Lecture 10 Sergiy Buten k0 Industrial and Systems Engineering Texas AampM University Fall 2007 11x Convex sets gt Next we will look at constrained minimization problem in the form mint x XEQ 7 where Q C Rquot gt Given the vectors x17 x in Euclidean space Rquot and real numbers A 2 0 with i A 1 the vector sum Alxl Amxm is lcalled a convex combination of these points gt For example the convex combination of two points is the line segment between these two points and the convex combination of three non colinear points is a triangle 21x Convex sets gt A subset C of Rquot is said to be convex if for every x17 x2 6 C andAElR 0 A 1wehavex11ix26 C gt The geometric interpretation for any two points of C the line segment joining them lies entirely in C Theorem A subset of Rquot is convex iff it contains all the convex combinations of its elements 31x Convex sets Theorem 224 Let Q1 Q Rquot and Q2 Q R be convex sets and Ax be a linear operator Ax Ax b 1Rquot gt R Then all sets below are convex 1 Intersection m n Q1 7 Q2 x E Rquot x E th 6 Q2 2 Sum mnQlQzZXXE Q1JE Q2 3 Direct sum Cartesian product 1 X 02 X E quotHquot XE Q1y 6 Q2 Conic hull 7CQ1 z E Rquot z xx E Qh 2 0 Convex hull ConvQ17 Qz Z 6 Rquot F gt zozx1704yy Xe QME 027046 071 ON Affine image AQ1 y E R y Axx 6 Q1 Inverse affine image A 1Q2 x E Rquot Ax 6 Q2 T4 oix Convex sets Examples gt A hyperplane H in Rquot is a set of the form HXERquotcTXb7 where c E Rquot 0 and b E R Similarly we define the closed halfspaces HXERquotICTX2b H XERquotCTX b gt It is easy to see that H7 H H are all convex sets since linear function is convex Six Convex sets Examples gt The intersection of finitely many halfspaces is called a polyhedron V Using matrix notation we can define a polyhedron to be the set of points P X E Rquot AX g b where A is an m X n matrix and b E R V A bounded polyhedron is called a polytope V Polyhedron polytope is convex as an intersection of convex sets gt Ellipsoid X E Rquot XTAX g r2 where A AT 2 0 is convex since the function XTAX is convex nix Convex sets gt A nonempty subset V of Rquot is called a linear subspace if the following conditions hold I leny Vthenlty6 V II IfXE VandreRthennlt6 V gt We say that a set of vectors 5 v17 Vm C V spans V if every vector v E V is a linear combination of vectors from 5 m v Z cv where each c E R 1 71x Convex sets gt The vectors in 5 are are linearly independent if m 2cm 0gtc 0i1m 1 gt If 5 spans V and its elements V1 vm are linearly independent we call 5 a basis of V V The dimension of the subspace V dimV is the number of vectors in some basis 5 of V V Alternatively a linear subspace V can be defined as solution set of the homogeneous system of linear equations CX 0 where C is a m X n matrix Here the dimension of V is n 7 rankC Xix Convex sets Convex sets and functions Lemma 225 gt An affine subspace A of Rquot is a linear subspace V translated If lrX i5 3 COMex mcm the for Q U E R is lower EVE 59f bysomevector u ieAx lRquotxuvvE V We quot have dimAdimV Ad X E R MIX S gt An affine set is a set that contains the line between any two is either convex or empty distinct points in the set Note that the convexity of does not imply that f is convex V Equivalently we can define A as the solution set of the nonhomogeneous linear system Cx Lemma 23926 gt Note that a hyperplane in Rquot is an affine subspace of XX 5 a convex funcnon WWW ep g aph d39 39 7 1 sf m e W ax T is a convex set gm mm Constrained convex problems Constrained convex problems Theorem 226 Let f E SHRquot and Q be a closed convex set Then there exists a unique solution x ot39problem migf x XE Theorem 225 Proof For x0 6 Q consider the set Q x E Q fx g fx0 Let f E flakquot and Q be a closed convex set A point xquot is a solution of the problem t39x if and only if fX0 2 fX 2 fX0 f X0TX 7X0 X0 7 f XTX7X 2 0 so iixixoiiz if X0TX X0 iiif xoiiiixixoiiy and thus Q is bounded Note that our problem is equivalent to the problem for alX E 039 min t39x7 therefore it has a solution x Assume there is another XE solution 7 Then f f0 2 f X f XTgt39lt X Z lli X llz 2 fllgt7xll2 gt xx D tt 12m Gradient mapping Definition For a fixed 7 gt 0 denote by XQO KY erg aw tquotgt39ltTX i llgtlt Tillzlv ga w K2 Xa y We call gqgt39lt39y the gradient mapping of f on Q Note that for Q Rquot we have 1 2097 X yWX 612097 WOO so 17 can be viewed as the step size of the step from 7 to XQO KY 131x Gradient mapping Theorem 227 Let f E SiiURUgy 2 L and E Rquot Then for anyX E Q We have M 2 MWmewvinxezi iimmviii2 iix22r mm s to 2 gimwii 2 M 1 m 2 Zilgaiwll2 giw 72H gt X Xquot and thus fXqgt39lt39y EQgtlt7 YTgtlt compare to the unconstrained properties of the gradient axfr m Wk llf wlz ower 2 Hf xll2 um Minimization over simple sets gt We consider the problem 22 RX where f E 5101 and Q is a closed convex set gt We assume that Q is simple enough for computing the gradient mapping explicitly gt Then the gradient method scheme for the constrained problem is 0 Choosex 6 Q 1 keth Iteration k 2 0 Xk1 Xk thWy L 151x Minimization over simple sets Theorem 228 Let f E If in the above scheme h then k in wit 12 9 iiXo 2 vii Proof Denote by k lle Xll7 5Q EQXIlt7 0 Using the inequality 1 EQX7 YTX X 2 legdxy llz gll Xllzi we obtain 5 lle i X hgall2 IVE 27513 i X hzllgallz 1 MOVE t h h I lngll2 17 L r5 ii i H D 1n1x Optimal schemes gt Optimal schemes are derived similarly to the unconstrained case gt Again we define an estimate sequence etc gt The same convergence properties as in the unconstrained case 1mg Optimal schemes 0 Choose X0 6 Rquot and a0 6 01 Set yo X0 and q uL 1 kth iteration k 2 0 3 Compute fyk and fyk Set Xk1 XQOky L b Compute ark 6 071 from equation Olin 1 ak10 l2lt 404k set w 7 k akaki 39 Yk1 Xk1 kXk1 M 1x1x
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'