Popular in Course
Popular in Mathematics (M)
This 6 page Class Notes was uploaded by Miss Gladys Lubowitz on Tuesday October 20, 2015. The Class Notes belongs to MATH693A at San Diego State University taught by P.Blomgren in Fall. Since its upload, it has received 19 views. For similar materials see /class/225265/math693a-san-diego-state-university in Mathematics (M) at San Diego State University.
Reviews for MATH693A
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 10/20/15
Numerical Optimization Lecture Notes 6 Line Search Methods Step Length Selection Peter Blomgren ltblomgrenpeter gmail comgt Department of Mathematics and Statistics Dynamical Systems roup Computational Science Rsearch Center San Diego State University San Diego CA 921827720 httpterminussdsuedu Outline a Line Search Methods 0 R ca 5 P 0 Step Length Selection 0 Step Length Selection 0 Interpolation 0 The Initial Step 0 LS Strong Wolfe Conditions Fall 2009 539 5351 Line Search Methods Step Length Selection 7 124 Line Search Methods Step Length Selection 7 224 Quick Recap Last Time Quick Recap Last Time Rates of convergence for our different optimization strategies further39 Qufs39NeWt n methOds39 Where the SearCh d39reCt39On 395 pi 78 Vfxk exhibit super linear convergence as long as We showed that for a simpe quadratic mode the matrix sequence Bk converges to the Hessian V2 fi in the f0 iTQi 7 ET the steepest descent method is indeed searCh direCtlon I316 linearly convergent m llBk V2fXPltll 0 The result generalizes to general nonlinear objective functions for 1 4 llpkll which Vf O and V2f is positive definite Coordinate Descent Methods Slower than Steepest descent We stated the result for Newton 5 method which says that it IS Useful of coordinates are decoupled andor computation of the locally quadratically convergent gradient IS not pOSSIble or too expenswe m 5393 Line Search Methods Step Length Selection 7 324 Line Search Methods Step Length Selection 7 424 Unconstrained Optimization In the Line Search Universe Lookahead This Time A Closer Look at Step Length Global Optimization Problem Local Strategies l Line Search Algorithms Search Direction Step Length Sufficient Descent Conditions Convergence Global We look at techniques for Best Finding a minimizer to the 1D function a fllt 043k UK Finding a step length 04k which satisfy a sufficient decrease conditionquot such as the Wolfe conditions We already have one such algorithm Algorithm Backtracking Line search 1 SetEgt O p 6071 c 6071 set a 6 2 While f ltk om gt my co kTWm 3 a pa Convergence Local Rate 4 End While m 7 gt31 5 Set 04k 7 a Line Search Methods Step Length Selection 7 524 Line Search Methods Step Length Selection 7 624 Step Length Selection Assumptions Step Length Selection Classification We must assume that pk iS a descent direction ie that O lt 0 It is natural to classify line search methods based on how many thus all our steps will be in the positive direction derivatives they need When the Objective f iS quadratic f0 aiTQii ET C the 0 Methods based on function values only tend to be inef Optimai SteP can be found explicitly ficient since they need to narrow the minimizer to a small interval Vf7ltkTl3k 04k i W 1 Gradient Information makes It eaSIer to determine If 3 pk pk certain step is good l39e it satisfies a sufficient reduction condition For general nonlinear f we must use an Iterative scheme to find the step length 04k gt1 Methods requiring more than one derivate are quite rare in order to compute the second derivative the full Hessian How the line search Is performed Impacts the robustness and V2 d d th H t h h t Isneee ISIsusua 00 I acos effICIency of the overall optimization method k y g m 5353 Line Search Methods Step Length Selection 7 724 Line Search Methods Step Length Selection 7 324 Step Length Selection Our Focus Step Length Selection Interpolation lof7 The best bang for bucksquot line search algorithms use the gradient information hence those will be the focus of our discussion A line search algorithm roughly breaks down into the following components 1 The initial step length a0 is selected 2 An interval ammama containing acceptable step lengths is identified Bracketing phase 3 The final step length is selected from the acceptable set Selection phase First we note that the Armijo condition can be written in terms ofltlgtas dgtak ltlgt0 clak O where q 10 4 in practice This is stronger but not much stronger that requiring descent Our new algorithms will be efficient in the sense that the gradient Vfk is computed as few times as possible If the initial step length a0 satisfies the Armijo condition then we accept a0 as the step search length and terminate the As we get close to the solution this will happen more and more often for Newton and quasi Newton methods with 040 m Otherwise we search for an acceptable step length in 07 a0 m Line Search Methods Step Length Selection 7 924 Line Search Methods Step Length Selection 7 1024 Step Length Selection lnterpolation 2 of 7 Step Length Selection lnterpolation 3 of 7 At this stage we have computed 3 pieces of information lt1gtO 60 and lt1gta0 we use this information to build a quadratic model ltlgtqa dgta0 7 lt1gtO 7 aodgtO Ma o2 ltlgt0a lt1gt0 0 0 Note q0 07 qao ao7 20 90 We set a O to find the minimum of the model our next a to try 7 7 400 20 w 0 0 0 0 Line Search Methods Step Length Selection 7 1124 Hence 041 a3 0 T Mao 0 ao 0l We now check the Armijo condition dgta1 lt1gto C1a1 O If it fails then we create a cubic function ltlgtca 8043 ba2 DADO dgt0 which interpolates lt1gtO lt1gtO lt1gta0 and lt1gta1 Q o a i 4 2 b T aga al 7 a0 fag 7042 ltlgta 7 ltlgt0 7 a ltlgtO ai l l Mai 7 M 7 acl iLJ Line Search Methods Step Length Selection 7 1224 Step Length Selection Interpolation 4 of 7 Step Length Selection lnterpolation 5 of 7 The next iterate a2 is now the minimizer of dgtca which lies in At this point we must introduce to following safeguards to 07 m it is given as one of the roots of the quadratic equation guarantee that we make sufficient progress 404 33042 2ba dgtO O lf lak1 7 akl lt 61 0 lak1l lt 62 then ak1 oak2 it is 2 b 192 33490 The algorithm described assumes that computing the derivative is significantly more expensive than computing function values In the extremely rare cases that a2 does not satisfy the Armijo However it is Often bUt 0t always POSSlble to comPUte the condition directional derivative or a good estimate thereof with minimal Nae S 490 C1a2 O extra cost we create a new cubic model interpolating In those cases we build the cubic interpolant so that it interpolates ltlgt07 ltlgtO7 ltlgt0417 and 042 Na7 ltlgtozilt7 ltlgt04k17 and dgtak1 ie dgt0 dgtO and the two most recent a s 5 93 this is a Hermite Polynomial of degree 3 see Math 541 m Line Search Methods Step Length Selection 7 1324 Line Search Methods Step Length Selection 7 1424 Step Length Selection lnterpolation 6 of 7 Step Length Selection lnterpolation 7 of 7 The cubic Hermite Polynomial satisfying The minimizer of III3ozl in akjhozkhis either at one of the end pomts or else in the interior given setting H3034 O H3ak1 Na1 Hgak1 ak1 The interior point is given by Hawk New H ak WW a 7a a a i i wok d2 7d1 i k17 k7 k7 k4 can be written explicitly as cl 04 cl ak 1 2d2 2 where H3a 1 2 aakT1 a Ta Na1 d 7 d M akil ak akil 2 d1 104191 6 3 i akTI akli a 7a 04704191 akil 7 04k 1 27akjak71 WWI ak 2 7 2 d2 d1 7 Otk1 Ozk a 7 Mel ak1 2 Either 04 is acce ted as the ste Ien th or the search rocess a aka Ilt1 P P g x P a T ak araka l akl continues Cubic interpolation gives quadratic convergence in the step length Straight from Math 541 m selection algorithm Line Search Methods Step Length Selection 7 1524 Line Search Methods Step Length Selection 7 1624 Step Length Selection The Initial Step 1 of 2 Step Length Selection The Initial Step 2 of 2 For Newton and quasi Newton methods the search vector pk contains an intrinsic sense of scale being formed from the local descent and curvature information hence the initial trial step length should always be 040 1 otherwise we break the quadratic respective super linear convergence properties For other search directions such as steepest descent and conjugate gradient to be described later directions which do not have a sense of scale other method must be used to select a good first trial step Strategy 1 Assume that the rate of change in the current iteration will be the same as in the previous iteration select 040 T agk ak71Pk71Vf kill Strategy 2 Use the minimizer of the quadratic interpolant to fk1 fk and qbO lilvfik71 as the initial a k 7 2lfgtltk Oflt70 040 T Pk71VfXlt71 If this strategy is used with a quadratically or super linearly convergent algorithm the choice of a0 must be modified slightly to preserve the convergence properties ol 7 min1 1014 0new lv ik this ensures that the step length a0 1 Wlll eventually always be 5393 tried 5331 Line Search Methods Step Length Selection 7 1724 Line Search Methods Step Length Selection 7 1324 Line Search for the Strong Wolfe Conditions 1 of 6 Line Search for the Strong Wolfe Conditions 2 of 6 In the first sta e of the al orithm either an acce table ste len th or a Algorithm LSStrong Wolfe Conditions g g p p g range whoa1 containing an acceptable step length is identified none 01 Set 040 Ov Choose 041 gt O maxi Clv and 52v i 1 of the conditions 04 O7 09 are satisfied so the step length is increased 02 while TRUE 11 03 Compute dgta 39 04 if War gt 10 Clai VfOD If in the first stage we identified a range the second stage invokes a or 139 Z aquot1and 39gt 1 function zoom which will identify an acceptable step from the interval 05 04 zooma1a and terminate search 06 ComPute W01 Notes 04 establishes that Oq is too long a step thus 04 must be in the 07 if ldgt al g icgdgt 0 range a1a 08 04 1 and terminate search 09 if W01 2 0 If 07 holds then both the strong Wolfe conditions hold since not04 10 04 zooma 0471 and terminate search must also hold 11 Choose 04 E anamax I 12 i Ht 1 Finally if 09 holds then the step is too large since we are going uphill at 13 end this pomt 5331 5353 Line Search Methods Step Length Selection 7 1924 Line Search Methods Step Length Selection 7 2024 Line Search for the Strong Wolfe Conditions 3 of 6 Line Search for the Strong Wolfe Conditions 4 of 6 The zoom function takes two arguments zooma10wyahigh Algorithm Zoom function satisfying the following 01 while TRUE O2 lnterpolate to find a between alow and ahigh 1 The interval bounded by alowy and ahigh contains step lengths 03 Compute dgtaj with satisfy the strong Wolfe conditions 0439 if M09 gt 10 l Clap0 r M09 3 Nam 05 ahigh a 06 else 2 am is the a corresponding to the lower function value ie 0 Compute Mal Nam lt Numb 08 if ldgt ajl S ego0 09 04 a1 and returna 3 am and ahigh satisfy ltlgtalowahigh 7 alow lt O 10 if dgt ahigh aiow Z 0 11 ahigh aiow 12 alow a See the figure on slide 23 1339 end 1 m 5351 Line Search Methods Step Length Selection 7 2124 Line Search Methods Step Length Selection 7 2224 Line Search for the Strong Wolfe Conditions 5 of 6 Line Search for the Strong Wolfe Conditions 6 of 6 In practical applications cl 10 4 and 2 09 enforcing strong Wolfe conditions require a similar amount of work compared with the Wolfe conditions The advantage of the strong conditions is that by decreasing 2 we can force the accepted step lengths to be closer to the local minima of this is particularly helpful in applications of steepest descent or conjugate gradient methods Sufficient Decrease Low High 0 02 04 06 08 1 Figure A Possible scenario for zoom since alow lt ahigh we must have Figure Illustrating the O4condition since we know we can push the objective negative slope at alow am down to dgtaow we reject a even though it satisfies the Armijo condition m Line Search Methods Step Length Selection 7 2324 Line Search Methods Step Length Selection 7 2424
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'