Popular in Course
Popular in Mathematics (M)
This 6 page Class Notes was uploaded by Braeden Lind on Thursday October 15, 2015. The Class Notes belongs to MA 796S at North Carolina State University taught by K. Sivaramakr in Fall. Since its upload, it has received 16 views. For similar materials see /class/223703/ma-796s-north-carolina-state-university in Mathematics (M) at North Carolina State University.
Reviews for ST
Report this Material
What is Karma?
Karma is the currency of StudySoup.
Date Created: 10/15/15
OR 791K Convex Optimization and Interior Point Methods Prepared for DriKartik Sivaramakrishnan December 2007 A Summary Fastest Mixing Markov Chain on a Graph Prepared by Susan Osborne 1 Introduction This paper summarizes the ideas presented in Fastest Mixing Markov Chain on 1 Graph Written by Stephen Boyd Persi Diaconis and Lin Xiaoi The author of this paper makes no claim of originality The intent is to review some of the ideas set forth in the aforementioned article and to expound upon its basic content This is accomplished through describing details of the intricacies of the problem and by illustrating an example for Which data is given in the article 2 Fastest Mixing Markov Chain on a Graph 21 Markov Chain on a Graph Let G V E denote a graph With vertex set V 12 M n and edge set E Where edge ij E E Whenever there exists an edge that joins vertex vi to vertex vji A graph G is an undirected graph Whenever edge j E E if and only if ij E El A vertex vi has a selfloop Whenever the edge E El A Markov Chain is a stochastic process Xnn 0 12 l i such that if Xn i the process is said to be in state i and there is a xed probability Pij that the process Will be next in state j A markov chain that takes on only a nite of countable number of states is called a discretetime markov chaini Consider a connected undirected graph on n vertices With selfloops at each vertex Then de ne a discretetime markov chain on the vertices of the graph by letting Xt E V for t 01Hi be the actual state of the markov chain at time t An edge of the graph is assigned a transition probability With Which Xt can move to a neighboring vertex or possibly stay in the current position due to the fact that each vertex has a selfloopi Note that the transition probabilities are nonnegative and the sum of the probabilities assigned to the edges connected to any particular vertex must be equal to 1 The markov chain is represented by a transition probability matrix P Where P is an n x n matrix de ned by 1 Where PM 0 if ij E ie the probability of moving to a nonadjacent vertex is 0 Additionally the matrix must be doubly stochastic ie the rows and columns both sum to l and symmetric These conditions imply that the matrix satis es P 2 0 ie Pij 2 0 for all i j 1 Hi n e e Where PM ProbXt1 let i ij 1Hi 7n7 e is a vector of all ones since P is doubly stochastic and P PT due to the symmetry of Pi De ne the probability distribution of the state at time t as 7rt E R such that Trit Pr0bXt The state distribution satis es the recursive equation 7rt 1 PT7rt which implies that the distribution at time t is given by 7rt 1 PT7r0 where PT is the t step transition probability matrix which is obtained by raising the matrix P to the t power A markov chain is irreducible if all states can be accessed from any given state and aperiodic if starting in state i the process can return to state i in one step The markov chain de ned as above is both irreducible and aperiodici Since the above process satis es these two properties then as t increases the probability distribution of the state time 7rt converges to the uniform distribution ei 22 The Second Largest Eigenvalue Modulus The eigenvalue structure of the matrix P determines the rate of convergence of 7rt to the uniform distribution ei Since P is a stochastic symmetric matrix its spectral radius MP 1 and all of its eigenvalues are real According to Perron Frobenius theory the eigenvalues must also be less than or equal to 1 in magnitude Thus the eigenvalues of P can be arranged in a nondecreasing order as follows 1 A1 P 2 A2 P 2 H i 2 AnP 2 71 Then rate of convergence is given by the second largest eigenvalue modulus SLEM which is de ned by P max2PrnPA 2 Thus MP lt 1 for the irreducible aperiodic stochastic matrix Pi Some typi cal measures for fast mixing of a markov chain are the mixing rate logi the mixing time r and the spectral gap 1 7 Mi The focus here is on the SLEM M the smaller the SLEM the faster the markov chain reaches its equilibrium dis tribution 23 Fastest Mixing Markov Chain Problem Consider the problem of determining the edge transition probabilities that pro vide the fastest mixing markov chain This can be achieved by minimizing the SLEM as follows min MP siti P 2 0 PTe e 3 P PT Pm 07 23139 3 Note the optimization variable is P and the graph is the data The optimal SLEM is Mquot inf MPlP 2 0PTe eP PTPj 0 ij The ex istence of one optimal transition matrix is guaranteed by the continuity on M and the fact the set of all transition matrices is compact see 3 Heuristic Methods The MaximumDegree Chain and the MetropolisHastings Chain are two heuris tic methods that attempt to obtain the transition probabilities for the fastest mixing markov chainl Although neither is guaranteed to produce the transition probabilities that give the fastest mixing markov chain these simple heuristics tend to provide probabilities for a fast mixing markov chainl Both methods are illustrated on the simple graph given in gure 1 shown without selfloops in the sections below 3 4 Figure 1 Simple graph 31 The MaximumDegree Chain Vertices v and u are adjacent vertices if the edge m is in the edge set E of the graph G The degree of vertex v is the number of vertices adjacent with vertex 1 not counting self loopsl Let the degree of vertex 239 be d and let dmm be the maximum degree of the graph Now assign to each edge of the graph excluding selfloops the probability dial For the graph in gure 1 notice that node 2 has 3 adjacent vertices and no other node has a degree greater than 3 Thus dmm 3 and the probability for each edge that is not a self loop is equal to g The probability of each selfloop is then assigned to ensure that the sum of the probabilities at particular vertex is equal to ll Shown in Figure 2 are two graphs the rst showing the assignment of the probabilities without the selfloops and second including the selfloops and their probabilitiesl Once the Figure 2 Application of MaximumDegree Chain Heuristic probabilities have been assigned to the graph they can be transformed into a matrix called the maximumdegree transition probability matrix Pg This matrix is given by W 277 5 E 7i j PW viii 239j lt4 0 23139 E For the graph in gure 1 2313 0 0 13 0 13 13 5 0 13 13 13 0 13 13 13 md 7 PW Since the second largest eigenvalue of Pm i is g the SLEM for the matrix is MP 32 The MetropolisHastings Chain A random walk on a graph can be visualized by assuming an individual is located at some vertex vi on the graph at some point in time Suppose that the individual must move to another vertex but has no preference as to which to move to next Thus any vertex that is adjacent to vi is equally likely to be chosen For example consider the simple graph in Figure 1 If the person is at vertex 4 then there are two adjacent vertices and thus the probability is that the individual will move to vertex 2 Likewise the probability that the individual moves to vertex 3 from vertex 4 is also Clearly the transition probabilities of a simple random walk on a graph is given by W 3 memy j P l3 WM 6 The symmetry of the probability matrix is necessary to ensure that the eigen values are real Note that the random walk probability matrix is not sym metric However the transition probabilities can be modi ed by applying the Metropolis Hastings algorithm to the random walk probability matrix see 4 for details on the algorithm If the equilibrium distribution is the uniform dis tribution then the resulting transition probability matrix is symmetric This matrix is given in a simpli ed form by min i WEE7amp1 mh Pij 20306147 max 0 di 7 i z j 7 0 i7j E Figure 3 shows the results of applying this function to the graph in Figure 1 1 1 23 13 13 2 z 7 0 13 13 13 quot 13 3 4 1 3 g qu 16 12 Figure 3 Application of Metropolis Hastings Chain Heuristic The resulting transition probability matrix is 2 3 1 3 0 0 1 3 0 1 3 13 mh 7 PW 0 13 16 12 8 0 1 3 12 1 6 The eigenvalues of this matrix are 1 and ill Thus the SLEM for the matrix MPMh is It is appropriate to note that in this example both heuristics gave the same SLEM yet this is not always the case see 1 for an example for which the results differ on 4 Convexity and Convex Optimization 41 Convexity of the SLEM The convexity of M is established through the use of the spectral norm of P restricted to the subspace ei u E RnleTu 0 This requires the de termination of the orthogonal projection on at Normalizing the vector 6 gives Then the elementary orthogonal projection for ei is given by the matrix I 7 eeTsee 3 for more on orthogonal projectorsi Now let 1 T 1 T 1 T MP H17ee PIf ee HP17ee Th1s 1s the spec tral norm of an af ne function of P an thus a convex function 42 Convex Problem Formulation Consider the constraint set for the Fastest Mixing Markov Chain problem along with the objective of minimizing the norm MPi Formulate the problem as follows min MP HP17eeTH2 siti P 2 0 PTe e 9 P PT PM 07 iyj EA Since this formulation has a convex objective function with all linear constraints the problem of minimizing the spectral norm over the set of symmetric stochastic matrices is a convex optimization problemi 43 Semide nite Programming Formulation Introducing a scalar s as an upper bound on the norm the convex optimization problem 9 can be rewritten as a semide nite programming SDP problemi The SDP is as follows min 8 st 781 j P17eeT j 81 P 2 0 PTe e 10 P PT PM 07i7j EA The SDP 10 can be rewritten and a linear matrix inequality LMl1 Let Diagsvec denote the diagonal matrix consisting of the n 21 upper triangle positions of the symmetric matrix R Then P7 eeTsI 317PeeT and DiagsvecPare positive semide nite matrices A matrix with block diagonal form in which each block is positive semide nite is also positive semide nitei Thus rewrite the SDP 10 as follows min 8 P7 eeT 31 0 0 11 0 s 7 P eeT 0 i 07 0 0 DiagsvecP For graphs with up to approximately one thousand edges this SDP can be solved to eoptimality by a standard SDP solveri Larger problems can be solved by tailoring a solver to take advantage ofthe sparsity of the matrix that is particular to this application The simple graph in Figure 1 was solved exactly by an SDP solver to obtain the transition probability matrix 611 511 0 0 7 511 0 311 311 PW 0 311 411 411 12 0 311 411 411 The SLEM of this matrix is MUD which is smaller than the SLEM for the heuristic methods for this graph It should be noted that the matrix 12 is not unique 5 Conclusion Several heuristic methods are available to determine a fast mixing for a discrete time markov chain de ned on a undirected connected graph The Fastest Mixing Markov Chain Problem can be formulated as convex optimization problem more speci cally as a SDR The typical measures of fast mixing are based on the SLEM1 Using the SLEM as the measure the SDP formulation of the problem out performs both heuristics methodsi References 1 Si BOYD P DIACONIS AND L XIAo Fastest Mixing Markov Chain on a Graph SlAM Review 462004 pp 6676891 2 G1 CHARTRAND AND L LESNIAK Graphs and Digraphs 3rd ed Chapman and Hall Suffolk Great Britain 1996 3 C1 MEYER Matrix Analysis and Applied Linear Algebra SlAM Philadel phia 20041 4 Si ROSS Introduction to Probability Models 8th ed Academic Press San Diego CA 20031 5 K1 SIVARAMAKRISHNAN Class Notes North Carolina State Univeristy Raleigh NC 20071