Machine Learning CSE 847
Popular in Course
Popular in Computer Science and Engineering
This 55 page Class Notes was uploaded by Donnell Kertzmann on Saturday September 19, 2015. The Class Notes belongs to CSE 847 at Michigan State University taught by Rong Jin in Fall. Since its upload, it has received 86 views. For similar materials see /class/207401/cse-847-michigan-state-university in Computer Science and Engineering at Michigan State University.
Reviews for Machine Learning
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 09/19/15
The Matrix Cookbook Kaere Brandt Petersen Michael Syskind Pedersen VERSION FEBRUARY 16 2006 What is this These pages are a collection of facts identities approxima tions inequalities relations about matrices and matters relating to them It is collected in this form for the convenience of anyone who wants a quick desktop reference Disclaimer The identities approximations and relations presented here were obviously not invented but collected borrowed and copied from a large amount of sources These sources include similar but shorter notes found on the internet and appendices in books see the references for a full list Errors Very likely there are errors typos and mistakes for which we apolo gize and would be grateful to receive corrections at cookbook 2302 Its ongoing The project of keeping a large repository of relations involving matrices is naturally ongoing and the version will be apparent from the date in the header Suggestions Your suggestion for additional content or elaboration of some topics is most welcome at cookbook 2302idki Keywords Matrix algebra matrix relations matrix identities derivative of determinant derivative of inverse matrix differentiate a matrix Acknowledgements We would like to thank the following for contribu tions and suggestions Christian Rishoj Douglas L Theobald Esben Hoegh Rasmussen Lars Christiansen and Vasile Sima We would also like thank The Oticon Foundation for funding our PhD studies CONTENTS CONTENTS Contents 1 Basics 5 1 1 Trace and Determinants l l l l l l l l l l l l l l l l l l l l l l l 5 12 The Special Case 2x2 l l l l l l l l l l l l l l l l l l l l l l l l l 5 2 Derivatives 7 2 1 Derivatives of a Determinant l l l l l l l l l l l l l l l l l l l l 7 2 2 Derivatives of an Inverse l l l l l l l l l l l l l l l l l l l l l l l 8 2 3 Derivatives of Matrices7 Vectors and Scalar Forms l l l l l l l l 9 2 4 Derivatives of Traces l l l l l l l l l l l l l l l l l l l l l l l l l 11 2 5 Derivatives of Structured Matrices l l l l l l l l l l l l l l l l l 12 3 Inverses 15 3 Basic l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l 15 32 Exact Relations l l l l l l l l l l l l l l l l l l l l l l l l l l l l 16 3 3 Implication on Inverses l l l l l l l l l l l l l l l l l l l l l l l l 17 3 4 Approximations l l l l l l l l l l l l l l l l l l l l l l l l l l l l 17 3 5 Generalized Inverse l l l l l l l l l l l l l l l l l l l l l l l l l l 17 3 6 Pseudo Inverse l l l l l l l l l l l l l l l l l l l l l l l l l l l l 17 4 Complex Matrices 19 4 1 Complex Derivatives l l l l l l l l l l l l l l l l l l l l l l l l l 19 5 Decompositions 22 5 1 Eigenvalues and Eigenvectors l l l l l l l l l l l l l l l l l l l l 22 5 2 Singular Value Decomposition l l l l l l l l l l l l l l l l l l l l 22 5 3 Triangular Decomposition l l l l l l l l l l l l l l l l l l l l l l 24 6 Statistics and Probability 25 611 De nition of Moments l l l l l l l l l l l l l l l l l l l l l l l l 25 6 2 Expectation of Linear Combinations l l l l l l l l l l l l l l l l 26 6 3 Weighted Scalar Variable l l l l l l l l l l l l l l l l l l l l l l 27 7 Gaussians 28 71 Basics l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l 28 72 Moments l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l 30 713 Miscellaneous l l l l l l l l l l l l l l l l l l l l l l l l l l l l l 32 7 4 Mixture of Gaussians l l l l l l l l l l l l l l l l l l l l l l l l l 33 8 Special Matrices 34 8 1 Units7 Permutation and Shift l l l l l l l l l l l l l l l l l l l l 34 82 The Singleentry Matrix l l l l l l l l l l l l l l l l l l l l l l l 35 8 3 Symmetric and Antisymmetric l l l l l l l l l l l l l l l l l l l 37 8 4 Vandermonde Matrices l l l l l l l l l l l l l l l l l l l l l l l l 37 8 5 Toeplitz Matrices l l l l l l l l l l l l l l l l l l l l l l l l l l l 38 86 The DFT Matrix l l l l l l l l l l l l l l l l l l l l l l l l l l l 39 PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 2 CONTENTS CONTENTS 87 Positive De nite and Semide nite Matrices l l l l l l l l l l l l 40 88 Block matrices l l l l l l l l l l l l l l l l l l l l l l l l l l l l 41 9 Functions and Operators 43 91 Functions and Series l l l l l l l l l l l l l l l l l l l l l l l l l 43 9 2 Kronecker and Vec Operator l l l l l l l l l l l l l l l l l l l l 44 93 Solutions to Systems of Equations l l l l l l l l l l l l l l l l l 45 9 4 Matrix Norms l l l l l l l l l l l l l l l l l l l l l l l l l l l l l 47 95 Rank l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l 48 9 6 Integral Involving Dirac Delta Functions l l l l l l l l l l l l l l 48 97 Miscellaneous l l l l l l l l l l l l l l l l l l l l l l l l l l l l l 49 A Onedimensional Results 50 All Gaussian l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l 50 Al One Dimensional Mixture of Gaussians l l l l l l l l l l l l l l l 51 B Proofs and Details 53 Bil Misc Proofs l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l 53 PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 3 CONTENTS CONTENTS Notation and Nomenclature A Matrix Aij Matrix indexed for some purpose Ai Matrix indexed for some purpose A Matrix indexed for some purpose A Matrix indexed for some purpose or The nith power of a square matrix A 1 The inverse matrix of the matrix A A4r The pseudo inverse matrix of the matrix A see Sec 36 Al2 The square root of a matrix if unique7 not elementWise Aij The i7jlth entry of the matrix A The i7jlth entry of the matrix A Allj The ijsubmatrix7 ie A With iith rOW and jith column deleted a Vector ai Vector indexed for some purpose ai The iith element of the vector a a Scalar 992 Real part of a scalar 992 Real part of a vector 9QZ Real part of a matrix 32 Imaginary part of a scalar Sz Imaginary part of a vector SZ Imaginary part of a matrix det A Determinant of A Tr A Trace of the matrix A diagA Diagonal matrix of the matrix A7 iiel diagAlj 6114417 vecA The vectorversion of the matrix A see Sec 922 Matrix norm subscript if any denotes What norm T Transposed matrix A Complex conjugated matrix AH Transposed and complex conjugated matrix Hermitian A o B Hadamard elementWise product A B Kronecker product The null matrixl Zero in all entries The identity matrix The singleentry matrix7 l at Lj and zero elsewhere A positive de nite matrix A diagonal matrix gtM EHo PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 4 l BASICS 1 Basics AB 1 B lir1 ABCM 1 MC lB lA l ATV A 1T ABT ATBT ABT BTAT ABCMT MCTBTAT AHVI A4 ABH AHBH ABH BHAH ABCMH MCHBHAH 11 Trace and Determinants TWA EiAii TWA 21 i eigA TWA THAT TrAB TrBA TrAB TrATrB TrABC TrBCA TrCAB detA Hi AieigA detAB detA detB detA 1 1 detA detIuVT 1uTV 12 The Special Case 2x2 Consider the matrix A A A11 A12 A21 A22 Determinant and trace detA AllAgg 7 14121421 TFA A11 A22 Eigenvalues A2 7 TrA detA 0 PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 5 12 The Special Case 2X2 1 BASICS A 7 I rA 7 xTrA2 7 4detA A 7 I rA 7 xTrA2 7 4detA 1 7 f 2 7 f A1 A2 TrA AIAQ detA Eigenvectors A12 A12 0A17Au V2 A27Au Inverse 71 1 A22 A12 detA 1421 A11 PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 6 2 DERIVATIVES 2 Derivatives This section is covering differentiation of a number of expressions With respect to a matrix X Note that it is always assumed that X has no special structure ie that the elements of X are independent eg not symmetric7 Toeplitz7 positive de nite See section 25 for differentiation of structured matrices The basic assumptions can be Written in a formula as BXM 7 aXij 7 511451 that is for eg vector forms7 lgl 8 l l 89 lall a 19y By By Byi 19y j Byj The following rules are general and very useful When deriving the differential of an expression 13 8A 0 A is a constant 1 8aX aaX 2 6X Y 6X BY 3 6TWO Tr0X 4 6XY 6XY XaY 5 ax o Y 8X 0 Y X 0 BY 6 6X Y BK Y X BY 7 ax1 7x4axx1 8 8detX detXTrX 16X 9 8lndetX TrX 16X 10 aXT 6XT 11 aXH 6XH 12 21 Derivatives Of a Determinant 211 General form BdetY 7 ABY T 7 detYTr Y a 212 Linear forms Bdet X 7 1 T 7 detXX T a det AXB 6X detAXBX 1T detAXBXT 1 PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 7 22 Derivatives of an Inverse 2 DERIVATIVES 213 Square forms If X is square and invertible then BdetXTAX T T 2d t X AX X 6X 6 If X is not square but A is symmetric then T w 2detXTAXAXXTAX 1 If X is not square and A is not symmetric then T w detXTAXAXXTAX 1 ATXXTATX 1 13 214 Other nonlinear forms Some special cases are See 8 7 W 2XT WW 72XT w X 1TXT 1 6017 kdetXkX T 22 Derivatives of an Inverse From 19 we have the basic identity aw1 flay 1 a Y EY from Which it follows E X 1 4X 1gtkiltx 1 BaTX lb 7 7T T 7T 8X X ab X adeWX l 71 71 T T idetX X B I rAX 1B 7 X lBAX l T 6X PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 8 23 Derivatives of Matrices Vectors and Scalar Forms 2 DERIVATIVES 23 Derivatives of Matrices Vectors and Scalar Forms 231 First Order 232 Second Order a 7 X an 8X17 16 M abTXTXc 8X 6Bx bTcDx d 6XTBXM 6X 6XTBX BXT v baT BaTXTa T T aa Jij Sim Anj JmnAij 6inAmj JnmAij 2 Z Xkl kl XbcT ch BTCDx d DTCTBx b 6ljXTBki 51g BXM XTBJiJ39 JJ39iBX J if 61661 See Sec 82 for useful properties Of the Singleentry matrix Jij 8XTBX ax B 13 abTXTDXc T T T T D Xbc Dch a Xb cTDXb c D DTXb cbT PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 9 23 Derivatives of Matrices Vectors and Scalar Forms 2 DERIVATIVES Assume W is symmetric7 then 0 7 AsTWx 7 As 7 72ATWx 7 As 0 7 sTWx 7 s 7 72Wx 7 s go 7 AsTWx 7 As 7 2Wx 7 As 0 7 AsTWx 7 As 7 72Wx 7 AssT 233 Higher order and nonlinear a T n n71 r T T n7177 T a X b7 X ab X 14 a n71 T nTn 717177 T nT7 aXaXXb X abXX X7 TX7LabTXn71739 T 15 See Bill for a proof Assume s and r are functions Of X7 iiei s sxr rX7 and that A is a constant7 then 6T 7 asT T 6r as Ar 7 ArsA a 234 Gradient and Hessian Using the above we have for the gradient and the hessian f xTAxbTx Vf7 7 AATxb x ax 62f T axaxT AA PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 10 24 Derivatives of Traces 2 DERIVATIVES 24 Derivatives Of Traces 241 First Order a TNX imam ax a mAXB a T TdAX B a T mx A a T TdAX 242 Second Order Trltx2gt iTrX213 8X TrXTBX TrXBXT I rAXBX TrXTX TrBXXT TrBTXTCXB Tr XTBXC TrAXBXTC Tr AXb c AXb QT See AT 16 ATBT 2XT XB BXT BX BTX XBT XB ATXTBT BTXTAT 2X B BTX CTXBBT CXBBT BXC BTXCT ATCTXBT CAXB 2ATAXb cbT PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 11 25 Derivatives of Structured Matrices 2 DERIVATIVES 243 Higher Order a 71 mxk kXk T 6 1 8 k 7 r kiril T aXTrAX 7 T 0X AX TrBTXTCXXTCXB CXXTCXBBT CTXBBTXTCTX CXBBTXTCX CTXXTCTXBBT 244 Other 6 TdAX lB 7X 1BAX 1T iX TATBTX T Assume B and C to be symmetric7 then Tr XTCX 1A 7CXXTCX 1A ATXTCX 1 TrXTCX 1XTBX 72CXXTCX 1XTBXXTCX 1 2BXXTCX 1 See 25 Derivatives of Structured Matrices Assume that the matrix A has some structure7 ie symmetric7 toeplitz7 etc In that case the derivatives of the previous section does not apply in general Instead7 consider the following general rule for differentiating a scalar function fA T df 2 8f BAklTr 8A dAZj kl 8AM BAZj 8A BAZj The matrix differentiated With respect to itself is in this document referred to as the structure matrix of A and is de ned simply by BA aAlj siJ39 If A has no special structure we have simply Sij Jij7 that is7 the structure matrix is simply the singleentry matrix Many structures have a representation in singleentry matrices7 see Sec 826 for more examples of structure matrices PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 12 25 Derivatives of Structured Matrices 2 DERIVATIVES 251 The Chain Rule Sometimes the Objective is to nd the derivative of a matrix Which is a function of another matrix Let U fX the goal is to nd the derivative of the function gU With respect to X WWWM 9X 9X Then the Chain Rule can then be Written the following way amam iamwm 8X BIZj 161 l1 Bum 199617 Using matrix notation this can be Written as a U a U 8U 57X be gm Taxiji 252 Symmetric If A is symmetric then 517 J17 in 7 JijJij and therefore a a a T d a dA 6A 6A lag 6A That is 6g 5 20 AAT7AOI see 23 BdetX 71 71 T detX2X 7 X o 1 BlndetX 1 71 i i ZX ROD 253 Diagonal If X is diagonal then 13 B I rAX 6X AOI 254 Toeplitz 17 18 19 20 21 22 23 Like symmetric matrices and diagonal matrices also Toeplitz matrices has a special structure Which should be taken into account When the derivative With respect to a matrix With Toeplitz structure PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 13 25 Derivatives of Structured Matrices 2 DERIVATIVES B I rAT 24 6T 7 B I rTA 6T MA 3AT1n0 TxltAT11n1n12gt Am MATM MA TxltAT11n12n1gt WHAThWJVLp T T TxaATm Am TIA1n2n71 mm 11 MA E aA As it can be seen7 the derivative aA also has a Toeplitz structure Each value in the diagonal is the sum of all the diagonal valued in A7 the values in the diagonals next to the main diagonal equal the sum of the diagonal next to the main diagonal in ATi This result is only valid for the unconstrained Toeplitz matrixi If the Toeplitz matrix also is symmetric7 the same derivative yields 6TrAT 7 B I rTA 6T 7 6T aA aAT 7 aA o I 25 PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 14 3 IN VERSES 3 Inverses 31 Basic 311 De nition The inverse A71 Of a matrix A E Cnxn is de ned such that AA l A lA I 26 Where I is the n X n identity matrix H A71 exists7 A is said tO be nonstngular Otherwise7 A is said tO be singular see eg 9 312 Cofactors and Adjoint The submatrix Of a matrix A7 denoted by ALj is a n 7 1 X n 7 1 matrix Obtained by deleting the ith rOW and the jth column Of A The i7j cofactor of a matrix is de ned as cofAij 4141 detAZj 27 The matrix of cofactors can be created from the cofactors COfA11 COfA1n COfA COfA7 i7j 28 COfA7 n1 COfA7 n7 n The adjoint matrix is the transpose Of the cofactor matrix adjA COHANT 29 313 Determinant The determinant Of a matrix A E Cnxn is de ned as see detA 271j1A1j det 1 j1 ZAljcofA1j 30 j1 314 Construction The inverse matrix can be constructed7 using the adjoint matrix7 by 1 1 m 39 adj A 31 PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 15 32 Exact Relations 3 INVERSES 315 Condition number The condition number Of a matrix CA is the ratio between the largest and the smallest singular value Of a matrix see Section 52 on singular values7 CA 74 The condition number can be used to measure hOW singular a matrix is If the condition number is large7 it indicates that the matrix is nearly singular The condition number can also be estimated from the matrix norms Here CAHAH39HA71H7 32 Where is a norm such as eg the 1norm7 the 2norm7 the OOnorm or the Frobenius norm see Sec 94 for more on matrix norms 32 Exact Relations 321 The Woodbury identity A CBCT 1 7 A 1 7 A 10B 1 CTA lc 1CTA 1 If P7 R are positive de nite7 then see 22 P 1 BTR lB 1BTR 1 7 PBTBPBT R 1 322 The Kailath Variant A 7 BC 1 7 A 1 7 A 1BI CA lB 1CA 1 See 4 page 153 323 The Searle Set of Identities The following set Of identities7 can be found in 17 page 1517 I A l 1 7 AA I 1 A BBT 1B 7 A 1BI BTA lB 1 A 1 B l 1 7 AA B 1B 7 BA B 1A A 7 AA B 1A 7 B 7 BA B 1B A 1 131 7 A 1A 133 1 I 7 AB 1 7 I 7 A1 7 BA 1B I AB 1A 7 A1 7 BA 1 PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 16 33 Implication on Inverses 3 INVERSES 33 Implication on Inverses AB 1 A 1B 1 a AB lA BA lB See 17 331 A PosDef identity Assume P7 R tO be positive de nite and invertible7 then P 1 BTR lB 1BTR 1 PBTBPBT R 1 See 22 34 Approximations IA 1 17AA27A3m A 7 AI A 1A E I 7 A 1 if A large and symmetric If a2 is small then Q 02M 1 Cr1 7 U2Q 1MQ 1 35 Generalized Inverse 351 De nition A generalized inverse matrix Of the matrix A is any matrix A such that see 18 AA A A The matrix A is not unique 36 Pseudo Inverse 361 De nition The pseudo inverse or MoorePenrose inverse Of a matrix A is the matrix AJr that ful ls I AAA A II AJVAAJr A1r III AAJr symmetric IV AA symmetric The matrix A1r is unique and does always exist PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 17 36 Pseudo Inverse 3 INVERSES 362 Properties Assume A4r tO be the pseudoinverse of A7 then See N A ATV AWT cAV 16 ATA AAT AATV ATVA Assume A tO have full rank7 then AAAA AA AAAA AA TrAA rankAA See18 TrAA rankAA See 18 363 Construction Assume that A has full rank7 then A n X n Square rankA n A4r A 1 A n X m Broad rankA n A4r ATAAT 1 A n X m Tall rankA m A4r ATA 1AT Assume A does not have full rank7 ie A is n Xm and rankA T lt minn7 The pseudo inverse A4r can be constructed from the singular value decomposi tion A UDVT by A VDUT A different way is this There does always exists two matrices C n X T and D T X m of rank T7 such that A CD Using these matrices it holds that A DTDDT 1CTC ICT See 3 PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 18 4 COMPLEX MATRICES 4 Complex Matrices 41 Complex Derivatives In order tO differentiate an expression With respect tO a complex 2 the CauchyRiemann equations have tO be satis ed M WWW 330 d2 8992 8992 33 and dflt gt 69Wlt gtgt aw gtgt 2 2 r 2 W 1 632 632 34 or in a more compact form W 26102 35 832 8992 A complex function that satis es the CauchyRiemann equations for pOints in a region R is said yO be analytic in this region R In general expressions involving complex conjugate or conjugate transpose dO not satisfy the CauchyRiemann equations In or er tO avoid this problem a more generalized de nition Of complex derivative is used 16 6 o Generalized Complex Derivative df2 1 WW 01 d2 lta9rez 72632 36 o Conjugate Complex Derivative df2 1 WW 01 d2 lt 6992 2 632 37 The Generalized Complex Derivative equals the normal derivative When f is an analytic function For a nonanalytic function such as 2 the derivative equals zerO The Conjugate Complex Derivative equals zerO When f is an analytic function The Conjugate Complex Derivative has eg been used by 14 When deriving a complex gradient Notice W 6W 26162 d2 8992 832 i 0 Complex Gradient Vector If f is a real function Of a complex vector z then the complex gradient vector is given by 11 p 798 2M 38 W2 dz 39gt WW 191 69 2 6 PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 19 41 Complex Derivatives 4 COMPLEX MATRICES 0 Complex Gradient Matrix If f is a real function of a complex matrix Z then the complex gradient matrix is given by dfZ VfZ 2W 40 a z 26M BWZ BSZ These expressions can be used for gradient descent algorithms 411 The Chain Rule for complex numbers The chain rule is a little more complicated when the function of a complex u is nonanalytic For a nonanalytic function the following chain rule can be applied 89u 7 By Bu By Bu 81 7 Bu 81 Bu BI 41 WWW 7 Bu BI Bu 81 Notice if the function is analytic the second term reduces tO zero and the func tion is reduced tO the normal wellknown chain rule For the matrix derivative of a scalar function gU the chain rule can be written the following way egg TYltlt3 gtT6Ugt mfg gtTaUgtA 42 6X 412 Complex Derivatives of Traces If the derivatives involve complex numbers the conjugate transpose is often in volved The most useful way tO show complex derivative is tO show the derivative with respect tO the real and the imaginary part separately An easy example is mom 7 amXH aRX max I 43 anon 7 amXH 7 Z 63X 2 63X I 44 Since the two results have the same sign the conjugate complex derivative 37 should be used amx 7 amXT aRX max I 45 ux 7 amXT Z 63X 1 63X 1 46 Here the two results have different signs the generalized complex derivative 36 should be used Hereby it can be seen that 16 holds even if X is a PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 20 41 Complex Derivatives 4 COMPLEX MATRICES complex number B I rAXH 7 6X 7 A 47 B I rAXH 2 A 48 amino 7 T 6X 7 A 49 amino 7 T Z BSX 7 A 50 amxxH 7 B I rXHX max 7 max MX 51 6TrXXH 7 6 I rXHX 0 Z BSX 7 2 80X ZZJX 52 By inserting 51 and 52 in 36 and 377 it can be seen that amxxH ax amxxH BX X X 53 54 Since the function TrXXH is a real function of the complex matrix X7 the complex gradient matrix 40 is given by an XXH vaXH 2 2X 55 413 Complex Derivative Involving Determinants Here7 a calculation example is provided The objective is tO nd the derivative of det XHAX With respect tO X E meni The derivative is found With respect tO the real part and the imaginary part of X7 by use of 9 and 57 det XHAX can be calculated as see Sect Bill for details BdetXHAX 7 1 lt6detXHAX 8detXHAXgt ax 2 max Z 63X det XH AX XH AX 1XH A T 56 and the complex conjugate derivative yields BdetXHAX 7 i lt6detXHAX 7 8detXHAXgt BX 2 max Z 63X detXHAXAXXHAX 1 57 PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 21 5 DECOMPOSITIONS 5 Decompositions 51 Eigenvalues and Eigenvectors 511 De nition The eigenvectors V and eigenvalues A are the ones satisfying Avi Aivi AV VD7 Dij 6139in Where the columns Of V are the vectors Vi 512 General Properties eigAB eigBA A is n X m At most minn7 m distinct Al rankA 7 At most 7 nonzero Al 513 Symmetric Assume A is symmetric7 then VVT I ie V is orthogonal M E R Le M is real MAquot 2x eigI CA 1 CA1 eigA 7 CI M 7 c eigA 1 A171 For a symmetric7 positive matrix A7 eigATA eigAAT eigA o eigA 58 52 Singular Value Decomposition Any n X m matrix A can be Written as A UDVT Where eigenvectors Of AAT n X n diageigAAT n X m eigenvectors Of ATA m X m ltUC PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 22 52 Singular Value Decomposition 5 DECOMPOSITIONS 521 Symmetric Square decomposed into squares Assume A tO be n X n and symmetric Then 7 T AlilVllDllV l Where D is diagonal With the eigenvalues of A and V is orthogonal and the eigenvectors of A 522 Square decomposed into squares Assume A E Rn Then T AllVllDllU l Where D is diagonal With the square root of the eigenvalues of AAT7 V is the eigenvectors of AAT and UT is the eigenvectors of A A 523 Square decomposed into rectangular Assume ViDiUZ 0 then we can expand the SVD of A intO D 0 UT Avail 0 EMF Where the SVD of A is A VDUT 524 Rectangular decomposition I Assume A is n X m 7 T A lilVllDll U l Where D is diagonal With the square root of the eigenvalues of AAT7 V is the eigenvectors of AAT and UT is the eigenvectors of A A 525 Rectangular decomposition II Assume A is n X m 526 Rectangular decomposition III AssumeAisnXm l A llVll D l UT Where D is diagonal With the square root of the eigenvalues of AAT7 V is the eigenvectors of AAT and UT is the eigenvectors of ATA PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 23 53 Triangular Decomposition 5 DECOMPOSITIONS 53 Triangular Decomposition 531 Choleskydecomposition Assume A is positive de nite7 then A BTB Where B is a unique upper triangular matrix PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 24 6 STATISTICS AND PROBABILITY 6 Statistics and Probability 61 De nition of Moments Assume x E Rn is a random variable 611 Mean The vector Of means7 m7 is de ned by W lt10 612 Covariance The matrix Of covariance M is de ned by MM ltIilt1igt11 lt11 or alternatively as M ltX 7 mX 7 mTgt 613 Third moments The matrix Of third centralized moments 7 in some contexts referred tO as coskewness 7 is de ned using the notation mg ltIz 7 ltIigtrj 7 ltIjgtrk 7 WW Ms 7 m531m532mm532 Where 77 denotes all elements Within the given index M3 can alternatively be expressed as M3 ltx 7 mx 7 mT E x 7 mTgt 614 Fourth moments The matrix Of fourth centralized moments 7 in some contexts referred tO as cokurtosis 7 is de ned using the notation mtg 7 ltltz 7 ltzigtgtltzj 7 ltzjgtgtltzk 7 ltzkgtgtltzz 7 mm M 7 mf lmf lmmiifllmf mf mmf glmlM lm lwmf or alternatively as M4 ltX7 mX7 mT X7 mT x7mTgt PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 25 62 Expectation of Linear C 39 quot STATISTICS AND PROBABILITY 62 Expectation of Linear Combinations 621 Linear Forms Assume X and x tO be a matrix and a vector Of random variables Then see See 18 EAXB C AEXB C VarAx AVarXAT CovAX7 By ACOvX7 yBT Assume x tO be a stochastic vector With mean m7 then see EAx b Am b EAx Am Ex b m b 622 Quadratic Forms Assume A is symmetric7 C Ex and E Var x Assume also that all coordinates 11 are independent7 have the same central moments M1 M2M3M4 and denote a diagA Then See 18 EXTAX TrAE CTAC VarxTAx 2M TrA2 4M2CTA2C 4MCTAa 4 7 3M aTa Also7 assume x tO be a stochastic vector With mean m7 and covariance M Then see 7 EAx a Bx bT AMBT Am a Bm bT EXXT M mmT ExaTx M mmTa ExTaxT aT M mmT l AM mmTAT ExaxaT M mamaT EAx aTBx 13 TrAMBT Am aTBm b EXTX me EXTAX TrAM mTAm EAXTAX TrAMATAmTAm Ex aTx 2a TrM m agtTm a See PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 26 63 Weighted Scalar Variable 6 STATISTICS AND PROBABILITY 623 Cubic Forms Assume x to be a stochastic vector With independent coordinates7 mean m7 covariance M and central moments V3 Ex 7 Then see EAx a Bx bTCx c AdiagBTCV3 TrBMCTAm a AMCTBm b AMBT Am aBm bTCm c ExxTx V3 2Mm mem EAx a Ax aTAX a AdiagATAV3 2AMAT Ax a Ax aT Am a TrAMATAm a EAx abTCx cDx dT Ax abTCMDT Cm c Dm dT AMCT Am aCm cTbDm dT bTCm cAMDT 7 Am aDm dT 63 Weighted Scalar Variable Assume x E Rn is a random variable7 W E Rn is a vector of constants and y is the linear combination y WTX Assume further that mM2M3M4 denotes the mean7 covariance7 and central third and fourth moment matrix of the variable x Then it holds that M WTm lty 7 ltygt2gt WTM2W lty7 ltygt3gt WTM3W W lty7 ltygt4gt WTM4W W W PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 27 7 GA USSIAN S 7 Gaussians 71 Basics 711 Density and normalization The density of x N jm7 2 is 1 1 T 1 pltxgt mew be 7 m 2 x 7 m Note that if x is d dimensional7 then det27r2 270quotI det2 Integration and normalization exp 7x 7 mTE 1x7 m dx xdet27r2 exp i xTAer bTx dx xdet27rA 1 exp bTAilb exp i l dSTAS nBTs dS det27rA 1 exp ET rBTA 1B The derivatives Of the density are 8 7pltxgt24ltx 7m pxlt21x 7 mx7 mTE 1 7 24 712 Marginal Distribution Assume x N wa 2 Where then pXa NXCLU aan 10Xb Nxbol bvzb 7 13 Conditional Distribution Assume x N wa 2 Where PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 28 71 Basics 7 GAUSSIANS then A A in Ma2c231xrub x x N 2 A 1 M 1 1 Ma 2a 20720251ch A A it Hb23231xa lla 17be N Mhzb A 7 i quotb Eb 2572321120 714 Linear combination Assume x N Nm Egg and y N Nmy 2y then Ax By c N NAm Bmy c AEgEAT BEyBT 7 15 Rearranging Means 7 det27rATE 1A 1 71 71 71 NAxm7217WXA mATE A 716 Rearranging into squared form If A is symmetric7 then 1 1 1 7 xTAx bTx 7 x 7 A lbTAx 7 A lb EbTA lb 1 1 1 7 TrXTAXTrBTX 7 TrX7A 1BTAX7A 1BE I rBTA 1B 717 Sum of two squared forms In vector formulation assuming 21 22 are symmetric 1 7 x 7 m1T2f1x 7 m1 1 7 x 7 m2T21x 7 m2 7x 7 mCTE1x 7 me C 2071 me 2121gt 1lt21ml21m2gt 1 O W112m 21gt2121gt 1lt21ml21m2gt 1 7E ltm 2flm1 mg glmggt PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 29 72 Moments 7 GAUSSIANS In a trace formulation assuming 21 22 are symmetric 1 7 Trltltx 7 M1gtT21ltX 7 M1 1 7 Trltltx 7 M2gtnglltx 7 M2 i T r X 7 MCTE1X Mc1 0 21 Egl Mo 21 21gt 121M1 WM 1 C ETF EilMi 2 1M2TEil 22 1 1EI1M1 EEIM2 1 7EMMfzflM1 MgzglMQ 718 Product of gaussian densities Let Nx m7 2 denote a density of X then Nxm1721 Nxm2 E2 CchmcEc cc Nm1m272122 1 det27r21 22 mC 21 1 22 1 121m1 Eglmg 2 7 21 Egl but note that the product is not normalized as a density of x 1 exp m1 m2T21 2271m1 m2 72 Moments 721 Mean and covariance of linear forms First and second moments Assume X N 39m7 2 Ex m COVxx Varx 2 ExxT 7 EXEXT ExxT 7 mmT As for any other distribution is holds for gaussians that EAx AEX VarAx AVarxAT COVAX7 By ACOVX7 yBT PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 30 72 Moments 7 GAUSSIANS 722 Mean and variance Of square forms Mean and variance Of square forms Assume X N 39m7 2 ExxT 2 mmT EXTAX TrAE mTAm VarxTAx 204TrA2 402mTA2m Ex 7 m TAx 7 m m 7 m TAm 7 m TrAE Assume x N 3907 03921 and A and B tO be symmetric7 then COVXTAXXTBX 2a4TrAB 723 Cubic forms EbexxT mbTM mmT M mmTme bTmM 7 mmT 724 Mean Of Quartic Forms ExxTxxT 22 mmT2 meE 7 mmT Tr2 2 mmT 2 mmTA AT 2 mmT mTAmE 7 mmT TrAEE mmT 2M2 4mTEm Tr2 me2 TrAEB BTE mTA ATEB BTm TrAE mTAmTrBE mTBm E xxT AxxT ExTxxTx EXTAXXTBX EaTbechxde aTE mmTb CTE mmTd aT 2 mmTc bTE mmTd aT 2 mmTd bTE mmTc 7 2aTmbTchdem EAx a Bx bTCx C Dx dT AEBT Am aBm bT CEDT Cm c Dm dT AECT Am aCm cTBEDT Bm bDm dT Bm bTCm c AEDT 7 Am aDm dT TrBECTAEDT Am a Dm dT PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 31 73 Miscellaneous 7 GA USSIANS EAx aTBx b Ox 7 CTDx d I rA2CTD DTCEBT Am aTB Bm bTAECTDm d DTCm c I rAEBT Am aTBm bTrCEDT Cm CTDm d See 725 Moments EX Zpkmk k Covx Z Z pkpk 2k 7 mkmf 7 mkmg k k 73 Miscellaneous 731 Whitening Assume x N 39m7 2 then 2 27120 7 m N N0I Conversely having 2 N 3907 I one can generate data X N 39m7 2 by setting x 21227mNNm2 Note that 212 means the matrix Which ful ls 212212 E and that it exists and is unique since 2 is positive de nite 732 The ChiSquare connection Assume x N 39m7 2 and x to be n dimensional7 then 2 x 7 mTE 1x 7 m N xi Where xi denotes the Chi square distribution With n degrees Of freedom 733 Entropy Entropy Of a Ddimensional gaussian Hx 7 m7 2 lnm7 Edx ln xdet27r2 7 PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 32 74 Mixture of Gaussians 7 GA USSIAN S 74 Mixture of Gaussians 74 1 Density The variable x is distributed as a mixture Of gaussians if it has the density K 1 MK 7 lapk det27r2k 1 exp 7 x 7 mkT21x 7 mk Where pk sum tO 1 and the 2k all are positive de nite 74 2 Derivatives De ning 108 2k lm50m 2k one get 6 1n ps 0m 6 1n ps apj 61nps BEj NS VB 8 P J J 71npjASll j72j Dc PkASU kv 2k 331 Pjvs My 2139 1 2k kVstm 21c lnijsuj721i 721s7 m 1npjsuj7 21 2 ETS 7 MW WTEJT But pk and 2k needs tO be constrained PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 33 8 SPECIAL MATRICES 8 Special Matrices 81 Units Permutation and Shift 811 Unit vector Let ei E Rn be the ith unit vector7 ie the vector Which is zerO in all entries except the ith at Which it is 1 812 Rows and Columns Uh rOW Of A efA jih column Of A Aej 8 13 Permutations Let P be some permutation matrix7 eg 0 1 0 e P 100 e2 e1 e3 9 0 0 1 eg For permutation matrices it holds that PPT I and that egA AP Aeg Ael Aeg PA efA egA That is7 the rst is a matrix Which has columns Of A but in permuted sequence and the second is a matrix Which has the rows Of A but in the permuted se quence 814 Translation7 Shift or Lag Operators Let L denote the lag or 7translation7 or 7shift Operator de ned on a 4 X 4 examp e 0000 1000 L0100 0010 ie a matrix Of zeros With one on the subdiagonal7 LLJV 6M 1 With some signal I for t 17N7 the nth power Of the lag Operator shifts the indices7 ie 0 for t 17 H n L x 117 for tnlN PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 34 82 The Singleentry Matrix 8 SPECIAL MATRICES A related but slightly different matrix is the 7recurrent shifted7 operator de ned on a 4x4 examp e to H 0 0 1 0 HOOD 0 1 1 0 0 0 0 0 ie a matrix de ned by 613141 6i16jdimL On a signal x it has the effect tnx M t t 7 n mod N 1 That is7 I is like the shift operator L except that it 7wraps7 the signal as if it was periodic and shifted substituting the zeros with the rear end of the signal Note that I is invertible and orthogonal7 ie L l LT 82 The Singleentry Matrix 821 De nition The singleentry matrix J17 E Rn is de ned as the matrix which is zero everywhere except in the entry Lj in which it is 1 In a 4 X 4 example one might ave 0000 J2370010 0000 0000 The singleentry matrix is very useful when working with derivatives of expres sions involving matrices 822 Swap and Zeros Assume A to be n X m and J17 to bem X p AJiJ390 0 A 0 ie an n X p matrix of zeros with the ith column of A in place of the jth column Assume A to be n X m and J17 to be p X n 0 WA ego PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 35 82 The Singleentry Matrix 8 SPECIAL MATRICES ie an p X m matrix Of zeros With the jth rOW Of A in the placed Of the ith row i 823 Rewriting product Of elements Akile AeieJTBm AJiij AikBlj ATeieJTBTm ATJijBTkl AikB ATeZeJTBm ATJiij AlaBU AeieJTBTm AJijBTm 824 Properties Of the Singleentry Matrix Ifz j JijJij Jij JijTJijT Jij JijJijT Jij JijTJij Jij Ifz y j JijJiJ39 0 JiJ39TJiJ39T 0 JijJijT Jii JijTJij Jjj 825 The Singleentry Matrix in Scalar Expressions Assume A is n X m and J is m X n then TrAJij TrJijA ATM AssumeAisanJ isngtltmandBismgtltn7 then TrAJijB ATBTM mAJ B 13AM I rAJijJijB diagATBTZj AssumeAisan7 Jij isngtltmBismgtltn7 then xTAJiJ39Bx ATxxTBTj xTAJiJ39JiJ39Bx diagATxxTBTij PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 36 83 Symmetric and Antisymmetric 8 SPECIAL MATRICES 826 Structure Matrices The structure matrix is de ned by 8A Sij BAZj If A has no special structure then sii Jij If A is symmetric then Sij Jijin 7 JijJij 83 Symmetric and Antisymmetric 831 Symmetric The matrix A is said to be symmetric if A AT Symmetric matrices have many important properties7 eg that their eigenvalues are real and eigenvalues orthogonal 832 Antisymmetric The antisymmetn39c matrix is also known as the skew symmetric matrix It has the following property from which it is de ned A7AT Hereby7 it can be seen that the antisymmetric matrices always have a zero diagonal The n X n antisymmetric matrices also have the following properties detAT det7A 71 detA 7 detA det7A 07 if n is Odd 84 Vandermonde Matrices A Vandermonde matrix has the form 12 1 v1 1 vilil 1 v2 v3 11371 V 59 1 Un vi v2 1 The transpose of V is also said to a Vandermonde matrix The determinant is given by 21 detV H gt jvi 7 vj 60 PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 37 85 Toeplitz Matrices 8 SPECIAL MATRICES 85 Toeplitz Matrices A Toeplitz matrix T is a matrix Where the elements of each diagonal is the same In the n X n square case7 it has the following structure 7511 7512 quot 39 tin 750 751 quot39 757171 T I i L1 i 61 t12 x t1 75711 quot39 7521 7511 7547171 quot39 Li 750 A Toeplitz matrix is persymmetn39c If a matrix is persymmetric or orthosym metric7 it means that the matrix is symmetric about its northeastsouthwest diagonal antidiagonal Persymmetric matrices is a larger class of matrices7 since a persymmetric matrix not necessarily has a Toeplitz structure There are some special cases of Toeplitz matrices The symmetric Toeplitz matrix is given by 750 751 39 39 39 757171 T ll 62 2 H t1 7547171 39 39 39 751 750 The circular Toeplitz matrix 750 751 757171 tn E To 63 t1 751 39 39 39 757171 750 The upper triangular Toeplitz matrix 750 751 757171 TU 0 i 64gt i l i l 0 0 to and the lower triangular Toeplitz matrix to 0 0 TL L1 a a 65 0 7547171 Li 750 PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 38 86 The DFT Matrix 8 SPECIAL MATRICES 851 Properties of Toeplitz Matrices The Toeplitz matrix has some computational advantages The addition Of two Toeplitz matrices can be done with 0n flops7 multiplication Of two Toeplitz matrices can be done in 0nln n ops Toeplitz equation systems can be solved in 0n2 ops The inverse Of a positive de nite Toeplitz matrix can be found in 0n2 ops too The inverse Of a Toeplitz matrix is persymmetric The product Of two lower triangular Toeplitz matrices is a Toeplitz matrix More information on Toeplitz matrices and circulant matrices can be found in 107 7 86 The DFT Matrix The DFT matrix is an N X N symmetric matrix WN7 where the k nth element is given by W113 Si 66 Thus the discrete Fourier transform DFT can be expressed as N71 Xk Z gammy 67 n0 Likewise the inverse discrete Fourier transform lDFT can be expressed as 1 N71 17 N Z XkW k 68 160 The DFT Of the vector x 107 117 1N71T can be written in matrix form as X wa 69 where X X0Xl7 1N 7 UV The lDFT is similarly given as x wjylx 70 Some properties Of WN exist W121 Wiv 71gt WNWV N1 72 Wiv WE 73 If WN 6 13quot then 15 WW2 7W 74gt Notice7 the DFT matrix is a Vandermonde Matrix The following important relation between the circulant matrix and the dis crete Fourier transform DFT exists Tc WWI o WNtWN 75 where t t0t1 tn71lT is the rst row of To PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 39 87 Positive De nite and Semi de nite Matrices 8 SPECIAL MATRICES 87 Positive De nite and Semide nite Matrices 871 De nitions A matrix A is positive de nite if and only if XTAX gt 07 Vx A matrix A is positive semide nite if and only if XTAX 2 07 Vx Note that if A is positive de nite7 then A is also positive semide nite 872 Eigenvalues The following holds With respect tO the eigenvalues A pos def 42gt eigA gt 0 A pos semidef 42gt eigA 2 0 873 Trace The following holds With respect tO the trace A pos def TrA gt 0 A pos semidef TrA 2 0 874 Inverse If A is positive de nite7 then A is invertible and A 1 is also positive de nite 875 Diagonal If A is positive de nite7 then Aii gt 07W 876 Decomposition I The matrix A is positive semide nite Of rank T 42gt there exists a matrix B Of rank T such that A BBT The matrix A is positive de nite 42gt there exists an invertible matrix B such that A BBT 877 Decomposition II Assume A is an n X n positive semide nite7 then there exists an n X T matrix B Of rank T such that BTAB I PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 40 88 Block matrices 8 SPECIAL MATRICES 878 Equation with zeros Assume A is positive semide nite7 then XTAX 0 AX 0 879 Rank of product Assume A is positive de nite7 then rankBABT rankB 8710 Positive de nite property If A is n X n positive de nite and B is 7 X n Of rank 7 then BABT is positive de nite 8711 Outer Product If X is n X 7 Where n S 7 and rankX n then XXT is positive de nite 8712 Small pertubations If A is positive de nite and B is symmetric7 then A 7 tB is positive de nite for suf ciently sma t 88 Block matrices Let Aij denote the ijth block Of A 88 1 Multiplication Assuming the dimensions Of the blocks matches we have L A11 L A12 L L B11 L 1312 L L AiiBiiA12B21 L AiiBi2A12B22 L L A21 L A22 L L B21 L B22 L L A21311A22B21 L A2iBi2A22B22 L 882 The Determinant The determinant can be expressed as by the use Of Ci A11A12A 21A21 C2 A22iA21A1711A12 det LigiL D detA22 detCl detA11 detCQ PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 41 88 Block matrices 8 SPECIAL MATRICES 883 The Inverse The inverse can be expressed as by the use Of Cl A11 A12A 21A21 C2 A22 A21Af11A12 as 71 J A11 J A12 J J Cfl iAl llAlgcgl J J A21 J A22 J J CEIAHAE J C J J A131 A lAuCg lAmAff J ic1A12A21 AE21A2ICII J A231 A 21A210I1A12A 21 J 884 Block diagonal For block diagonal matrices we have JA11J0J 1JltA11gt 1J 0 J 0 J A22 J J 0 J A22 1 J det A011 A022 detA11detA22 PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 42 9 FUNCTIONS AND OPERATORS 9 Functions and Operators 91 Functions and Series 911 Finite Series Xn 71X7 I 1 IXX2 X 1 912 Taylor Expansion Of Scalar Function Consider some scalar function f x which takes the vector x as an argument This we can Taylor expand around x0 fX g fX0 gX0TX X0 X X0THX0X X0 where gltxogt Lg Xe Hltxogt 7223 X0 913 Matrix Functions by In nite Series As for analytical functions in one dimension7 one can de ne a matrix function for square matrices X by an in nite series fX i chn 710 assuming the limit exists and is nite If the coef cients 5 ful ls En cur lt 007 then one can prove that the above series exists and is nite7 see Thus for any analytical function there exists a corresponding matrix function fx constructed by the Taylor expansion Using this one can prove the following results 1 A matrix A is a zerO Of its own characteristic polynomium l p detI 7 A chA pA 0 2 If A is square it holds that l A UBU l e fA UfBU 1 3 A useful fact when using power series is that A A 0forn A 00 if A lt1 PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 43 92 Kronecker and Vec Operator 9 FUNCTIONS AND OPERATORS 914 Exponential Matrix Function In analogy to the ordinary scalar exponential function7 one can de ne exponen tial and logarithmic matrix functions eA E iilA IAlAWRH n0 ml 2 e A E i l71 A I 7 A 1A2 7 n0 ml 2 A 7 m 1 n 1 2 2 a zg AItA tA m mam E SrjngnA7MM7m Some Of the properties Of the exponential function are l eAeB eAB if AB BA 61471 67A gem Ae A e AA7 t6 R Trem TrAe A deteA eTrA 915 Trigonometric Functions 71 nA2n1 1 1 sinA E A 7 5A3 5A5 7 00 7L 27L cosA E I7 A2 A4 7 92 Kronecker and Vec Operator 921 The Kronecker Product The Kronecker product Of an m X n matrix A and an 7 X 4 matrix B7 is an m r X m matrix7 A B de ned as A1113 A1213 AMB A21B A22B AgnB A B AmlB AmgB AmnB PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 44 93 Solutions to Systems oquuations 9 FUNCTIONS AND OPERATORS The Kronecker product has the following properties see 13 A BC A BA C A o B y B o A A B C A B C aAA aBB aAaBA B A o BT AT BT A BC D AC BD A B 1 A 1 B 1 rankA B rankArankB TrA B mAmng detA B detArankltBgt detBmnkA 922 The Vec Operator The vecOperator applied on a matrix A stacks the columns into a vector7 ie for a 2 X 2 matrix A11 A 2 i vecltAgt ii A22 Properties of the vecOperator include see 13 vecAXB BT AVecX TrATB vecATvecB vecA B vecA vecB vec aA a vecA 93 Solutions to Systems of Equations 931 Existence in Linear Systems Assume A is n X m and consider the linear system Ax b Construct the augmented matrix B A b then Condition Solution rankA rankB m Unique solution x rankA rankB lt m Many solutions x rankA lt rankB NO solutions x PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 45 93 Solutions to Systems oquuations 9 FUNCTIONS AND OPERATORS 93 2 Standard Square Assume A is square and invertible7 then Ax b a x A lb 933 Degenerated Square 934 Overdetermined Rectangular Assume A to be n X m n gt m tall and rankA m then Ax b a x ATA 1ATb Ab that is there exists a solution x at alll If there is no solution the following can be useful Ax b xmin Ab NOW xmin is the vector x Which minimizes MAX 7 MP ie the vector Which is 77least Wrong The matrix A4r is the pseudoinverse of A See 935 Underdetermined Rectangular Assume A is n X m and n lt m llbroadll Ax b a xmm ATAAT 1b The equation have many solutions x But xmin is the solution Which minimizes MAX 7 MP and also the solution With the smallest norm 2 The same holds for a matrix version Assume A is n X m X is m X n and B is n X n then AX B gt Xmm AB The equation have many solutions X But Xmin is the solution Which minimizes MAX 7 BH2 and also the solution With the smallest norm See Similar but different Assume A is square n X n and the matrices B0B1 are n X N7 Where N gt n then if B0 has maximal rank AB0 131 a Am B1B0TB0B0T 1 Where Amin denotes the matrix Which is optimal in a least square sense An interpretation is that A is the linear approximation Which maps the columns vectors of B0 into the columns vectors of B1 936 Linear form and zeros Ax07 Vx A0 PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 46 94 Matrix Norms 9 FUNCTIONS AND OPERATORS 937 Square form and zeros If A is symmetric7 then xTAx 07 Vx gt A 0 938 The Lyapunov Equation AX XB C vecX 1 A ET 8 I lvecC See Sec 921 and 922 for details on the Kronecker product and the vec Operator 939 Encapsulat ing Sum EnAnXBn C vecX An7lvecC See Sec 921 and 922 for details on the Kronecker product and the vec Operator 94 Matrix Norms 941 De nitions A matrix norm is a mapping Which ful ls HAH gt o HAH omo HcAH lclllAH 66R HABH S HAHHBH 94 2 Examples llAlll maxi lAijl llAll2 max EigATA HAHp maXHxHP1HAXHp1p llAlloo 11189921 lAijl 421 lAijl2 TrAAH Frobenius llAllmaoc m fm39j lAijl HAHKF llsmgAH1 Ky Fan Where singA is the vector Of singular values Of the matrix A PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 47 95 Rank 9 FUNCTIONSAND OPERATORS 943 Inequalities E H Rasmussen has in yet unpublished material derived and collected the following inequalities They are collected in a table as below assuming A is an m X n and d minmn llAllmax HAH1 HAUL HAH2 HAHF HAHKF HAHma 1 1 1 1 1 HAH1 m m W17 W17 W17 HAHoo n n w w w HAHQ M W17 1 1 HAHF M 1 mnd m M d which are to be read as eg HAH2 x771HAHoo 944 Condition Number The 2norm ofA equals maxeigATA 9 p57 For a symmetric positive de nite matrix this reduces to maxeigA The condition number based on the 2norm thus reduces to 1 7 1 7 maxeigA HAH2HA H2 maxelgAmaxelgA i 76 95 Rank 951 Sylvester s Inequality lfAisznandBisnXT then rankA rankB 7 n S rankAB S minrankA rankB 96 Integral Involving Dirac Delta Functions Assuming A to be square then s6x Asds 7 1 A lx 1 detAp Assuming A to be 77underdetermined ie 77tall then ps6xi ASWS iwaTAWMH ifx AAX 0 elsewhere See 8 PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 48 97 Miscellaneous 9 FUNCTIONSAND OPERATORS 97 Miscellaneous For any A it holds that rankA rankAT rankAAT rankATA It holds that A is positive de nite 42gt 3B invertible7 such that A BBT 971 Orthogonal matrix If A is orthogonal7 then detA ill PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 49 A ONEDIMENSIONAL RESULTS A Onedimensional Results A1 Gaussian A11 Density 2 27ra2 0 101 1 exp A 1 2 Normalization S 2 e 20 d3 V27r02 eia12bxcdr View 52 4ac a a 2 0212C1T00 7 L Cl 7 46260 e dz 762 exp 7462 A13 Derivatives 9101 95 i M W WT 6mm z 7 m 8 02 9101 7 1 I 2 W W T 1 6mm 1 z 7 m2 1 a 7 a a2 A14 Completing the Squares 0212 011 co iaz 7 b2 w 1 cl 1 c 7 b 7 7 a CQ 2 C2 w 4 C2 co or 1 0212 011 co 7z 7 M2 d 2 1 2 1 d60751 M 202 a E 402 PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 50 A2 One Dimensional Mixture of GaussiAnsONE DIMENSIONAL RESULTS A15 Moments If the density is expressed by El 0r 101 CeXPWIQ 511 1 PW W exp 2a2 then the rst few basic moments are 701 ltIgt E 2 2 2 1 2 ltxgt aw awe 02 ltzsgt WW radial 2 2 ltz4gt 72gt we ewe and the central moments are will 0 0 seem a i seem o o 2 ltltzwgt4gt 3a4 3 A kind of pseudomoments un normalized integrals can easily be derived as 2 expegz2 elzzndz Zltzngt Lexp 752 7452 From the un Centralized moments one can derive other entities like ltI2gt ltIgt2 02 27 ltz3gt e ltz2gtltzgt 2am 2 14gt 7 lt12gt2 204 402 17 420712 A2 One Dimensional Mixture of Gaussians A21 Density and Normalization s 7 502 K Pk 1 105 7 exp 77 16 nal 2 0k A22 Moments An useful fact of MOG7 is that PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 51 A2 One Dimensional Mixture of GaussiAnsONE DIMENSIONAL RESULTS Where denotes average With respect tO the krth component We can calculate the rst four moments from the densities 35 ex AM 0 2 TM pl 2 a 101 2 r3ka exp 6212 Czar k as ltzgt k m 2k pk 35 2 ltz2gt Ema Mi Em 5 2 lt13gt 2k k3aimc M2 2k pk 22km 3 i 2 l 2 2 2 ltz4gt mm min 30 Em a e 622 3N If all the gaussians are centered7 irei Mk 0 for all k then ltIgt 0 0 ltz2gt Ema Em 32 13gt 0 0 2 ltz4gt 3 ma 2k pk 232 From the un centralized moments one can derive other entities like ltI2gt 7 ltIgt2 Elma Pic714 Mi 03913 MkMkl lt13gtnlt12gtlt1gt Elm Pic714 3013M 1 013 MiMkl ltI4gt ltI2gt2 EM mam Mi Mia 30 7 013 iWi 3 A 23 Derivatives De ning 103 2k picV39s 167013 we get for a parameter 9139 Of the jrth compo nen Blnpltsgt7 MAW Blnlt stwjya gtgt f 1991 2k r3st M7013 9 J that is7 01np8 3stij a i 571 2k kNAHm 013 P1 81np8 PM 17 a S 7 M 19M St PicVska 013 0 01np8 stWU i S WV 71 50139 2k PkMltMk7 0 Uj 0 PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 52 B PROOFS AND DETAILS Note that pk must be constrained to be proper ratiosi De ning the ratios by W ETJ 2k e 7 we obtain 8 Where i E In plt8gt E In plt8gt 8p 7 V V 7 2 76m an Pllt6l1 P1 an B Proofs and Details B1 Misc Proofs B11 Proof of Equation 14 Essentially we need to calculate 600m a 6X BXZ39 XIWIXMMMX V un71gtl J u139un71 6ki6u1qu1uzu XunilJ Xku1 6u1i6u2j Xun1l Xku1 Xu1u2 6un1i6lj n71 ZltXrkiX l Tjl 7 0 n71 ZltXrJinnili7gtkl 7 0 Using the properties of the single entry matrix found in Sec 8247 the result follows easily B12 Details on Eq 78 BdetXHAX detXHAXTrXHAX 16XHAX detXHAXTrXHAX 16XHAX XH6AX det XH AX nXH AX 16XH AX TrXHAX 1XH6AX det XH AX TrAXXH AX 16XH TrXHAX 1XHA6X PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 53 Bl Misc Proofs B PROOFS AND DETAILS First the derivative is found With respect tO the real part of X det XH AX TrXHAX 1XHA6X Z 8X l a det XHAX TrAXXHAX 16XH aRX lt aRX detXHAX AXXHAX 1 XHAX 1XHAT Through the calculations 16 and 47 were used In addition by use of 48 the derivative is found With respect tO the imaginary part O X 8detXHAX 7 TrAXXHAX 16XH 2T 7 z detXHAX TrXHAX 1XHA6X Z 63X l detXH AX AXXH AX 1 7 XHAX 1XHAT Hence derivative yields BdetXHAX 7 llt8detXHAX 8detXHAXgt 6X 2 8X Z 63X det XH AX XH AX 1XH A T and the complex conjugate derivative yields BdetXHAX 7 1 lt6detXHAX 7 8detXHAXgt BX 2 8X Z 63X detXHAXAXXHAX 1 Notice for real X A the sum of 56 and 57 is reduced tO 13 Similar calculations yield BdetXAXH 7 1lt8detXAXH 8detXAXHgt 6X 2 8X Z 63X detXAXHAXHXAXH 1T 77 and BdetXAXH 7 llt8detXAXH 8detXAXHgt BX 2 8X Z 63X detXAXH XAXH 1XA 78 PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 54 REFERENCES REFERENCES References 1 Karl Gustav Andersson and LarsChrister Boiersi Ordinaera di erentialek E E E F l E EEE vationeri Studenterlitteratur 1992i JOrn Anemiiller Terrence Ji Sejnowski and Scott Makeig Complex inde pendent component analysis of fr c1 4 39 39 datai Neural Networks 169l31171323 November 2003 S Barneti Matrices Methods and Applications Oxford Applied Mathe matics and Computin Science Series Clarendon Press 1990 Christoffer Bishopi Neural Networks for Pattern Recognition Oxford Uni versity Press 1995 Robert J Boiki Lecture notes Statistics 550 Online April 22 2002 Notes Di Hi Brandwoodi A complex gradient operator and its application in adaptive array theoryi EE Proceedings 1301211716 February 1983 PTSi F and Hi Mi Brookesi Matrix Reference Manual 2004 Website May 20 2004 Mads Dyrholmi Some matrix results 2004 Website August 23 2004 Gene Hi Golub and Charles E van Loani Matrix Computations The Johns Hopkins University Press Baltimore 3rd edition 1996 Robert Mi Grayi Toeplitz and circulant matrices A revieWi Technical report Information Systems Laboratory Department of Electrical Engi neeringStanford University Stanford California 94305 August 2002 Simon Haykini Adaptive Filter Theoryi Prentice Hall Upper Saddle River NJ 4th edition 2002 Roger A Horn and Charles Ri Johnson University Press 1985 M atrix Analysis Cambridge Thomas P Minkai Old and new matrix algebra useful for statistics De cember 2000 Notes Li Parra and Cl Spencei Convolutive blind separation of nonstationary sources In IEEE Transactions Speech and Audio Processing pages 3207 327 May 2000 John Gi Proakis and Dimitris Gi Manolakisi PrenticeHall 1996 Digital Signal Processing Laurent Schwartzi Cours d Analyse volume Hi Hermann Paris 1967 As referenced in 11 i PETERSEN amp PEDERSEN THE MATRIX COOKBOOK VERSION FEBRUARY 16 2006 Page 55
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'