Review Sheet for CMPSCI 689 at UMass(5)
Review Sheet for CMPSCI 689 at UMass(5)
Popular in Course
Popular in Department
This 42 page Class Notes was uploaded by an elite notetaker on Friday February 6, 2015. The Class Notes belongs to a course at University of Massachusetts taught by a professor in Fall. Since its upload, it has received 14 views.
Reviews for Review Sheet for CMPSCI 689 at UMass(5)
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 02/06/15
CMPSCI 689 Linear Algebra LeastSquares Regression and the Multivariate Gaussian Sridhar Mahadevan mahadevacs umass edu University of Massachusetts 689 F03 p 141 Topics 39 Review of linear algebra Linear Algebra Strang 39 Matrices and vector spaces 39 Projections 39 Orthonormal projections and leastsquares regression 39 Eigenvalues and eigenvectors 39 Multivariate gaussian densities 39 Discriminant functions 689 F03 p 241 Applications of Linear Algebra 39 Face recognition using eigenfaces Mogaddam and Pentland 39 Solves the problem of face recognition using linear algebra and multivariate gaussian density estimation 39 Eigenvectors produce a reduced representation of faces 39 Information retrieval Google 39 Construct a singular value decomposition of the adjacency matrix underlying the web by doing a random walk 39 Look for dominant eigenvectors 39 Orthonormal basis expansions 39 Wavelets Fourier series etc used to encode signals 39 Early layers in the visual cortex compute reduced representations of highdimensional visual data Theoretical Neuroscience Dayan et al MIT Press 689 F03 p 341 Face recognition using eigenvectors MTurk and APentland Eigen Faces for Recognition Journal of Cognitive Neuroscience 31 1991 689 F03 p 441 Feature detection using eigenvectors B Moghaddam and A Pentland Probabilistic visual learning for object representation In S K Nayar and T Poggio editors Early Visual Learning pages 99 130 Oxford Univ Press 1996 BlitancerlrurilrLEjerSpace IllSlallceiilDilliFlEjeispaEE llalancerlrumrl luulhrSpace 689 Pm 7 p 54 Vectors 001 A vectorx IS an ntuple eg 002 A text document d can be viewed as a very long vector where each component d1 lel represents the number of occurences of a particular word An image can be viewed as a vector of pixels A matrix is a list of vectors one for each column eg 11 12 21 22 For example the rows of a matrix A could be documents and the columns could represent words The 23 entry a 1 if the word 9 occurs in document 2 The transpose of a matrix AT is obtained by interchanging the rows and columns of A What is the matrix ATA if A is the worddocument frequency matrix 689 F03 p 641 Vector Addition and Subtraction 689 F03 p 741 Vectors Operations 39 Operations on vectors 0 Addition and subtraction is done component wise Multiplication by a scalar cXT 0901 cxn Inner product between two vectors X 0 Y X TY 2 2i xiyi Length of a vector V W Outer product between two vectors X YT results in a 7171 matrix eg x191 0013p 002 002 00232 689 F03 p 841 Vector Operations Unit vectors have length 1 eg z 10 9 01 u cosa sin a The unit vector in the direction of X is u H H A coordinate rotation about the axis can be represented by two unit vectors U1 2 cosa sin CKU2 sina cos a Two vectors are orthogonal if X 0 Y 0 689 F03 p 941 Cauchy Schwartz Inequality 39 The angle between two vectors X and Y is given by show this XOY COSCK llelllYll 39 This immediately gives us the celebrated inequality why IX 39Yl S lelllYll 6892 F03 p 1041 Matrix inverses and determinants The inverse of a square invertible matrix A is A l In practice inverses can be found by GaussJordan elimination The determinant of a matrix A is a single scalar number that represents the volume of a box whose edges are defined by the rows of the matrix 39 I 1 Given a matrix A if we form a new matrix A by interchanging two rows A A If we multiply a single row of A by c keeping all other rows constant A cA Also AB AB b A a ad be d 39 The cofactor CM 2 1ijMj where the matrix MM is formed by eliminating rowi and column 3 The z j element of the inverse A613 2 IOAI39 A1 1 d b ad be c a 6892F03 p114Z Vector Spaces A vector space is a set of vectors closed under addition multiplication by a scalar has a 0 element inverse has a 1 identity element and satisfies standard associatity and commutativity operations Examples R is the vector space of ndimensional real numbers M is the vector space of all 2 x 2 real matrices etc A subspace of a vector space is a set of vectors that has the 0 element and is closed under addition and scalar multiplication eg all vectors on a line passing through 0 0 constitute a subspace of R2 Is the quadrant made up of all ac such that 001 gt 0 x2 gt 0 a vector space How about the set of all 2 2 symmetric matrices 6892 F03 p 1241 Vector Spaces associated with a matrix The column space 0 A associated with a m n matrix A is the set of all linear combinations of the columns of A Ax A1901 A2902 where A1 is the first column of A etc illiill lmlilm The row space CAT associated with a m n matrix A is the column space of the transpose AT The system of equations Ax b is solvable if and only if b is in the column space of A 6892 F03 p 1341 Vector Spaces associated with a matrix The null space N A associated with a m n matrix A is the set of all vectors ac such that Ax 0 The left null space N AT associated with a m n matrix A is the set of all vectors ac such that ATx 0 1 2 What is the nullspace of 3 6 1 2 0 We solve the below linear system of equations x1 3 6 x2 0 2 The nullspace NA consists of all scalar multiples of 1 6892 F03 p 1441 Example of Matrix Vector Spaces Consider the matrix A 1 2 3 What is the column space CA What is the row space CAT What is the null space NA What is the left null space NAT Repeat the above for the matrix A 6892 F03 p 1541 Independence The columns of a matrix A are linearly independent if the only solution to Ax 0 is ac 0 Eg 1001 are independent Any set of n vectors in H is dependent if n gt m A set of vectors span a space if their linear combination fills the space A basis for a vector space is a set of linearly independent vectors that span the space 6892 F03 p 1641 Dimensions The dimension of a vector space is the number of vectors in every basis eg the dimension of R is n The dimension of the row space 2 7 number of independent rows The dimension of the column space 2 7 number of independent columns If the dimension of the rowcolumn space 2 number of rows columns the matrix A is considered full rank and has an inverse A l The dimension of the nullspace n 7 The dimension of the left nullspace m 7 Give the dimensions of the vector spaces associated with the matrix A l 1 2 3 Repeat the above for the matrix A 6892 F03 p 1741 Vector Spaces associated with a Matrix dimension r Row space RA x n nullspace NA dimension n r A is an mn matrix xx Axb dimension r Column space CA b dimension m39r left null ace NAT 6892 F03 p 1841 Orthogonal Spaces Two subspaces W and V are orthogonal if every w E W is perpendicularto every 21 E V A subspace W forms an orthogonal complement to another subspace V if for each vector v E V there exists a unique perpendicular vector vi E W The nullspace NA forms an orthogonal complement to the row space CAT The left nullspace NAT forms an orthogonal complement to the column space CA 6892 F03 p 1941 Projection of a Line onto a Vector We want to find the point p on a line through a vector 1 that is closest to another vector 9 The pointp m since it must be in the direction of 1 Consider b p b om Low id 0 A aob aTb x Z aoa aTa The projected vectorp is given by p a T T b The prOJectIon matrix IS GT0 Exercise find the projection of 9T 2 1 1 1 onto aT 1 2 2 and find the projection matrix of a 689F03 p204Z Projection onto a subspace a2 39 Given a set of linearly independent vectors 11 an in R find the combination 38111 man that is closest to a given vector 9 a1Tb A0E 0a b Aaz 0 39 These n equations can be summarized as ATb AQE 0 where A is an m n rectangular matrix 39 The projection is given by p AQE AATA 1ATb The projection matrix is AATA1AT 6892F03 p214Z Least Squares Regression 39 We have just solved the learning problem of how to find the best line that will fit a set of points and minimize the squared error loss function This is the problem of linear regression Let the equation of the line be in 2 dimensions y 2 cm B We are given a set of points x1y1ac2y2 xmym Bax1y1 axmym 1 x1 31 1 062 5 3J2 l a l 1 00m ym 39 Since m gt 2 often much larger obviously we have a system of equations that cannot be solved exactly ie the 2 columns span only a small portion of the mdimenSIonal space 689F03p224 Least Squares Regression 39 Since we can t find an exact solution to Ax B we find an approximation 9 such that AQE B where 9 minimizes the distance Ax B 1 x1 ATA 1 1 1 002 2 m 2 xi 1 00m 91 Solve the system m 2 xi 5 Z Z yz 2 xi 2 C 2 xiyz 6892F03 p234I Least Squares Projection Vector Space View AbanmmnBMX dhnen on r dhnen on r Cdumn space CA nuHspace 39 t IS emp y b is not in CA 39 nuHsp ce AT 6892 F03 p244Z Orthogonal Least Squares Regression We can make the computation even simpler by converting the ATA matrix into a diagonal matrix The trick is to define a new set of variables mg 2 xi m We now from before that EX EX 0 so this implies that 90 0 0 i The new system becomes m 2 B 2 y 0 Z 902 C 2 902 The new system is much easier to solve This is an example of GramSchmidt orthogonaization 6892 F03 p254Z Orthonormal vectors and matrices A set of vectors are orthonormal if 1qu 6 which is 1 exactly when 2 3 otherwise 0 So orthonormal vectors are both perpendicular to each other and are unit vectors Show that the matrix Q whose columns are orthonormal vectors has the following property QTQ I If Q is a square matrix show that this implies QT Q l cos a sin a Example of an orthonormal matrix sin a cos a This matrix corresponds to doing a coordinate rotation by a Show that the permutation matrix is orthonormal 0 1 0 100 001 6892 F03 p264Z GramSchmidt Orthogonalization How to make a set of vectors into an orthonormal set Let the first vector be given as A a The second vector B 9 must be made perpendicular to the first We can do that by subtracting its projection along A ATb Bb A ATA The third vector 0 0 now can be made perpendicularto the first two as follows ATC ETC 0 A C ATA BTB B Repeat until done Finally normalize the length of every vector by A 2 Hi ll B B HBH etc Example Orthogonalize the following vectors a 1 1 0 b 2 0 2 c 3 3 3 6892 F03 p274Z Eigenvalues and Eigenvectors Generally when we multiply a vector ac by a matrix A the resulting vector A90 is not in the same direction as x Occasionally however this does happen and the resulting vector is then called an eigenvector Formally the eigenvectorx of a matrix is such that Ax 2 Am where is a scalar called an eigenvalue Eigenvectors lie in the nuspace of A I because they satisfy the equation A Iac 0 Algorithm Solve the equation A I 0 to find the eigenvalues For each resulting eigenvalue M solve the equation A Iac 0 to find the corresponding eigenvector 1 1 2 Exercise Find the eigenvalues and eigenvectors of 6892 F03 p284Z Properties of Eigenvalues and Eigenvectors The product of the n eigenvalues gives us the determinant of the matrix so if the matrix is singular 0 must be an eigenvalue The sum of the eigenvalues equals the trace or sum of the diagonal entries of the matrix If a square 7171 matrix A has n linearly independent columns then the following property holds S lAS A equivalently it follows that A 2 SAS 1 A1 0 sx1 01 7A 0 2 0 0 n The columns of S are made up of the eigenvectors To prove this first show that AS 2 SA and then it follows directly Suggest a fast way to compute Ak given its decomposition above 6892 F03 32941 Eigenvalues 0f Symmetric Matrices If a matrix A is symmetric that is AT 2 A then all its eigenvalues are real This leads to a fundamental spectral theorem Every symmetric matrix A has the factorization A QAQT where the eigenvector matrix Q is comprised of an orthonormal set of vectors Interesting property If Q is orthonormal then QT Q l Find the orthonormal set of eigenvectors for the previous matrix Show that any symmetric matrix can be written as the sum of projection matrices A AlxlxlT l l AMONG Each of these matrices is a projection along an eigenvector space 6892 F03 p304Z Positive De nite Matrices A symmetric matrix A is positive de nite if xTAx gt 0 for every nonzero vector x if we allow equality we get the property of positive semidefiniteness Symmetric matrices with positive eigenvalues are positive definite if Ax 2 Am then xTAac AxTx gt 0 when gt 0 b xTAxx ya xax22l9xycy2gt0 b c 3 Note that this is an equation for an ellipse What we need to determine is the orientation and the parameters of the ellipse Result Given a positive definite matrix A QAQT the graph of xTAx 1 is an ellipse or ellipsoid ac yQAQTX YA1X22Y21 39 39 1 1 689F03 314 The half length of the major and minor axes are m and m P Multivariate Normal Distribution 2D Normal 689 F03 33241 Multivariate Normal Distribution The multivariate normal distribution in d dimensions is given by 1 1 a HT 1 a a 00 E 00 Px 6 2Wd2212 eXp 2 M M 0 Ef is a ddimensional vector 0 Z Ef mm mT is a d x d covariance matrix 2 is the determinant of Z 6892 F03 p334Z Covariance Matrix The covariance matrix in 2 dimensions is E E901 M1901 M1 E901 M1902 M2 0 012 2 Ex2 M2901 1 E002 WW2 2 012 02 The covariance matrix is symmetric which means that the eigenvalues are real It also means that the eigenvectors form an orthonormal set independent unit length 80 if Eu 2 Aim then uiTuj 6 Since 2 is symmetric it can be written as E QAQ l where A is the eigenvalue diagonal matrix and Q is the orthonormal eigenvector matrix It immediately follows that the inverse covariance matrix can also be written in such a form why This implies that the inverse covariance matrix 2 1 can be expressed as a sum of projection matrices 2 1 2L1 L 6892 F03 p344Z Mahalanobis Distance 39 The exponent in the multivariate normal distribution A2 2 ac MTE 1x u is called the Mahalanobis distance If the covariance matrix is diagnonal E 0 it reduces to Euclidean distance 39 Consider a change of coordinate from the original ac to the following y Uac u where the rows of U are the eigenvectors 2 39 Then using the above expression for 2 1 it can be shown that A2 2 Notice this shows that the inverse covariance matrix is positive semidefinite y1 x2 12 X1 689F03 p354 Transformed Multivariate Gaussian x1 Note that in the transformed y Uac a the multivariate gaussian can be written in the following simplified form This is a product of d independent normal distributions with mean 0 and variance 2 j jth eigenvalue Thus the transformation y Uac a decorreates the original components 689F03 p364Z Singular Value Decomposition 39 So far we have seen how to decompose symmetric matrices into a product of orthogonal matrices the eigenvector matrix and a diagonal matrix the eigenvalue matrix What about nonsymmetric matrices The key insight behind SVD is to find two orthogonal basis representations U and V such that for any matrix A the following decomposition holds AzUEVT where U and V are orthogonal matrices so UUT I and VVT I 39 E is a diagonal matrix whose entries are called singular values hence the meaning behind the term SVD 6892 F03 p374Z Singular Value Decomposition We want to choose the orthogonal matrices in such a way that not only are they orthogonal but we want the result of applying A to them to also be orthogonal That is the v are are a orthonormal basis set unit vectors and moreover Av also orthogonal to each other A111 IIAWII We can then construct the u set to give us the second orthonormal basis 0392 Ail WHOM WHU WHO 0 So we get the following relationships AV 2 US or U lAV 2 or UTAV 2 AV 2 US or AVV 1 UEV 1 or A UEVT 6892 F03 p384Z Finding Orthonormal Basis Representations for SVD How do we go about finding U and V Here is the trick We can eliminate U from the equation AV 2 US by premultiplying A by its transpose ATA UEVTT UEVT VETUTUEVT VETEVT Since ATA is always symmetric why the above expression gives us exactly the familiar spectral decomposition we have seen before namely QAQT except that now V represents the orthonormal eigenvector set of ATA In a similar fashion we can eliminate V from the equation AV 2 US by postmultiplying A by its transpose AAT UEVT UEVTT UEVTVETUT UEETUT The diagonal matrix 2 is now the eigenvalue matrix of ATA 6892 F03 33941 Examples of SVD 39 Find the SVD of the following matrices 6892 F03 p404Z SVD Technology How GoogleTM works GoogleTM uses SVD to accelerate finding relevant web pages Here is a brief explanation of the general idea Define a web site as an authority if many sites link to it Define a web site as a hub if it links to many sites We want to compute a ranking x1 acN of authorities and 31 yM of hubs As a first pass we can compute the ranking scores as follows as is the number of links pointing to z and y is the number of links going out of 2 But not all links should be weighted equally For example links from authorities or hubs should count more So we can revise the rankings as follows E y ATyO and Z on Aaco j links to i i links toj Here A is the adjacency matrix where aij 1 ifz39 links to j 6892F03 p414Z SVD Technology How GoogleTM works 39 Clearly we can keep iterating and compute x5 2 ATA sick 1 and yf AAT yk1 39 This is basically doing an iterative SVD computation and as k increases the largest of eigenvalues dominate 39 GoogIeTM is doing an SVD over a matrix A of size 3 x 109 by 3 x 109 6892 F03 p424Z
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'