### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# SPEC TOPICS IN MATH MATH 498

UW

GPA 3.76

### View Full Document

## 36

## 0

## Popular in Course

## Popular in Mathematics (M)

This 32 page Class Notes was uploaded by Addison Beer on Wednesday September 9, 2015. The Class Notes belongs to MATH 498 at University of Washington taught by Anne Greenbaum in Fall. Since its upload, it has received 36 views. For similar materials see /class/192053/math-498-university-of-washington in Mathematics (M) at University of Washington.

## Reviews for SPEC TOPICS IN MATH

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 09/09/15

Ranking Web Pages Tim Chartier and Anne Greenbaum Department of Mathematics Winter 2008 Tim Chartier and Anne Greenbaum Ranking Web Pages 0 Have a question Looking for an old friend Need a reference for a paper A popular and often effective form of information acquisition is submitting queries to Googlecom o In fact in January 2003 just over 1 200 searches would have been conducted in the past second 0 Google is a play on the word googol the number 10100 reflecting the company s goal of organizing all information on the World Wide Web Google Tim Chartier and Anne Greenbaum Ranking Web Pages Googleing Math 0 Suppose that we submit the query mathematics to Google 0 Why is the page we see at the top ofthe list deemed the best page related to the query 0 The web page listed first by Google is deemed loosely speaking the best web page related to the query 0 How is this page given such distinction Tim Chartier and Anne Greenbaum Ranking Web Pages The PageRank algorithm 0 Developed by Google s founders Larry Page and Sergey Brin who were graduate students at Stanford University when the foundational ideas of Google developed 0 Google ranks webpages according to the percentage of time one would end up at each web on a random walk through the web g E i if Tim Chartier and Anne Greenbaum Ranking Web Pages Return 0 Mnte Carlo Let s return to Monte Carlo simulation to mathematically model such a random walk through a web network V Tim Chartier and Anne Greenbaum Ranking Web Pages Surf over mini web 0 Assume we start at web page 1 0 We will assume that at each stage the surfer will randomly follow one of the links on the page The surfer can choose any link with equal probability Tim Chartier and Anne Greenbaum Ranking Web Pages Adjacency matrix 0 We will represent the structure ofa network with a matrix 0 The adjacency matrix for the network below is COO 40 OO OOO O OOO AOOOOA OO OOO O OOOO Tim Chartier and Anne Greenbaum Ranking Web Pages Your Turn 0 From the course webpage found at http www math washington eduNgreenbau download googlesiml m 0 Find the PageRank ofthe system 0 Experiment with altering networks and viewing the results 0 How many iterates do you need to distinguish the rankings ofthe various webpages Tim Chartier and Anne Greenbaum Ranking Web Pages Whatout Google 7 ML 0 Google indexes billions of webpages o How is PageRankfound by Google U Tim Chartier and Anne Greenbaum Ranking Web Pages Gettin Stochastic 0 Form a stochastic matrix M from our adjacency matrix 0 That is element mj gives the probability of a surfer visit webpagej from webpage i which implies mij9ijZgij j 0 Therefore for 000100 000100 100000 100000 100000 100000 G 0110107M oggogo 001001 0000g 000100 000100 pi Tim Chartier and Anne Greenbaum Ranking Web Pages 00 Note that G is sparse Recall the size ofn The sparsity of G will be an asset in manipulating it on a computer In particular only the nonzero entries along with column and row information are stored for large sparse matrices Because the average outdegree of pages on the web is about seven Kleinberg et al 1999 this saves a factor on the order of halfa billion in storage space and since n is growing overtime while the average number of links on each page appearto remain about constant the savings will only increase overtime 1i Tim Chartier and Anne Greenbaum Ranking Web Pages 0 Now let s start our random walk at state 1 0 What is the probability that we land at web page i after one step 0 While trivial to compute we can also find this with our transition matrix 0 First we represent our initial state by the vector v 1 0 0 0 0 0 Simply compute VM 2 0 0 0 1 0 Tim Chartier and Anne Greenbaum Ranking Web Pages 0 Now let s take another step 0 Compute v2 v1M 0 033 033 0 033 0 0 Your Turn Find V3 Tim Chartier and Anne Greenbaum Ranking Web Pages 0 Now let s take another step 0 Compute v2 v1M 0 033 033 0 033 0 0 Your Turn FindV3 AnswerV32v2M067 0 017 0 0 017 U Tim Chartier and Anne Greenbaum Ranking Web Pages Lotsa seps 7 0 An important observation should be made about the matrixvector multiplication In particular V4 2 V3M V2MM Tim Chartier and Anne Greenbaum Ranking Web Pages Lotsa steps 0 An important observation should be made about the matrixvector multiplication In particular V4 2 V3M V2MM 0 Therefore we can easily find say V100 Compute mm 0263 0105 0158 0316 0105 0053 Tim Chartier and Anne Greenbaum Ranking Web Pages Marathn 0 steps 0 Your Turn Find V500 0 Your Turn Find V700 o What do you notice Tim Chartier and Anne Greenbaum Ranking Web Pages Marathn ofsteps 0 Your Turn Find V500 0 Your Turn Find V700 o What do you notice 0 To three decimal places VM7oo VM5002VM100 0263 0105 0158 0316 0105 0053 m Google s Eigenvectors o A nonnegative vectorthat satisfies VM 2 v is called a steadystate vector of the Markov process where v is normalized such that Zv 1 which results in a vector of probabilities o For us it is important to note that this is a lefteigenvector ofthe matrix M That is VM 2 v 0 Did you notice that we were just using the Power Method to find v 0 However notice that we didn t need at least in that example to normalize the vector at each step A s 7 Tim Chartier and Anne Greenbaum Ranking Web Pages PageRank is a vector 0 Google defines the PageRank of page i to be v 0 Therefore the largest element ofv corresponds to the page with the highest PageRank the second largest to the page with the second highest PageRank and so on o The limiting frequency that an infinitely dedicated random surfer visits any particular page is that page s PageRank Tim Chartier and Anne Greenbaum Ranking Web Pages MATLAB s search 0 The following will implement the Power Method to find the PageRank vector iterates 0 while maxabsvNew V gt 001 V VNew VNew VM iterates iterates 1 end 0 The full code can be found also on the course webpage as googlePower m Tim Chartier and Anne Greenbaum Ranking Web Pages Catching anther wave 0 Let s surf again 0 Adapt googlepowermforthe network below Tim Chartier and Anne Greenbaum Ranking Web Pages Dangling node 0 Note web page 6 is what is called a dangling node with no outlinks What web pages have this behavior 0 What problem did you see with our current model 0 Ideas to fix it Let s see if we can come up with the one of Erin and Page that lies deep within Google s algorithm Tim Chartier and Anne Greenbaum Ranking Web Pages Revised random surfing The rules to our Monte Carlo game are now 0 Again restrict ourselves only to indexed web pages 0 Assume that forp 085 or 85 ofthe time a surfer follows a link that is available on the current web page that the surfer is visiting The other 15 ofthe time the surfer randomly visits with equal probability any web page available in the network Wt Tim Chartier and Anne Greenbaum Ranking Web Pages It s elementry 0 Let r denote the row sum of row i 0 Therefore the transition matrix M has elements p amp pi r 0 my 2 r n r E7 0 Again p 085 0 This model creates a transition matrix that is often called the Google matrix Tim Chartier and Anne Greenbaum Ranking Web Pages Formin the Transition Matrix 0 Therefore for our problem the adjacency matrix is 000100 100000 100000 62 011010 001001 000000 0 The first row ofthe transition matrix M is 156 156 156 85156 156 156 4 E 7 Tim Chartier and Anne Greenbaum Ranking Web Pages Forming the Transition Matrix cont 0 Again 0 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 0 0 G 0 1 1 0 1 0 0 0 1 0 0 1 0 0 0 0 0 0 0 Therefore the Google matrix is 00250 00250 00250 08750 00250 00250 08750 00250 00250 00250 00250 00250 08750 00250 00250 00250 00250 00250 00250 03083 03083 00250 03083 00250 00250 00250 04500 00250 00250 04500 01667 01667 01667 01667 01667 0166477 Tim Chartier and Anne Greenbaum Ranking Web Pages o If you enter M into googlepowermthe algorithm will converge to 02680 01117 01594 02644 01117 00846 0 Therefore page 1 has the best ranking followed by page 4 Compare this to the network Tim Chartier and Anne Greenbaum Ranking Web Pages Existence and uniqueness o Ifthis vector is not unique which one would you choose Would you bid between companies for which one to choose 0 The following theorem Lax 1997 guarantees the uniqueness ofthe steadystate vector and that it will have positive entries Theorem Perron Every real square matrix P who entries are all positive has a unique eigenvector with all positive entries its corresponding eigenvalue has multiplicity one and it is the dominant eigenvalue in that every other eigenvalue has strictly smaller mag nitude A u l Si Pi Tim Chartier and Anne Greenbaum Ranking Web Pages Stochastic matrices 0 Recall that the rows ofM sum to 1 Therefore M1 1 where 1 is the column vector ofall ones That is 1 is a right eigenvector of M associated with the eigenvalue 1 most notably for our purposes having all positive entries 0 Perron s Theorem ensures that1 is the unique right eigenvectorwith all positive entries and hence its eigenvalue must be the dominant one o The right and left eigenvalues of a matrix are the same therefore 1 is the dominant left eigenvalue as well So there exists a unique steadystate vector v that satisfies VM 2 v Normalizing this eigenvector so that Zv 1 gives a steadystate vector A s 7 Tim Chartier and Anne Greenbaum Ranking Web Pages Your Turn 0 It is time to experiment and play Indeed we will become Google search engines simple ones ourselves 0 To search from a homepage you will type a statement like UG surfer httpwwwxxx 222 n o This starts at the given URL and tries to surfthe Web until it has visited n pages That is an n by n matrix is formed 0 Note surfing can cause problems and as such you may even have to terminate MATLAB Yet nonetheless we can create our own PageRank example Tim Chartier and Anne Greenbaum Ranking Web Pages Googletime 0 Download surfer m page rank m and page rankpow m from the course webpage Note this version of the Power Method only uses G and does not use the transition matrix M 0 These codes were written by Cleve Moler although pagerankm is an adapted version ofCIeve s codes 0 Let s begin with wwwwashingtonedu as our starting URL Let s only visit 20 pages Therefore type U G surfer http wwwwashington edu 20 0 U a cell array of n strings the URLs ofthe nodes 0 G an nbyn sparse matrix with Gij 1 if nodej is linked to node i 0 Type pagerank and the pagerankwill be computed 0 Note The program hangs sometimes and requires breaking from the program CTRLC or shutting down 1 MATLAB altogether LL Tim Chartier and Anne Greenbaum Ranking Web Pages

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

#### "I made $350 in just two days after posting my first study guide."

#### "Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

#### "Their 'Elite Notetakers' are making over $1,200/month in sales by creating high quality content that helps their classmates in a time of need."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.