Group Study MAE 298
Popular in Course
Popular in Engineering Mechanical & Aero
This 98 page Class Notes was uploaded by Consuelo Herman DDS on Tuesday September 8, 2015. The Class Notes belongs to MAE 298 at University of California - Davis taught by Staff in Fall. Since its upload, it has received 21 views. For similar materials see /class/187472/mae-298-university-of-california-davis in Engineering Mechanical & Aero at University of California - Davis.
Reviews for Group Study
Report this Material
What is Karma?
Karma is the currency of StudySoup.
Date Created: 09/08/15
MAE 298 Lecture 7 April 25 2006 Timescale 1 5314 to Timescale 2 1092 to Timescale 3 157 to 20 20 20 0 5 10 0 5 10 0 5 10 Partitioning Networks Spectral Methods Review class so far Random graphs basic graph properties diameter degree distribution phase transitions Power laws signatures of phase transitions size distribution of components at the critical point seen pervasively in degree distribution of realworld networks for 2 lt 7 lt 3 finite mean but infinite standard deviation o Preferential attachment PA Cumulative advantage o Robustness of power law random graphs o Optimization as an underlying mechanism for PA also gives rise to saturation of preferential attachment Last time Software tools a Especially important for connecting data with the structure of networks Partitioning networks Challenge Finding groups of vertices that have a high density of edges within them with a lower density of edges between groups Find partitions that divide the graph into smaller pieces while cutting the minimum number of edges possible Partitioning networks approaches From Graph Theory Spectral methods Identifying subclusters Mixing times for information Betweenness Incorporating data attributes Mixing of links amongst different data types Community structure Graph Theory Spectral Methods o Looks only at graph structure without taking into account node or edge attributes 0 Based on random walks on the graph Extremely useful for quantifying properties of information flow on networks and algorithms cover time mixing time etc Quantitative no subjectivity Why are other methods subjective eg Dendrograms wwawgzmw mm 23mzmguam7mmggmm Iterate from bottom up Stop when have desired number of clusters Spectral methods a Considers the eigenvalues and eigenvectors of the graph o Take adjacency graph and consider a random walk on that graph o For now impose constraint that adjacency matrix be the symmetric Need this to ensure real eigenvalues For now three important measures from random walks Cover time Mixing time Relaxation time Sample graph structure Random walk State Transition Matrix Columnnormalize the adjacency matrix 93 17 14 13 12 14 0 0 14 13 0 14 0 M 14 0 12 0 0 14 13 0 14 12 0 0 0 1412 M will have a basis set of eigenvectors ii and corresponding eigenvalues Ai Spectral values and graph isomorphisms o Consider two networks with the same exact topology but different labeling of edges ie two isomorphic networks a Two isomorphic graphs will have the same eigenspectra same eigenvalues and eigenvectors o The eigenvalues and eigenvectors can be used to determine is two networks are notisomorphic f eigenspectra different can guarantee graphs not isomorphic But if eigenspectra the same cannot guarantee graphs are isomorphic but it s a good clue especially for large graphs Eigenvalues a Let the vector 27 denote the probability of finding the random walk on any node ie vi is probability of finding walker on ith node a We can write 27 27 aj j sum over the basis vectors 0 Apply the state transition matrix to each eigenvector Maj Ajuj Applying it ttimes MMMoooMuj Maj Aj uj Eigenvalues what does it mean a Steadystate A1 1 The state transition matrix will have at leastone eigenvalue A1 1 M211 A1721 equivalently M75221 2 Alytjl 2 171 o In fact the number of eigenvalues with A 1 equals the number of components Each component relaxes to its unique independent steadystate solution Spectral gap 0 When only one component the largest eigenvalue A1 1 All other A s are smaller 0 The difference A1 A2 is the spectral gap o It tells us how effectively we can partition graphs into separate pieces The larger the spectral gap the worse the partitioning cuts too many edges 0 Identify bottle necks Mixing time related to A2 a lsoperimetric number the best cut also related to A2 Mixing timerelaxation time Mixing time Start a random walker at any arbitrary node How long before it forgets on average where it started from Consider a dumbbell graph Long memory of whether started on left or right hand side But very easy to partition in two wo cutting many edges A closely related measure is the relaxation time Time for an original amplitude to decay to 16 Mtg 2 272 Relaxation time Want to calculate the relaxation time of the eigenvectors M UZ39 EUZ39 Hi gt t1n 1n1e 1ne 1 t 11n Thus the largest relaxation time in the system 25W 2 11n2 1O 2O 0 5 17mm 109 to 7513 Example topologies 1O 2O 0 5 Tuned 604 to 461 1O 2O 0 5 Tmax 5314 to 5315 20 0510 Mixing times and partitioning into two components R 7513 R 5315 Tmin 109 to max 5314 to 1 71 39 v 1 A a k g a a gt gt O E m m 0 O I I I I O 5 10 20 X X X The larger the mixing time the better the out 10 20 0 5 Timescale r1 5314 to 500 5000 Higher order modes Timescale 12 1092 to Timescale r3 157 to O O l l gt gt O O LO LO 0 O X X Applications to sensor networks Standard state transition matrix column normalize o Fastest mixing 1 complete graph 2 star network 0 Smallest cover time star network Applying this to wireless networks next time 0 But in realworld often communication overhead only one conversation at a time o Weighting the random walks kink H 1kk71 1kk71 1kk71 Asymmetric directed graphs Can have complex real imaginary eigenvalues a Not fully known how to use spectral methods for such graphs an open problem Summary spectral methods eigenspectrum o If two distinct graphs have the same eigenspectrum they are likely isomorphic esp for large graphs o Eigenvalues degeneracy of A 1 tells us how many disconnected components in the graph MAE 298 June 6 2006 Review What are networks a Structural measures to characterize them Network models theory 0 Realworld networks guest lectures What are networks Nodes and edges 0 Geometric versus geometry free 0 dynamic topologies versus static 0 Social technological biological o Interplay of topology and function Structural characteristics a Degree distribution concentrated versus heavytailed Properties of power laws Clustering coefficient Betweenness centrality of geodesic paths through a given node Eigenvalue spectrum cover time of graph identify bottlenecks one algorithm for clustering o Structural equivalence if have same neighbors Network growth models 0 ErdosRenyi random graph Emergence of the giant component phase transition Preferential attachment 0 Optimization FKP CIPA etc Small worlds Network models applications a Robustness to node and edge deletion as a function of topology 0 Decentralized search and WWW search Sixdegrees of separation Millgram Page rank algorithm a Percolation and spreading as a function of topology SIR models pde s versus network view 0 Algorithms for selforganization of sensor networks 0 Simple models of the power grid 0 Simple views of interacting networks Realworld networks 0 Software networks Chris Bird o Protein interaction and gene expression networks Eivind Almaas a Internet and overlay networks Ram Keralapura o Biological networks and evolution Sergei Nuzhdin o Estuaries and policy networks Mark Lubell o Urban transportation Michael Zhang Interplay of topology and function Together with your own original research projects we have covered a lot of ground this quarter Hope you have found it provocative and interesting Further resources Will try to organize these on the web somewhere UCD networksresearch wiki Seminar series Other UCD classes Meetings and symposiums Summer schools Complex systems seminar series Science of Complex Systems Weds 4105pm 1147 Mathematical Sciences Building httpcseucdaviseducsseminar Other UCD classes 0 Your input needed here 0 STATS 250 Data visualization Graduate topics course Prof Duncan Temple Lang Winter 2007 STATS 250 Data visualization 0 Content will depend on student interest a Focus is on developing reusable and extensible scientific software and making it available to people in various disciplines a Data visualization where data and observations are the focus not shapes surfaces and physical models However the interplay of data with models is important a perception theory taxonomy of display types models for graphics in R Matlab etc graph algorithms and tools for data analysis GIS virtual reality Bayesian computation o intersystem interfaces models for merging different systems and languages metacomputing writing software that writes software Meeting and symposiums SOCIAL NETWORK THEORY WORKSHOP Saturday June 10 Wickson 2124 Sponsored by the Animal Behavior Graduate Group Network theory is emerging as a major tool in various fields for quantifying network structure and for analyzing how network structure might influence outcomes This approach has produced interesting new insights for community ecology food web networks mutualism networks epidemiology spread of disease genetics genetic networks and various issues in the social sciences Put another way applications of network theory are appearing regularly in recent issues of Science and Nature This workshop will examine how social network theory SNT can be used to generate new insights and new questions or issues for animal behavior Featured speakers from other universities include Darren Croft U Leeds Jessica Flack Santa Fe Institute and Dan Promislow U Georgia The workshop will also feature talks by students and faculty here at UCD and lots of discussion on future directions We expect this workshop to provide many of the ideas that will go into a proposed conceptual overview paper on SNT and animal behavior RSVP to Professor Andy Sih asihucdavisedu Summer schools a Summer School on Game Theory in Computer Science 263062006 Aarhus Denmark httpwwwbrics dkgame06 a Les Houches Summer School Complex Systems 32872006 Les Houches Chamonix France httpw3houchesujf grenoblefrsessionsete ete 85poster summer 85html o The Santa Fe Institute Complex Systems Summer School An annual event since about 1991 Applications due December for the next year s school This year two schools Santa Fe and Beijing httpwwwsantafeedueducationindeXCSSSphp What is a network Topology and Activity MAE 298 Lecture 1 March 28 2007 Raissa s Professional history ie How did I get here a 1999 PhD Physics Massachusetts Inst of Tech MIT Joint appointment Statistical Physics and Lab for Computer Science c 20002002 Postdoctoral Research Fellow Bell Laboratories Joint appointment Fundamental Mathematics and Theoretical Physics Research Groups c 20022005 Postdoctoral Research Fellow Microsoft Research Theory Group Interdisciplinary group in Physics and Theoretical Computer Science a 2005present Assistant Professor UC Davis Dept of Mechanical and Aeronautical Eng and Center for Computational Science and Eng o 2007present External Faculty Member Santa Fe Institute What is a Network Topology ie structure Activity ie function Phase transitions Example social networks Immunology viral marketing 1 M150 Mg 1quot Mic d 0 J d gfgw 3 q 0 M E J Newman The Internet Robustness to failure optimizing future growth testing protocols on sample topologies H Burch and B Che swick A typical web domain Web searchorganization and growth M EJ Newman The airline network Optimization dynamic external demands xx Routes Tram WWW a quotummm rmhmll mimmw mm mm Lawn Continental Airlines Yeast protein signaling network Control mechanisms in biology S Masloc and K Sneppen M E J Newman The power grid Mitigating failure Distributed sources M E J Newman Software call graph Uncovering design principlesrobustness to mutation Chris Myers Networks basic properties a Network made of nodes connected together by edges o Edges can be directed or undirected ie oneway or twoway connec ons Example oneway Web pages Example twoway Family tree relatives Example hybrid Road networks with some oneway and some two way links city of Boston prime example 0 Geometric versus geometry free eg Internet vs WWW o STRUCTURE topology and FUNCTION information flow dynamics on the network Transportation Networks UCDAVIS distribution collection networks Computer Vi networks Biological networks protein interaction Soc39al networks genetic regulation 39 Immun039ogy drug design Information Commerce 22 January 2007 CSE Advance Natn Acam SciencesNatn Research Council Study 2005 NETWORK SCIENCE all our modern critical infrastructure relies on networks too much emphasis on specific applicationsjargondisciplinary stovepipes need a crosscutting science of networks Research for the 21st century Why do networks exist Physical Biological Social Engineered o More efficient control esp through hierarchy o Robustness to noise and fluctuations 0 Can we learn function from structure a Can we apply these lessons to engineered systems gt Would a modern power grid look like the one we have Lessons from existing networks All networks all quantitatively different each optimizes something different What are the key parameters that distinguish them Which kinds work best for which application General ConsiderationsTradeoffs For what purpose are we building the network CONNECTIVITY Preserve at all costs Internet Or break at all costs Immunology 0 ROBUSTNESS to which failure modes Random failure biology Or targeted attacks technological 0 Fully decentralized or some centralized control How do we represent a network as a mathematical object Matrix representation of a network TOPOLOGY Connectivity matrix M 1 if edge exists between i and j if 0 0therw1se C H H H H C H C H H C C H C H H H C H H H H C C C i i i The degree of a node is how many links it has IOPOLOGY Common measures of fine structure 1 C H H H H C H C H H C C H C H H H C H H H H C C C i i i 0 Degree distribution fraction of nodes with degree k c Clustering coefficient 0 Diameter 0 Betweenness centrality o Assortativedissortative mixing Many different types of networks exhibit Power Law Degree Distributions ie scale free 1904 1901 i i 1e07 i 1e10 i i i 1 100 10000 Mk N 76 lnpk ylnkconst Consider a random walk Matrix representation of a network ACTIVITY Spread of disease routing of data gossip spreadmarketing matrix P 14 13 12 14 14 13 0 14 14 6 12 6 14 13 0 14 0 0 0 14 0 0 0 12 12 on the network The state transition The eigenvalues and eigenvectors convey much information Eigenvalues and eigenvectors of the state transition matrix The stationary distribution of a random walk on the graph cover time of a random walker mixing time occupancy probabilities o Partitioning graphs into subgraphscommunities Partitioning networks Community structure Political Books USA 2004 I I nmnu mmcmm I I magma PamwwuluPWEaa I I m m Dvuviluwu mumn umM I rmmmvm I new 55mm is Slim 3 Damn Nb cum I I quot 5 V quot 3 quot Wlmnimlue 511 Chunky I mmulnmm Incas m on mm am I I I mauswcmnpinn SWIMIMI wnthmIgIsh I mm meanmum I usmuulyinvu m I I mannqu m I h mvnanwcmmy I I any mpmmm menme I wm um an I Evaum mam I ndivu u mm Evil pmmm I Slim Me n Blak I In Enlwwmun I I I mznmnummnmwmn IIIuy sm I Lamvy mummean I mm I I ronwnnnurnnd Lnslm lnLdm nquot I I I mqu memmmquot m magmaI quot I I wmummqoummz I Wuumanw am I I I menosan fwmm um AnuquIW I vy I m Slug mum swim mu null quot139 I I ma ovum rum mewy mm mm In a nun mw I in AW can I All in Slmlh Mm I mammmnumzuu I quotWMm NunsWu name M E J Newman Dynamics on networks a Markov chains ie state transition matrix current approach a Tracks one field the random walker and has only one timescale the rate of the walker o Partial differential equations track multiple fields with multiple scales simultaneously a How do we develop techniques for PDEs on a network A Sample Application Web Search aw 1 Ii 39 V at A typical web domain M E J Newman Pieces of a WWW search engine 0 First thing needed is a web map Nodes Web page unique url semantic content inlinks out links Edges Connectivity between pages a Use a crawler also called spider to crawl the WWW o Over 109 web pages How long will this take Will it ever be complete a Selectively choose which outlinks to follow during the crawl avoid spamholes linkfarms etc What heuristics characterize quality pages How often should recrawl occur and in what order Page Rank Brings us back to random walks on a graph Markov chains 14 13 12 14 14 13 0 14 14 0 12 0 0 P 14 13 0 14 12 0 0 0 1412 0 0 The stationary probabilities are essentially the Page Rank The celebrated algorithm which was the foundation for Google Summary Terms introduced today 0 Graphnetwork also nodes and edges Connectivity matrix M a State transition matrix random walk on M Degree Degree distribution Graph diameter Networks a Network structures are pervasive physical biological social engineered a Networks are made of discrete nodes hard to envision continuum description o Nodes can live in geometric or geometry free space Internet vs WWW Need to consider Topology of network Activity on network a Can we learn lessons from existing networks MAE 298 Lecture 16 March 10 2008 1 ampVi I quot quotr Techniques for Model Selection Inferring network mechanisms The Drosophila melanogaster protein interaction network Middendorf Ziv and Wiggins PNAS 102 2005 a Study the Drosophila protein interaction network a Use machine learning techniques discriminative classification to compare with seven proposed models to determine which model best describes data Data Giot et al Science 302 1727 2003 R g I I M rlma rm I Hamw39llmmr rm um 3955 E a I l I a 391 a quota E i 39 a 5 u 1 4 i E 2 U I39 a H 41 1 I I39i I in 4 nu eggEggs i I l u I n I 1 2 E 1H 2 m Elli wtm a Immanmm o 3359 vertices and 2795 edges 7 candidate models a DMC duplicationcomplementationmutation Vasquez et al 0 DMR duplicationmutation with random mutations o RDG random growing graph Callaway et al 0 LPA Linear pref attachment BarabasiAlbert AGV Aging vertices o SMW Small world WattsStrogatz RDS random static ErdosRenyi The procedure a Generate 1000 random instances of a network with N3359 and E2795 for each of the seven models 7000 random instances in total Training data a Subgraph census classify each network by exhaustive search for all possible subgraphs up to a given size a Classify each of the 7 mechanisms by raw subgraph counts iLiL l i i i i J S l2 Ji 4 L774 39 v1 391 39 I n quotl quotI 391 J 1quot l r i T 17 V t l nit t 7 1r Al F U 11 l l r J l i my i l l 3v w l l l l I it y n u l i l l VHF N1quot I l I i l l Fawn thatquot L A 4 S40 S41 S42 543 844 S45 S46 S47 S48 549 550 851 i t 391 v3 lit l i n l I ti l J q pa Ex 1 fu El ll A L We t l to i l g l m tr JD y I 39 39 J I l H 39 39 2L iHL Example subgraphs Build classifier from the training data Learning Algorithm o Alternating Decision Tree ADT Freund and Schapire 1997 I SM lt I65 Which subgraphs best distinguish the models After classifier built use it to characterize individual network realizations Walk the Drosophila data through the ADT o A given network s subgraph counts determine paths in the ADT decision nodes are rectangles o The ADT outputs a realvalued prediction score which is the sum of all weights over all paths o The final weight for a model is related to probability that particular network realization was generated by that model a Model with the highest weight wins best describes that particular network realization o DMC wins for Giot Drosophila data Comments 0 A relative not absolute judgement ie which of these 7 models fits the data best oMany of these 7 models considered produce similar macroscopic features degree distribution clustering diameter etc o Delve into microscopic details and let the data distinguish between the 7 models 0 Model selection not validation Web search and decentralized search on smallworlds MAE 298 Lecture 8 Feb 4 2008 Search for information Assume some resource of interest is stored at the vertices of a network Web pages 0 Files in a filesharing network Would like to determine rapidly where in the network a particular item of interest can be found To warehouse data or search on demand a Centralized Catalogue data in one central place Makes most sense when high cost to search network in real time Requires resources for learning the data and storing it o Decentralized Data is spread out in a distributed data base Can be a very slow process to search But dependent on network topology may be able to devise quick algorithms Web search a Centralized warehousing of information need results as quickly as possible a Key Use information contained in the edges as well as the vertices Assumes edges contain information about relevance Process Query arrives select subset of pages which match order that subset by ranking based on link structure Typical web domain v i rt V 3 3 A M EJ Newman Ranking pages in a connected component 0 Each site starts with unit rank ie weight a Transfers fraction of this rank equally to each connected site 0 So the rank of vertex an X E AijTj j where A is the adjacency matrix a Using a random walk formulation the occupancy probabilities in steadystate ie the vector corresponding the A 1 77sz where M is state transition matrix Aside on Random Walks Sample graph structure Random walk State Transition Matrix Columnnormalize the adjacency matrix 93 17 14 13 12 14 0 0 14 13 0 14 0 M 14 0 12 0 0 14 13 0 14 12 0 0 0 1412 M will have a basis set of eigenvectors ii and corresponding eigenvalues Ai PerronFrobenius Theorem Applies to irreducible positive stochastic matrices o Irreducible means cannot be blockdiagonalized into disjoint pieces ie network is connected only one component 0 Positive means each entry MM gt 0 Stochastic means column normalized or row normalized PerronFrobenius Theorem Leading eigenvalue 0 One leading eigenvalue with A1 1 o The corresponding eigenvector 121 has strictly positive entries and the sum over all the entries 1 mi 1 o This is the stationary distribution of the random walk dynamics o For nonnegative matrices MM 2 0 similar results but can t guarantee eigenvectors are positive in practice normally still works see accompanying Rcode if interested PerronFrobenius Theorem Remaining eigenvalues Will return to this when discuss partitioning o All other eigenvalues are less than A1 ie all Ai lt 1 forz39 gt 1 o These represent decaying modes that will eventually converge to the steadystate solution described by 121 o How long do they take to decay Approximated by the relaxation time for that mode time for an original amplitude to decay to 1e Mtg gt 72 o So the longest relaxation time is Tmax 72 11n2 o The difference A1 A2 is the spectral gap Back to Ranking pages in a single component 0 Each site starts with unit rank ie weight a Transfers fraction of this rank equally to each connected site 0 So the rank of vertex an X E AijTj j where A is the adjacency matrix a Using a random walk formulation the occupancy probabilities in steadystate ie the vector corresponding the A 1 77sz where M is state transition matrix The Web as a whole Central core our 56 mm pages in million pages F k i N V 0amp0 r A V lM mE lm ids35quot 0amp3 mummmmmmmmwmm Ml mlrlua nsw v 7 mrlmymm This is an old picture circa 2000 currently Google 8 billion pages servedquot Yahoo claims 20 billion Is there really that much valid content out there We understand how to deal with each components How do we deal with getting a consistent rank across the whole web Disconnected components The Random Surfer model Brin and Page The anatomy of a largescale hypertextual Web search engine Computer Networks 30 1998 L Page 8 Brin R Motwani and T Winograd The Pagerank Citation Ranking Bringing Order to the web technical report Stanford University Stanford CA 1998 0 With probability 6 follow an outlink of current page a With probability 1 6 jump at random to some other web page Usually assume jump is random so land at any site with prob 1N The Random Surfer model The weight of a pagej is the sum over all the inlinks pointing to it including those gained by the random jump 7i 7361 1 isz Note this also makes the matrix positive recall PerronFrobenius Rules of thumb 6 sz 1 N 6 08 Realworld complications Page Rank o Newer pages have less rank even though they may be extremely relevant o Calculate eigenvalues of 109 by 109 matrix spam spam spam a Search engine optimizers reverse engineer search engines 0 Link farms both invisible and visible o Selling highly ranked domain names delisting web sites Realworld complications in general a Stop words typically removed and of the to be or So how to handle query to be or not to be a Dealing with complications of multiple languages Alternate approaches topology based Kleinberg and Lawrence The structure of the Web Science 2942001 Kleinberg Authoritative sources in a hyperlinked environment J ACM 46 1999 o Slightly more sophisticated Kleinberg proposes to use inlinks and outLinks a Google assumes a page is important if other important pages point to it o Kleinberg identifies two kinds of importance hubs and authorities Hubs and authorities 0 A page pointed to by highly ranked pages in an authority 0 A page that points to highly ranked pages is a hub May not contain the information but will tell you where to find it In use a Teoma search engine httpsearchaskcom o Citeseer literature search engine Alternate approaches Usageclick based 0 Negative edge weights Penalize spam linking to you a Genetic algorithms Multiple agents searching simultaneously The leastfit killedoff and the most fit duplicated Relies on assumption that pages on a particular topic clustered together Menczer and Belew Adaptive retrieval agents Internalizing local context and scaling up to the Web Machine Learning 39 2000 o Clustering results by subject eg httpcustycom Distributed search Some resource of interest is stored at the vertices of a network ie Files in a filesharing network Search on arbitrary networks 0N Depthfirst Breadthfirst Search on powerlaw random graphs o Breadthfirst passing always to highestdegree node possible Adamio R M Lukose A R Puniyani and B A Huberman Search in powerlaw networks Phys Rev E 64 2001 find between 0N23 and 0N12