### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# 632 Class Note for PHYS 597A with Professor Albert at PSU

### View Full Document

## 15

## 0

## Popular in Course

## Popular in Department

This 24 page Class Notes was uploaded by an elite notetaker on Friday February 6, 2015. The Class Notes belongs to a course at Pennsylvania State University taught by a professor in Fall. Since its upload, it has received 15 views.

## Similar to Course at Penn State

## Reviews for 632 Class Note for PHYS 597A with Professor Albert at PSU

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 02/06/15

Exponential random graph models for network data David R Hunter Department of Statistics Penn State University email dhunter stat psu edu What is an ERG model Given a set of n nodes let X denote a random graph on those nodes and 1 denote a particular fixed graph on those nodes Then PX 13 oc GXD918 IZ OI exp6 sa PXx 7 where o 6 is an unknown vector of parameters o 5x is a known vector of graph statistics on 1 Whence the name ERGM Whenever the probability density function or mass function of a random variable may be written fa oc exp9tsa for 9 E 9 C RP then the family of all such random variables for all 0 E 9 is called an exponential family Since the random graphs in our model form an exponential family we call the model an exponential random graph model See eg Snijders 2002 This model has also been referred to as a 19 model Wasserman and Pattison 1996 because it generalizes a simpler model called p1 Holland and Leinhardt 1981 Markov random graphs Frank and Strauss 1986 Erdos Renyi model Each dyad pair of nodes is an edge with con stant probability p independent of all other dyads An obvious generalization accords each dyad its own probability pij However independence of dyads is not a realistic assumption for most networks Frank and Strauss 1986 considered a specific form of dyadic independence Markov random graphs A random graph is said to exhibit Markov dependence if dyads that do not share an individual are conditionally independent given the rest of the graph For instance the states of dyads 12 and 34 do not influence one another if all other dyads are held constant but the presence or absence of an edge at 12 might affect the probability of an edge at 13 Markov random graphs Frank and Strauss 1986 showed that if isomorphic graphs have the same probability the homogeneity condition then Markov graphs are precisely those for which n l PX 13 oc exp Z 9 d a Gym510 i1 where o dim is the proportion of nodes having degree 139 o 7x is the number of triangles in the graph Note that this is an ERGM with s13 d113dn1a739a Fitting the ERGM Fitting the model refers to estimating the parameters 0 based on an observed graph xObS The method of estimation we ll consider is called maximum likeli hood estimation Suffice it to say it s a tried and true method Recall that exp9tsx0bs P0X mom 69 Viewed as a function of xObS the above is the probability mass function viewed as a function of 0 it s the likelihood function L6 Maximum likelihood estimation Goal Maximize exp9tsx0bs W 2 cm Equivalently maximize the loglikelihood 39 GE ogL9 9tsa0bs log cw Let 0 arg maxeee 9 denote the maximizer also known as the maximum likelihood estimate or MLE There s just one small problem With exp9tsa P0X x 26 we see that 70 2 Z exD0 sy all graphs 1 where the sum is taken over all possible distinct graphs on the n given nodes Thus even evaluation let alone maximization of 9 is somewhat computationally burdensome HOW burdensome Even for an undirected 34 node graph computing c9 directly requires summation of 7548 x 10168 terms Pseudolikelihood A possible alternative Following the ideas of Strauss and Ikeda 1990 and Wasserman and Pattison 1996 let ij denote the status of all dyads in X except Xij so that ij is in a sense everything else If we hold ij fixed at mgj that is conditional on the event ij 37 the random graph X has only two possible states depending on whether Xij is O or 1 Furthermore PltXij 1ng xgj expwtsmgn Ce PXj oixgj 32 Ca exp9tsxj39 Conditional odds of Xij 1 From 93 exp9t8 833 5 j 47 we get Iogit PXJ 2 1le39 2 1391 9 AsmJ where A D t i s a w s 13 s 13 is the vector of change statistics Note Iogit is the log odds function ogitp jg log lLip Maximum Pseudolikelihood What if the conditional probability PXJ 1Xf 171 were equal to the marginal probability PXy 1 Then we would have logit PXj 1 9tA533ij with the Xij independent so we could estimate 0 using logistic regression very straightforward Result The maximum pseudolikelihood estimate or MPLE Strauss and Ikeda 1990 Unfortunately little is known about the quality of MPL estimates Back to MLE Remember c9 is HARD to compute However Fix 90 Consider the quantity E00 exp 9 90tsrx 2 exp 9 90km PltX y all graphs 1 t eXD968y all 922 yexp 6 90 8y lt 260 exp Wm all graphs 1 C60 69 C90 Thus c9C90 is an expectation The symbol E00 denotes the expectation operator assuming X is random from the ERGM with parameter 60 Law of Large Numbers to the rescue Thus we can estimate the population mean expectation C9c90 2 E00 exp 9 00 5X by the sample mean M i 2 exp 0 00 sx Mizl where X1X2XM is a random sample of graphs from the distribution defined by the ERGM with parameter 90 MCMC To estimate c9c90 we need to obtain a sample from the distri bution defined by the ERGM with parameter 90 In principle this may be accomplished by running a discrete time Markov chain whose stationary distribution is the distribution we wish to sample from This is the MCMC Markov Chain Monte Carlo idea Geyer and Thompson 1992 Snijders 2002 There are two common ways to run such a Markov chain Gibbs sampling and a Metropolis algorithm Though each can be made more complicated in its basic form each method works by deciding at each time step whether to toggle a randomly selected dyad Gibbs sampling First select a dyad at random say 13 Gibbs sampling sets Xij at the next time step according to the conditional probabilities given ij We saw earlier that logit PXj 1Xfj mg is given by 65Asmzj where Asmj 39317 8zvi j is the vector of change statistics Thus the Gibbs sampler sets Xij 1 with probability exp98AS Bzj 1 exp96A8zj and it sets Xij 0 otherwise Note To run the MCMC the values of sag and 53313 are not needed only the difference Asmj matters Metropolis Hastings Gibbs sampling is one form of a Metropolis Hastings algorithm An other is to calculate the ratio PXij changesij mgj PXLgtj does not changeXfj xgj expl96AS33z j and then accept the change of Xij with probability min1 7r Strictly speaking this is a Metropolis algorithm This scheme generally has better properties than Gibbs sampling Note Again the values of 532237 and 53313 are not needed only the difference Asa j matters Change statistics VS Absolute statistics As we ve seen an ERGM fitting algorithm only needs to keep track of how 5X changes at each step of the Markov chain the actual values of 5X are not important If we start the chain at 0 keeping a running total of all changes gives 3X sr 09 But we wanted to estimate C90 by 1 M M 2 exp 9 90t8XZ i1 which appears to require 5X l Fortunately 5Xquot sa quot39 is good enough Why is sXi s0b5 good enough Let s revisit the likelihood function exp9tsa 0b3 Z exp9tsy a graphs y P0X 32053 1 Z expwt islty 420058 a graphs y So the Ioglikelihood may be rewritten E9 og Z exp9isy sx0b8 a graphs y MCMCMLE With elt9 Iog Z exp9tlsy sx0b8l all graphSy we may estimate 9 E60 by I 1 M 9 9 t Obs og M Zexph 0 lsr sa l 121 Thus an ERGM fitting algorithm could compute and return a large number of realizations of 5X 5x0b5 from the distribution deter mined by 90 This sample is then used to maximize 1 as a function of 0 HOW should 60 be chosen Theoretically the estimated value of 9 E90 converges to the true value as the size of the MCMC sample increases regardless of the value of 90 However this convergence can be agonizineg slow if 90 is not chosen close to the maximizer of the likelihood A choice that sometimes works is the MPLE A numerical example NewtonRaphson iterations 32 MCMC sample of size 10000 Monte Carlo MLE Results theta0 estimate se pvalue matchgrade 10706 14118 04988 0 0054 dmatchsex0 10383 14660 07482 0 0522 dmatchsex1 09387 07195 06767 02897 triangle 11864 10389 05750 00732 kstar1 93754 82408 53026 0 1226 kstar2 81424 75155 44219 00916 kstar3 52464 50092 33080 0 1324 kstar4 24226 24512 18068 0 1773 Log likelihood of g 7852976 Some References Frank O and D Strauss 1986 Markov graphs JASA Geyer C J and E Thompson 1992 Constrained Monte Carlo maximum likelihood for dependent data J Roy Stat Soc 8 Holland P W and S Leinhardt 1981 An exponential family of probability distributions for directed graphs JASA Snijders Tom A B 2002 Markov chain Monte Carlo estima tion of exponential random graph models J Soc Struct Strauss D and M Ikeda 1990 Pseudolikelihood estimation for social networks JASA Wasserman S and P Pattison 1996 Logit models and logis tic regression for social networks I An introduction to Markov graphs and 19 Psychometrika

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

#### "I made $350 in just two days after posting my first study guide."

#### "Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

#### "Their 'Elite Notetakers' are making over $1,200/month in sales by creating high quality content that helps their classmates in a time of need."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.