### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# 593 Class Note for STAT 51200 with Professor Jennings at Purdue

### View Full Document

## 12

## 0

## Popular in Course

## Popular in Department

This 16 page Class Notes was uploaded by an elite notetaker on Friday February 6, 2015. The Class Notes belongs to a course at Purdue University taught by a professor in Fall. Since its upload, it has received 12 views.

## Similar to Course at Purdue

## Reviews for 593 Class Note for STAT 51200 with Professor Jennings at Purdue

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 02/06/15

Statistics 512 Applied Linear Models Topic 9 Topic Overview This topic will cover 0 Random vs Fixed Effects 0 Using EMS to obtain appropriate tests in a Random or Mixed Effects Model Chapter 25 Oneway Random Effects Design Fixed Effects vs Random Effects 0 Up to this point we have been considering xed effects models in which the levels of each factor were xed in advance ofthe experiment and we were interested in differences in response among those speci c levels 0 Now we will consider random effects models in which the factor levels are meant to be representative of a general population of possible levels We are interested in whether that factor has a signi cant effect in explaining the response but only in a general way For example were not interested in a detailed comparison of level 2 vs level 3 say 0 When we have both xed and random effects we call it a mixed effects model The main SAS procedure we will use is called proc mixed77 which allows for xed and random effects but we can also use glm with a random statement We7ll start rst with a single random effect 0 In some situations it is clear from the experiment whether an effect is xed or random However there are also situations in which calling an effect xed or random depends on your point of view and on your interpretation and understanding So sometimes it is a personal choice This should become more clear with some examples Data for oneway design 0 Y the response variable 0 Factor with levels 239 l to r o i is the jth observation in cell 239 j l to n o A balanced design has n n KNNL Example KNNL page 1036 knn11036sas Y is the rating of a job applicant Factor A represents ve different personnel interviewers of cers7 r 5 levels 71 4 dz ereat applicants were randomly chosen and interviewed by each interviewer ie 20 applicants applicant is not a factor since no applicant was interviewed more than once The interviewers were selected at random from the pool of interviewers and the appli cants were randomly assigned to interviewers Here we are not so interested in the differences between the ve interviewers that happened to be picked ie does Joe give higher ratings than Fred7 is there a difference between Ethel and Bob Rather we are interested in quantifying and accounting for the effect of interviewer in general There are other interviewers in the population at the company and we want to make inference about them too Another way to say this is that with xed effects we were primarily interested in the means of the factor levels and the differences between them With random effects7 we are primarily interested in their variances Read and check the data data interview infile hSystemDesktopCH24TA01DAT input rating officer proc print datainterview Obs DOONOEO39IprNMD k rating officer 76 65 85 74 59 75 81 67 49 63 61 46 74 71 85 89 66 84 H mmpbpbpbpbWWWWMMMMD kHD k 19 80 5 20 79 5 Plot the data titlel Plot of the data symboll vcircle inone cblack proc gplot datainterview plot ratingofficer run Plot of he data vuhga 2 a n anv Find and plot the means proc means datainterview output outa2 meanavrate var rating by officer titlel Plot of the means symboll vcircle ijoin cblack proc gplot dataa2 plot avrateofficer run Plot Of 19 means awn an v 2 3 o 5 mu Random effects model cell means This model is also called 0 ANOVA Model ll 0 A variance components model Ym39 M W o The M are iid Nu 0 NOTE THIS IS DIFFERENT o The EM are iid N002 o M and EM are independent 0 Y Nuai02 Now the M are random variables with a common mean The question of are they all the same77 can now be addressed by considering whether the variance of their distribution 0 is zero Of course the estimated means will likely be different from each other the question is whether the difference can be explained by error 02 alone The text uses the symbol 02 instead of 0 they are the same thing I prefer the latter nota tion because it generalizes more easily to more than one factor and also to the factor effects model Two Sources of Variation Observations with the same i eg the same interviewer are dependent and their covariance is 73 The components of variance are 0 and 02 We want to get an idea of the relative magnitudes of these variance components Random factor effects model Same basic idea as before M u 1 04 The model is YM u 1 04 1 EM 6 N N 0 0 Em N N07 02 Ym39 N NW7 0 02 The book uses 03 instead of 0 here Despite the different notations 0E and 02 are really the same thing because M and 04 differ only by an additive constant 1 so they have the same variance That is why in these notes I m using the same symbol 0 to refer to both With two factors we will have to distinguish between these Parameters There are two important parameters in these models 0 and 02 also 1 in the The cell means MM are random variables not parameters 2 2 i i i i i 0 0 We are sometimes interested in estimating 2 2 J2 0A0 0y In some applications it is called the intraclass correlation coe cient It is the correlation between two observations with the same 2 ANOVA Table 0 The terms and layout of the ANOVA table are the same as what we used for the xed effects model 0 The expected mean squares EMS are different because of the additional random effects so we will estimate parameters in a new way 0 Hypotheses being tested are also different EMS and parameter estimates EMSE 02 as usual We use MSE to estimate 02 EMSA 02 710 Note that this is different from before From this you can see that we should use ml to estimate 0 Hypotheses H02 730 H12 The test statistic is F MSAMSE with r 7 1 and rn 7 1 degrees of freedom since this ratio is 1 when the null hypothesis is true reject when F is large and report the p value Note that in the one factor analysis the test is the same it was before This WILL NOT be the case as we add more factors SAS Coding and Output run proc glm With a random statement proc glm datainterview class officer model ratingofficer random officer Sum of Source DF Squares Mean Square F Value Pr gt F Model 4 1579700000 394925000 539 00068 Error 15 1099250000 73283333 Corrected Total 19 2678950000 Random statement output Source Type III Expected Mean Square officer VarError 4 Varofficer This is SAS7s way of saying EMSA 02 40 note 71 4 replicates proc varcomp This procedure gets the variance components proc varcomp datainterview class officer model ratingofficer MIVQUEO Estimates Variance Component rating Varofficer 8041042 VarError 7328333 Other methods are available for estimation mivque is the default SAS is now saying VarError dz 7328333 notice this is just MSE 394925 7 73233 MSA 7 MSE Varoffz39cer a 3041042 4 l n As an alternative to using proc glm with a random statement7 and proc varcomp7 you could instead use proc mixed7 which has some options speci cally for mixed models proc mixed proc mixed datainterview cl class officer model rating random officervcorr The C1 option after datainterview asks for the con dence limits The Class statement lists all the categorical variables just as in glm The model rating line looks strange ln proc mixed7 the model statement lists only the red e ects Then the random effects are listed separately in the random statement In our example7 there were no xed effects7 so we had no predictors on the model line We had one random effect7 so it went on the random line This is different from glm7 where all the factors xed and random are listed on the model line7 and then the random ones are repeated in the random statement Just in case you7re not confused enough7 proc varcomp assumes all factors are random effects unless they are speci ed as xed Proc mixed gives a huge amount of output Here are some pieces of it Covariance Parameter Estimates C0v Parm Estimate Alpha Lower Upper off icer 804104 005 244572 1498 97 Residual 73 2833 005 399896 175 54 The estimated intraclass correlation coef cient is 7 i a i amp i 0 5232 W 7 677 804104732833 39 39 About half the variance in rating is explained by interviewer Output from vcorr option This gives the intraclass correlation coef cient Row gt5me C011 C012 C013 C014 10000 05232 05232 05232 05232 10000 05232 05232 05232 05232 10000 05232 05232 05232 05232 10000 Con dence Intervals For M the estimate is 17quot and the variance of this estimate under the random effects 7 2 2 7 model becomes 02YH W which may be estimated by 52YH See page 1038 for derivation if you like To get a Cl we use a t critical value with r 7 1 degrees of freedom Notice that the variance here involves a combination of the two errors and we end up using MSA instead of MSE in the estimate we used MSE in the xed effects case We may also get point estimates and Cl7s for 02 0 and the intraclass correlation UiUi0 2 See pages 1040 1047 for details All ofthese are available in proc mixed Applications 0 In the KNNL example we would like Ti039 02 to be small7 indicating that the variance due to interviewer is small relative to the variance due to applicants 0 In many other examples we would like this quantity to be large One example would be measurement error if we measure 7 items 71 times each7 02 would represent the error inherent to the instrument of measurement Twoway Random Effects Model Data for twoway design 0 Y7 the response variable Factor A with levels 239 1 to a Factor B with levels j 1 to b YLM is the kth observation in cell 2397j k 1 to 711 For balanced designs7 n 71m KNNL Example 0 KNNL Problem 25157 page 1080 knnllO80sas 0 Y is fuel ef ciency in miles per gallon 0 Factor A represents four different drivers7 a 4 levels 0 Factor B represents ve different cars of the same model7 b 5 0 Each driver drove each car twice over the same 40 mile test course 71 2 Read and check the data data efficiency infile hSystemDesktopCH24PR15DAT input mpg driver car proc print dataefficiency Obs mpg driver car 1 253 1 1 2 252 1 1 3 289 1 2 4 300 1 2 5 248 1 3 6 251 1 3 7 284 8 279 9 271 10 266 HHHH m m Prepare the data for a plot and plot the data data efficiency set efficiency dc driver10 car titlel Plot of the data symboll vcircle inone cblack proc gplot dataefficiency plot mpgdc Plotoflhedata on Find and plot the means proc means dataefficiency output outeffout meanavmpg var mpg by driver car titlel Plot of the means symboll v A symb012 v B symbolS v C symbol4 v D symbolS v E ijoin ijoin ijoin ijoin ijoin cblack cblack cblack cblack cblack proc gplot dataeffout plot avmpgdrivercar Plot of the means Random Effects Model Random cell means model YMLk MM ELM 0 MM Np7ai NOTE THIS IS DIFFERENT o 61 NM N0702 as usual 0 pig7 ELM are independent 0 The above imply that Yam NM7ai 02 Dependence among the Yle can be most easily described by specifying the covariance matrix of the vector Random factor effects model Yam M 04139 5739 045 6137 Where al N N07Ui 3739 N N070123 Wm N N 07033 a aia aig02 Now the component 02 from the cell means model can be divided up into three components A7 B7 and AB That is7 02 0124 0 033 10 Parameters 0 There are ve parameters in this model u 0 023 033 02 o The cell means are random variables7 not parameters ANOVA Table 0 The terms and layout of the ANOVA table are the same as what we used for the xed effects model 0 However7 the expected mean squares EMS are different EMS and parameter estimates EMSA 02 bnai 71033 EMSB 02 mag 71033 EMSAB U2 Wig EMSE U2 Estimates of the variance components can be obtained from these equations or other meth ods Note the patterns in the EMS these hold for balanced data They all contain 02 For MSA7 it also contains all the 02s that have an A in the subscript 0 and 033 similarly for the other MS terms The coef cient of each term except the rst is the product of n and all letters not repre sented in the subscript It is also the total number of observations at each xed level of the level corresponding to the subscript eg there are nb observations for each level of A Hypotheses HoAiUi0 HlAUi7 0 HogiUg0 H13U 7 0 HoAgiUiB0 HlAB UiByr O Hypothesis HoA o HoAiUi0H1AiUi7 0 o EMSA 02 bnai 71033 EMSAB 02 71033 EMSE a2 11 0 Need to look for the ratio that will be 1 when H0 is true and bigger than 1 when it is false So this hypothesis will be tested by F AJSA m not the usual xed effects test statistic The degrees of freedom for the test will be the degrees of freedom associated to those mean squares a 717a 71b 71 0 Notice you can no longer assume that the denominator is MSEHH Note that the test using MSE is done by SAS7 but it is not particularly meaningful it sort of tests both main and interaction at once Hypothesis HOB OHOBU0H13307 0 o EMSB 02 mag EMSAB 02 71033 EMSE a2 So HOB is tested by F Hypothesis HoAB nUiB AJSB AJSAB 2 7 2 OHOABUAB70H1ABUAB7 0 o EMSAB 02 71033 EMSE U2 with degrees of freedom b 7 17 a 7 1b 7 1 0 So HoAB is tested by F with degrees of freedom 1 71b 717abn 71 Run proc glm proc glm dataefficiency class driver car model mpgdriver car drivercar random driver car drivercartest Model and error output Source DF Model 19 Error 20 Corrected Total 39 Sum of Squares 3774447500 35150000 3809597500 12 Mean Square F Value Pr gt F 198655132 11303 lt0001 01757500 Factor effects output Source DF Type I SS Mean Square F Value Pr gt F driver 3 280 2847500 93 4282500 car 4 94 7135000 23 6783750 drivercar 12 24465000 02038750 116 03715 Only the interaction test is valid here the test for interaction is MSABMSE7 but the tests for main effects should be MSAMSAB and MSBMSAB which are done with the test statement7 not MSE as is done here However7 if you do this the main effects are signi cant as shown below Lesson just because SAS spits out ap ualue doesn t mean it is for a meaningful test Random statement output Source Type III Expected Mean Square driver VarError 2 Vardrivercar 10 Vardriver car VarError 2 Vardrivercar 8 Varcar drivercar VarError 2 Vardrivercar Randomtest output The GLM Procedure Tests of Hypotheses for Random Model Analysis of Variance Dependent Variable mpg Source DF Type III SS Mean Square F Value Pr gt F driver 3 280284750 93428250 45826 lt0001 car 4 94713500 23678375 11614 lt0001 Error 12 2446500 0203875 Error MS drivercar This last line says the denominator of the F tests is the MSAB Source DF Type III SS Mean Square F Value Pr gt F drivercar 12 2446500 0203875 116 03715 Error MSError 20 3515000 0175750 For the interaction term7 this is the same test as was done above proc varcomp proc varcomp dataefficiency class driver car model mpgdriver car drivercar MIVQUE0 Estimates Variance Component mpg Var driver 932244 Varcar 293431 Var drivercar 001406 Var Error 017575 13 Mixed Models Twoway mixed model Two way mixed model has 0 One xed main effect 0 One random main effect 0 The interaction is considered a random effect Tests 0 Fixed main effect is tested by interaction in the denominator 0 Random main effect is tested by error 0 Interaction is tested by error 0 Notice that these are backwards from what you might intuitively extrapolate from the two way random effects and two way xed effects model See Table 255 page 1052 and below for the EMS that justify these statements Also see Table 256 for the tests page 1053 Notation for twoway mixed model Y7 the response variable A7 the xed effect 1 levels E7 the random effect b levels We7ll stick to balanced designs n Factor effects parameterization Yam M 04139 5739 045M ELM Where 0 M is the overall mean7 0 al are xed but unknown xed main effects with ai 07 o 37 are N07 0123 independent random main effects7 0 046m are random interaction effects Randomness is catching so the interaction between a xed and a random effect is considered random and has a distribution However7 the interactions are also subject to constraints kind of like xed effects 046m N07 flaiB subject to the constraint Zimmm 0 for each j Because of the constraints7 046m having the same j but different 239 are negatively correlated7 with covariance COVltOL ij7OLBij 70373 14 Expected Mean Squares 2 b 2 2 EMSA 039 ailzainaa EMSB 02 mm EMSAB U2 710 EMSE U2 SAS proc glm writes these out for you but it uses the notation QA to denote the xed quantity Zn 0412 It uses the names VarErr0r 027 VarB 0237 and VarA gtlt B 033 It doesnt actually use the names A and B it uses the variable names Looking at these EMS7 we can see that different denominators will be needed to test for the various effects HoA all al 0 is tested by F MSA MSAB HOB U123 0 is tested by F gg HoAB U33 0 is tested by F Wig39 So7 though it seems counterintuitive at rst7 the xed effect is tested by the interaction7 and the random effect is tested by the error Example KNNL Problem 2516 knn11080asas Y service time for disk drives A make of drive xed7 with a 3 levels B technician performing service random7 with b 3 levels The three technicians for whom we have data are selected at random from a large number of technicians who work at the company data service infile hstat512datasetsch19pr16dat input time tech make k mt make10tech proc print dataservice proc glm dataservice class make tech model time make tech maketech random tech maketechtest The GLM Procedure Dependent Variable time Sum of Source DF Squares Mean Square F Value Pr gt F Model 8 1268177778 158522222 305 00101 15 Error 36 1872400000 52011111 Corrected Total 44 3140577778 RSquare Coeff Var Root MSE time Mean 0403804 1291936 7211873 5582222 Source DF Type I SS Mean Square F Value Pr gt F make 2 28311111 14155556 tech 2 24577778 12 288889 024 07908 maketech 4 1215288889 303822222 584 00010 We have MSA 1416 MSB 1229 MSAB 30382 and MSE 5201 The GLM Procedure Source Type III Expected Mean Square make VarError 5 Varmaketech Qmake tech VarError 5 Varmaketech 15 Vartech maketech VarError 5 Varmaketech To test the xed effect make we must use the interaction FA MSAMSAB 141630382 005 with 24 df p 0955 To test the random effect tech and the interaction we use error FB MSEMSE 12295201 024 with 236 df p 07908 FAB MSABMSE 303825201 584 with 436 df p 0001 The GLM Procedure Tests of Hypotheses for Mixed Model Analysis of Variance Dependent Variable time Source DF Type III SS Mean Square F Value Pr gt F make 2 28311111 14155556 005 09550 tech 2 24577778 12 288889 Error MSmaketech 4 1215288889 303822222 Source DF Type III SS Mean Square F Value Pr gt F maketech 4 1215288889 303822222 584 00010 Error MSError 36 1872400000 52011111 Threeway models We can have zero one two or three random effects etc EMS indicate how to do tests In some cases the situation is complicated and we need approximations eg when all are random use MSAB MSAC 7 MSABC to test A 16

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

#### "I made $350 in just two days after posting my first study guide."

#### "Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

#### "Their 'Elite Notetakers' are making over $1,200/month in sales by creating high quality content that helps their classmates in a time of need."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.