Popular in Course
Popular in Statistics
This 22 page Class Notes was uploaded by Orval Funk on Monday September 28, 2015. The Class Notes belongs to STAT102 at University of Pennsylvania taught by L.Zhao in Fall. Since its upload, it has received 21 views. For similar materials see /class/215431/stat102-university-of-pennsylvania in Statistics at University of Pennsylvania.
Reviews for INTROBUSINESSSTAT
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 09/28/15
Lecture 16 Two Way ANOVA Stat 102 0 General Description Read Chapter 92 amp 93 Additive Model N Randomized Block Design Chapter 92 Model With interactions Chapter 93 0 Example Agent service times 0 General Description cont Theory for the Additive Model Formulas for analysis of the Balanced Additive Model 0 Application to the example 0 More Theory and formulas Model with interactions 0 Example cont 0 Comments on the unbalanced case amp other notes General Description 0 The 2way model is an extension of the 1way model 0 The 1way model has 1 factor with I categorical levels Example in One Way ANOVA had factor Fund Type With I 4 different fund types ie 4 levels 0 The 2way model has 2 factor types that can be combined in various ways Typically they can be combined in all possible ways 0 In the Additive Model the effects of these 2 factor types add together This can occur in a Randomized Block Design as described in Chapter 92 but it can also occur in other situations 0 In a more general interaction model the factor effects may also interact with each other Example Agent Service Times 0 In a telephone callcenter the AgentServiceTime of a call is the amount of time the agent spends on a given call 0 This depends on Two types of Factors The Agent identi ed here by their name The Type of Call There are 3 CallTypes in our data Regular Stock transaction New Customer 0 There are many other factors but these others are ignored in the following discussion 0 It is desirable to see which agents serve their customers faster after controlling for calltype In general faster is better 0 For example the manager may wish to give a bonus to the fastest agents Example Data 0 From an Israeli Bank Call Center 0 Service times for a random sample of calls handled by 8 different agents in Nov 1999 0 There are 3 major types of calls handled by center Regular Stock transaction New Customer 0 Random Sample by design has 25 calls of each of the 3 types from each of the 8 agents Balanced sample 0 Traditionally Service Times ofmany sorts have been treated as if they had an exponential distribution Beginning with the research ofErlang in 1911 0 However Brown Zhao 2005 found that such times seem to have a lognormal distribution ie Their logs are normally distributed We don t really understand WHY But this has now been verified in several callcenter situations Example Use of 1Way ANOVA o A 1way ANOVA on Server ignoring CallType is a useful preliminary analysis Can help us understand the data and the relation of Servers to call times Will help prepare for explanation of the difference between the 1way and 2way analyses 0 The statistical MODEL in this 1way analysis has YU LogSerViceTime of jth call to service agent 2 Yij ual 51 51 N0039e i l8 j l75 total calls handled by agent 139 Note this model is homoscedastl39c 0 Here is output from the 1way ANOVA SidebySide Plot from 1way ANOVA 7 f mi 2 E 39 E 39 E6Eiii gsfi i iLJN 3 l l I I I I I I I AVNI DORITMEIR MORIAH SHARON YITZ All Pairs ROTH VICKY TukeyKramer server 0 05 0 Note that The TukeyKramer multiple comparison test shows that Avni is worse than Dorit and Sharon not controlling for CallType 2Way ANOVA General Model 0 Observe Y kwithil1j 1 J and k1 Kg 0 The subscripts i and j index the two types of factors 0 The subscript k indexes the repetitions of a type of observation 0 A Balanced Model has KU K a constant for all zj 0 Our Example has I 8 servers J 3 CallTypes and K25 repetitions for each serverampType combination Our model is balanced Model Without interactions cont 0 The additive model decomposes the population means pi EYjk as a sum of effects of the corresponding 139 and j factors Thus MU39ZMO i J39 0 We also assume normality homoscedasticity amp independence z39e Yak lug 80k with 3k independent N 002 1 o The general principle for estimating the unknown values u 04 is as before Least Squares 0 Thus these are chosen to min ize Hypothesis Tests in additive model 0 There is an overall null hypothesis to be tested There are no factor effects ie H0 a1 041 O and 31 JO Ha H0 is not true 0 There are also two separate subhypotheses of additional interest No Factor A effects HOA a1 04 0 vs HaA HOA is not true amp No Factor B effects HOB 31 J 0 vs Hag HOB is not true 1 In ordinary language these should be interpreted as testing whether there are factor A effects after controlling additively for the effects of factor B and viceversa 2 Normally you would only be interested in these tests if the overall test of H0 rejects 0 There are also tests and Cls for individual estimates of ual 3 0 And Cls and prediction Cls for the estimates of My Example Results for Additive Model 0 ANOVA Table Gives test of the overall null hypothesis H0 Analysis of Variance Source DF Sum of Squares Mean Square F Ratio Model J29 8045 894 1287 Error nlJ1590 40971 069 Prob gt F C Total n1599 49016 lt0001 o F 1287 with 9 amp 590 DF has PltOOOl 0 Hence we REJECT H0 0 Here DFTotal n l DFModel 11 J l 1 J 2 DFError nl 1J2 n I J l 0 Mean Square Sum of Squares DF o Also SSTotal Z X yj as always amp SSE 2Yyk Ya J 2311 ijk Results cont 0 Effect Test Table Gives tests of HOA and HOB Effect Tests Source DF Sum of Squares F Ratio Prob gt F CallType 12 5619 4046 ltOOO1 server J1 7 2426 499 ltOOO1 o DFCallType I l 2 DFserver J l 7 o The Fratios test HOA and HOB respectively with DFS being DFeffect amp DFError and Pvalues as given in the table 0 Both null hypotheses are Rejected here 0 The Effect Sums of Squares have the usual interpretation They are decrease in SSE When going from the 1way model Without the Effect type to the complete model with both types of effects 0 When the model is balanced as here the Sums 0f Squares will add to be the SSModel otherwise usually not Results cont 0 Parameter Estimates We recommend using the Expanded Estimates table This has all the u 051 3 parameter estimates 0 Here are some of the entries Expanded Estimates Term Estimate Std Error t Ratio Probgtt Intercept 4823 00340 14178 00000 CaIITypeNew 0395 00481 821 lt0001 CaIITypeReg 00438 00481 091 03628 CaIITypeStock 0351 00481 729 lt0001 serverAVNI 03438 00900 382 00001 serverDORT 0199 00900 221 00275 etc 0 Hence for example the estimated LogSerViceTime for Dorit to handle a Stock call is 4823 351 199 4975 0 The t ratios and PValues here test that the individual coef cients are 0 This is not usually an interesting null hypothesis to test 0 You can nd CIS the usual way With the Save Columns dropdown Results Model Validation o The residual plot provides a visual check for homoscedasticity though it can be hard to read because of superimposed points 0 Saving the residuals and then looking at the normal quantile plot of their distribution provides a good check for normality of residuals o A good check of whether the additive model is suitable is provided by looking at the model with interactions and seeing whether the interaction effects are significantly nonzero If so the additive model is Rejected as a null hypothesis versus the model with interactions 0 The next two pages show the result of the first two investigations Then we ll discuss the model with interactions Results Residual Plot Residual by Predicted Plot l00 lulu 0 5 LogSerTime Residual 4039 I 395039 I I60 LogSerTime Predicted 0 There is no evident heteroscedasticity here and that fact provides a Visual con rmation that the homoscedasticity assumption is probably OK 0 There s also a formal test ofhomoscedastl39cz39ty that can be used here that we won t describe The result is to fail to reject homoscedastl39cz ty Results Normal Quantile Plot of Residuals o This shows an extremely good agreement with normality o It is because of many plots like this that we concluded that Service times are lognormal for the Israeli call center Normal Quantile Plot 0 When we perform a similar 2 way analysis using ordinary ServiceTime this type of residual plot comes out strongly skewed and non normal Results Notes optional 0 You can find con dence intervals for individual means or for prediction CIS from the drop down menu For example the 95 Prediction interval for the LogService Time for Dorit to handle a Stock call is 333 6 63 This is of course a pretty wide interval which reflects the fact that the Root Mean Square error here is still pretty large It s 0 832 from the Summary of F it table and this together with the estimate of 4975 on our p 13 suggests we should find prediction CIs of about 498 i 2gtlt83 and this is what we get from the formal procedure in JAIP o This CI corresponds to a 95 Prediction CI in terms of Service Times of e33933e63963 279 7575 2 of Dorit s 25 Stock calls fell outside of this interval which is about par for a 95 interval 0 There is also a way to find multiple confidence intervals here for the e ect of the Server after controlling for the Type of Service But this is not automatic in JMP It turns out that 95 intervals of this sort show that Avni is significantly longer winded than Dorit and Sharon as with the analysis ignoring Service Type 0 And al Roth and Meir are significantly longerwinded than Sharon General Model With Interactions 0 As before we have observations Yak corresponding to K observations at level 1 of two factors 0 Now we include a possible effect of the interaction between the two factors after controlling for the overall additive effect of the two factors 0 The interaction effect parameter for combination 1 is labeled 75 o The full model for uij E Yak is thus 1 Iu05i 3 75 0 The model still has the same type of normality homoscedasticity and independence assumptions as before Estimation of the Means o In this model the means uij can take any numerical values 0 Hence the best estimate of uij is the ifh cell mean Ki A i A 1 Yij K1 o This value minimizes A 2 SSE 2Yijk Ya ijk o The estimates of Mai j 71 are then found in JMP by solving ii 2 Hot 3 27 This is done subject to the convenient minimal set of side conditions 0 2a 23 2 2 Tests and DFs o The ANOVA fullmodel test has H 0 lug u vs the alternative Ha uij are any numbers that are not all equal 0 Hence the Fstatistic in the ANOVA table has DF J l o It is also possible to test the null hypotheses H 0 A 2 061 0 Vi no Factor A effect amp H O B 2 0 V no Factor B effect as well as H OInter yij 0 0 Each of these null hypotheses involves a test controlling for the other variables in the usual fashion 0 The DFS are as given in the formulas in red below Example Analysis With Interactions 0 Use the Fit Model platform Choose as Model Effects CallType amp Server amp also cross CallType amp Server Watch in lecture how this is done 0 Here is the ANOVA table Analysis of Variance Source DF Sum of Squares Mean Square F Ratio Model lJ1 23 9632 419 612 Error nIJ576 39384 068 Prob gt F C Total n1599 49016 lt0001 0 Note the DF values 0 As usual Mean Square Sum of Squares DF and F MSModelMSE with 23 amp 576 DF 0 We Reject H0No model effect since PValue is ltOOOl 20 Tests for the effect types Effect Tests Source DF Sum of Squares F Ratio Prob gt F CallType 1 2 5619 4109 lt0001 server J17 2426 507 lt0001 serverCaType JlJ114 1587 166 00604 0 Note the DF values Note also that these add to the previous DFModel o The Effect Sums of Squares have the usual interpretation They are decrease in SSE when going from the model Without the Effect type to the complete model With all types of effects SumofSquares of Effect MSE DF of Effect 0 Here the F for testing CallType has 2 and 576 DFS etc 0 At OL 05 we reject HOA and HOB We fail to reject Hoylnter 0 As usual F 21 Estimates of Mean Service Times 0 Overall the interaction effects are not quite statistically signi cant 0 SO we could justi ably decide to just use the additive model from before 0 OR we could still use the full model If we did that the Expanded Estimates table says Term Estimate Std Error t Ratio Probgtt Intercept 4 823 00338 14288 00000 CallTypeStock 0351 00477 735 lt0001 serverDORT 0199 00893 223 00263 serverDORITCaTypeStock 0423 01263 335 00009 Hence we estimate the mean for Dorit to handle a Stock call as 4823 0351 0199 0423 4552 The estimates for the Call Type and Server effects are equal in the no interaction model because the model is balanced Otherwise estimates may differ 22
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'