Popular in Course
Popular in Statistics
This 7 page Class Notes was uploaded by Orval Funk on Monday September 28, 2015. The Class Notes belongs to STAT102 at University of Pennsylvania taught by Staff in Fall. Since its upload, it has received 36 views. For similar materials see /class/215434/stat102-university-of-pennsylvania in Statistics at University of Pennsylvania.
Reviews for INTROBUSINESSSTAT
Report this Material
What is Karma?
Karma is the currency of StudySoup.
Date Created: 09/28/15
Statistics 102 Oneway Anava Spring 2000 I OneWay Analysis of Variance Administrative Items M i d t e r m Grades Makeup exams in general Getting help See me today 3530 or Wednesday from 4530 Send an email to stinewharton Visit TAs particularly for help using the computer Review Questions Chisquare How do we calculate the chisquare statistic a Textbook formula given that you can figure out the expected frequencies Page 399 404 Use the null hypothesis to obtain expected counts Use the marginals if testing for independence It s done on a casebycase basis for goodnessoffit b Use JMP software in cases of independence Use the column button to tell JlVlP that the column has frequencies How do I use the value of the chisquare statistic Large values of the chisquare statistic indicate a deviation from the null hypothesis The chisquare gets larger as the data depart further from the null hypothesis Look at the formula deviations are squared and added What assumptions should be checked The observations used to build the table are independent The expected counts are 5 or larger Statistics 102 Oneway Anova Spring 2000 2 New Concepts and Terminology Analysis of variance method for comparing many mean values Generalization of twosample t test Same assumptions Same null H0 pl pl pl vs some difference t test is replaced by an overall test the overall F test Analysis of Variance Source DF Sum of Squares Mean Square F Ratio Model 9 1173290 130366 10361 Error 90 11324500 125828 ProbgtF C Total 99 12497790 126240 04180 Anova table summary Test H0 using a decomposition of the total variation Anova table summarizes the sources of variation Model differences between group averages Error differences within groups DF degrees of freedom Sum of squares variation attributed to some source Mean square variation divided by degrees of freedom F Ratio Ratio of model mean squares to error mean square ProbgtF pvalue of the test of H0 Complicating issue It s too easy to find a significant effect when many comparisons are made If you do 20 t tests you expect to find one error Chance for at least one error among 20 independent oc005 tests is Perror somewhere l Pno error l 09520 z 064 Bonferroni method Simple means to control the chance for an error somewhere Based on simple Bonferroni inequality PError1 or Error2 or Errorzo S 2 PErrori 20 005 Want to do 10 honest t tests at one time Use a pvalue of 00510 0005 for each one rather than usual 005 cutoff Moral Being punished for not having better sharper theory Statistics 102 Oneway Anava Spring 2000 3 Other multiple comparison methods Goal Locate important differences while avoiding false claims of significant differences Problem Suppose you have 10 groups gt 45 pairwise comparisons Three methods of comparison depending upon the question of interest Least significant differences LSD generally avoid Hsu s comparisons HSU Which is besUworst TukeyKramer comparisons HSD Are any differences significant HVIP How do I build an analysis of variance with one factor Fit Y bewith continuous response Y and single categorical predictor X Graphically Comparison circles show which are different Tables summarize the differences Read the labels to interpret the output Statistics 102 Oneway Anava Spring 2000 4 Examples for Today Selecting the best vendor Repairsjmp page 233 Q 11 e s t i 0 11 Does one vendor stand out as the best lowest cost D a t a 10 service calls for each of 10 vendors price of comparable repair Initially model as a multiple regression using one categorical variable tratios for the slopes suggest one is significant p 00307 Is this appropriate Bonferroni says no none are significant all pvalues are gt 0005 Fratio says no the factor as a whole is not useful Multiple comparisons via Hsu s procedure Fit Y by X also says no Graphically p 236 via linked comparison circles Tabular p 237 The tables show one endpoint of confidence interval for the difference in mean values Decoding the table We want to know if the smallest value is really smaller than the rest The difference Avgsmallest Avgother group will always be negative so the lower endpoint of the confidence interval will also be negative The upper endpoint might be negative or positive If it s negative then zero is not in the interval and the difference is significant If it s positive as all are in the table at the bottom of page 237 the differences are not significant Use the hints that the IMP output provides If a column as any negative values the mean for that column is significantly greater than the min smallest mean value Checking assumptions independence constant variance normality C 0 n c l u d e Differences among the vendors are not significant Get more data and have a more focused comparison next time Note the deceptive use of an inappropriate analysis to conclude a significant effect does exist p 242 Statistics 102 Oneway Anova Spring 2000 5 Headache pain relief Headachejmp page 243 Q 11 e s t i 0 11 What claims can be made comparing these drugs D a t a Tests of several active compounds and placebo Similar in spirit to methods used in clinical trials Outliers are still relevant after all it s still multiple regression Use TukeyKramer since interested in all possible comparisons for marketing claims rather than just deciding which if any offers the most relief C 0 n c l u d e Active compound 2 is clearly best even allowing for all siX possible pairwise comparisons None of the others differs from each other in the amount of relief offered Oneway Anava 6 Statistics102 Svring2000 180 170 39 160 39 39 1 I AA a V V g v v v 140 I I I 130 g 120 I I I I I I I I I 1 2 3 4 6 7 8 9 10 Vendor Analysis of Variance Source DF Sum of Squares Mean Square F Ratio Model 9 1173290 130366 10361 Error 90 11324500 125828 ProbgtF C Total 99 12497790 126240 04180 18L 170 39 160 39 39 39 E150 39 39 AW A 2 YQW V 7 V 39 8 39 140 39 130 120 I I I I I I I I I All Pairs With Best 1 2 3 4 5 6 7 8 10 TukeyKramer Hsu39sMCB Vendor 03905 Statistics 102 Spring 2000 Comparisons for q 324444 AbsDifLSD 738 468 dumb uumnw 2 MeaniMeanjLSD dumb uumnm o 15 1e 15 15 14 13 13 12 18 28 78 28 08 38 18 08 848 578 Positive values show pairs of means that are signi cantly different 1468 1418 1578 1528 1628 1578 1578 1628 1458 1508 1388 1438 1368 1418 1258 1308 898 948 628 678 all pairs using TukeyKramer 12 14 14 15 1e 15 15 14 10 793 38 88 58 28 08 98 38 868 If a column has any negative values the mean is signi cantly greater than the min 1208 10 12 12 13 14 15 16 12 998 Oneway Anava 7