Introduction to Statistics I STAT 1000

UCONN

GPA 3.87

Date Created: 09/17/15

14 Comparing Two Population Means lst data set X1 an withEXl u1 2quotd data set Y1 with EX ui How to compare two population means quot2 Independent Sampling Large Samples 1 Setup Null Hypothesis H0 1 2 D0 Alternative Hypothesis 0 HaLtl LLZgtD0 or o Hau1 uzltDoor Ha 1 2 7 D0 2 Test Statistic The test statistic is given by Y f D 0 2 2 Li 7 i4 71 1 2 3 Rejection Region 0 zaoo or o 00 Za or o oo Z1 le 00 1 A U 4 1 Assumptions The two samples are randomly selected in an independent manner from two populations Both 111 and 112 are greater than 30 Independent Sampling Small Samples 1 Setup Null Hypothesis H0 1 2 D 0 Alternative Hypothesis 0 HaLtl LLZgtD0 or o Haul uzltDoor H a i 1 2 i D 0 2 Test Statistic The test statistic is given by t Y D0 2 2 15 n n 3 Rejection Region The rejection region is given by 0 twice or o 00 tayv or rt I 39 IAV l where the number of degrees of freedom is given by V minn1 1 n2 1 Assumptions The two samples are randomly selected in an independent manner from two populations Both sampled populations are approximately normal Example 1 A processor of recycled aluminum cans is concerned about the level of impurities principally other metals contained in lots from two sources Laboratory analysis of sample lots yields the following data kilograms of impurities per hundred kilogram of product n1 12 J 3267 3X 676 n2 12 173617 sy 1365 Can the processor conclude using the con dence level of 5 that there is a nonzero difference in means Solution 1 Setup Let y z39 12 be the true means of all the lots from the 139 th source H0ul uz0 HaLtl LLZ 0 2 Test statistic Note that the sample sizes are small I X Y D0 3267 36l7 7956 s s 6762 13652 77 n1 r12 12 12 RR oo tyv U tyvoo 3 Rejection Region 00 1 025711Ut025711oo oo2201U2201oo 4 Conclusion The test statistic does not fall into the rejection region Therefore we cannot reject the null at signi cance level of 5 That is our data does not support the research hypothesis that there is a nonzero difference between two sources Paired Difference Experiment Large Samples ngt30 The same dogs in two measurements or pairs of matched before measurements dogs 1 Setup Null Hypothesis H0 IuD D0 Alternative Hypothesis 0 Ha yD gtD0 or 0 Ha yD ltD0 or 0 Ha yD 7E D0 2 Test Statistic The test statistic is given by where X D is the sample mean difference 3 D is the sample standard deviation of differences and n D is number of pairs Under the null it has standard normal distribution 3 Rejection Region The rejection region is given by 0 zaoo or o 00 Za or o 00 Z U 2oo Assumptions The differences are randomly selected from the population of differences Paired Difference Experiment Small Samples n 30 1 Setup Null Hypothesis H0 IuD D0 Alternative Hypothesis 0 Ha uD gtD0 or 0 Ha uD ltD0 or 0 Ha uD D0 2 Test Statistic The test statistic is given by YD D0 3 D In D a where X D is the sample mean difference 3 D is the sample standard deviation of t differences and n D is number of pairs Under the null it has t distribution with n D 1 degrees of freedom 3 Rejection region The rejection region is given by tamrl w or o oo t 1 or mm o wa tagwl U Pumped Assumptions The differences are randomly selected from the population of differences The difference population distribution is approximately normal Example 2 A tasting panel of 15 people is asked to rate two new kinds of tea on a scale ranging from 0 to 100 25 means I would try to nish it only to be polite 50 means I would drink it but not buy it 75 means It s about as good as any tea I know and 100 means It s superb I would drink nothing else The difference in rating is recorded for each person The mean of differences is 7 and the standard deviation is 1608 Does these data indicate the difference in ratings at a 5 Solution 1 Setup H0 yD 0 Ha yD 0 2 Test Statistic The test statistic is equal to YD D0 7 0 7 169 SD ME 16085 3 Rejection region 00 00 2 145 U 2145oo RR oo IYKD71UIYKD71 4 Conclusion The test statistic does not fall into the rejection region Therefore we cannot reject the null at significance level of 5 That is our data does not support the research hypothesis that there is difference in ratings of these two kinds of tea Exercises p 420 7147715 7197720 p 434 735 738 References 1 Chase and Bown General Statistics 2 Hildebrand and Ott Statistical Thinking for Managers 3 Keller and Warrack Statistics for Management and Economics 4 McClave Benson and Sincich A First Course In Business Statistics Exercises 1 Business schools A and B reported the following summary of GMAT Graduate Management Aptitude Test verbal scores n X s2 A 201 3475 4859 B 115 3374 3068 At a 5 level of signi cance is there sufficient evidence to believe there is a difference in the population means a Use the classical approach b Use the Pvalue approach 2 Assume independent samples from approximately normal populations with equal variances The data are given here n X s2 A 10 74 60 B 13 81 40 Is there sufficient evidence to conclude that the mean of A is smaller than the mean of B Use the 5 significance level 3 Twentyfour males age 2529 were selected from the Framingham Heart Study Twelve were smokers and 12 were nonsmokers The subjects were paired with one being a smoker and the other a nonsmoker Otherwise each pair was similar with regard to age and physical characteristics Systolic blood pressure readings were as follows Smokers Nonsmokers 122 114 146 134 120 114 114 116 124 138 126 110 118 112 128 116 130 132 134 126 116 108 130 116 List the differences A B and verify that YD 6 and SD 840 Use a 5 level of signi cance to determine whether the data indicate a difference in mean systolic blood pressure levels for the populations from which the two groups were selected You may assume that the population of differences is approximately normal 4 A salesman for a shoe company claimed that runners would record quicker times on the average with the company39s brand of sneaker A track coach decided to test the claim The coach selected eight runners Each runner ran two 100yard dashes on different days In one 100yard dash the runners wore the sneakers supplied by the school in the other they wore the sneakers supplied by the salesman Each runner was randomly assigned the sneakers to wear for the first run Their times measured in seconds were as follows A B With shoe company s sneakers With school s sneakers 108 114 123 125 107 108 120 117 106 109 115 118 121 122 112 117 Note For the differences YD 225 and SD 276 Assume the population of differences is approximately normal

