# ELEM STATISTICS [C3T1G1] MATH 220

JMU

GPA 3.61

This 23 page Class Notes was uploaded by Eunice Schoen on Saturday September 26, 2015. The Class Notes belongs to MATH 220 at James Madison University taught by Steven Garren in Fall.

## Reviews for ELEM STATISTICS [C3T1G1]

Date Created: 09/26/15

Chapter 9 Comparing Two Groups January 87 2009 9 Comparing Two Groups on We may wish to compare two treatment groups in experimental design Example In the agricultural setting7 which type of seed produces a better yield per acre D Example Which of two drugs is better D on We may wish to compare two populations in sample surveys Example Compare the heights of females in Mexico vs the United States7 or the likelihood of developing cancer between females and males B When comparing two treatment groups in exper imental design or two populations in sample sur veys we may use Section 91 Categorical Response How Can We Compare Two Proportions January 87 2009 2 a Independent samples sections 91 and 92 OR b Dependent samples matched pairs IDEAL section 94 Example Matched pairs from section 44 B When matched pairs are not possible use independent samples Example Independent samples from section 44 D 91 Categorical Response How Can We Compare Two Proportions Ztest and Zcon dence interval on the difference between two population proportions p1 p2 Seetion 91 Categorical Response How Can We Compare Two Proportions January 87 2009 3 Example Let p1 unknown xed population proportion of female adults at least 21 years old who have a high school diploma Let p2 unknown xed population proportion of male adults at least 21 years old who have a high school diploma Are these two population proportions the same What is the difference between these two population proportions D Example Consider an experiment involving prostate cancer and surgery as reported by the New England Journal of Medicine 2002 Does surgery reduce the death rate due to prostate cancer within 62 additional years for prostate can cer patients From 1989 through 1999 695 Scandinavian men with newly diagnosed prostate cancer were randomly as Section 91 Categorical Response How Can We Compare Two Proportions January 87 2009 4 signed to surgery radical prostatectorny or control Treatment Group died survived sample size death rate 1 control 31 317 m 348 2 surgery 16 331 712 347 Let p1 be the population proportion of the control group who would die within 62 years from prostate cancer Let p2 be the population proportion of the surgery group who would die within 62 years from prostate cancer To be continued below D What is a reasonable point estimate of 191 p2 151 132 H131 H132 p1 p2 ie population mean difference between two sample Section 91 Categorical Response How Can We Compare Two Proportions January 87 2009 5 proportions is the same as the difference between the two population proportions For independent or nearly independent observations 051 132 0121 l 01 p11p1n1 192lt1 p2n27 and hence 0231 132 Will P1n1 192lt1 192n2 Suppose all observations are independent or nearly in dependent and the sample sizes are reasonably large T hen by the Central Limit T heorern 1 151 152 p1 p2lP11 190711 292lt1 P2n2 WWW N 0 1 if there are at least 5 successes and at least 5 failures in each of the two samples and 2 A con dence interval on unknown xed 191 192 is 1151 Z52 i 2151U 151V711 1521 I527127 Section 91 Categorical Response How Can We Compare Two Proportions January 87 2009 6 if there are at least 10 successes and at least 10 failures in each of the two samples The number of successes in sample 1 is 721151 The number of successes in sample 2 is ngpg The number of failures in sample 1 is n11 151 The number of failures in sample 2 is 7121 15239 Con dence Interval on p1 p2 Example Prostate cancer and surgery a Determine the point estimate of p1 p2 b Interpret your above point estimate in regular English We estimate that for 43 of patients surgery makes a positive difference in terms of surviving vs not surviving an additional 62 years but Scotion 91 Two Proportions Categorical Response How Can We Compare January 87 2009 NOT for the remaining 957 of patients c Check the assumptions for constructing a con dence interval 01 Construct a 95 con dence interval on 191 pg 7 t table7 p A3 df 80 17100 90 Con dence Level 98 99 Right Tail Probability 17050 17025 17010 17005 998 17001 50 60 80 100 1299 1296 1292 1290 1282 1676 1671 1664 1660 1645 2009 2000 1990 1984 1960 2678 2660 2639 2626 2576 2403 2390 2374 2364 2326 3261 3232 3195 3174 3090 e State the Layrnan s interpretation and the math ernatically rigorous interpretation of your above con dence interval Layman s interpretation We are 95 con dent that the difference in population death rates of control and surgery is between 058 and 802 Mathematically rigorous interpre Scotz39on 91 Categorical Response How Can We Compare Two Proportions January 87 2009 tation If we repeat the sampling procedure many times to construct many 95 con dence intervals on 191 p2 the difference in population death rates of control and surgery then approxi mately 95 of these 95 con dence intervals Will contain the true value of 191 192 D Hypothesis Testing on p1 p2 Again assume the observations are independent or nearly independent What is a reasonable point estimate of 191 192 What is the overall sample proportion of successes Under H0 the standard deviation of p1 152 is WOO 190711 p01 p07 L2 Which is estimated by p1 p1n1 1712 Recall If all observations are independent or nearly in Section 91 Categorical Response How Can We Compare Two Proportions January 87 2009 9 dependent and the sample sizes are reasonably large then by the Central Limit Theorem approx N 151 152 P1 P2lP11 190711 P21 P27 L2 N01 Determine the standardized test statistic Rule of thumb for hypothesis tests on the difference between two proportions If there are at least 5 successes and at least 5 failures in each of the two samples then the standardized test statistic is approximately standard normal Example Consider an experiment involving aspirin and heart attacks as reported by New Eng land Journal of Medicine 1988 Male physicians aged 40 to 84 in the United States in 1982 participated in the double blinded randomized controlled experiment Treatment was one 325 mil ligram aspirin tablet every other day Results were determined about 5 years later Test at level 005 Section 91 Categorical Response How Can We Compare Two Proportions January 87 2009 10 Whether or not aspirin reduces the likelihood of a heart attack in this population in comparison to a placebo heart attack Treatment Group yes no sample size sample proportion of heart attacks 1 placebo 189 107845 711 117 034 2 aspirin 104 107933 712 117 037 total 293 217778 227071 a State the notation Let p1 be the population proportion of placebo users Who would suffer a heart attack Let p2 be the population proportion of aspirin users Who would suffer a heart attack b State the hypotheses c Check the assumptions for performing a signi cance test ie hypothesis test d Determine the point estimate of p1 p2 e Determine the value of the standardized test statistic Section 92 Quantitative Response How Can We Compare Two Means January 87 2009 11 f Determine the P Value g State the conclusion in statistical terms and in regular English We conclude that use of aspirin results in a lower likelihood of a heart attack in this population of male physicians aged 40 to 84 in the United States in comparison to a placebo D Note For a twosided test the P Value is twice the tail probability of the appropriate one sided test 92 Quantitative Response How Can We Compare Two Means In this section we focus on independent observations not matched pairs Construct independent t test and independent t con dence interval Section 92 Quantitative Response How Can We Compare Two Means January 87 2009 12 Population 1 Take independent or nearly in dependent observations from a population With mean a1 and nite standard deviation 01 Let X1 be the sample mean and 31 be the sample stan dard deviation based on a sample of size m Population 2 Take independent or nearly in dependent observations from a population with mean M and nite standard deviation 02 Let X2 be the sample mean and 32 be the sample stan dard deviation based on a sample of size 712 Assume that the two samples are independent of each other Question Is a1 ag OR is a1 ag 0 Estimate al ag What is the point estimate of a1 2 What is the mean of X1 X2 Section 92 Quantitative Response How Can We Compare Two Means January 87 2009 It can be shown that since the samples are independent or nearly independent then i m 03712 aXl XQ For the rest of this section assume that all observa tions in the samples are independent or nearly independent and both 01 and 02 are nite If m and 712 are both large usually n1 2 30 and 712 Z 30 if none of the tails of the two distribution are too heavy or if the two populations are approximately normal then Z Xi X2 m M2 x0n103n2 for inference NOT PRACTICAL is approximately standard normal and iXi X2 M1 H2 T 7 W PRACTICAL is approximately 75 distributed so a con dence Section 92 Quantitative Response How Cart We Compare Two Means January 87 2009 14 interval on m pig is 7 7 32 32 x mit i 711 712 Degrees of freedom When 31 and 32 are similar and 711 and rig are close then the degrees of freedom is close to 711 n2 2 Otherwise the degrees of freedom can be approximated conservatively by the smaller of n1 1 and n21 Listed in your textbook is a very ugly but more accu rate formula for degrees of freedom so we simply will use the above approximation How can we verify the normality assumption Again the t procedures are robust Example A study of Zinc de cient mothers was conducted to determine Whether Zinc supple mentation during pregnancy results in babies with Section 92 Quantitative Response How Can We Compare Two Means January 87 2009 15 increased mean weights at birth Data are avail able at Goldenberg et al JAMA 1995 August 9 274 6 463 468 Treatment 1 Treatment 2 Zinc supplement group Placebo group 711 712 X1 3214 g X2 3088 g 31669g 32728g Is there suf cient evidence to support the claim that Zinc supplementation results in increased mean birth weight in comparison to a placebo Test at level 04 005 a Do we need to assume that the two populations for birth weight are approximately normally dis tributed b De ne your notation Let 1 unknown population mean birth Section 92 Quantitative Response How Can We Compare Two Means January 87 2009 weight in the Zincsupplemented group Let 2 unknown population mean birth weight in the placebo group c State the hypotheses d Determine the value of the standardized test statistic Let X1 sample mean birth weight in the Zincsupplemented group Let X2 sample mean birth weight in the placebo group e Determine the estimated number of degrees of freedom f Determine the P Value Section 92 Quantitative Response How Can We Compare Two Means January 87 2009 17 t table7 p A3 Con dence Level 80 90 95 98 99 998 Right Tail Probability 1 100 17050 1 m 17005 17001 CT 0 is 1299 1676 2009 2403 2678 3261 60 1296 1671 2000 2390 2660 3232 80 1292 1664 1990 2374 2639 3195 100 1290 1660 1984 2364 2626 3174 1282 1645 1960 2326 2576 3090 Standard normal table7 pp A17A2 z 00 01 02 03 04 05 06 I 08 09 722 0139 0136 0132 0129 0125 0122 0119 0116 0113 0110 0179 0174 0170 0166 0162 0158 0154 0150 0146 0143 720 0228 0222 0217 0212 0207 0202 0197 0192 0188 0183 g State the conclusion in statistical terms and in regular English We conclude that Zinc supplementation during pregnancy among Zinc de cient mothers results in babies with increased mean weight at birth in comparison to a placebo Sccttoa 92 Quantitative Response How Can We Compare Two Means January 87 2009 18 h Construct a 99 con dence interval on M1 H2 t table7 p A3 Con dence Level 80 90 95 98 998 Right Tail Probability df 17100 15050 17025 17010 17005 15001 50 1299 1676 2009 2403 2678 3261 60 1296 1671 2000 2390 2660 3232 80 1292 1664 1990 2374 2639 3195 100 1290 1660 1984 2364 2626 3174 1282 1645 1960 2326 2576 3090 Layman s interpretation We are 99 con dent that the difference in population mean birth weights between placebo users and Zinc users among Zinc de cient mothers lies between 237 grams and 2757 grams Mathematically rigorous interpre tation If we repeat the sampling procedure many times to produce many 99 con dence in tervals on 11 ag the difference in popula Section 94 How Can We Analyze Dependent Samples January 87 2009 tion rnean birth weights between placebo users and Zinc users arnong Zinc de cient rnothers then approximately 99 of these 99 con dence inter vals will contain the true value of m 2 i Construct a 99 con dence interval on 2 M1 94 How Can We Analyze Dependent Samples Here we pair the observations Construct paired t test and paired t con dence inter val What are some examples of paired observations We assume the pairs of observations are independent or nearly independent but we do NOT necessarily have independence within a pair Section 94 How Can We Analyze Dependent Samples January 8 2009 20 Let d be the observation in sample 1 observa tion in sample 2 Again we make inferences on the difference between two means m M2 or the mean difference ltd What is a reasonable point estimate of W Assumptions 1 The observations are reasonably paired 2 The differences are independent or nearly in dependent and ad is nite 3 n iS large usually 71 Z 30 if neither tail of the distribution of the differences is too heavy or the differences are approximately normal Then the standardized test statistic is Xd liedSclW apez ox39 tn i Con dence interval on ud is Xd i tn1 sd Section 94 How Can We Analyze Dependent Samples January 87 2009 Example Hypothetical data Test at level 05 005 Whether the population mean systolic read ing of blood pressure is reduced by more than 10 when using a placebo The data consist of the follow ing before and after blood pressure readings of five patients 190 180 220 205 242 214 175 156 201 177 a De ne your notation Let d be the difference in blood pressure before rninus after Let td be the unknown population mean difference in blood pressure b State the hypotheses c Check the assumptions d Determine the value of the standardized test statistic Let Xd or cl be the sample mean difference in blood pressure Section 94 How Can We Analyze Dependent Samples January 87 2009 22 Let 3d be the sample standard deviation of the difference in blood pressure Goal Construct a one sarnple t test on ud e How many degrees Of freedom are asso ciated With this test f Determine the P Value t table7 p A3 Con dence Level 80 90 95 98 99 998 Right Tail Probability df 17100 17050 m 17005 17001 T 3078 6314 12706 31821 63657 318309 2 1886 2920 4303 6965 9925 22327 3 1638 2353 3182 4541 5841 10215 M 5 1533 2132 2776 3747 4604 7173 1476 2015 2571 3365 4032 5893 g State the conclusion in statistical terms and in regular English We conclude that the population mean sys tolic reading of blood pressure is reduced by more than 10 When using a placebo Section 94 How Can We Analyze Dependent Samples January 87 2009 23 h Construct a 98 con dence interval on ud t table7 p A3 Con dence Level 80 90 95 99 998 Right Tail Probability df 17100 17050 17025 17010 17005 17001 1 3078 6314 12706 31821 63657 318309 2 1886 2920 4303 6965 9925 22327 3 1638 2353 3182 4541 5841 10215 1533 2132 2776 3747 4604 7173 5 1476 2015 2571 3365 4032 5893 Layman s interpretation We are 98 con dent that ud the population mean reduction in systolic reading of blood pressure due to the placebo effect is between 727 and 3113 when using a placebo Mathematically rigorous interpre tation If we repeat the sampling procedure many times to produce many 98 con dence in tervals on ud the population mean reduction in systolic reading of blood pressure due to the

