# 562 Class Note for STAT 51100 with Professor Levine at Purdue

This 18 page Class Notes was uploaded by an elite notetaker on Friday February 6, 2015.

Date Created: 02/06/15

Statistics 511 Statistical Methods Purdue University Dr Levine Fall 2006 Lecture 20 Two Sample Test for Proportions and the Variance Test D39evore Section 9495 AUG 2006 Statistics 511 Statistical Methods Purdue University Dr Levine Fall 2006 Difference between Population Proportions 0 Consider two different populations with proportions of individuals possessing a desired property being p1 and p2 respectively We denote the number of individuals in each sample possessing the desired property by X and Y respectively o If the respective sample sizes m and n are small we can assume that X N Birnmp1 and Y N Binnp2 o The obvious estimator for p1 p2 is the difference in sample proportions if 131 fni and 132 then 131 132 can be used to estimate p1 p2 AUG 2006 Statistics 511 Statistical Methods Purdue University Dr Levine Fall 2006 o It is easy to show that Ei51 252 P1 p2 and I 2911 291 1921 p2 V131 32 I m n o The commonly used notation is qr 1 pi i 1 2 a When m and n are large enough we can claim that 151 132 191 192 is approximately standard normal and use this fact to construct Z a test AUG 2006 Statistics 511 Statistical Methods Purdue University Dr Levine Fall 2006 A LargeSample Test Procedure 0 We consider H0 2191 p2 0 vs Ha p1 p2 7 0 0 Under H0 denote p1 p2 p and Q1 q2 1 p Then the variable p1 p2 0 m has a standard normal distribution under H0 Z a It cannot be used for testingwhy Because p and q are not known AUG 2006 Statistics 511 Statistical Methods Purdue University Dr Levine Fall 2006 0 Therefore we estimate AXY m A n A p mn mnp1 mnp2 and use the statistic 231 252 Z 156 i i which is approximately standard normal when both m and n are large AUG 2006 Statistics 511 Statistical Methods Purdue University Dr Levine Fall 2006 Example 0 Does pleading guilty in a criminal trial result in a more lenient sentencing outcome 0 The first group of m 191 consists of defendants who plead guilty as 101 of them were sentenced to prison terms with 131 529 The second group ofn 64 plead not guilty y 56 of them received prison terms with 132 875 o ThuswehaveH0 p1 p2 0vsHap1 p2 y 0 At level 04 001 H0 should be rejected if z 2 258 or Z g 258 The combined estimate of the success proportion isp 101 56191 64 616 AUG 2006 Statistics 511 Statistical Methods Purdue University Dr Levine Fall 2006 o The value of the test statistic is Z 529 875 494 616384 i 6 14 and therefore H0 must be rejected Nota that Pvalue is approximately 00004 c As a remark this outcome had been confirmed for many different types of crimes burglary robbery etc as well as for defendants with different prior record non some but no prison prison etc AUG 2006 Statistics 511 Statistical Methods Purdue University Dr Levine Fall 2006 Type II Error Probabilities 0 Under the alternative hypothesis there are differentpl p2 p q p q ThUSO39p1p2 r o The expressions are rather complicated foriexample for Harm p2 gt0 I ZaxW m 171 p1 292 0131 52 ltplrp2gt AUG 2006 Statistics 511 Statistical Methods Purdue University Dr Levine Fall 2006 Sample Size Calculations o For a specified alternative p1 p2 d the common sample size needed to achieve p1p2 can be easily determined As an example for an uppertailed test we have VOApl 192 91 Q22 Z VPlQl p2Q2l2 d2 7771 AUG 2006 Statistics 511 Statistical Methods Purdue University Dr Levine Fall 2006 Example 0 Consider 1954 Salk polio vaccine trialsThe experiment was a doubleblind one 0 p1 and p2 are the probabilities of getting a paralytic polio in the trial and control groups respectively We test H0 2191 p2 0 vs Ha 2191 p2 gt 0 lfthe true value is p1 00003 p2 000015 would be a success a We use a level 05 005 test and want to find sample sizes such that 0 01 for the above values of p1 and pg 0 If the sample sizes are equal then applying the sample size formula we have n m 1717 000 In reality samples of about 200 000 were used and z 643 AUG 2006 Statistics 511 Statistical Methods Purdue University Dr Levine Fall 2006 A Large Sample Confidence Interval for p1 P2 o It is easy to derive a simple 1001 00 CI for the difference of proportions as A A 19191 19292 Pl P2iZa2 m 72 As a remark note that the estimated standard deviation for 131 132 is not equal to the one used under the null hypothesis o It is usually recommended in practice to use 151 and the respective 4 instead of traditional 13 and Q because of the problems with coverage AUG 2006 Statistics 511 Statistical Methods Dr Levine Example Purdue University Fall 2006 0 Consider the effectiveness of the combined cancer treatment radiation and chemotherapy vs just chemotherapy Chemotherapy Chemotherapy and radiation 15 year survival 76 98 Less than 15 year survival 78 66 a Sample proportions are 151 76154 494 and 132 98164 598 AUG 2006 Statistics 511 Statistical Methods Purdue University Dr Levine Fall 2006 o The 99 confidence interval for the difference of proportions using the traditional interval is 494 598i258 4914598 65911102 247 039 o This is somewhat inconclusive since 0 is inside the confidence interval You can easily check that the improved version in this case produces nearly identical Cl AUG 2006 Statistics 511 Statistical Methods Purdue University Dr Levine Fall 2006 Inferences about Two Population Variances o F distribution depends on the two parameters V1 and V2 they are called the number of numerator degrees of freedom and the number of denominator degrees of freedom It is also nonnegative o For two independent chisquared variables X1 and X2 the ratio XlVl F X2V2 has an F distribution AUG 2006 Figure 1 The F Distribution Density Cuive Property 1 F t Hasty 011111 2 F density curve Shaded area I Statistics 511 Statistical Methods Purdue University Dr Levine Fall 2006 o The density is not symmetric however it can be shown that F1 057V17V2 1FOK7V12V2 o The last property allows us to tabulate only upper or lower tail critical values AUG 2006 Statistics 511 Statistical Methods Purdue University Dr Levine Fall 2006 The Main Property 0 Consider X1 Xn is taken from a normal distribution with variance 03912 and Y1 Yn sampled from another normal distribution with variance 0 independently of X s Then Sfo39 3303 F has an F distribution with V1 m 1 and V2 n 1 0 Clearly it would be an indication of a being rather different from 0 if this ratio is very different from 1 AUG 2006 Statistics 511 Statistical Methods Purdue University Dr Levine Fall 2006 The Test for Equality of Variances a We test H0 0 0 vs one of the three possible alternatives 0 Test statistic value is o For Ha 0 gt 0 the rejection region is f 2 Fa7m17n1 For the lower tailed test it isf S F1a7m17n1 and forthe twotailed test it is f 2 Fa27m17n1 orf g F1a27m17n1 a Note that F table39s a little harder to use than ttables because of the two parameters The table we have in Devore s book gives only a very limited choice of fourvalues for 05 AUG 2006

