Experimental Methods STAT 401
Popular in Course
Popular in Statistics
This 0 page Class Notes was uploaded by Hilbert Denesik on Sunday November 1, 2015. The Class Notes belongs to STAT 401 at Pennsylvania State University taught by Staff in Fall. Since its upload, it has received 8 views. For similar materials see /class/233135/stat-401-pennsylvania-state-university in Statistics at Pennsylvania State University.
Reviews for Experimental Methods
Report this Material
What is Karma?
Karma is the currency of StudySoup.
Date Created: 11/01/15
STAT 401 Ch 8 Tests of Hypotheses Sometimes we want answers to specialized questions such as is y gt a speci ed value 0 Ex1 a A trucking rm suspects the claim made by a tire man ufacture that certain tires last on average at least 28000 miles This can be addressed in terms of a hypothesis testing prob lem where we are asked to decide between H0 y 2 28 000 and Ha y lt 28000 Note that CI though they provide a region containing 0 are not designed for choosing between two hy potheses We will see that such decision involve considerations that do not arise in the construction of CI The hypothesis denoted by H0 is called the null hypothesis while HCl is the alternative There is a rule for which hypothesis gets designated the null hypothesis Consider the Ex1 b The tire manufacturer wants to claim that certain tires last on average at least 28000 miles The decision is between H0 y 3 28000 andHa y gt 28000 The rule for designating the null and alternative hypotheses is Alternative is the hypothesis the investigator wants to claim as true In other words Alternative is the hypothesis the investi gator wants evidence for Convention The null hypothesis will be stated as equality e g H0 y 28000 vs H y lt 28000 Speci cation of 21 Testing Procedure A test procedure is speci ed in terms of i a test statistic and ii a rejection rule region Ex1 a Continued In this case the test statistic will be 2 and the rejection rule speci es that H0 is rejected wheneverX gt C some constant Ex2 To investigate whether certain detonators used with explo sives in coal mining meet the requirement that at least 90 will ignite 20 are tested and it is found that 17 function correctly Hypotheses Hozp9Haplt9 A X Test statistic p or Just X Rejection rule Reject H0 if X g C same constant Question How is C and thus the testing procedure speci ed The answer to this question comes by consideration of the different types of error and assigning priorities as will be eX plained Type I and type II errors It is a fact that mistakes cannot be avoided Thus H0 can be rejected when it is true or not be rejected when it is false 2 Ex2 Cont Suppose the rejection rule is Reject H0 ifX g 16 Also suppose that p 9 so H is true Then from binomial tables we get PX 16p 9314 20 133 That is H will be rejected 133 of the time even though it is true Now suppose that p 8 so HCl is true Then PX 16p 8311 20 589 That is H0 will be rejected 589 of the time meaning we commit an error 411 of the time Remark 1 If H0 were false in a more pronounced way ie p 5 then the probability of committing an error ie not rejecting H0 would be smaller The two types of error that can be committed are called type I and type II and can be shown in the table Truth H0 Ha Outcome H0 1 Correct decision 1 Type II 1 of test HCl TypeI Correct decision Ex2 Cont Suppose we change the R to Reject H if X g 17 Then Ptype I error PX g 17p 2911 20 323 Ptype II error atp 8 1 PX g 17p 9n 20 1 794 206 Remark Changing the RR fromX g 16 to X g 17 the prob ability of type I error T from 133 to 323 while the probability of type II error I from 411 to 206 Fact We cannot control both types of errors with the same sample size Philosophy Type I error is more serious than type 11 Thus we always want to contrail Prob Type I error The level at which we decide to control the P type I error is denoted by X usually on 105 or 01 and is called level of signi cance Fact Deciding on X determines the RR and thus also the test ing procedure Ex2 Cont Deciding on or 133 determines that the R is X g 16 Remark The discreteness of the binomial distributions does not allow the usual choices of QC Tests About the Population Mean Case 1 Sample from Nu6262 known Let X1X be a rsample from N y 62 and suppose we want to test H0 y no no is some speci ed value vs some alternative hypothesis The test statistic is X but more conveniently we use X yo Test statistic Z 7 o Remark It is important to realize that Z N N 0 1 only if H is true ie only if the true value of y is the speci ed 0 Suppose we decide on some signi cance level 06 The R depends on the alternative hypothesis Ha RRatlevelOL gto 2220c lto 23 20 u 75 0 Z 2 Zoe2 Z 2 Zoe2 or Z S Zoc2gt Ex3 A tire company wants to change the tire design Econom ically the modi cation can be justi ed of average life time with new design exceeds 20000 miles A rsamp1e of n 16 new tires is tested Assume life times are N y 15002 The 16 tires yield 2 203 758 Should the new design be adopted Test at 06 01 Sol Here H0 y 2030003 Ha y gt 203000 why The test statistic is Z tt0 203758 203000 c 1500 ME R at on 01 Z gt 201 233 Since 202 4 233H0 is not rejected implying the new design is not adopted Remark The fact that 203 758 gt 203 000 but the testing procedure decides against Ha y gt 20 000 serves to 202 a highlight the bias of the testing procedure in favor of H and b raise questions regarding the performance characteristics of the procedure For example it would be of interest to know what is the P type 11 error when y 21 000 Ex3 Cont Find the probability of not rejecting H when y 21 a 000 Sol Ptype II errory 21 a 000 P X 0 lt 21 000 7 Z G 06 u a G Plt ltmza 213000 X 21000lt 0 21000 21000 Z cw cw 0 21 000 20 000 21 000 qgt Ha qgt 233 Gx 150016 CI 34 3669 Remarks a The probability of type 11 error at a value y is denoted by mu Thus in above example we found 321a 000 3669 b In HW problems and in quizzes you can use directly the formulas in p319 Thus 20000 213000 1500 16 See formulas for But for Ha y lt yo and HCl y 75 no If the probability of type 11 error is large for our purposes we can increase the sample size in order to reduce it 3213000c1gt233 gtCIgt 343669 7 Ex3 Cont Find the sample size needed to have 321a 000 1 The equation we must solve is za0C 1 or z1 za0 orn 2 Here 0 1500233 128 20000 213000 As usual we round up and use n 30 2 l 5422 2932 Remark See p319 for sample size calculation for Ha y 75 0 Case 11 Large Samples n gt 30 Here the rsample X1X is allowed to come from any population distribution with mean u and variance 62 both un known Interest in testing H0 y no vs some alternative Test statistic f m Sm IfHo is true then Z 53 N9a l The R at level or are Z Ha RRatlevelOL gto 222a lto ZS Zoc o ZIZZdZ 8 Ex4 A trucking rm suspects the claim that certain tires last at least 28000 miles From a rsample of n 40 tires X 274633 S 1348 Test at0L 01 H0 2283000 vsHa zylt 283 000 X Sol Test stat1st1c Z S 0 252 R Z lt z01 233 In this caseZ 252 lt 233 so H0 is rejected Case 111 Sample from N y 952 unknown When 62 is un known we use as test statistic for testing H0 y no vs some alternative X T J s If H0 is true T N ln1 and the RR s are Ha R at level 06 gt 0 T gt tam lt 0 T lt t0Ln1 75 0 TI gt loc2n1 Ex5 The maximum acceptable level of eXposure to microwave radiation in US is an average of 10 microwatts per cm It is feared that a large television transmitter may be pushing the average level of radiation above the safe limit A rsample of n 25 givesfz 103s 20 TestHO y 10 vsHa y gt10 at 06 1 J Sol Test stat1st1c T 75 RR T gt 11724 2 1318 Here 75 4 1318 so H0 is not re jected Note No sample size determination for ltest Tests for the Binomial p We consider testing H0 p p0 vs some alternative only for the case of large samples In this case large samples means up 2 5 and n1 pg 2 5 The test statistic is 13 170 POOPo n IfHo is true then Z 53 N0a 1 The TR s at level or are Z Ha RRat level or p gt 0 Z 2 Zoe p lt 0 Z S ZOL P75P0 ZI ZZaz Ex6 It is thought that more than 70 of all faults in transmis sion lines are caused by lightning To gain evidence in support of this contention a rsample of 200 faults from a large data base yields that 151 of 200 are due to lightning Test H0 p 7 vs Hazpgt7at0L01 Sol Here g3 151200 2 755 Test statistic 17 2 1697 73200 Since 2007 2 5 and 2003 2 5 the RR is Z gt 201 233 Here 1697 4 233 so H0 is not rejected To evaluate the performance characteristics of the testing procedure we want to look at the probability of type 11 error Formulas for the probability of type 11 error are given in p329 Here we give the formula with a brief derivation for Ha p gt p0 51 Ptype 11 error when p p Here p is some value gt p0 3 170 I ltZappgt 1701 P0 A 1 PPltP0Zoc WIPPIgt qP0 P39Zocxm1 mngt p 1 p n PZltZupp39P Ex6 Cont Find 7 8233 73200 M8 cplt 82200 gt c1gt 866 1936 Pvalues We have seen that the rejection rule depends on the chosen or value For example in Example 6 the test statistic for H0 p 7 vsHa p gt 7 is Z 1697 andHo was not rejected at on 01 since 201 233 However had we chosen or 05 H0 would have been rejected since 205 1645 When we report only the outcome of the test for one particular signi cance level we are not being as informative as we can be Thus it is good practice to always report the socalled value pvalue Pvalue is the smallest level of signi cance at which H0 would be rejected for a given data set Reporting the pvalue is very informative and in fact the p value can be used instead of the test statistic to decide the out come of the test If pvalueg or gt reject H0 at level or If pvaluegt or gt don t reject H0 at level OL PValue for a Ztest Let Z denote the test statistic of any one of the Ztests ie Case 1 Case 11 or the Binomial case Then the pvalue is given by 1 Z for uppertailed test P value Z for lowertailed test 21 Z for twotailed test Ex6 Cont The test statistic for H0 p 7 vs Ha p gt 7 right tailed test is Z 1697 Thus the pValue is 1 CIgt1697 2 1 CI17 0446 PValue for a T test This is determined in exactly the same way except that you use test statistic T instead of Z and the ltable instead of the Ztable Ex5 Cont The test statistic for H0 p 10 vs Ha p gt 10 right tailed test is T 75 In this case the ltable is not detailed enough STAT 401 Ch 10 OneWay Analysis of Variance ANOVA In this chapter we consider a method for the comparison the mean of 1 populations and the related issue of multiple com parisons New Terminology It is common to refer to the 1 populations as I factorlevels EG if we want to compare the mean yield of a chemical process under I 4 different temperature settings the factor is temperature and the setting each of which de nes a population are the levels Let uni 1 1 denote the mean of the i th population or factor level Of interest is testing H0 1 2 1 vs Ha H0 is false Data The data from the ith factor level are XihXiz XU so sample sizeJ for all 1 Note We will cover only the case where the sample sizes are the same Assumptions All 1 populations are normal with the same vari ance That is XUN cszforaiii11j11 Model This is the simplest situation for which it is common to write a statistical model for the data AXvijui8ij l 1 where y Y Ema y t and 8 N N062 The differ i1 ence on y y of y from the average mean value y is called the effect of the ith population or factor level of treatment The null hypothesis H0 1 2 y can be written equiv alently asHo X1 X2 2090 Question Since we can test the equality of two means why don t we test H0 by testing equality of all pairwise means Basic Idea for Analysis of Variance Suppose I 3 and 1 lt 2 lt 3 The three samples would tend to look like J 6 Sample 1 Sample 2 Sample 3 Combined S On the other hand if m 2 3 then the picture might look like Sample 1 Sample 2 Sample 3 Combined S Note that the variability in the combined sample is larger in the case that m lt 2 lt 3 This is because the combined sample variability is due to a the variability within each sample and b the variability between Exit 1 Notation 1 X j i th sample mean l 1 J 11 X Z ZXijzgrand mean IJi1j1 l J 111 2 XU X2 ith sample variance 11 To turn the above basic idea for analysis of variance into a test procedure we argue as follows 1 The common variance 62 can be estimated by pooling the MSESSS JFrom rules of expectation it follows that EMSE 62 regardless of the values of m 2 Wu 3 62 2 Since VarX 71 1 1 it follows that when H0 1 2 y is true the sample variance ofX1X1 will be an unbiased estimator of 621 or MSTr J i 43 i1 is an unbiased estimator of 62 Thus EMSTr 62 ifHO is true However EMSTr gt 62 ifHO is false The test procedure is based on a comparison of the two variance estimators MSTr MSE If H0 is true and the assumptions are met TEst statistics F F NFlIJUl ie it has an F distribution with I l and I J 1 degrees of freedom Rejection rule at level or F gt 714104 The critical values for the F distr are in Table A7 4 Computational Formulas We will use initials SST Total Sum of Squares SSTr Treatment Sum of Squares SSE Error Sum of Squares and the additional notation I J J X 22am 2X i1j1 11 2 Li 2 SST 206 X X I 1 l SSTr X2 EX SSE 220g X2 SST SSTr 139 j Note The identity SST SST r SSE is responsible for the name Analysis of Variance SST r I 1 T t St t t F es a1s1c SSE1J1 It is common to denote SST MST r I f 2 Mean SS due to Treatment SSE MSE 2 Mean SS due to Error 1J 1 which equal the quantities we saw before Thus MSTr T tSt tit F es a1s1c MSE Ex1 The following data resulted from comparing the degree of soiling in fabric treated with three different mixtures of methacrylic acid MiXl 56 112 90 107 94 X1459 f1 918 MiX2 72 69 87 78 91 X2 397 262 794 MiX3 62 108 107 99 93 X3 469 938 TestHo 11 2 03 vs Ha H0 is false at 01 01 S01 First 223 562 1122 932 121351 221 X1 X2 X3 1325 Thus 1 SST 121351 13252 4309 3110 1 SSTr g 4592 3972 4692 117042 0608 SSE 4309 0608 3701 6 It is customary to summarize the calculations in the following ANOVA table Source df SS MSSSdf F Treatment 1 1 2 0608 0304 99 Error 1J 1 12 3701 0308 Total 14 4309 The on 01 critical value is F0172712 693 Since the rejec tion rule speci es that H0 be rejected ifF gt F0172712 H0 is not rejected Multiple Comparisons When H0 1 2 y is rejected it is not clear which of the yi s are different from each other Methods for doing this further analysis while preserving the overall level or are called multiple comparison procedures The one we present here is recommended for deciding whether of 2 it for each i and j Tukey s procedure This depends on the socalled studentized range distribution which is characterized by a numerator df m and denominator df v Let QOWV denote the uppertail or critical value of the studentized rdistr with mv df which are given in Table A7 Tukey s procedure is based on Proposition With probability 1 0c MSE 139 Xj ro11J1 T S 1 j S MSE Xi ro1111 T for every i and with i 75 ij l 1 Note that the proba bility statement in the Proposition holds simultaneously for all i 7 j It follows that if for a pair 139 ji 7 j the interval MSE Xi Xj i ro1IJ1 T does not contain zero it can be concluded that M and y differ signi cantly at level OL The following steps for carrying out Tukey s procedure lead to an organized way of presenting the results from all pairwise comparisons 1 Select 06 and nd wa from Table A8 2 Calculatew QOLMU MS J 3 List the sample means in increasing order and underline each pair that differs by less than w Pairs that are not un derlined indicate that the corresponding population means differ signi cantly at level OL Example Four different concentrations of ethanol are com pared for their effect on sleep time Each concentration was given to a sample of 5 rats and the REM sleep time for each rat was recorded yieldingfl 79282 61543 47924 3276 The analysis of variance calculations are summarized in the ANOVA table Source df SS MS F Treatment 3 58823575 196078583 2109 Error 16 14874000 929625 Total 19 73697575 Since F0le 420 it follows that H0 y 4 is re jected at level or 01 To identify which of the means differ we apply Tukey s procedure 1 Q01416 519 2 w 519 92965 2238 3 f4 f3 f2 f1 3276 4792 6154 7928 Thus 1 3 1 4 and 2 4 are signi cantly different at 06 01 Chapter 1 Page 1 STAT 401 INTRODUCTION Often we want to find out certain properties of a populations of objects or subjects Examples 1 Population US citizens of age 18 and oven Property proportion or fraction supporting nuclear energy 2 Population a certain component Property average length of life time Census ie examination of all members of a population is not conducted typically This is because of the cost and time required but also because the population is often hypothetical or conceptual Example 2 above is an example of a hypothetical population An alternative to census which is typically Chapter 1 Page 2 adopted is to examine a random sample Le a representative subset of the population Statistics is the science of i Collecting data sampling ii Summarizing data descriptive statistics iii Drawing conclusion from the data inferential statistics Chapter 1 Page 3 11 18 4 18 14 64 65 66 67 68 69 70 71 72 OOOOU IOOOOO JONNNPCDNOD Note Suffices to specify leaf unit Ch1 Descriptive StatisticsI StemandLeaf Displays Example Stemandleaf of yardage N 2 40 Stem Thousands and hundreds digits Leaf Tens digits Leaf Unit 10 ODPOOCDN Chapter 1 Page 4 The main idea of steamandleaf and other displays is to convey a reasonable impression regarding the distribution of the data Had we chosen only thousands for stems so only two stems the display would not be as informative too crude or clumpy Occasionally we need to use repeated stems to get a better display 5H 5 5L 0 2 2 4H 6 6 7 9 4L 111 3H 566 4 4 4 4 4 03wa CONOOOO COPCDh Relative Frequency Distributions If instead of writing out the numerical values of the leafs in each stem we provide only a count of them we obtain a frequency histogram Instead of stems we now Chapter 1 Page 5 talk about class intervals which can be chosen more arbitrarily eg they need not have the same width Ex Class fl fl n Freq Relative Freq 1 350lt550 1 010 2 550lt750 3 030 3 750lt950 8 079 4 950lt1150 17 168 5 1150lt1350 19 188 6 1350lt1550 19 188 7 1550lt1750 11 109 8 1750lt2550 23 228 n101 1000 Histograms Histograms are pictorial representations of frequency distributions They are constructed by drawing a box above each class interval whose height is j Page 6 2 U m 49 n w Relative Frequency Class Width Frequency of Life Time height 7 Chapter 1 Chapter 1 Page 7 symmetric M K bimodal positively skewed negatively skewed Chapter 1 Page 8 gar Graphs for Qualitative Data Frequency distributions and histograms bar graphs can be done with qualitative or categorical data Frequency Distribution Histogram Relative Manufaturer Frequency frequency lHonda 41 34 34 2 Yamaha 27 23 3 Kawasaki 20 17 4 Suzuki 18 15 5 HarleyDavidson 3 03 6 Other 11 09 120 101 Chapter 1 Page 9 measures of Location Let X1 X2 Xn denote the data The sample mean is X X X 1 n X 1 2 nZXi n n The sample median is the value which separates the sample in two parts that with low values and that with high values To define the sample median let X1X2 XW denote the ordered values Then the sample median is f Xn1 ifn is odd 2 Xlt gtXltg1gt K 2 7 JIS if n is even Example Let n 5X1 23 X2 32 X3 18 X4 2 X5 2 ThUS Chapter 1 Page 10 f 2332L82527 X 25 5 To find the median we first order the values U I I I I 11 p323 n I l Also 2 3 Thus the median is szhnyamp 25 2 Note Had the largest observation been X6 42 instead of 32 we would have X2 X225 Thus the value of is affected by extreme observations outliers Analogous to the sample mean and sample median there is the population mean u and the population median 22 Chapter 1 Page 11 a positively skewed b symmetric c negatively skewed it su MZt M gtZt A compromise between the mean and the median is the trimmed mean A 10 trimmed mean is computed by eliminating the largest 10 and the smallest 10 of the sample and then averaging the remaining values Example Let n 5 and X1X2 X5 as in previous example A 20 trimmed mean is Q 3 X4 23 25 27 25 3 3 EEI39QO Z Note Typically the trimming proportion or should be such that nor is integer and certainly nor 2 1 Thus if Q 5 it makes no sense to consider a 10 or a 5 Chapter 1 Page 12 trimmed mean Occasionally we may want to considex or such that nor gt 1 but not an integer For example we may wantitruo when n 22 Thus or 01 and nor 2 22 One way of doing it is by interpolating between 27trim 2 and 27trim 3 where 27trim 2 is obtained by eliminating the largest 2 observations and the smallest 2 observations and then averaging the remaining values 27trim 3 is defined similarly For example suppose 27trim 2 230 atrim 3 245 Then 2 I 3 22 39 24 230 233 5 Measures of Variability Location mean median trimmed mean is an important characteristic of a data set and often an important quality measure For example car manufactures want to increase the mean has mileage Another important characteristic of a data set and alsy Chapter 1 Page 13 nportant measure ofquality is the variability For example a sample of car mileages 291 296 30 305 308 indicates better quality than the sample 21 26 30 35 38 even though the mean in both cases is 30 An obvious measure of variability is the sample range maxX minXz39 Xn X1 but clearly this is very sensitive to extreme values outliers To remedy this drawback of the sample range we can consider sample percentiles The 90th for example percentile is defined to be the value that separates the upper largest 10 of the observations from the lower 90 Thus the median is the 50th percentile The 25th the 50th and the 75th percentiles are also called quartiles as they divide the sample in four equal parts We also refer to the 25th and the 75th percentiles as the Upper Quartile UQ and Lower Quartile LQ respectively The interquartile range IQR in book this is called fourth spread and denoted by fs IQR UQ LQ 2 Upper Quartile Lower Quartile Chapter 1 Page 14 another measure of variability The LQ is obtained Q LQ Median of smallest half of the values ie smallest 142 or n 12 as n is even or odd The UQ is obtained similarly Example Consider the n 8 values 939 704 717 1328 746 2106 1519 750 Since n is even we consider the Smallest n24 704 717 746 750 andthe Largestn224 939 1328 1519 2106 717 746 Thus LQ 731 1328 1519 UQ 1423 If there was an additional observation of 820 so 2 9 then check Chapter 1 Page 15 LQ 746 UQ 1328 The five numbers X1LQ UQ and X0 are conveniently summarized in a box plot Example Consider the above n 9 observations Thus N X 704LQ 746X 820 UQ 1328Xn 2106 397 i1 Each observation that falls between 15 IQR and 3 IQR from the edge of the box plot to which it is closest is called a mild outlier Observations that fall more than 3 IQR from the closest edge are called extreme outliers A box plot can be embellished to show such outliers see Figure 121 p43 Afinal measure of variability can be constructed using Chapter 1 Page 16 Ge deviations of each observation from the mean X1 X2 7 Xn 7 Note that n sum of deviations 7 0 1 21 Using these deviations we construct the following measure of variability Sample Variance SXX amp X2 n1 1 l z 1 n 14 11 The positive square root of the variance is called Sample Standard Deviation IS Computational formula Chapter 1 Page 17 Properties of 225218 Let X1 X2 Xn be a sample and C denote any constant 0 1 IfY1X1CY2X2lC YnXnCthen 7YQ 928wzgy 2 llezCX1Y2CX2YnCXnthen Kai ampwmr Example X1 8130031 X2 813015 X3 813006X4 813011X5 812997 X6 813005 X7 813021 Code the data by subtracting 812997 and multiply by 10000 Thus the coded data are 4 18 9 14 0 8 24 It is given that the mean and variance of the coded data are respectively 110 6833 Find the mean and variance of the original data Chapter 1 Page 18 To solve this problem reason as follows To obtain the original data from the coded data we must first divide the coded data by 10000 and then add 812997 Thus we proceed by first dividing the coded data by 10000 By Property 2 the mean amp variance become 6833 00011 2 00000006833 10000 100002 Next add 812997 By Property 1 the mean and variance become 00011 l 812997 2 813008 00000006833 These are the mean and variance respectively of the original data