Class Note for STAT 528 at OSU 38
Class Note for STAT 528 at OSU 38
Popular in Course
Popular in Department
This 37 page Class Notes was uploaded by an elite notetaker on Friday February 6, 2015. The Class Notes belongs to a course at Ohio State University taught by a professor in Fall. Since its upload, it has received 88 views.
Reviews for Class Note for STAT 528 at OSU 38
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 02/06/15
Stat 528 Autumn 2008 Elly Kaizar Introduction to Inference Reading Sections 61 62 63 0 Large Sample Con dence Intervals Constructing a con dence interval Con dence intervals in MlNlTAB o Hypothesis testing The logic of testing The components of a test gtllt Formulating hypotheses gtllt De ning test statistics gtllt Calculating p values gtllt Drawing conclusions Tests in MlNlTAB o The relationship between tests and con dence intervals Inference Revisited 0 Build a model for the world 0 Con dence Intervals Find values for aspects parame ters of that model that are consistent With data 0 Tests Test if particular values for aspects parameters of that model are consistent With data Runners Example You measure the weights in kilograms of 24 male runners You do not actually Choose a SR8 but are willing to assume that these runners are a random sample from the population of male runners in your town or City Here is the summary Variable N N Mean SE Mean StDev runner weights 24 0 61792 0981 4808 Variable Minimum Q1 Median Q3 Maximum runner weights 53100 58075 62100 65450 69300 Suppose that the standard deviation of the population is known to be a 45 kg We are interested in the population mean of male runners7 weights As usual we let n denote this value c Find a range of plausible values of u 0 Test to see if it is plausible that 1 equals a particular value Building a twosided con dence interval for n o The situation Let X1 X2 Xn be the random variables RVs associ ated with a simple random sample SR8 of size n drawn from a population with mean u and stdev a We assume that the mean and variance are nite numbers 0 What is the approximate sampling distribution of X o What range of values of X might you expect if you repeated the data collection many times Choosing a con dence limit 0 Now need to set our con dence limits 7 how likely are we to trap the value of the population parameter u based on con dence intervals constructed from many random samples 0 Suppose we want a con dence level of C 95 c We know Z is approximately a standard normal NO1 o What 2 values of the NO1 distribution capture the central 95 of the data A large sample 95 con dence interval CI for u c We have that P 196 lt Z lt196 7 H Ux o Rearranging the term in the brackets is Plt 196lt lt196gt 095 a 0 P X 196 lt lt X 196 z 095 lt V5 M W 0 Thus a largesample 95 con dence interval for u is given by U 0 196 196 St W quotT El 0 More compactly we write this as 039 I 196 LC Interpreting a con dence interval lf l took repeated random samples from the population in the long run 3 of the time the true population mean u is captured by the limits of the calculated con dence limits77 Runners example You measure the weights in kilograms of 24 male runners You do not actually choose a SR8 but are Willing to assume that these runners are a random sample from the population of male runners in your town or city Here is the summary Variable N N Mean SE Mean StDev runner weights 24 0 61792 0981 4808 Variable Minimum Q1 Median Q3 Maximum runner weights 53100 58075 62100 65450 69300 Suppose that the standard deviation of the population is known to be a 45 kg a What is 0y the standard deviation of Y b Calculate a 95 con dence interval for u the mean of the population from Which the sample is drawn c Calculate a 997 con dence interval for u The Con dence Level C and Critical Values 0 Suppose we want some con dence level C 1001 00 o The interval now becomes EiZag i x Where Zag is the 2 value corresponding to an upper tail prob ability of oz 2 in the standard normal distribution 0 Common values of zaQ the critical values are Con dence level C 90 95 99 oz oz 2 Critical value zag 1645 1960 2576 One sided con dence intervals o A lower con dence interval a as za lti m 0 An upper con dence interval a ult W c We proceed as with the two sided interval but calculate the area under the standard normal curve differently lnstead of an area of a2 at either side of the curve we put all 0 on the left lower interval or 0 on the right upper interval 10 The analysis in MINITAB 1 Load the dataset into MlNlTAB 2 Examine the data With graphs and summary statistics 3 Select Stat gt Basic Statistics gt 1Sample Z 0 Variable Cl 0 Standard deviation Enter the popul value of 015 0 Under Options If you want either a lower con dence limit for u change the Alternative to greater than If you want either a upper con dence limit for u change the Alternative to less than You can also change the con dence level of the intervals 11 Testing statistical hypotheses 0 Proof by Transposition Suppose it is raining Then the grass must be wet The grass is not wet Contradiction It is not raining o A hypothesis test is mirrors the proof by transposition 0 Compare the proof to our hypothesis test The two argu ments match exactly until the Contradiction step Proof by transposition There is a logical contradiction Hypothesis test It is implausible to explain the data as anything but a contradiction 12 Hypotheses o A hypothesis is a claim or statement about the value of a population characteristic parameter or characteristics Examples 1 Let u mean ball bearing weight on a production line Hypothesis 1 100 grams 2 Let p proportion of OSU researchers who use statistics in their research Hypothesis p 05 o Hypotheses should not depend on the sample data These are not hypotheses 1 E 95 2p0 13 The null and alternative hypotheses o The null hypothesis H0 is the Claim about one or more populations or population Characteristics that is initially as sumed to be true We Will assume the null hypothesis is of the form H0 population parameter some value 0 The alternative or alternate hypothesis denoted by Ha is the Claim to which Wish we compare H0 some people use H1 instead of Ha We Will assume that Ha has one of three forms 1 Ha population parameter y some value 2 Ha population parameter lt some value 3 Ha population parameter gt some value These alternative hypotheses are one or two sided 14 Hypothesis examples ln each of the following situations a signi cance test for a popu lation mean u is called for State the null hypothesis H0 and the alternative hypothesis Ha in each case 0 Experiments on learning in animals sometimes measure how long it takes a mouse to nd its way through a maze The mean time is 20 seconds for one particular maze A researcher thinks that playing rap music will cause the mice to complete the maze faster She measures how long each of 12 mice takes with a rap music as a stimulus 0 ln a botanical study there is interest in measuring the av erage nitrogen percentage of plants of the species Leucaena luecocephala grown in a laboratory It is known that the average nitrogen percentage for the species found in nature is 3 The researcher believes that the average percentage may be higher for lab grown species 15 Pollution example 0 Thirty samples were taken from a stream and the pollution level in parts per million ppm was recorded for each sam ple The average pollution level for the data was 3 101 ppm Suppose that the population standard deviation is 27 ppm The investigator who collected the samples is inter ested in determining Whether or not there is evidence that the population mean pollution level is greater than 95 ppm 16 The testing procedure c We see how well the sample data supports the null hy pothesis c As a conclusion we either reject H0 or fail to reject H0 0 Although you may hear people say it we never accept Hal 17 Errors in tests of signi cance 0 Consider the truth about the world The truth H0 is true H0 is false We reject H0 Type I error We fail to reject H0 Type ll error 0 Type I error we reject H0 when H0 is actually true 0 Type II error we fail to reject H0 when H0 is actually false 0 When we base our inference on data from a study we cannot completely eliminate either type of error 18 The probabilities of making each error o The signi cance level or size of the test is or P making a type l error Pltreject H0 given that H0 is true eg For an or 005 test if H0 is true and the test was repeatedly run on different random samples from the same population in the long run H0 would be rejected 5 of the time o The other probability of error is 6 Pmallte a type ll error Pfail to reject H0 given that H0 is false 19 A compromise in errors 0 In practice or increases as 6 decreases and vice versa 0 Common strategy 1 Select a test statistic that allows you to distinguish be tween the null and alternative hypotheses 2 Choose the signi cance level oz for your test eg people often use or 05 or or 01 3 Find a testing procedure that leads to a small 6 given this choice of oz 0 The tests in this course were constructed in this way 20 De ning a test statistic o A test statistic is a function of the data which is calculated and used as a judge between H0 and Ha c We calculate the test statistic for the sample data the ob served value and ask the question based on the sampling distribution of the test statistic under the assumption that H0 holds how likely is it to obtain data that clashes with H0 at least as much as the observed value does 21 Pollution example cont c We know that the sample mean is an unbiased estimate of the true population mean u 0 Thus we will base our test on the sample mean X For the sample size we have n 30 it seems reasonable to assume Y M 7 Ux that Y is approximately N u o Standardizing we have sq Z is approximately NO1 0 Z is called the test statistic The test is called the ztest 22 The observed test statistic 0 Question is the observed test statistic for our data com patible with H0 0 For the example E 1010 27 and n 30 c When H0 is true the population mean u 95 0 Does this value of 2 support the null hypothesis or not We answer this question by calculating a P Value 23 Under the null hypothesis 0 Remember we are testing H0 u 95 versus Ha u gt 95 Y 0 Under H0 Z H has a NO1 distribution a o How does the observed test statistic 2 compare with this distribution 04 03 02 01 00 24 The p value o The p value is the probability that the test statistic takes a value as extreme or more extreme than the observed test statistic The probability calculation is based on the sampling distribution of the test statistic assuming H0 is true The smaller the P value the more evidence in the data against H0 0 For a test of signi cance level oz lf P Value 3 oz we reject H0 lf P Value gt oz we fail to reject H0 0 The form of P value calculation will depend on Ha not H0 0 Let 2 be the observed test statistic and let Z be a standard normal RV the distribution of our test statistic under H0 We Will illustrate the cases When 2 is negative or positive 25 Calculating the Pvalue one sided Ha o For Ha 21 lt MO the P value is PZ g quotC quotC 0 0 391 391 0 0 N N 0 0 0 0 Q Q 0 0 3 2Z 1 0 1 2 3 o For Ha 21 gt MO the P value is PZ Z quotC quotC 0 0 391 391 0 0 N N 0 0 0 0 Q Q 0 0 3 2Z 1 0 1 2 3 26 Calculating the Pvalue two sided Ha o For Ha 21 y 0 the P value is 2PltZ Z quotC o 03 02 01 00 27 Testing example The runners revisited You measure the weights of 24 male runners You do not actually choose a SR8 but are willing to assume that these runners are a random sample from the population of male runners in your town or city The mean of the sample is given by 6179 kg Suppose that the standard deviation of the population is known to be a 45 kg ls there evidence that the population mean weight is not equal to 64 0 Write down the hypotheses for your test Make sure you de ne all the quantities in your hypothesis 0 Calculate the value of the observed 2 test statistic 0 Calculate the P value for your test 0 ls the result signi cant at the 5 level ie oz 005 At the 1 level 0 Interpret your result in words 28 The steps in MINITAB 1 Load the dataset into MlNlTAB 2 Examine the data with graphs and summary statistics 3 Select Stat gt Basic Statistics gt 1Sample Z 0 Variable Cl 0 Standard deviation 45 o Tick Perform hypothesis test 0 Test mean 63 0 Under Options select an Alternative of not equal to One Sample Z runner weights Test of mu 64 vs not 64 The assumed standard deviation 45 Variable N Mean StDev SE Mean 95 CI Z P runner weights 24 61792 4808 0919 59991 63592 240 0016 29 Thinking about hypothesis tests Which of the following does a test of signi cance answer 0 ls the sample or experiment properly collected or designed o ls the observed effect important 0 ls the observed effect due to chance 0 Could the observed effect be due to chance 0 ls the hypothesized parameter value consistent with the data 30 Validity of hypothesis tests o The following causes can destroy the validity of a hypothesis test Bad survey or experimental design eg lack of control badly worded survey questions Faulty data collection eg we do not have a random sample Poor approximations for sampling distributions eg for a Z test based on E the data contain outliers the population distribution is heavily skewed a is unknown 0 We need to adjust our testing scheme if we carry out mul tiple tests at once Even when H 0 is true in each case it is possible to produce a signi cant result by chance alone 31 Failing to reject 0 Suppose we test H0 u 0 versus Ha u y MO o If we fail to reject H0 then either H0 is true OR there is not enough evidence in the data to reject H0 0 The second case is related to the power of a test 7 the probability of detecting an effect of the size you hope to nd yet to come 0 Remember a lack of signi cance is still a positive result Journals tend to disagree 32 The perils of a 005 signi cance level Suppose that SAT math scores vary normally with a 100 One hundred students go through a rigorous training program to raise their scores by improving their mathematics skills We carry out a test of H0 21 475 versus Ha 21 gt 475 0 Suppose the average student score is E 4914 Then 2 164 lt 1645 ie not signi cant at the 005 level 0 Suppose the average student score is E 4915 Then 2 165 gt 1645 ie signi cant at the 005 level Beware attempts to treat or 005 as a sacred number 33 Statistical versus practical signi cance 0 ln a hypothesis test when the P value is smaller than the chosen signi cance level oz we say the result is statistically signi cant 0 The observed deviation that was expected under H 0 cannot be attributed to sampling variation alone 0 But is the actual difference signi cant in practice Example A pharmaceutical company markets a new cold treatment as signi cantly better than a competing medica tion Would you pay twice as much for the new medicine 0 Practical signi cance will be speci c to each problem 34 Relating con dence intervals and hypothesis tests 0 We observe a random sample 61 2 71 from a population with mean u 0 Recall we used two sided con dence intervals to identify plausible values for the population parameter u fizagw 0 Recall the testing approach for two sided alternative H0M0 Ha7 uO i 0 Z av Reject H0 whenever z gt zag or z lt za2 o How might we use the testing approach to nd a range of plausible values for the population parameter 1 35 Using 21 Con dence Interval to test 0 We can use the interval to get a test of signi cance level or for the following hypotheses H0 u no for some constant 0 Ha u 7t 0 Procedure Reject H0 if no is not contained in the Cl Fail to reject H0 if no is contained in the Cl 36 An example of relating CIs and tests 0 ln the con dence interval notes for the runners example we calculated a 95 Cl for the mean Inale runner weight for your town or city M It was given by 5999 kg 6359 kg We can use this Cl to carry out the test of H0 1 64 versus Ha u y 64 when or 005 o What is your conclusion 37
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'