### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# Class Note for STAT 528 at OSU 67

### View Full Document

## 17

## 0

## Popular in Course

## Popular in Department

This 31 page Class Notes was uploaded by an elite notetaker on Friday February 6, 2015. The Class Notes belongs to a course at Ohio State University taught by a professor in Fall. Since its upload, it has received 17 views.

## Popular in Subject

## Reviews for Class Note for STAT 528 at OSU 67

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 02/06/15

Statistics 528 Data Analysis Lecture 8 July 18 2006 Christopher Holloman The Ohio spam Univers ty Summer 2006 Overview of Today s Lecture o IPS Sections 62 71 Tests of Significance Inference for the Mean of a Population Christopher Holloman The Ohio spam Univers ty Summer 2006 Intro to Hypothesis Tests Two of the most common types of statistical inference 1 Confidence intervals Goal is to estimate and communicate uncertainty in our estimate of a population parameter 2 Tests of Significance Goal is to assess the evidence provided by the data about some claim concerning the population Basic Idea of Tests of Significance Example Each day Tom and Heather decide who pays for lunch based on a toss of Tom s favorite quarter Heads Tom pays Tails Heather pays 0 Tom claims that heads and tails are equally likely outcomes for this quarter 0 Heather thinks she pays more often Christopher Holloman The Ohio spam Univers ty Summer 2006 Heather steals the quarter tosses it 10 times and gets 7 tails 70 tails She is furious and claims that the coin is not fair There are two possibilities 1 Tom is telling truth the chance of tails is 50 and the observation of 7 tails out of 10 tosses was only due to sampling variability 2 Tom is lying the chance of tails is greater than 50 Christopher Holloman The Ohio spam Univers ty Summer 2006 Suppose they call you to decide between the two possibilities To be fair to both of them you toss the quarter 25 times Suppose you get 21 tails What would you conclude Why 9 The coin is probably not fair Even with sampling variability it is unlikely that a fair coin would result give such a high percentage of tails The actual probability of getting 21 or more tails in 25 tosses is 0000455 if the coin is fair Christopher Holloman The Ohio spam Univers ty Summer 2006 Moral of the story an outcome that would rarely happen ifa claim were true is good evidence that the claim is in fact not true This is the idea behind Hypothesis Testing Christopher Holloman The Ohio spam Univers ty Summer 2006 o A hypothesis is a statement about the parameters in a population we will be making statements about u in Section 62 o A hypothesis test or significance test is a formal procedure for comparing observed data with a hypothesis whose truth we want to assess o The results of a test are expressed in terms of a probability that measures how well the data and hypothesis agree Christopher Holloman The Ohio spam Univers ty Summer 2006 Performing a Hypothesis Test 1 State Hypotheses State your research question as two hypotheses the null and the alternative hypotheses These hypotheses are written in terms of the population parameters The null hypothesis Hg is the statement being tested This is assumed true and compared to the data to see if there is evidence against it A null hypothesis that we will see often is that the mean u is equal to some standard value Usually null hyfgtheses give a statement of no difference or no e e Christopher Holloman The Ohio spam Univers ty Summer 2006 Suppose we want to test the null hypothesis that u is some specified value say p0 Then H0 HHo Note We will always express H0 using an equality sign Christopher Holloman The Ohio spam Univers ty Summer 2006 The alternative hypothesis Ha is the statement about the population parameter that we hope or suspect is true We are interested in seeing if the data support this hypothesis 0 Ha can be onesided Ha p gtu0 or Ha H lt 0 0 Ha can be twosided Ha p uo Christopher Holloman The Ohio spam Univers ty Summer 2006 Example Strawberry Ba rs Kellogg s says that its strawberry bars weigh on average 16 oz 1 V s consumer reporter is suspicious that the bars weigh less than what is claimed In order to check his suspicion he weighs the contents of 20 randomly chosen bars These 20 bars have an average weight of 156 oz Assume that the wei hts follow a normal distribution with a standard deviation 0 07 02 Is there evidence that the reporter s suspicion is correct Christopher Holloman The Ohio spam Univers ty Summer 2006 o The hypotheses are H0 H Ha p lt 16 o Is this a onesided test or a twosided test This is a one sided test The reporter thought the bars were smaller than 16 02 Christopher Holloman The Ohio spam Univers ty Summer 2006 2 Calculate Pvalue We ask Does the sample give evidence against the null hypothesis Pvalue The probability that the sample mean would take a value as extreme or more extreme than the one we actually observed assuming H0 is true Christopher Holloman The Ohio spam Univers ty Summer 2006 In the strawberry bar example this means Is the sample average much less than we would expect to see if p really is 16 0 We need to find the probability that we get a sample of 20 bars whose mean is 156 or less given that p 16 Question Why 156 or less This is more extreme evidence against our null hypothesis and in support of the alterative hypothesis Christopher Holloman The Ohio spam Univers ty Summer 2006 o What is the distribution of 97 the average weight of the bars sampled if the null hypothesis is true N 1607Vl20 o What area under the Normal 1607Vl20 curve corresponds to the pvalue The area to the left of 156 Christopher Holloman The Ohio spam Univers ty Summer 2006 0 Calculate the test statistic zscore 156 16 07 020 0 Calculate the Pvalue Using Table A the area to the left of 256 is 00052 Z gt The probability of getting more extreme evidence against H0 p 16 given that p 16 is 00052 Christopher Holloman The Ohio spam Univers ty Summer 2006 o More on pvalues Pvalues correspond to tail areas under the density curve For the onesided test in the strawberry bar example the pvalue was the area to the left of the test statistic What area corresponds to the pvalue of a two sided test In the case of twosided tests we could observe something as extreme or more extreme in either direction The pvalue includes two tail areas Christopher Holloman The Ohio spam Univers ty Summer 2006 Pvalues in terms of the test statistic Ha Pvalue Area under curve u lt no Z S 1 Left ofz u gt a W2 2 1 Right ofz 1 in 2PZ 2 ill Tails where z is the observed value of the test statistic and the robabilities are found using the standard normal istribution given in Table A Christopher Holloman The Ohio seam Univers ty Summer 2006 o A p value is exact if the population distribution is normal 0 If the population is not normal the p value approximates the true probability for large n because of the Central Limit Theorem Christopher Holloman The Ohio seam Univers ty Summer 2006 1O 3 State Your Conclusions 0 The final step is to decide if there is a strong amount of evidence to reject H0 in favor of Ha This is accomplished using the Pvalue o In our example we got a Pvalue 00052 What does this tell us If H0 is true ie true mean weight is 16 02 then the chance of getting a sample whose mean weight is 156 02 or less is 052 Christopher Holloman The Ohio spam Univers ty Summer 2006 Does it give evidence against H0 Yes it is very unlikely that we would observe a sample mean as low as we did if H0 is true Conclusion We reject the null hypothesis Christopher Holloman The Ohio spam Univers ty Summer 2006 11 o A small p value is strong evidence against H0 Such a p value says that if H0 is true then the observed data are unlikely to occurjust by chance 0 The smaller the P value the stronger the evidence against the null Christopher Holloman The Ohio spam Univers ty Summer 2006 0 Question How small does the P value need to be Prior to testing it is determined how small the P value must be to be considered decisive evidence against H0 This value is called the significance level and is usually represented as on Typical on levels used are 01 005 and 001 If the P value s on reject the null hypothesis If the P value gt on do not reject the null hypothesis Christopher Holloman The Ohio spam Univers ty Summer 2006 12 o If we use an ac level of 001 and the pvalue is smaller than 0c we can say that there is a less than one in onehundred chance that we would observe data as extreme or more extreme than what we saw if the null hypothesis is true 0 If the Pvalue s cc we say the data are statistically significant at level a Note When we do not reject H0 we are not claiming H0 is true We are just concluding there is not sufficient evidence to reject it Christopher Holloman The Ohio spam Univers ty Summer 2006 Example Ameritech Suppose last year Ameritech s repair service took an average of 31 days to fix customer complaints One of the managers is assigned to check if this year s data 5 ow a different average time to fix problems He collects a random sample of 30 customer complaints and finds that the average time taken to fix them is 21 days Assume that the standard eviation of the time taken to fix the complaints is 25 days Is this good evidence that the average time taken to fix the complaints is not 31 days Christopher Holloman The Ohio spam Univers ty Summer 2006 13 0 State the hypotheses H0H31 Hau 31 0 Calculate the test statistic i yo 21 31 ZU 25 A m Christopher Holloman The Ohio spam Univers ty Summer 2006 219 0 Calculate Pvalue Since this is a twosided test we find the area to the left of 219 and then double it to get the pvalue Pvalue 2 X 00143 00286 0 What is our conclusion at the 005 level Since the pva ue is less than 005 the test is significant at the 005 level We reject the null o What is our conclusion at the 001 level The p value is arger than 001 so the test is not significant at this level We would not reject the null at this level Christopher Holloman The Ohio spam Univers ty Summer 2006 14 Example EPA The EPA limit on concentration of PCB in drinking water is 5 ppm Wells are regularly tested to make sure they are not over the limit A random sample of 100 water specimens from a well was collected and has an average PCB 51 ppm Is there evidence at 5 level that the well is over the limit Assume that the PCB concentration varies with standard deviation 08 Christopher Holloman The Ohio spam Univers ty Summer 2006 0 State the hypotheses H0 H 5 Ha u gt 5 0 Calculate the test statistic 7c 0 51 5 u z 125 a 08 J Am Christopher Holloman The Ohio spam Univers ty Summer 2006 15 0 Calculate the Pvalue The Pvalue is the area under the normal curve to the right of 125 Pvalue 01056 Since the pvalue is larger than 005 we do not have evidence at the 5 level that the PCB level exceeds the limit We do not reject the null hypothesis Christopher Holloman The Ohio spam Univers ty Summer 2006 Tests from Confidence Intervals We have covered two types of statistical inference procedures for the population mean u Confidence Interval CI and Tests of Significance Question Is there any relationship between hypothesis tests and confidence intervals Answer Yes a level oc twosided test rejects a hypothesis H0u uo exactly when the value uo falls outside a level 1oc CI for u Christopher Holloman The Ohio spam Univers ty Summer 2006 16 o Bud is interested in knowing if the mean weight of all blocks is 68 lbs or not State the appropriate hypotheses Using the above CI what do you conclude H0p68 Hap 68 Our confidence interval was 64373 to 66627 Since 68 does not fall in this interval we reject H0 The significance level of the above test is 5 005 because we used a 95 confidence interval Christopher Holloman The Ohio spam Univers ty Summer 2006 Example Jupiter The diameter of Jupiter is measured 100 times independently using a new unbiased rocess Using these 100 measurements a 99 C1 or the true diameter is computed to be 88707 miles to 88733 miles Is there evidence at 1 level that the true diameter of Jupiter is not 88720 miles Christopher Holloman The Ohio spam Univers ty Summer 2006 o What would the hypotheses be H041 88720 Ham 88720 0 What do you conclude Since 88720 falls in the 99 confidence interval 88707 miles 88733 miles we do not reject the null hypothesis when testing at the 1 level Christopher Holloman The Ohio spam Univers ty Summer 2006 More on stating your conclusions 0 When working with hypothesis tests there are many ways to state your conclusion 0 The following four statements convey the same conclusion 1 The test is significant 2 Reject the null hypothesis 3 The data show strong evidence against H0 4 The pvalue is smaller than 1 Christopher Holloman The Ohio spam Univers ty Summer 2006 19 0 These four statements also convey the same conclusion 1 The test is not significant 2 Do not reject the null hypotheses 3 The data do not show evidence against H0 4 The pvalue is larger than a 0 Usually 12 or 3 are given as the conclusion and 4 is given as the explanation of the conclusion Christopher Holloman The Ohio spam Univers ty Summer 2006 Confidence Intervals and Hypothesis Tests in Minitab 1 Use Minitab to get descriptive statistics and then use formulas 2 Use Minitab directly to compute confidence intervals and perform tests Stat Basic Statistics 1Sample Z Note This function is for computing confidence intervals and hypothesis tests of p the population mean assuming the population standard deviation is known Section 61 and Section 62 20 Stat 9 Basic Statistics 9 1Sample z 0 Variables enter column of data 0 Sigma known value of the population standard deviation 0 Test Mean value of mean under the null hypothesis H0 Christopher Holloman The Ohio spam Univers ty Summer 2006 Click on Options box Confidence level level C of confidence interval or level 1 0 of a hypothesis test Alternative form of alternative hypothesis 0 Not equal 9 HA p a5 M 0 Less than 9 HA p lt M 0 Greater than 9 HA p gt W Note you need to select not equal as the alternative to calculate an equal tails confidence interval like the ones we ve been doing Christopher Holloman The Ohio spam Univers ty Summer 2006 21 tTests Previously when making inferences about 2 3 the population mean u we were assuming Our data observations are an SRS of size n from the population The observations come from a normal distribution with parameters u and a The population standard deviation ais known Christopher Holloman The Ohio spam Univers ty Summer 2006 To perform statistical inference we were using the test statistic onesample 2 statistic f z 0 039 A which has a normal distribution This holds approximately for large samples even if assumption 2 is not satisfied Why CENTRAL LIMIT THEOREM Christopher Holloman The Ohio spam Univers ty Summer 2006 22 Issue In a more realistic settin assumption 3 is not satisfied That is a is un nown In more realistic situations where a is unknown we can use the sample standard deVIation s as an estimate of the population standard deVIation 039 L 72 s n1xi x swZ is called the standard error of the sample mean Christopher Holloman The Ohio spam Univers ty Summer 2006 When making inferences about the population mean u with 0 unknown we use the onesample t statistic Christopher Holloman The Ohio spam Univers ty Summer 2006 23 Question What is the distribution of a onesample t statistic Answer It haS a t distribution Note It is NOT normally distributed We Specif a particular tdistribution by giving itS degrees 0 freedom df When we are using the onesam le tStatistic the de reeS of freedom are equa to n1 where n ist e sample size There is a different tdistribution for each sample Size Christopher Holloman The Ohio Sham Univers ty Summer 2006 Question How does the tdistribution compare with the normal distribution The tdistribution is similar in shape to the standard normal distribution 1 Symmetric about zero 2 Unimodal 3 BellShaped Christopher Holloman The Ohio Sham Univers ty Summer 2006 24 o The spread of a tdistribution is larger than that of a standard normal distribution That is there is more probability in the tails of a tdistribution o This makes sense because the t statistic should have more variability than the test statistic 2 that we used in Chapter 6 0 Why There is added variability in the t statistic since it uses 5 an estimate of 6 rather than a known fixed value of 6 Christopher Holloman The Ohio State Univers ty Summer 2006 0 Notation tk or tk represents the tdistribution with k df 0 As the df k increases the t distribution approaches the N01 distribution As the sample size increases 5 estimates 6 more accurately so there is little extra variation MO 1 t5 Christopher Holloman The Ohio State Univers ty Summer 2006 25 Finding Areas Under the t Distributions o We use Table C to find areas under the t distributions Note Table C works very differently from Table A o Table C ives critical values of t distributions for various f o The numbers in the middle of the chart are values from t distributions Each row corresponds to a t distribution with the degrees of freedom given at the be innin of the row The numbers in the top row are rig t tai areas Areas under t distributions can also be found using Minitab Stat Probability Distributions t Christopher Holloman The Ohio spam Univers ty Summer 2006 Example If we go across the row for nine degrees of freedom and down the column for an area of 05 we get a t value of 1833 That means that for a t9 distribution the area under the curve to the right of 1833 is 005 Note There are no negative t values in the table To get the area to the left of a negative tstatistic take advantage of the symmetry of the distribution Christopher Holloman The Ohio spam Univers ty Summer 2006 Inference Procedures using the t Statistics We use the t distributions to do the same types of inferences we did in Chapter 6 confidence intervals and hypothesis tests The procedures are very similar to what we learned in Cha ter 6 only we replace the 6 5 with 5 s and 2 s wit t s Both confidence intervals and hypothesis tests can be performed using Minitab Stat BaSIc Statistics 1Sample t This function is almost identical to the 1Sample 2 function Christopher Holloman The Ohio Sham Univers ty Summer 2006 Confidence Interval for u when ais Unknown A 100 confidence interval for u is given by iifky 42 where t is the upper xZ critical value for the tnl distribution ie the area between t and t is 1x Christopher Holloman The Ohio Sham Univers ty Summer 2006 27 Example A nationwide chain of electronics stores wishes to know the average income of the households in Franklin County before they decide to build a new store here A random sample of 21 households is taken and the income of these sampled households turns out to average 35000 with a standard deviation of 15000 Give a 90 confidence interval for the unknown average income of households in Franklin County Christopher Holloman The Ohio spam Univers ty Summer 2006 7 35000 s 15000 n 21 df 20 t 1725 35000 i 1 7251 KyE 2935440646 Christopher Holloman The Ohio spam Univers ty Summer 2006 28 Steps for Doing Testsof Si nificance about u when O39IS Un nown 1 State the null and the alternative hypotheses 2 Calculate the onesample t statistic assuming HD to be true x 0 34 3 Calculate Pvalue and use it to draw conclusion 1 Note This procedure is known as the a t test Christopher Holloman The Ohio spam Univers ty Summer 2006 o Exact pvalues cannot be obtained using Table C but can be obtained using Minitab To approximate pvalues using Table C locate the row corresponding to the n1 degrees of freedom Go across until you find the critical values that your test statistic falls between Use the corresponding upper tail probabilities at the top of the table to calculate an interval in which the p value falls Note Remember to ONLY look in the row for the degrees of freedom that correspond to your sample size Christopher Holloman The Ohio spam Univers ty Summer 2006 29 Example Suppose testing H0 u 0 vs Ha u gt 0 yields a onesample 139 statistic of 182 from a sample of size 15 o What are the degrees of freedom for this statistic 0 Give two critical values t from Table C that bracket 139 What are the righttail probabilities for these two entries Christopher Holloman The Ohio spam Univers ty Summer 2006 0 Between what two values does the P value fall 0 Is t 182 statistically significant at the 500 level At 100 Christopher Holloman The Ohio spam Univers ty Summer 2006 30 Example The National Center for Health Statistics reports that the mean systolic blood pressure for males aged 35 44 years is 128 The medical director of a large company looks at the medical records of 21 executives in this age group and finds that the mean systolic blood pressure is 12707 and standard deviation is 15 in this sam Ie Is there evidence that the company s executives ave a different mean blood pressure than the general population Christopher Holloman The Ohio spam Univers ty Summer 2006 0 State the hypotheses H0 u 128 Ha u 1 128 0 Calculate the t statistic f uo 12707 128 Christopher Holloman The Ohio spam Univers ty Summer 2006 t 028 31 0 Estimate the pvalue and state the conclusion of the test The degrees of freedom is 20 Look to see where 028 falls in the row for 20 df It s off the chart The area to the left of 028 is larger than 025 Our pvalue is larger than 2 X 025 050 We do not reject the null We do not have evidence that this company s executives have different blood pressure Christopher Holloman The Ohio spam Univers ty Summer 2006 Cautions about Using the t Procedures If the sample size is less than 15 and the data are close to normal it is okay to use the t procedure Do not use it if the data are clearly nonnormal andor if outliers exist If the sample size is 2 15 it is ok to use the t procedure unless outliers or strong skewness exists If the sample size is 2 40 it is okay to use the t procedure even if strong skewness exists Christopher Holloman The Ohio spam Univers ty Summer 2006 32

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

#### "Selling my MCAT study guides and notes has been a great source of side revenue while I'm in school. Some months I'm making over $500! Plus, it makes me happy knowing that I'm helping future med students with their MCAT."

#### "There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

#### "Their 'Elite Notetakers' are making over $1,200/month in sales by creating high quality content that helps their classmates in a time of need."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.