### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# 522 Class Note for STAT 30100 with Professor Howell at Purdue

### View Full Document

## 26

## 0

## Popular in Course

## Popular in Department

This 15 page Class Notes was uploaded by an elite notetaker on Friday February 6, 2015. The Class Notes belongs to a course at Purdue University taught by a professor in Fall. Since its upload, it has received 26 views.

## Similar to Course at Purdue

## Reviews for 522 Class Note for STAT 30100 with Professor Howell at Purdue

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 02/06/15

Lecture 7 Chapter 6 Statistical inference is the next topic we will cover in this course We have been preparing for this by describing and analyzing data graphs and plots descriptive statistics discussing the means to find andor generate data studies samples experiments and we have defined sampling distributions We are now ready for statistical inference The purpose of statistical inference is to draw conclusions from data It adds to the graphing and analyzing because we substantiate our conclusions by probability calculations Formal inference is based on the longrun regular behavior that probability describes When you use statistical inference the data should come from a random sample or a randomized experiment We will learn in this chapter the two most prominent types of formal statistical inference con dence intervals for estimating the value of a population parameter and tests of signi cance which assess the evidence for a claim Section 61 Estimating with con dence A confidence interval for a population parameter includes a point estimate and a margin of error The point estimate is single statistic calculated from a random sample of units For example E the sample mean is a point estimate of u the population mean Point estimates give us very little information So we add to our point estimate a margin of error making up a confidence interval Lecture 7 Chapter 6 Page 1 Example 1 You want to estimate the mean SAT Math score for high school seniors in California At considerable effort and expense you give the test to a simple random sample of 500 high school seniors The mean score for your sample is E 461 points The standard deviation of the SAT Math test is a known a 100 points Questions 1 Can we include a measure of the precision associated with the point estimate 2 Can we include a measure of our confidence in our results Answer 0 Yes we can construct a confidence interval for u A con dence interval is calculated from the sample data and it represents an interval estimate of the population parameter A con dence interval includes 1 an interval computed from the sample The interval is a measure of the variability of our point estimate 2 a confidence level This confidence level measures the confidence that our inference is correct In this lesson we want to find a confidence interval for a population mean u Lecture 7 Chapter 6 Page 2 Given a SRS of n units from the population we can calculate the sample mean E This single calculated value E represents an outcome in the sampling distribution of E To obtain a 95 confidence interval for u based on this single observed value we treat our observed outcome E as though it is the true mean of the sampling distribution of E We then construct our interval about this observed value E For example we find the value for both the 25th percentile lower bound and 975th percentile upper bound of the normal distribution with a mean of E and a standard deviation of GJ Using the techniques for normal distributions we find that the 25th percentile of this distribution is 36 1961 and the 975th percentile is l96 J13 J13 These two percentiles form the lower and upper limits of our 95 confidence interval for u From Table A we find the following For a 95 Confidence Interval For a 90 Confidence Interval For a 99 Confidence Interval Lecture 7 Chapter 6 Page 3 Example 1 Suppose we wish to estimate u the driving time between Lafayette and Indianapolis We select a SRS of n 25 drivers The observed sample mean is 1 110 hours the first sample taken the sample mean calculated Let s assume that we know the standard deviation of X is a 05 hours Let E denote the distribution of the sample mean The standard deviation of E is CE 01 J3 Then J CN Nu0 1 Remember our first observed sample mean is 1110 hours An approximate 95 confidence interval for u can be constructed as follows 0 The lower endpoint is 11961 x M 11o 19601 0904 0 The upper endpoint is 196 J 110 19601 1296 0 So our approximate 95 confidence interval for u is 09041296 0 Note We are not guaranteed that the true value of u is in the above interval Suppose we conducted another trial of our random phenomenon and obtained an E2 100 hours the second sample taken sample mean calculated If we calculate an approximate 95 confidence interval for u based on E2 we get 0804 1196 which is a different interval estimate If we repeatedly selected SRS of 25 drivers and for each SRS we calculated the approximate 95 confidence interval for u the population mean in the longrun 95 0f the intervals will contain the true value of u Lecture 7 Chapter 6 Page 4 Formal de nitions Con dence Interval A level C confidence interval for a parameter is an interval computed from sample data by a method that has probability C of producing an interval containing the true value of the parameter Con dence Interval for a Population Mean Choose a SRS of size n from a population having unknown mean u and known standard deviation 0 A level C confidence interval for u is a x i Z R Here 2 is the value on the standard normal curve with area C between 2 and 2 This interval is exact when the population distribution is normal and approximately correct for large n in other cases 0 Margln of error Z J17 So the confidence interval for a population mean can also be written as a i margin of error Example 1 Suppose X Bob s golf scores has a normal distribution with unknown mean and standard deviation 0 3 A SRS of nl6 units is selected and a sample mean of E 77 is observed a Calculate a 90 confidence interval for u b Calculate a 95 confidence interval for u c Calculate a 99 confidence interval for u Lecture 7 Chapter 6 Page 5 There is a tradeoff between the precision with which we estimate the unknown parameter and the confidence we have in the result Higher confidence larger C level requires a wider interval Another way in which the margin of error is changed is the sample size A larger sample size will result in a smaller margin of error Choosing the sample size Sample Size for Desired Margin of Error The confidence interval for a population mean will have a specified margin of error In when the sample size is 2 2039 n m 1 You are planning a survey of starting salaries for recent liberal arts major graduates from your college From a pilot study you estimate that the standard deviation is about 8000 What sample size do you need to have a margin of error equal to 500 with 95 confidence Example Lecture 7 Chapter 6 Page 6 Some Cautions o The above interval estimation method applies only to SRS Slightly different methods are required when the data are obtained from more complicated surveys and experiments 0 The above formulas do not correct the data for any unknown bias Consequently if the data are biased then ANY inferences based on those data are also biased This includes biases arising from nonresponse undercoverage and response error or hidden bias in experiments 0 Because the sample mean is not resistant confidence intervals are not resistant to outliers o If the sample size is small eg nlt15 and the sampling distribution of X is not well approximated by the normal density curve then the above formulas should not be used 0 Typically we do not know the population standard deviation 0 Lecture 7 Chapter 6 Page 7 Section 62 A Confidence interval is one of the two common types of statistical inference it is used to estimate a population parameter from a sample statistic with certain confidence The second common type is a signi cance test which assesses evidence provided by gathered data in favor of some claim about the population an hypothesis about the population Tests of Signi cance Often we have specific questions regarding a particular value of a population parameter Example 1 Bob golf scores are historically normally distributed with u 77 strokes and c5 3 strokes Bob has recently made two improvements to his game he has taken a lesson from a local Professional and he has read a book on the mental approach to putting With these improvements Bob thinks his game is better Let s examine the truth of Bob s thinking using a test of significance Bob has played several rounds after these improvements His scores after the improvements were 77 73 74 78 78 75 75 74 71 We now ask if Bob s average score u 77 is still a reasonable value or has he improved so his average would be less We ask Question 1 Is u 77 Question 2 Is u lt 77 Lecture 7 Chapter 6 Page 8 Example 2 Bob has a driver s license that gives his weight as 190 pounds Bob did not change this information the last two times he renewed his license but has been eating too well in the interim Recently Bob decided to go on a diet and feels he has changed his eating habits Let s test the truth of the weight on Bob s license using a test of significance We know over time Bob s weight is approximately normally distributed with a standard deviation of 3 pounds Bob has weighed himself during the last month and his weights are 193 194 192 191 We wonder if the weight on Bob s license is still correct Question 1 Is u 190 Question 2 Is u 3395 190 If we look at the two examples we see that the question 1 s and 2 s are similar in form In the first question a particular value of the parameter is specified We call this statement about the parameter the null hypothesis or H 0 In the second question an alternative set of possible parameter values is specified This second statement about the parameter is called the alternative hypothesis or H a Note The parameter value specified in the null hypothesis usually represents an established standard or generally accepted norm The suspected change in the parameter value is summarized by the alternative hypothesis Lecture 7 Chapter 6 Page 9 Now let s express our above examples in terms of the null and alternative hypothesis Example 1 H 0 u 77 H a u lt 77 Note This is a onesided significance test Example 2 H 0 u 190 H a u 190 Note This is a twosided significance test The goal of a significance test is the following Based on a random sample from the population we want to determine if a change has resulted in a shift in the mean Because the null hypotheses represents the established or accepted mean value and we believe a change has occurred as specified in the alternative hypothesis we want to use the data to determine statistically if we can Reject the null hypothesis in favor of the alternative Let s look at the first example based upon the statistical techniques we already know X Bob s golf score We use the scores Bob has recorded after his lesson and book read as a sample of X We know that Bob s golf scores are normally distributed hence any sampling distribution of his scores is also normally distributed If his scores were not normally distributed we would need to assume that our sample size is large enough so that the Central Limit theorem applies and the sampling distribution of E can be approximated by the normal density curve The standard deviation of E is 7E 3 5 1 stroke Lecture 7 Chapter 6 Page 10 Now let us consider our mean of the sampling distribution in context of our two hypotheses If the null hypothesis is true then the mean of the sampling distribution is ux 77 If H0 is true then J NN771 If the null hypothesis is false and H a is true then the mean is some value lt 77 strokes If H0 is false and Ha is true then YN Nul for some value u lt77 0 Values of E close to 77 would tend to support H 0 and values that are less than 77 would provide evidence against H 0 0 With the sample of Bob s last 9 scores we get a sample mean of f 75 Can we conclude that we should reject H 0 in favor of H a 0 To figure this out we calculate a pvalue For example we calculate the PYlt75 PZ lt 751772PZ lt 200 00228 0 Note that to calculate this probability we assumed H 0 was true and 7 N N 77 1 0 Because the probability of obtaining a sample f lt 75 is small we would rej ect H 0 0 Let s say Bob wanted quicker feedback on his golf improvements and wanted it after his first 5 scores In this case the sample size is smaller and as a result we get a sample mean of f 76 and a 3 1342 then PYlt76 PZ lt 76 77 02266 1342 which is not very small 0 This suggests that it is reasonable that Y 76 as an observed outcome from N77l 342 So we would fail to reject H 0 0 As you can see the sample size often affects whether we reject H 0 That is why we never accept H 0 but rather fail to reject H 0 This failure to reject H 0 may be because we did not select a large enough sample Lecture 7 Chapter 6 Page 11 Test of Signi canceHypothesis Tests formal steps Step 1 State the Null and Alternative Hypothesis Null Hypothesis H 0 The statement being tested in a statistical test is called the null hypothesis The test is designed to assess the strength of the evidence against the null hypothesis Usually the null hypothesis is a statement of no effect or no difference H 0 y 0 Alternative Hypothesis H a The claim about the population that we are trying to find evidence for Choose one of the three potential hypotheses Onesided tests H a y gt 0 H a y lt 0 Twosided test H a y i 0 Step 2 Find the test statistic lf uo is the value of the population mean Lt specified by the null hypothesis the onesample 2 statistic is Ux Step 3 Calculate the p value For one sided tests pvalue PZ E z or PZ 2 2 For two sided tests pvalue 2PZ Z z Step 4 State conclusions in terms of the problem One way to do this is to choose a significance level that defines how much evidence we desire39 we choose an d Then compare the pvalue to the 1 level lfpvalue lt 1 then reject H 0 Z lfpvalue gt 1 then fail to reject H 0 Your conclusion should be in this form At a rejectfail to reject H 0 evidenceno evidence that restate the null hypothesis in language layperson can understand Even though H a is what we hope or believe to be true our test gives evidence for or against H 0 only We never prove H 0 true we can only state whether we have enough evidence to reject H 0 which is evidence in favor of H a but not proof that H a is true or that we don t have enough evidence to reject H 0 Lecture 7 Chapter 6 Page 12 Example 1 A shipment of machined parts has a critical dimension that is normally distributed with mean 12 centimeters and standard deviation 01 centimeters The acceptance sampling team believes that the measurements are less than 12 centimeters Consequently they take a random sample of 25 of these parts and obtain a mean of 1199 Is the acceptance sampling team correct in their assertions Use an 1 level of 001 Lecture 7 Chapter 6 Page 13 Con dence Intervals and TwoSided Tests A level 1 twosided significance test rejects a hypothesis H 0 u uo exactly when the value uo falls outside a level 11 confidence interval for u Examples 1 Bob has a driver s license that gives his weight as 190 pounds Bob did not change this information the last two times he renewed his license but has been eating too well in the interim Recently Bob decided to go on a diet and feels he has changed his eating habits Let s test the truth of the weight on Bob s license using a test of significance We know over time Bob s weight is approximately normally distributed with a standard deviation of 3 pounds Bob has weighed himself during the last month and his weights are 193 194 192 191 Lecture 7 Chapter 6 Page 14 Section 63 Use and Abuse of Tests Choosing a Level of Significance If we want to make a decision based on our test we choose a level of significance in advance We do not have to do this however if we are only interested in describing the strength of our evidence If we do choose a level of significance in advance we must choose 1 by asking how much evidence is required to reject H 0 The choice of 1 depends on the type of study we are doing Practical Significance Statistically significant does not mean practically significant If we use a large sample we may get a statistically significant result but it may not be practically significant Don t Ignore Lack of Significance Often only research that shows statistically significant results is published This is a problem because other researchers may repeat the experiment thinking that there is a difference when there may not be Sometimes a lack of Significance may be due to a small sample size In this case the power of the test is low Some Cautions about Statistical tests 0 As with Cl s badly designed surveys or experiments often produce invalid results Formal statistical inference cannot correct basic flaws in data collection 0 As with Cl s tests of significance are based on laws of probability Random sampling or random assignment ensures that these laws apply 0 Statistical significance is not the same thing as practical significance 0 There is no sharp border between significant and non significant only increasingly strong evidence as the P value gets smaller Lecture 7 Chapter 6 Page 15

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

#### "I used the money I made selling my notes & study guides to pay for spring break in Olympia, Washington...which was Sweet!"

#### "There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

#### "Their 'Elite Notetakers' are making over $1,200/month in sales by creating high quality content that helps their classmates in a time of need."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.