# Class Note for STAT 528 at OSU 07

This 19 page Class Notes was uploaded by an elite notetaker on Friday February 6, 2015.

## Reviews for Class Note for STAT 528 at OSU 07

Date Created: 02/06/15

Statistics 528 Data Analysis Lecture 7 July 13 2006 Christopher Holloman The Ohio spam Univers ty Summer 2006 Overview of Today s Lecture o IPS Sections 52 61 Sampling Distribution of a Sample Mean Statistical Confidence Christopher Holloman The Ohio spam Univers ty Summer 2006 The Sampling Distribution of the Sample Mean 0 Imagine that we have an SRS of size n from a po ulation and measure a variable X on each In ividual in the sample 0 Each Xi is a measurement from the opulation and therefore has the distribution 0 the population 9 uXiu and cm o o The sample mean of an SRS of size n is ilX1X2Xn I l Christopher Holloman The Ohio spam Univers ty Summer 2006 Question What are the mean and standard deviation of x The mean of the distribution of the sample mean 1 X Ht m gt 1 tt quot39 n 9 E is an unbiased estimate of p Christopher Holloman The Ohio spam Univers ty Summer 2006 The standard deviation of the distribution of the sample mean The observations are independent so we can use the addItIon rule for varIances 1 0392 2 j 0392l 02x2 02 1 2039203920392 n 1 2nO39 O39 n n 039 07 SO x in 2 Christopher Holloman The Ohio Sham Univers ty Summer 2006 0 Example The height X of a single randomly chosen young woman varies according to the N645 25 distribution Suppose we randomly sample 100 young women What is mean and variance of the distribution of x Christopher Holloman The Ohio Sham Univers ty Summer 2006 0 We know the mean and variance but that doesn t tell us everything we need to know about the distribution of the sample mean 0 First let s examine one special case the normal distribution Christopher Holloman The Ohio spam Univers ty Summer 2006 Sampling Distribution of the Sample Mean If a population isgistributed N040 then the sample mean x of n independent observations has the NOAHJ2 distribution 9 The sampling distribution of 3 depends on the sample size n the distribution is more spread out larger variance the smaller the sample size Christopher Holloman The Ohio spam Univers ty Summer 2006 10 observations 1 observation Christopher Holloman The Ohio State Univers ty Summer 2006 o More generally any linear combination of independent normal random variables is also normally distributed 0 The sample distribution is a special case of this it s a linear combination of n independent random variables Christopher Holloman The Ohio State Univers ty Summer 2006 Question What if we have an SRS from a population that is not normally distributed Answer Central Limit Theorem CLT Draw an SRS of size n from any population with finite mean u and finite standard deviation 5 When n is large the sampling distribution of the sample mean T is approximately normal 5 is approximately NotchZ Christopher Holloman The Ohio State Univers ty Summer 2006 Central Limit Theorem in action X N Exp1 Sam le size of 1 figure a 2 figure b 10 figure c 25 igure d OX 0 a Distribution of a sample means This even works for discrete random variables Christopher Holloman The Ohio State Univers ty Summer 2006 Question How large must the sample size be for the Central Limit Theorem to apply Answer It depends on the shape of the distribution we are samplin from More observations are required i the distribution of X is far from normal Rule of Thumb CLT is usually applicable for n gt 30 13 o Newly manufactured automobile radiators may have 0 1 2 or more leaks in them The number of leaks in radiators made by one sw lier has mean 015 and standard deviation 04 at type of distribution is this 0 Suppose the supplier ships 400 radiators to an assembly lant What is the distribution of the mean number of eaks in this shipment 0 Over many years many of these shipments are made What range of values will contain the middle 95 of the mean number of leaks Christopher Holloman The Ohio seam Univers ty Summer 2006 Statistical Inference Idea Estimate parameters of the population distribution using data How Use the sampling distribution of sample statistics and methods based on what would happen if we used this inference procedure many times 1 Confidence Intervals 2 Hypothesis Tests Note Be sure that ou understand the meaning of these procedures in ad ition to being able to use them 15 Confidence Intervals Idea We use a sample statistic tg estimate a population parameter eg use x to estimate p A confidence interval tells us how confident we are in our estimate A confidence interval will have the form estimate margin of error The smaller the margin or error the higher our confidence in our estimate Christopher Holloman The Ohio spam Univers ty Summer 2006 Example Assume that the sampling distribution of x is Nu 45 x lies within 9 of u in 95 of all samples so u also lines within 9 of x in those samples Density curve of Probability 095 LL 9 u unl nown uVll 9 Christopher Holloman The Ohio State Univers ty Summer 2006 9 In 95of samples E 9 lt u lt E 9 We say that x 9 x 9 is a 95 confidence interval for u Requirements of a Confidence Interval for an Unknown Parameter 1 an interval of the form a b where a and b are numbers computed from the data 2 a con dence level that gives the probability that an interval computed this way covers the parameter Usually confidence levels are 90 or 95 Christopher Holloman The Ohio State Univers ty Summer 2006 Definition of a Confidence Interval A level C confidence interval for a parameter is an interval computed from sample data by a method that has probability C of producing an interval containing the true value of the parameter Note The following statement is INCORRECT The probability that the unknown parameter is contained within a level C confidence interval is C Why is this wrong Christopher Holloman The Ohio State Univers ty Summer 2006 Density curve of Christopher Holloman The Ohio State Univers ty Summer 2006 10 Confidence Intervals for the Population Mean Recall g is approximately Nuc7 Z by the Central Limit Theorem To construct a level C confidence interval for u assuming we know a 0 Let 2 be the point such that the area under the NO1 curve between 2 and z is C Christopher Holloman The Ohio State Univers ty Summer 2006 Christopher Holloman The Ohio State Univers ty Summer 2006 11 0 Notice that any normal curve has probability C between the points 2 standard deviations below the mean and 2 standard deviations above the mean Why So there is probability C that 5 lies between 039 n 7 z andHZ r Christopher Holloman The Ohio spam Univers ty Summer 2006 This is the same as saying that 95 percent of the time in repeated sampling from the population with mean p and standard deviation 0 p will lie between 039 7 m 039 xmzquot and xzi 5 9 This is our level Cconfidence interval for p ie our estimate of p is x and our margin of error is 045 Christopher Holloman The Ohio spam Univers ty Summer 2006 12 The most commonly used confidence levels are C 90 95 99 2 1645 196 2576 z for other confidence levels can be found similarly from the Normal Table Table A from the bottom row of Table D t distribution critical values or using Minitab Christopher Holloman The Ohio spam Univers ty Summer 2006 Example Scores on a test of quantitative skills range a from 0 to 500 A simple random sample of 840 men aged21 to 25 took the exam Their average score was x 272 Suppose we know that the opulation standard deviation for this test 6 is equa to 60 What can we say about the population mean score u of all 95 million men in this age group Find a 90 confidence interval for the mean test score Christopher Holloman The Ohio spam Univers ty Summer 2006 13 b Find a 99 confidence interval for the mean test score c Find an 80 confidence interval for the mean test score Christopher Holloman The Ohio spam Univers ty Summer 2006 Meaning of Confidence Note We don t know if any of the above confidence intervals contain u or not Then what do we mean by confidence The meaning of Confidence When we say 95 confident we mean that if you use 95 confidence intervals often in the long run 95 of your intervals will contain the true value of 0 Christopher Holloman The Ohio spam Univers ty Summer 2006 14 Our confidence is in the process not in any one specific interval Remember that probability chance is associated only with a random phenomenon After you have constructed a confidence interval from a random sample there is no randomness left in it Hence it doesn t make sense to attach any probability statement to a specific numerical confidence interval Christopher Holloman The Ohio spam Univers ty Summer 2006 Behavior of Confidence Intervals Question What happens to the margin of error when sample size increases Does it increase decrease or stay the same Question How does changing the sample size affect the size of the resulting confidence interval Christopher Holloman The Ohio spam Univers ty Summer 2006 15 0 Question What happens to the size of the confidence interval as we decrease the confidence level C Hint what happens to the value of 2 Christopher Holloman The Ohio seam Univers ty Summer 2006 Note the tradeoff We would like to have a smaller margin of error narrower interval as well as high confidence but the interval gets wider as our confidence gets higher Christopher Holloman The Ohio seam Univers ty Summer 2006 16 0 Question How does the size of o affect the margin of error Christopher Holloman The Ohio spam Univers ty Summer 2006 Thus we have 3 ways of reducing the width of the confidence interval 1 Increase the sample size n 2 Decrease the confidence level C 3 Decrease the standard deviation 0 Christopher Holloman The Ohio spam Univers ty Summer 2006 17 Choosing the Sample Size 0 We saw that we can have a high degree of confidence as well as a small margin of error by using a large sample size 0 Usually researchers will have a desired confidence level and margin of error they want to attain 0 So one aspect of designing any study is to decide the number of observations needed Christopher Holloman The Ohio spam Univers ty Summer 2006 Let m represent the desired margin of error Recall the formula of margin of error Solving for n we get Always round your answer up Christopher Holloman The Ohio spam Univers ty Summer 2006 18 Example Suppose the GSA at the Ohio State wants to estimate the mean monthl Income of OSU graduate students within 100 with 95 con idence How many students should the GS sample Assume that the standard deVIation of incomes of OSU graduate students is 421 Christopher Holloman The Ohio spam Univers ty Summer 2006 19

