Introduction to Statistics I
Introduction to Statistics I STA 2122
Popular in Course
Popular in Statistics
This 25 page Class Notes was uploaded by Jonatan Kunde on Monday October 12, 2015. The Class Notes belongs to STA 2122 at Florida International University taught by Dane McGuckian in Fall. Since its upload, it has received 21 views. For similar materials see /class/221817/sta-2122-florida-international-university in Statistics at Florida International University.
Reviews for Introduction to Statistics I
Report this Material
What is Karma?
Karma is the currency of StudySoup.
Date Created: 10/12/15
Created by Professor Dane McGuckian Identifying the Target Parameter Recall Inferential statistics are used to make predictions and decisions about a population based on information from a sample The two major applications of inferential statistics involve the use of sample data to 1 estimate the value of a population parameter and 2 test some claim or hypothesis about a population In this Chapter we introduce methods for estimating values of some important population parameters We also present methods for determining sample sizes necessary to estimate those parameters The unknown population parameter that we are interested in estimating is called the target parameter Some helpful key words are provided below to determine our target parameter I Parameter I Key Words or Phrases I Type of Data I I Mean Average I Quantitative I I P I Proportion Percentage Fraction Rate I Qualitative I Estimating a Population Mean Using a Confidence Interval l Recall A point estimator of a population parameter is a rule or formula that tells us how to use the sample data to calculate a single number that can be used to estimate the population parameter For all populations the sample mean f is an unbiased estimator of the population mean u meaning that the distribution of sample means tends to center about the value of the population mean u For many populations the distribution of sample means x tends to be more consistent with less variation than the distributions of other sample statistics We have used point estimators before to estimate target parameters however we cannot assign any level of certainty with those point estimators To remove this drawback we can use what is called an interval estimator An interval estimator or Con dence Interval is a formula that tells us how to use sample data to calculate an interval that estimates a population parameter The con dence coef cient is the relative frequency with which the interval estimator encloses the population parameter when the estimator is used repeatedly a very large number of times Created by Professor Dane McGuckian The diagram shown below shows the coverage of 8 confidence intervals C1 s The vertical line shows the location of the parameter u all the intervals capture the parameter except C1 2 1f the confidence level was 95 for each of these intervals we would expect only 5 of the intervals to fail to capture the parameter as C1 2 has done C11 C12 C13 C14 C15 C16 C17 C18 u The Most common choices for the confidence level are 90 95 or 99 a 10 a 5 a 1 A little notation The value zais defined as the value of the normal random variable Z such that the area to its right is a a 2 area in this tail Za Our goal for this section is to be able to estimate the true value of the population parameter u mu 2 mean Created by Professor Dane McGuckian What we will need is sample data This should include the sample mean the sample size and the sample or population standard deviation We will also need a confidence level and a zrchart The Log39c of a confidence interval can be understood by considering the following ideas First recall that under the empirical rule approximately 95 of the data will fall between 2 standard deviations from the mean Also recall that the CLT tells us that for samples of size n drawn from the populationj Nuo39Zn Note 7 Nuo392n means the sample mean is normally distributed with mean u and variance 0392 n It should make sense that p 721117 2l would capture about 95 of all the J2 J2 sample means possible Now consider the drawing below The z score separating the righttail is commonly denoted by 111 and is referred to as a critical value because itis on the borderline separating sample mean values that are likely to occur from those that are unlikely to occur S m chance with probability denoted by at of falling in one of the red tails of the gure Denoting the area of each shaded tail by 012 we see that there is a total probability of a that a sample mean will fall in either of the two red tails By the rule of complements from probability there is a probability of 10 that a sample mean will fall within the inner region of the gure below Created by Professor Dane McGuckian 1 7 Alpha ZwZ ZaZ The critical value zo is the positive z value that is at the vertical boundary separating an area of 02 in the right tail of the standard normal distribution The value of zo 2 is at the vertical boundary for the area of 02 in the left tail The subscript 02 is simply a reminder that the z score separates an area of 02 in the right tail of the standard normal distribution We can see from the drawing that the P zm2 lt Z lt zit2 1 a Now substitute for Z x togetP z ltT ultzm2 1 a 052 J2 Next we may solve the compound inequality for u PZa2lt3Ia ltZa2 1a n J Multiply all three sides of the inequality by negative one 039 039 P za gt E gt za 1 a 2 n I 2 Add X bar to all three sides and write inequality in proper order P f z iltultfz i 1 a 052 J 052 J 5 Now we must drop the Probability notation because u is not a random variable but is instead an unknown constant It is either in the interval or it isn t there can t be any notion of probability here if the value of u does not vary randomly Finally we can say that we are 1 a100 thatf zit2 lt u lt 3 zit2 n n Created by Professor Dane McGuckian The I 11005 Con dence Interval for the Moan 7 I Cr X 39Zmz f rKr mart L 4 I The part of the interval above given by 2m 1is called the margin of error 11 The margin of error is the maximum likely difference observed between sample mean x and population mean u and is denoted by E The formula above has only one quantity which we will not be given directly zm2 Here are the steps to finding zmWhen table A3 can t be used Identify the Clevel Find the CLevel2 Go to the Ztable7 in the body of the table look up the number found in step 2 2m the bold numbers on the side and top of the table PP E U Find szor a 95 confidence interval The confidence level is 095 Dividing that in two gives us 04750 Looking up 04750 in our ztable gives us our answer zm2 196 012 2 0025 ozZ 0025 Created by Professor Dane McGuckian Confidence LaveI 95 12 0025 12 0025 za2 796 Z O 2112 796 Note Using table A3 provided with the formula card on my web site is much easier than the above method Find zm for a 90 cr Find zm fora95 Cl Find zm fora99 Cl Now that we can find zm it will be easy to create our confidence intervals Before we create a confidence interval to estimate the mean we should look at the requirements for the constructing these intervals 1 The sample is a simple random sample All samples of the same size have an equal chance of being selected 2 The value of the population standard deviation ais known 3 Either or both of these conditions is satisfied The population is normally distributed or n gt 30 Steps to Create a Confidence Interval 1 List all given sample data from the problem including sample size and Clevel 2 Find zm2 543 Calculate the margin of error E z i J A Calculate X EJiE A note about notation f EjiEhs often written as i E lt y lt FriE Created by Professor Dane McGuckian x In sociology a social network is defined as the people you make frequent contact with The personal network size for each adult in a sample of 2819 adults was calculated The sample had a mean personal network size of 146 with a known population standard deviation of 98 a Give a point estimate for the mean personal network size of all adults b Form a 95 confidence interval for the mean personal network size of all adults c Give the practical interpretation of the interval created in part b d Give the conditions required for the interval to be valid answer The sample must be random and 11 should be large n 2 30 llllmportant A study found the body temperatures of 106 healthy adults The sample mean was 982 degrees and the sample standard deviation was 062 degrees Find the margin of error E and the 95 confidence interval for y Conclusion There are some relationships that should be understood If confidence goes upT so does the interval widthT If confidence goes downL so does the interval widthL If sample size goes upT interval width goes down i Sample Size for estimating the mean u Suppose we want to collect sample data with the objective of estimating some population The question is how many sample items must be obtained By solving the margin of error formula for n we can arrive at the following sample size formula 2 n ZaZo E where gm 2 critical z score based on the desired confidence level E desired margin of error a 2 population standard deviation When nding the sample size n if the use of formula above does not result in a whole number always increase the value of n to the next larger whole number Created by Professor Dane McGuckian t Nielsen Media Research wants to estimate the mean time that full time college students spend watching tv each weekday Find the sample size necessary to estimate that mean with a 15 minute margin of error Assume that 96 confidence is desired and assume that the population standard deviation is 1122 minutes Estimating a Population Mean O39Not Known This section presents methods for finding a confidence interval estimate of a population mean when the population standard deviation is not known With 0 unknown we will use the Student t distribution assuming that certain requirements are satisfied Recall The CLT says if X Nu0392 then X Nu0392 nfor any sample size n7 no matter how small However if X is not normal we need a sufficiently large sample size to assume normality When the population standard deviation039is unknown we use the sample standard deviation S as a substitute but for small sample sizes S may not be a very good substitute Our goal for this section is to be able to estimate the true value of the population parameter u mu 2 mean when 1 039is unknown 2 The sample is normally distributed or n 2 30 As before we will need certain information from the sample to form our confidence interval Note We no longer are using a z value for our confidence interval instead we will need a value from a related distribution the Student s t distribution The t distribution has a shape like that of the standard normal distribution Z distribution but it is a little heavier in the tails and consequently a little lower at its center Actually the specific shape of the t distribution is determined by its degrees of freedom 2 n 7 1 Important Properties of the Student t Distribution Created by Professor Dane McGuckian l The Student t distribution is different for different sample sizes see the gures below 2 The Student t distribution has the same general symmetric bell shape as the standard normal distribution but it re ects the greater variability with wider distributions that is expected with small samples 3 The Student t distribution has a mean of t 0 just as the standard normal distribution has a mean of z 0 4 The standard deviation of the Student t distribution varies with the sample size and is greater than 1 unlike the standard normal distribution which has a s l 5 As the sample size n gets larger the Student t distribution gets closer to the normal distribution Sfcmdard normal disfribu on Sfudenf 7 distribu on wiH z r7 Z Sfudenf r distribu on wifh I7 3 t distribution Normal z distribution 0 Z025 025 196 2776 Skip this if you do not want to know how Z and t are related By taking a Standard Normal random variable Z and dividing it by the square root of a Chi squared random variable V which is divided by its degrees of freedom v we get a random variable that has a Student s t distribution Z Vv Created by Professor Dane McGuckian To get the needed critical value rm we need to 1 Get the degrees of freedom df n 7 1 2 Find 0 1 CIlevel 3 Look up the degrees of freedom and a on the t table from the inside cover of our text or use the table titled A 3 on the web page Find t mfor a 90 confidence interval with a sample size of n 31 Find taZfor a 99 confidence interval with a sample size of n 29 quot Find to such thatPt S to 0025 when n 41 Now that we can findt it is time to learn how to create our confidence interval to 052 5 estimate the mean Steps to Create a Confidence Interval 1 List all given sample data from the problem including sample size and C level 2 Find til2 U3 Calculate the margin of error E til2 n b Calculate E EE E 11 213 1111 A study found the body temperatures of 106 healthy adults The sample mean was 982 degrees and the sample standard deviation was 062 degrees Find the margin of error E and the 95 confidence interval for g 9808 lt u lt 9832 111 Because cardiac deaths appear to increase after heavy snow falls an experiment was designed to determine the cardiac demands of manually shoveling snow Ten subjects cleared tracts of snow and their maximum heart rates were recorded Their average maximum heart rate was 175 with a standard deviation of 15 Assuming maximum heart rates are normally distributed find the 95 confidence interval estimate of the population mean for those people who shovel snow manually 2 J Flesch ease of reading scores for 12 different pages randomly selected from JK Rowling s Harry Potter and the Sorcerer s Stone Find the 95 interval estimate of m the mean Flesch ease of reading score The 12 pages distribution appears to be bell shaped with x 8075 and s 468 Created by Professor Dane McGuckian Choosing Between z and t Method Conditions Zdistribution 039 known amp normally distributed or O39known amp 11 gt30 tdistribution O39not known amp normally distributed or O39not known amp 11 gt30 nonparametric Population is not normally distributed and n S 30 1 Is the A Is the Yes Dopulatiort normal YES opuation normaly it dristributed 39 xgjistributed7 Yes 2 i Z Use nonparametric t Use nonparametric U58 H79 Norma or bootstrapping Use the I or bootstrapping diS l I iDUTiOH methods distribution methods LargeS ample Confidence Interval for a Population Proportion In many real world scenarios we would like to estimate a population proportion If we look at 11 randomly selected subjects and x of then have some trait we are interested in we can form a sample proportion from the data A x p where X the number of subjects hav1ng the trait we are interested in n This proportion is a sample proportion since it is only based on n subjects from some A x larger population We can use this p to estimate the population proportion 71 Created by Professor Dane McGuckian Since for each sample drawn of size n a different amount X of subjects will have the desired trait the probabilities associated with each possible value of f i will be equal n to the probability associated with each possible value of X X has a binomial distributioniwe can approximate this distribution when n is large as long as n is large enough that t 2039 will fit inside of 01 using the standard normal Z distribution Remember that XBinomial u np 0392 npq f i will have the following mean and standard deviation n Pq p and 01 7 To understand why these values are as they are look at the following properties of expectation E aX aE X and VaraX anar X SinceEXnp E j p n n SinceVarX npq Var j g E n n n As before we would like to have more than just a good point estimator of the population proportion so in this section we will learn how to form an interval estimate of the true population proportion From above we can recall the mean and standard deviation of f i is n paand o E n n Now using the assumption that n is large we can approximate the sampling distribution of f i by the normal distribution n As in when estimating the mean we will use the following interval form to estimate the true population proportion Created by Professor Dane McGuckian Point Estimate i Number of Standard Deviation 5 Standard Error Steps to creating a Confidence Interval for a population proportion 1 Gather sample data X or 13 n and C level 2 Calculate 193 amp 1 f2 g n 3 Calculate the standard error O39 z E n 4 Find Zm 5 Calculate the Margin of Error E ZW2 E n 6 Finally form f E f E if v39 i5 A nationwide poll of nearly 1500 people conducted by the syndicated cable television show Dateline USA found that 70 percent of those surveyed believe there is intelligent life in the universe perhaps even in our own Milky Way Galaxy What proportion of the entire population agrees at the 95 confidence level The Ford Motor Company commissioned a study to determine where professional leaders stand on government policy issues Among the 7000 leaders in business education media government and environmental advocacy that took part in the survey 80 felt that industry should be held liable for environmental damage caused by their actions Construct a 90 confidence interval estimate of the true proportion of industry leaders who believe companies who harm the environment should be held liable Created by Professor Dane McGuckian 1r When Mendel conducted his famous genetics experiments with peas one e of offspring consisted of 428 green peas and 152 yellow peas Mendel expected samp that 25 of the offspring peas would be yellow a Find a 95 confidence interval estimate for the true proportion of yellow peas b Do the results from part a contradict Mendel s theory Note Watch out for problems where p is close to 0 or 1 11 would have to be very large for the sampling distribution of f to be approximately normal I If p is close to 0 or 1 Wilson s adjustment for estimating 17 yields better results quot10 pizaz f where x2 p n4 is l L Suppose in a particular year the percentage of firms declaring bankruptcy that ha shown profits the previous year is 002 If 100 firms are sampled and one had declared bankruptcy what is the 95 Cl on the proportion of profitable firms that will tank the next year Solution quot 130 13 i p p ZaZ n4 x2 12 0289 n4 1004 p0289i196 w 1004 P 0289i 032 Inferences Based on a Single Sample Tests of Hypotheses In this chapter we will begin to learn how to test a hypothesis Of course then we will need a hypothesis to test We will form our hypotheses from a claim The Federal Aviation Administration claims that the mean weight of an airline passenger with carry on baggage is greater than the 185 lb that is was 20 years ago express this claim symbolically Created by Professor Dane McGuckian This claim can be reworked symbolically to form two competing hypotheses The first hypothesis we will form is called the Null Hypothesis which usually expresses the status quo scenario It is denoted by H0 In a hypothesis test we start out by assuming H0 is true We are always testing H0 during the test H0 must always have an equal sign The second hypothesis we will form from the claim is the Alternative Hypothesisi alternative because it is the alternative to the Null hypothesis It is also often referred to as the research hypothesis It is denoted by H A H A always has one of the symbols lt gt or The H A will determine the kind of test we conduct on the null hypothesis more on that later note some texts use H1 instead of H A to denote the alternative hypothesis 7 Using the claim from the FAA above create our set of competing hypotheses H0 and HR Hozy185 and HAzygt185 The way we will test this claimis to collect a sample assume that the Null Hypothesis is true and then determine if our sample supplies evidence that indicates that the Null is not true In other words suppose that our sample result seems very extreme assuming the Null Hypothesis is true we would then be left with two possible explanations 1 The Null is really correct and we just happened to select an unusual sample or 2 The sample isn t 15 Created by Professor Dane McGuckian really that unusual it just seems so because we are assuming the Null Hypothesis is correct The whole process relies on the idea that in order for us to reject the Null Hypothesis we need to observe a sample that would only show up very rarely just by chance when the Null is true But how do we know how rare our sample results are The only way to know how rare the sample is would be for us to use a summary statistic to extract the information contained in the sample After summarizing the information contained in the sample into a statistic we will need to know the probability distribution for the statistic formed This statistic will be called the Test Statistic The Test Statistic will be a z value that measures the distance as a number of standard errors between the value of X and the mean u specified under the Null Hypothesis K Suppose that after taking a random sample of airline passengers we have the following data 11 36X 2001bs and 0333 known from previous studies Use the CLT the null hypothesis from above and the data above to create a test statistic that has a Z distribution Since we know the distribution of the normal random variable Z we can determine how unusual our test statistic is under the assumption that the Null hypothesis is correct In fact we might decide that if the chance of getting a statistic as extreme or even more extreme is less than some predetermined value we will conclude that the Null can be rejected That predetermined value will be called the Critical Value The Critical Value will be the boundary point between the Rejection Reg39on and the rest of the number line The Rejection Reg39on refers to the values of the test statistic for which we will decide to reject the null hypothesis I If the test statistic has a high probability when H is true then Hois not rejected I If the test statistic has a very low probability when Hois true then Hois rejected When testing a hypothesis we will need to know what kind of extreme test statistic would make us question the validity of our Null hypothesis For example we need to know if we are concerned if the test statistic seems abnormally large abnormally small or is either too small or too large Abnormally small or large values in a distribution will fall into the tails of the distribution The tails in a distribution are the extreme regions we will bound by our critical values Created by Professor Dane McGuckian Rejecf Failrorejecf Ho Ho Sign used In H lt diva lib Fail to reject Ho 39 H0 Sign used in H gt U ml Fail to reject 0 Sign used in Hr 9 Before we can determine the critical value for a hypothesis test we need to decide the maximum amount of error we are willing to allow assuming the Null hypothesis is true To understand this consider the hypothesis we have been using in our examples above What if in reality the true weight of airline passengers including carry on luggage was 185 pounds Is it impossible for us to have unusual samples even given that the null is true Of course it is possible to have extreme or unusual samples but they would be rare occurrences Using our z chart we can select the critical value such that when the Null hypothesis is true our test statistic has only a very small chance of being more extreme l7 Created by Professor Dane McGuckian than the chosen critical value The significance level is defined as the maximum probability of committing the error of rejecting a true Null hypothesis It is denoted as a This critical value might remind you of Za Recall the area in the tail here is equal to alpha In the diagram below the researchers were testing a claim that said y gt 2400 and they had sample data that produced a sample mean of E 2430 which was not far enough to the right on the number line to put it in the rejection region 103 e 39 cfion jg m region Find the critical value for a test of the above FAA hypothesis HA ugt 185 at the 1 significance level If the test statistic we created earlier is greater than the above found critical value we will reject the Null hypothesis Created by Professor Dane McGuckian 32 l l39 39 Assume that the data have a normal distribution and the number of observations is greater than 50 Using a 005 for a lefttailed test nd the critical 1 value used to test the null hypothesis 1645 Let us summarize what we have done with the FAA example above Claim The Federal Aviation Administration claims that the mean weight of an airline passenger with carry on baggage is greater than the 185 lb that is was 20 years ago H0 2 u 185 Hypotheses H A u gt185 Sample Data 11 36 X 2001bs and 039 333 known from previous studies Test Statistic z Xff 2703 J2 Significance Level 05 001 Type of Test Right tailed H A u gt185 Critical Value 2326 Initial Conclusion Reject the Null Hypothesis It is important to word our conclusions carefully We want to make sure that we address the original claim and we want to make sure that we do not say more than the evidence grants us to say To learn how to word our conclusions properly look at the ow chart provided with the formula card on the web site it has been reproduced below Created by Professor Dane McGuckian Wording of final conclusion l There is sufficieni This is The original claim con39i ai evidence 10 warmm only case In which rhe rejection of The claim Thai original claimquot fhe condi l ion of Original claim original claim contains equality is rejected quotThere is n01L sufficien39i39 evidence 1 0 warrani39 rejeci39ion of The claim that original claimlquot Original claim does nor contain equality and becomes H1 7quot l n This is The Do The sample dafa ony case In you rejeciL i supporr fhe claim hich 152 hf P6867L Ho 1 hai original claimquot mgmal Claim is supporiea No Fdi To rejed H0 There is not sufficieni sample evidence to suppori39 i39he claim fhai39 original claimquot Finally let s look at the four possible outcomes for our hypothesis test Conclusions Reality The Null Is True The Null Isn t True We Fail to Reject Null Correct Decision Type 11 Error We Reject the Null Type IError Correct Decision 1 A Type I error is the mistake of rejecting the null hypothesis when it is true 1 A Type II error is the mistake of failing to reject the null hypothesis when it is false In hypothesis testing we want to make sure the worst of the two possible errors is the type I error The reason for this is that we design the test to control the probability of the type one error The below table explains how the signi cance level is related to the type 0116 error LLi ll Fo a lefttailed test will i At most a For a righttailed test At most a For a twotailed test Exactly equal to a Note We have notation for the probability of a type two error PType II error To reduce the error rate for a type one error we lower alpha 0 but this will increase the likelihood of committing a type 11 error 20 Created by Professor Dane McGuckian Consider a criminal trial to understand why if a country decides to convict any person who ends up in court regardless of the evidence they will not let any guilty people go free who end up in court but as a result a lot of innocent people will end up in jail If you decide to let people off the hook for a crime unless the evidence is overwhelmingly against them you will end up letting many guilty people go free This tug of war always exists we cannot reduce both kinds of errors at once guarding against one will produce more of the other unless we can get more evidence For us this would mean taking a larger sample size To reduce the error rate for both a type one error and a type two error we need to increase the sample size To summarize the following set up steps can be used to conduct a test of hypothesis Steps to test a hypothesis 1 Express the original claim symbolically 2 Identify the Null and Alternative hypothesis 3 Record the data from the problem 4 Calculate the test statistic 5 Determine your rejection region 6 Find the initial conclusion reject the null hypothesis with possible Type I error or do not reject it with possible Type 11 error 7 Word your final conclusion 12 1111 When 40 people used the Atkins Diet for one year their mean weight change was 21 lbs Assume that the population standard deviation for all such weight changes is known to be 48 pounds Use a significance level of 005 to test the claim that the mean weight change is less than zero Does the diet seem to be effective Does the mean weight change seem substantial enough to justify the diet What assumptions are necessary for the test we just conducted to be valid 39 I We have a sample of 106 body temperatures having a mean of 98200F Assume that the sample is a simple random sample and that the population standard deviation 039is known to be 0620F Use a 005 significance level to test the common belief that the mean body temperature of healthy adults is equal to 9860F The following assumptions need to be upheld in order for the results of the above tests to be valid 0 The sample was selected randomly 0 039 known amp normally distributed or O39known amp 11 gt30 21 Created by Professor Dane McGuckian PValue Method The Observed Signi cance Level or PValue for a specific statistical test is the probability assuming the null is true of observing a value of the test statistic that is at least as extreme as the test stat computed from your sample data Recall that we always test the null hypothesis so the initial conclusion will always be one of the following 1 Reject the null hypothesis 2 Fail to reject the null hypothesis In the Traditional method of hypothesis testing we Reject H0 if the test statistic falls within the critical region Fail to reject H0 if the test statistic does not fall within the critical region In this section we will learn the Pvalue method Reject H0 if the P Value lt a where ais the significance level such as 005 Fail to reject H0 if the P Value gt 0c In a study of the effects of prenatal cocaine use on infants the following sample data on birth weights was obtained n 36 28000 645 Using a significance level a 001 test the claim that the mean birth weight for children of cocaine users is less than 3103 grams the mean weight for children who had mothers who did not use cocaine Claim u lt 3103 mean for children who had mothers who did not use cocaine 1 Calculate the test statistic 2 Find the p Value To help you with part two of the example question above you may bring the page contained in the website s formula card It is recreated below 22 Created by Professor Dane McGuckian Le aIled Wf Pighf Taied type of fes 7 Two 1 cu39ed Le fhe fes r sfafis c Pighf 0 The righf or eff of center V v Pvaue area P value fwice P v alue flyre pvaue area To 16 eff of H13 fhe area f0 H12 fhe area 1 0 172 f0 me rig1f of ve es sfa s c eff of the Tesi rig11 of fhe 39I39esf 1125 Sm s c s1 a39l39is39l ic sfa fis c Pvalue P value is P vaue is value fwice fhis area fwice fhis area L Tesf sfa s ric LT251 sfa sfic T251L sfafis c j T257L sfa ris c j To use the pvalue to test a hypothesis we simply need to compare it to our stated significance level If p lt a we reject the null hypothesis If p gt 0 we do not reject the null hypothesis 0 note we should not encounter to many situations where the pvalue is equal to the significance level however if it happens it is up to the statistician to decide if the evidence warrants rejection of the null or not 7 A researcher claims that the average age of cable TV viewers who buy products from home shopping networks is 51 years old If you randomly selected 50 cable TV shoppers with a mean age of 523 find the pvalue to test the researcher s claim Assume the population standard deviation is known to be 62 we l Lt In 1980 the average time to complete a fouryear degree was 49 years In 2006 a study of 31 randomly selected students had an average completion time for their fouryear degree of 53 years with a population standard deviation of 1 year Use the pvalue method to test the claim that the mean time to complete a fouryear degree is now more than 49 years 23 Created by Professor Dane McGuckian The ttest If you recall the situation we faced when constructing confidence intervals when the population standard deviation was unknown it will come to no surprise to you that when testing hypotheses without knowledge of the population standard deviation we will need to use the t distribution amp normally distributed or O39known amp 11 gt30 or 039 not known amp 11 gt30 not n S 30 Aside from the change from Z distribution to t distribution the problems in this section are the same The only step that changes is step 5 below This change is minor because we will use table A 3 provided on the web Steps to test a hypothesis 1 Express the original claim symbolically Identify the Null and Alternative hypothesis Record the data from the problem Calculate the test statistic Determine your rejection region Find the initial conclusion Word your final conclusion IOUIJRWN The Windsor bottling company received complaints that their 12oz root beer bottles contained less than 12 ounces in them When 24 bottles are randomly selected and measured the amounts had a mean of 114 ounces and a standard deviation of 062 ounces Test the claim that consumers are being cheated If the company says the sample is too small for the results to be being meaningful is that reasoning valid here L l In previous tests baseballs were dropped 24ft onto a concrete surface and they bounced an average of 9284 inches In a test of a sample of 28 new balls the bounce heights had a mean of 9267 inches and a standard deviation of 179 inches Use a 5 level of significance to test the claim that the new balls have a mean bounce height different from 9284 Testing a Claim about a Proportion 24 Created by Professor Dane McGuckian A lot of times the most interesting problems arise when we are looking at survey data Usually the data generated from surveys is proportional in nature This means the distribution of the sample proportion will often be Binomial however we can approximate the distribution of the sample proportion with the Normal distribution if the sample size is sufficiently large Poqo n Note Sufficiently large means that the interval p0i3 is entirely contained within0l The main change in our steps to testing a hypothesis will be our test statistic Test Stat for Testing Claims About a Proportion Sample Proportion Null Hypothesized Proportion f p0 Standard Error for Sample Proportion pqu n Another change will be the parameter used in our hypotheses It will be rho the Greek symbol for the population proportion For example H0 0 056 is the symbolic form of the claim that the population proportion is equal to 56 Other than the two changes mentioned above the seven steps given in earlier sections will work on these problems as well Steps to test a hypothesis 1 Express the original claim symbolically Identify the Null and Alternative hypothesis Record the data from the problem Calculate the test statistic Determine your rejection region Find the initial conclusion Word your final conclusion 2 3 4 5 6 7 Glamour magazine sponsored a survey of 2500 prospective brides and found that 60 of them spent less than 750 dollars on their wedding gown Use a 001 significance level test to test the claim that less than 62 of brides spend less than 750 on their wedding gown If these results were obtained from internet users who voluntarily went to the web to answer the survey does that affect the result of the survey in any way 2 An article distributed by the Associated Press included these results from a nationwide survey Of 880 randomly selected drivers 56 admitted that they run red lights Test the claim that the majority of all Americans run red lights 25