Popular in Course
Popular in Business
verified elite notetaker
This 20 page Class Notes was uploaded by Theresia Dare on Wednesday September 23, 2015. The Class Notes belongs to STAT201 at Drexel University taught by Staff in Fall. Since its upload, it has received 21 views. For similar materials see /class/212423/stat201-drexel-university in Business at Drexel University.
Reviews for IntroductiontoBusinessStatistics
Report this Material
What is Karma?
Karma is the currency of StudySoup.
Date Created: 09/23/15
Estimation Using Random Samples Stat 201 Prof Yanni Papadakis Key Terms Point Estimate Mean Std Deviation Proportion Interval Estimate Mean 5 known Mean 5 unknown t distribution Proportion Determination of Sample Size Unbiased Point Estimates Population Sample Parameter Statistic Formula Mean u 7 7 Variance 02 S2 S2 XXIquot32 n l Pro ortion 7c mumes p p p n trials Example Estimating m 3 Bob has been weighing himself once a week for the past month His measurements are 1905 1890 1955 1870 Estimate his average weight Estimate the std deviation of his measurements His long term average weight is XN190lb5lb What is the probability his average weight in 4 measurements is the one calculated above or higher Example Estimating m s Estimateu ZX 1905 1890 1955 1870 f l 1905 n 4 Estimated Zhr z S n71 190571905Z18901905Z19551905Z18701905Z 471 02253251225 363 Example Estimating Proportion ATM time use per service Age Gender Seconds 1 a under 30 years Female 501 2 a under 30 years Male 530 3 b 30760 years Female 432 4 a under 30 years Female 349 5 c over 60 years Male 375 6 b 30760 years Male 378 7 c over 60 years Male 494 8 a under 30 years Male 505 9 c over 60 years Male 481 10 a under 30 years Female 276 11 c over 60 years Female 556 12 b 30760 years Female 508 Estimate the proportion of female ATM users from the sample What is its std deviation Example Estimating Proportion X26 n12 X 6 o5 p n 12 0pp1 pO51 05O14 n 12 Example Estimating Proportion A poll using a random sample of 500 registered voters estimates a mayoral candidate is expected to gain the approval of 51 of the vote with margin of error 3 What is the std deviation of the proportion estimate Using Normal Approximation does it make sense calculate the probability the candidates approval rate will be within margin of error Confidence Intervals Overview Strategy Provide a region inside which we expect with great confidence real parameters to be confidence level is usually 95 or 90 eg forecast proportion in 52l2 For mean m 5 known use Normal Distribution 5 unknown use tDistribution For proportion p use Normal Approximation CI for m 3 known Confidence level C Usually C95 Otherwise clearly stated say C90 o Calculate Sample mean Std Error s sqrtn Critical 2 typically z196 Margin of Error 2 StdError CI Sample Mean Margin of Error CI for m 3 unknown Confidence level C Usually C95 Otherwise clearly stated say C90 Valid if Underlying distribution NORMAL Sample size ngt30 CLT Calculate Sample mean Sample std deviation 5 Std Error s sqrtn Degrees of Freedom df n1 Critical tt a1C2 df Margin of Error t StdError CI Sample Mean Margin of Error CI for the Mean Quality Control Longterm std deviation of rod diameters is s0053in A sample taken for quality control purposes had size n30 and sample average equal to 14m Calculate a CI for the true mean at eve95 Calculate a CI for the true mean at eve90 CI for the Mean Quality Control 7 14 0009681n zquot 196 C95 14 i 019 LCL 21381 UCL 1419 CI for the Mean Quality Control 7 14 of 0009681n zquot 1645 CI95 14 i 016 LCL 21384 UCL 1416 CI for the Mean Example A radio talk show in order to settle a dispute about the wages of city council members asks Is 60000 enough How much should council members be paid annually 958 people respond sample mean 9740 Sample std deviation 1125 The radio station needs a Confidence Interval for the mean salary calculated CI for the Mean Example 7 29740 V958 tleve1 095 df 957 2level 095 196 error margin 196gtlt 364 712 CI095 9740i 712 9668 8981 12 Is this a reliablestatistical estimate CI for the Mean Example A random sample of 10 employees has been selected from a firm in which a large number of tax preparers are employed Each of the subjects was given identical information and was asked to complete the tax return Sample Average 3942 min to complete Sample Std dev 1375 min CI for the Mean Example Con dence Interval for Mean C 095 5 1375 23942 s 2 2 2435 J2 M fquot za 025df 210 129 2262 C 95 3962i 984 LCL 2978 UCL 4926 CI for the Mean Example Con dence Interval for Mean C 099 7c 2 3942 t t0 025df 210 1 2 9 325 C95 3962i 1414 LCL 22528 UCL 25356 Example Supermarket Shoppers A marketing consultant observed 50 consecutive shoppers at a supermarket One variable of interest was how much each shopper spent in the store The sample statistics were mean2740 stdev2203 Calculate the 95 CI for the mean shopper expenditure under similar conditions The supermarket needs an estimate with margin of error at most equal to 4 What should the sample size be in order to meet this requirement Example Supermarket Shoppers Confidence Interval for Mean C 095 322740 s 312 tza025df50 149201 CI95 2740 i626 LCL2114 UCL3366 Cl width requirement 2 42t s 2196wcn2196w 21165 J 4 s CI Population Proportion Recall when Normal approx assumptions are met Thus Clpi2 lplp Critical 2 scores C 90 95 99 2 1645 1960 2576 Proportion CI Example In a USA TodayCNN poll 1406 adults were randomly selected from across the US In response to the question Do you agree that the current system discourages the best candidates from running for president 22 responded strongly agree Calculate 95 CI for propontion strongly agree Proportion CI Example plp CI piz n 022i196 PO225 1406 01980242 Calculate Margin of Error for p 05 Sample Size Determination Mean CI margin of error limit e gtilt 0 622 gt J5 2 gtilt 2 622011 n 2 202 e2 Example In order to estimate the CIO95 of the amount earned by teenagers in a state during the past summer vacation period with margin of error 50 how large a sample is required it is known s400 ngt196x4005022459 Min integer n246 Required Sample Size Proportion margin of error limit e s 1 922 n n 2 p1 pz e2 When p unknown use p 05 Example A tourist agency would like to determine the proportion of US adults who have ever vacationed in Mexico It wishes to be 95 confident and have a margin of error no more than 3 Determine required sample size What would the sample size be if C090 Required Sample Size Example p unknown use 05 n 2pl pze2 051 051960032 10671 Min integer n 1068 C 2090 n 2 051 0516450032 7517 Min integer n 752 When Population is Finite CI pZ pl p N n V n VN l Sample Size Determination 11gt 19019 eZ2p1p N Finite Population Example FAA lists 8719 pilots holding commercial helicopter certificates It wants to calculate the proportion of pilots planning to switch to another job in the next three years Design a study to estimate this proportion with margin of error 4 at C095 Is this result different to assuming infinite population Finite Population Example Use p 05 p unknown n2 pGp ez2 p 11 051 05 004196295 9 9 8719 25616 Min integer n 562 Exercise The ABA survey of community banks also asked about the loantodeposit ratio LTDR a bank s total loans as percent of total deposits The mean LTDR for the 110 banks in a random sample is 767 and the standard deviation s123 The sample size is large enough to assume population ss Calculate the 95 CI for LTDR The Standard Deviation and the Distribution of Data Values The Empirical Rule and T chebyshe 39s Theorem Suppose that a data set has mean Y and standard deviation s We re used to working with and interpreting the mean Y but what does the value of the standard deviation s tell us It s a measure of dispersion or variability in the data set but we can be more specific than that Here are two useful rules I The empirical rule The empirical rule tells us that if the data follows a normal distribution then Approximately 68 of the data values can be expected to lie within a one standard deviation interval around the mean ie in the interval Y i s Approximately 95 of the data values can be expected to lie within a two standard deviation interval around the mean ie in the interval Y i 2s Virtually all approximately 9973 of the data values can be expected to lie within a three standard deviation interval around the mean ie in the interval Y i 3s So for example if Y 50 and s 8 we d expect about 68 of the data in the interval 50 i 18 or 42 58 we d expect about 95 of the data in the interval 50 i 28 or 34 6639 and we d expect virtually all of the data in the interval 50 i 38 or 26 74 We now have information regarding the dispersion of the data around the mean If you re interested in intervals having widths other than one two or three standard deviations you can use normal curve tables to find the appropriate percentages 11 Tchebysheff s Theorem The empirical rule is limited in that it only applies to data that follows at least approximately a normal distribution Here is a rule called Tchebysheff s Theorem that applies to any shape distribution For any value of kthat is Z l at least 1001 lk2 of the data will lie within k standard deviations of the mean This is a general formula that can be used with any value of k of interest as long as k is at least one For example if we d like to build a one standard deviation interval around the mean apply the formula with k 1 It says that at least 1001 112 0 of the data will lie within one standard deviation of the mean That s useless information of course we already knew that The weakness of Tchebysheff s theorem is that since it must apply to any shape distribution it can t be very specific about the percentage of data in any interval Also it only gives a lower bound for that percentage In this case the lower bound provides no useful information Let s try a two standard deviation interval Plugging k 2 into the formula we learn that at least 1001 122 75 of the data will lie within two standard deViations of the mean That s much more useful it says that for any distribution we have assurance that at least of the data is within two standard deViations Once again keep in mind that this is a lower bound the actual percentage could be as low as 75 or as high as 100 We already know that if the distribution has a normal shape then the actual percentage is closer to 95 Let s suppose once again that we re working with a data set that has a mean of Y 50 and a standard deViation of s 8 Here s a table showing what we can say about the distribution of the data using both the empirical rule and Tchebysheff s Theorem For practice with the formula you should verify the results shown in the Tchebysheff column at k 15 25 3 and 4 Standard Interval Tchebysheff oo Empirical Rule oo Deviations k l 50 i 18 or 42 58 At least 0 Approx 68 15 50 i 158 or 38 62 At least 56 2 50 i 28 or 34 66 At least 75 Approx 95 25 50 i 258 or 30 70 At least 84 3 50 i 38 or 26 74 At least 89 Virtually all 9973 4 50 i 48 or 18 82 At least 94 If the data appears to follow a normal distribution then the empirical rule is preferred as it is more specific Otherwise the empirical rule does not apply and Tchebysheff s Theorem should be used