Fall 2007 Stats 250 Exam 1 with detailed Solutions and Explanations
Fall 2007 Stats 250 Exam 1 with detailed Solutions and Explanations STATS 250
Popular in Introduction to Statistics
Popular in Statistics
verified elite notetaker
This 11 page Bundle was uploaded by Debra Tee on Wednesday October 5, 2016. The Bundle belongs to STATS 250 at University of Michigan taught by Brenda Gunderson in Fall 2016. Since its upload, it has received 9 views. For similar materials see Introduction to Statistics in Statistics at University of Michigan.
Reviews for Fall 2007 Stats 250 Exam 1 with detailed Solutions and Explanations
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 10/05/16
Statistics 350 Fall 2007 Exam 1 Explanations 1. a. This is an experiment because the treatment groups were randomly assigned. b. Weight loss is a response variable because it is what we wish to study. A response variable cannot also be an explanatory nor confounding variable so neither of these is appropriate. Weight loss is also a (continuous) quantitative variable since the amount of weight lost is something that is measured. A quantitative variable cannot also be categorical and there is insufficient information to conclude weight loss is normal. c. The value of 2 pounds is a statistic since it is calculated based on the sample data. d. Use the boxplots to answer the remaining questions. i. For the no incentive group, 75% lost at least 1 pound. ‘At least’ means this much or more, so we want the value for which 75% of the data in this group is higher. This is Q1, the bottom of the box. ii. The most weight lost by an employee in the $14 incentive group was 15 pounds. This value may be an outlier, but it is still the maximum. iii. False, boxplots should never be used to confirm the shape of the distribution. 2. a. Randomly selected means you want the marginal probability of conviction; i.e. you should include everyone. Hence, P(convicted) = #convicted/Total = 185/500 = 0.37. b. i. P(convicted within 2 years given education level of at least 10 years) Recall that if A and B are independent events then P(A|B) = P(A). ii. Using only the row “At least 10 years” find: P(convicted|educ of at least 10 years) = = 0.25. The probability = 0.25 Thus it appears that conviction status is not independent of education level. 3. No, this is a time plot (not a histogram). 4. a. Since the distribution of scores for class A is symmetric, the mean is also the median, and you can look for the middle value. In this case, the mean score is 5. b. In class B the scores are more centered near the middle so the standard deviation is smaller. c. Correct iv. The individual scores vary from the mean by about 3.2 points, on average. i. Individual scores may be more than one standard deviation away so this can’t be correct. Remember, standard deviation does not determine the minimum or maximum. ii. Standard deviation is the average distance of individual scores and the mean, not between individual scores. iii. For one, the empirical rule only applies when our data is normally distributed. Even then, 95% of the data is within two standard deviations of the mean rather than one. d. Class B has an approximate normal distribution. 5 5. a. Bar chart (Each bar corresponds to one value of a categorical variable.) b. 1000*(0.15) = 150 people p(1 p) c. A confidence interval for a population proportion is given by p z* n where z* depends on the desired confidence level. To find z* for a 90% confidence level recall that we would like 90% of the area in the middle of a normal distribution. Compute: 0.15(0.85) 0.15 ± 1.645* 0.15 ± 1.645(0.0113) 0.15 ± 0.0186 (0.1314, 0.1686). 1000 d. Yes, the entire interval of values is less than (below) 0.50 or 50%. e. No, the sample of college football fans may not be representative of all adults. f. Both i, iii describe the meaning of the confidence level. Choice ii is incorrect as the population parameter is a single, fixed (but unknown), value so that the probability it is in the interval is either 0 or 1. g. The only correct choice is iv. The first option is incorrect for the same reason as part f, ii above. Choicei ii would be correct if we changed sample to population. Choice ii isn’t even close as it interprets the confidence level, and incorrectly at that. 2 z* 2 1.645 2 h. n 41.125 1691.27 so at least 1692 adult college football fans. 2m 2(0.02) 6. First, observe that 45 seconds = 0.75 minutes. a. Complete the sentence: 20% of the customers will have a service time shorter than 3.37 minutes. th Here we want to find the 20 percentile of the service time distribution. From Table A.1, the closest z value is -0.84. Recall that z = (x – μ)/σ. Solving for x gives us: x = μ + z*σ = 4 - 0.84(0.75) = 4 – 0.63 = 3.37 minutes. b. The probability that a customer will take longer than 5 minutes is: P(X > 5) = P(Z > (5-4)/0.75) = P(Z > 1.33) = 1 – 0.9082 = 0.0918. It might help to make a sketch as the one to the right. c. In part b, we found that the probability a customer waits more than 5 minutes is p = 0.0918. The wait for each of the 4 customers can be thought of as an independent trial. So, the number of customers who wait at least 5 minutes (call this Y) has the Binomial(4, 0.0918) distribution. Calculate: P(Y 2) (0.0918) (0.9082) 6(0.0084)(0.8248)0.0416 Want more review? Visit www.rossmanchance.com/applets/NormalCalcs/NormalCalculations.html 6 7. a. Each of the n=6 cars is either out-of-state (“success”) or in-state (“failure”) independently with probability p = 0.20. So, the distribution is Binomial (n = 6, p = 0.20). b. P(1≤X≤4) = P(X=1) + P(X=2) + P(X=3) + P(X=4) = 0.39322 + 0.24576 + 0.08192 + 0.01536 = 0.7363 You could also do: P(1<X<4) = 1 – P(X=0) – P(X=5) – P(X=6) = 0.7363. c. The mean, or expected value, of a binomial random variable is np . Here E(X) = 6(0.20) = 1.2 cars. Want more practice finding binomial probabilities? Visit www.ltcconline.net/greenl/java/Statistics/Binomial/Binomial.htm (in Internet Explorer or Chrome.) 8. Both ii and iii are correct. Since it was not a random sample we should not assume independence and the confidence interval assumptions are not met. Now, without a confidence interval it makes little sense to speak of the margin of error, so (i) is incorrect. Moreover, (i) is also false since a large sample size gives a small margin of error. 9. All of i, ii, and iv are correct. i. Since Bob’s z-score is positive, he must be taller than the mean of 69 inches. ii. This is the definition of a z-score! iii. One standard deviation above the mean is 71 inches, so Bob is 71 inches tall. iv. Since the heights are normally distributed, we may use the empirical rule. Since about 68% of values are within one standard deviation of the mean (100% - 68%)/2 = 16% are more than 1 standard deviation above the mean. 10. a. This is a binomial distribution where “success” is a passenger showing up with probability p = 0.9 while a “failure” is a passenger who fails to show up. There are n = 130 passengers who buy tickets. Hence, E(X) = np = 130*(0.9) = 117 passengers. b. The standard deviation of a binomial random variable is given by: np(1 p) 130(0.90)(0.10) 11.7 3.42 c. We have a large sample size so we can approximate the binomial with a normal distribution that has a mean of 117 and a standard deviation of 3.42. We want the probability that 120 or fewer actually show up. The z-score for 120 is z 120 117 0.88 so P(X ≤ 120)=P(Z ≤ 0.88) = 0.8106. A 3.42 sketch might be useful here: 7 Statistics 350 Fall 2007 Exam 1 1. This past September the results of a study were re ported by the Associated Press. The headline read “Study: Dieting for dollars works”. The study involved about 200 overweight employees at several campuses in North Carolina, which were randomly divided up into three groups. One group received no incentives while the other two groups received $7 or $14 for each percentage point of weight lost. (For example, someone in the middle group weighing 200 pounds who lost 10 pounds, or 5 percent, would get $35.) In the end, employees who received the most incentives lost the most weight, an average of nearly 5 pounds after three months. a. This study is: (circle one) an observational study an experiment  b. Weight loss (in pounds) is: (circle all that are appropriate)  an explanatory variable a response variable a confounding variable a categorical variable a quantitative variable a normal variable c. The study reported that those offered no incentives lost an average of 2 pounds; where as those in the $7 group lost about 3 pounds on average.  Thevalue of 2 pounds is: a parameter a statistic a sample a population d. Use the boxplots to answer the remaining questions.  i. For the no incentive group, 75% lost at least _________ pound(s). ii. The most weight lost by an employee in the $14 incentive group was _______ pound(s). iii. True or False: The boxplot of the weight losses for the $7 incentive group confirms the distribution is bell-shaped. True False 119 2. A study of the behavior of a large number of drug offenders after treatment for drug abuse suggests that the likelihood of conviction within a 2-year period af ter treatment may depend on the offender’s education. Use the following table of results to answer parts (a) and (b): Status within 2 years of treatment Convicted CNoovticted Total Education At least 10 years 50 150 200 Level Less than 10 years 135 165 300 Total 185 31500 a. What is the probability that a randomly selected drugoffenderwillbeconvictedwithin2yearsof treatment?  Final Answer: ______________________ b. The researcher would like to assess if conviction status is independent of education level.  i. To check for independence, the probability foun dinpart(a)shouldbecomparedtowhichofthe following probabilities? Clearly circle your answer. P(education level of at least 10 years) P(education level of at least 10 years given convicted within 2 years) P(convicted within 2 years given education level of at least 10 years) ii. Find the probability you circled above and circle the appropriate conclusion. Tperob ability = ____________________________________________ Thus it appears that conviction status is is not independent of education level. 3. Seeking a cure was the title of this USA Today Snapshot regarding diabetes. Based on the graph, is it appropriate to say that the distribution of the number diagnosed with diabetes is right skewed? Circone: Yes No  Brieflyexplain: 120 4. The scores for a recent pretest were recorded for the two sections of a cla ss (all scores were integer values). A summary of the two score distributions is provided below. a. What is the mean score for the students in class A?  Final answer = ____________________ b. The standard deviation for the scores in Class A is 3.2. The standard deviatio n for the scores in Class B would be … (circle one)  largermaslmre c. Which of the following is an appropriate interpretation of that standard deviation for Class A?  i. All of the individual scores are within 3.2 points, on average. ii. The average difference between the individual scores is roughly 3.2 points. iii. We expect 95% of the scores to be within 3.2 points of the mean. iv. The individual scores vary from the mean by about 3.2 points, on average. d. At the right we have one more plot but the title is missing the letter for the Class. For which class is this a plot of?  Clearly circle your answer: Class A C Blass 121 5. For this problem, consider the chart at the right which can be found on the Gallup Organization’s website (www.galluppoll.com). a. What type of chart is this?  __________________________________ b. Suppose that this chart summarizes the results of a survey of a random sample of 1,000 adult college football fans. How many people in the sample reported that they prefer the current bowl game system?  AFninal swer: ______________________ c. Create a 90% confidence interval to estimate the proportion of all ad ult college football fans who prefer the current bowl game system.  Final Answer: ______________________ d. Does the interval from part (c) provide evidence that a minority of all adult college football fans prefer the current bowl game system?  Circle one: yes no Explain in one simple sentence. e. Does the interval from part (c) pr ovide evidence that a minority of all adults prefer the current bowl game system?  Circle one: yes no Explain in one simple sentence. 122 f. Consider the following statements below. Clearly circle all which correctly explain the meaning of the 90% confidence level.  i Confidence intervals computed by using the same procedure will include the true population value for 90% of all possible random samples taken from the population. ii The probability that the population parameter falls between the bounds of an already computed confidence interval is roughly 90%. iii If we consider all possible randomly selected samples of the same size from a population, the 90% is the percentage of those samples for which the corresponding confidence interval will contain the population parameter. g. The 90% confidence interval for the population proportion of all adult college football fans who would prefer a playoff tournament to the current system was found to be (0.67,0.72). Which of the following is a correct interpretation of this 90% confidence interval? Clearly circle your one answer.  i There is a 90% probability that the proportion of all adult college football fans who would prefer a playoff tournament is between 0.67and 0.72. ii If this study were to be repeated with a sample of the same size, there is a 90% probability that the sample proportion would be between 0.67and 0.72. iii We estimate, with 90% confidence, that the sample proportion of all adult college football fans who would prefer a playoff tournament is between 0.67and 0.72. iv We estimate, with 90% confidence, that the population proportion of all adult college football fans who would prefer a playoff tournament is between 0.67and 0.72. h. Suppose another survey is to be conducted for esti mating the population propor tion of all adult college football fans prefer the current bowl game system. Th e sample size should be la rge enough so that, with 90% conservative confidence, we are within 2% of the population proportion. How large a sample size do we need?  aninal swer: _________________ 123 6. A study at the United States Postal Service suggests that the time taken to serve an individual customer at a post office is normally distributed with a mean of 4 minutes and standard deviation of 45 seconds. a. Complete the following sentence: (show all work)  20% of the customers will have a service time shorter than __________ minutes. b. What is the probability that a customer will take longer than 5 minutes?  Final Answer: ______________________ c. Suppose we track 4 randomly selected customers. What is the probability that exactly two of the four customers will take more than 5 minutes? Show all work.  Final Answer: ______________________ 124 7. Suppose that among all of the many cars parked at the stadium on a game day, 20% are out-of-state. Let X represent the number of cars in a random sample of size 6 that are out-of-state. The probability distribution for X is given below. X = x 0 1 2 3 4 5 6 P(X = x) 0.26214 0.393220.245760.08192 0.015306.001530.000064 a. What is the distribution for the random variable X = the number of out-of-state cars in a random sample of 6 cars? (Be specific.)  _____________________________________________________________ b. What is the probability of finding at least 1 but no mo re than 4 out-of-state cars in your random sample of 6 cars?  Final Answer: ______________________ c. What is the expected number of out-of-state cars in a random sample of size 6?  Final answer: ______________________ 8. On August 7, 2007, Barry Bonds became the all-time home run leader in major league baseball. Bonds’ record is controversial due to his use of steroids, which are not allowed in professional baseball. Mark Ecko purchased the record making ball and se t up an internet survey to let the public decide what he should do with it. Below are excerpts from the September 27th New York Times article givi ng the results of that poll. “After more than 10 million online votes, 47% of voters wanted the ball to be adorned with an asterisk [and donated to the Baseball Hall of Fame], 34% said it should not be changed [and donated to the Baseball Hall of Fame] and 19% wanted it to be shot into space.… Ecko, who cast three online votes for the ball to have an asterisk, said the voting was emblem atic of the cynical way fans often view sports these days.… Based on Ecko’s poll, almost two-thirds of the voters had doubts about Bonds’ achievements.”  Based on the above, clearly circle all of the statements below that are true. i. With such a large sample size, there would be a very large margin of error, so this poll gives very accurate information about the population of all baseball fans opinion on Bonds’ achievements. ii. Since the sample was a volunteer sample, it may not give information that is representative of the population of all baseball fans. iii. The responses are not independent, therefore this data should not be used to create a confidence interval. 125 9. The distribution of heights of ad ult men is approximately normal with a mean of 69 inches and a standard deviation of 2 inches. Bob's height has a z-score of 1 when compared to all adult men.  Consider the following statements. Circle all true statements based on the above information. i. Bob is taller than 69 inches tall. ii. Bob's height is 1 standard deviation above the mean. iii. Bob is 70 inches tall. iv. About 16% of adult men are taller than Bob. 10. The overbooking problem: Airlines find that a small percentage of ticket holders fail to show up to board a flight. Assumethatthepercentageis10%. Asa result, the airlines often sell more tickets than the capacity of the plane. Suppose for planes with 120 seats, the airlines sells 130 tickets. Let X be the number of passengers out of 130 that actually show up. a. If 130 tickets are sold, how many passengers are expected to actually show up?  Final Answer: ______________________ b. If 130 tickets are sold, what is the standard deviation for the number of passengers that actually show up?  Final Answer: ______________________ c. If 130 tickets are sold, what is the probability that every passenger that actually shows up will get a seat? Show all work.  Final Answer: ______________________ 126
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'