Popular in Course
Popular in Statistics
This 13 page Class Notes was uploaded by Orval Funk on Monday September 28, 2015. The Class Notes belongs to STAT101 at University of Pennsylvania taught by Staff in Fall. Since its upload, it has received 12 views. For similar materials see /class/215432/stat101-university-of-pennsylvania in Statistics at University of Pennsylvania.
Reviews for INTROBUSINESSSTAT
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 09/28/15
Statistics 101 Review Example 0n Bivariate November 8I 2004 A company makes two calculators a scienti c calculator and a business calculator Let X Monthly demand in thousands for the business calculators Let Y Monthly demand in thousands for the scienti c calculators We are given the following joint probability distribution function x 1 2 3 4 1 2 1 1 0 y 2 0 1 1 0 3 0 1 1 2 Business Calculators Scienti c Calculators Selling price Manufacturing cost 20 30 Total Fixed cost 50000 per month Questions a Find px Use px to nd the expected value and variance of X b Find py Use py to nd the expected value and variance of Y N If we sell 1000 scienti c calculators nd the probability distribution expected value and variance for the number of business calculators sold P Let Z be monthly pro t in 10000 units from both types of calculators a Express Z in terms ofX and Y b Find pz Use pz to nd the expected value and variance on c Find CovXY and CorrXY Interpret the correlation in context d Use the rules for EZ and VZ to verify your answer in part b 4 a What is the probability that total monthly pro t exceeds 5000 b What is the probability that total monthly pro t is negative 4 Solutions x mm mm xufpoo y py ypltygt yHzpy 1 2 2 225245 1 4 4 144 2 3 6 25 3075 2 2 4 0 20 3 3 9 253075 3 4 12 144 4 2 8 225 245 EX 25 VX 105 EY20 VY 8 x px1y1 xpx1y1 x2px1y1 1 24 5 5 1 5 5 2 14 25 5 4 2510 3 14 25 75 9 25 225 Ele1 175 Eley1375 Vle13751752 6875 a Z X 2Y5 b Table of values on X 1 2 3 4 2 W W zuzpz 1 2 1 0 1 2 2 4 12252245 y2 0 1 2 3 1 1 1 6251 625 3 2 3 4 5 0 1 0 2251225 1 1 1 25 1 025 2 1 2 25 1 025 3 1 3 225 1 225 4 1 4 625 1 625 5 2 10 1225 2245 EZ15 VZ 665 c EXY 1 l22 ll3 ll 22l 32 1 23 l 33 l 43 2 56 COVXY EXYEXEY562526 CORRXY 6l05quot 812 6547 The demand for the two calculators tend to go in the same direction This is a relatively strong 65 on a 0 to 1 scale linear relationship d EZ EX2Y5EX 2EY 7 5 25 22 7 5 15 VZ VX2Y5 VX 4VY212 COVXY 1053246 665 a Pro t is greater than 5 5000 Just add up prob of cases 1 l l 12 6 b Pro t is negative only when Xl and yl or X2 and yl Prob2 13 Review Problems on Last Part of the Material 1 Business Week s rankings of business schools are in part based on a survey of students For example 150 MBA students at Wharton are randomly selected and asked to rate the quality of teaching on a 010 scale The following summary statistics are calculated from the 90 students who responded to the questionnaire Sample average 73 sample standard deviation 19 and 18 of the 90 students rated Wharton a 10 a Find the 95 con dence interval for the true population mean of the rankings of quality of teaching b Find the 95 con dence interval for the true proportion of students who rate Wharton a 10 c Do the data support the claim that the population mean u for the teaching quality at Wharton exceeds 7 Use at 05 d Do the data support the claim that the population proportion of ratings that are 10 exceed 15 Use at 01 e How large a sample would we need for the con dence interval in b to have a margin of error of 04 How would this change if we were certain that p lt 30 f What assumptions did you make in doing the calculations in parts ab that is most suspect 2 A company called ESP extra special people claims that they can predict the behavior of the stock market A class action suit against ESP was launched by disgruntled former clients The following experiment was used in the court proceedings Stocks are randomly chosen on Monday morning ESP can either predict that the price at the end of the trading day on Friday is above or below the opening price on Monday morning for each stock For simplicity assume that the probability is zero that the prices on Monday morning and Friday aftemoon are the same It is decided that ESP has to be correct more than 60 of the time to prove its claim a i What are the null and alternative hypotheses in a test designed to show ESP s claim ii What sample size would be required so that a 95 con dence interval for the probability that ESP is correct has a margin of error of 08 b If a sample of size 400 stocks is taken what proportion of them must ESP get correct to prove its claim Use 0c05 c Would the error be greater in the test in b if the true proportion were 63 or 65 What kind of error would this be 3 It is argued that a review course for the SAT tests does not increase SAT scores The following test is performed Twenty ve high school juniors are randomly chosen on March 1 Each of the students takes the SAT exam on March 15 prior to any instruction The twenty ve students are then given the course to improve the verbal score on the SAT They take the SAT exam after the course Below are the observed data see next page Student Verbal Score Verbal Score Difference Number Before After 1 460 630 170 610 630 20 3 510 520 10 4 600 640 40 5 390 440 50 6 610 590 20 7 410 610 200 8 530 520 10 9 620 640 20 10 360 400 40 11 590 580 10 12 550 600 50 13 640 700 60 14 510 590 80 15 430 500 70 16 380 460 80 17 600 610 10 18 590 560 30 19 620 670 50 20 450 500 50 21 480 450 30 22 510 600 90 23 480 510 30 24 600 590 10 25 630 650 20 Average 5264 56760 4120 Sample Variance 774900 613567 307767 ai Perform an appropriate test to see if the course improved the mean score Use 0c05 ii A victory for the course will only be declared if it increases the mean score by more than 20 points What conclusion would you draw now using at 05 b Do the data support the claim that the proportion of people who improved is greater than 50 Assume 0c 05 4 The government passed a law that receipts must accompany charitable contributions of 250 or more on an individual39s 1994 IRS itemized deductions form A study of 1993 returns shows the following population values The mean H1993 and standard deviation 61993 for the percentage of an individual s income that was claimed as charitable contributions were 3 and 27 respectively and the proportion p1993 that didn t claim any deductions for charitable contributions was 80 Let H1994 61994 and p1994 denote the corresponding values for 1994 A sample of 150 returns for 1994 was randomly chosen The sample average and standard deviation for the percentage of an individual39s income that was claimed as deductions for charitable contributions were 25 and 20 respectively Also of the 150 people 125 of them did not claim any deductions for charitable contributions Use 0c05 for all tests a i Test whether H1994 is lower than the corresponding value in 1993 ii What is the probability that the test in i would give the correct answer if H1994 were really 25 Assume 61994 20 b What sample size would be required if in addition to 0c05 we wanted the probability ofmaking an error for the test in part a to be 05 when H1994 25 c Test whether p1994 is higher than the corresponding value in 1993 5 A study is undertaken to see if providing additional funds by state government for high school students increases the percentage that go to college There are eight districts in the state Two high schools are randomly chosen in each district One of the schools randomly selected is given an additional stipend of 1000 per student and the other school is not given this additional stipend The following data are collected on the percentage going to college in each school District School with School without Difference Stipend Stipend 1 53 46 7 2 27 21 6 3 70 41 29 4 39 41 2 5 3 5 32 3 6 53 56 3 7 61 5 8 3 8 50 43 7 Sample Average 485 4225 625 Sample Standard 141219 120208 99535 Deviation Use at 05 for all tests a Perform an appropriate ttest to see if the mean percentage going to college with the stipend exceeds the mean percentage going to college without the stipend b The school receiving stipends in school district 3 used the money for scholarships rather than to improve the education Redo part a after eliminating the 3rd district c Compare your answers in parts a b Give explanations for differences wherever appropriate 6 Two airlines are competing for the route from NY to Washington DC The following are summary statistics for the number of minutes that each airline was late Airline 1 In a sample of size 15 sample average 20 minutes and s 20 minutes Airline 2 In a sample of size 15 sample average 15 minutes and s 175 minutes a A test is run for the claim that the average amount of minutes that airline 1 is late is less than 30 What can you say about the Pvalue of the test What does the Pvalue imply as to the values of alpha for which the claim would be supported b What do the data suggest about the appropriateness of the assumptions that were made in answering part a c If the true mean time that airline 1 is late is 20 minutes and the true standard deviation is 20 minutes how likely would a test to show that the mean is less than 30 arrive at the correct conclusion Assume n15 and 0c05 d Of the 15 observations for airline 1 12 of them were below 30 minutes and only 3 of them were above 30 minutes Is this sufficient in showing that the median time for airline 1 is below 30 minutes Use 0c05 e Do the data support the claim that the mean time for airline 1 is higher than the mean time for airline 2 JMP returned a value of t that is 73 from the above data Use 0c05 f The reason the average time for airline 1 is relatively high is that one of the ights was in inclement weather and arrived an hour and a half late What would the average for airline 1 be without this observation What does this say about the appropriateness of the test in part d 7 Let H0 be the hypothesis that a tumor is benign and Ha be the hypothesis that the tumor is malignant A physician gives a screening test that gives ratings of 12 3 4 or 5 The following probabilities are available for the distribution of ratings from individuals with benign and malignant tumors H0 05 05 2 3 4 Ha 2 2 3 2 l a If a person is classi ed as having a benign tumor if the ratings are 234 or 5 what are the values of 0c and 5 What do these values mean in the context of this problem b The form of the test is to classify atumor as benign if the rating is at least c What value of c minimizes or 5 8 Consider the regression analysis example discussed in class relating the number of hours a student works on school related functions X to the student s GPA Y Sivariate Fit of GPA By School HS 7 10 20 30 40 50 60 70 School Hrs I Linear Fit Linear t GPA 17644708 00342453 School Hrs 1 Summary of Fit RSquare 0468111 RSquare Adj 0466774 Root Mean Square Error 0402178 Mean of Response 310175 Observations or Sum Wgts 400 1 Analysis ofVariance Source DF Sm oquLares MeanSqLae F Raio Model 1 5665629 566563 3502763 Error 398 6437548 01617 Pm C Total 399 121 03177 lt00 1 Parameter Estimates Term 8 imae SdError tRa o Probgtt Intercept 1 7644708 0074228 2377 ltOOO1 School Hrs 00342453 000183 1872 lt0001 a Interpret the effect of increasing the number of school hours by one on GPA b What would be the predicted GPA for someone who only spent 20 hours per week on school related functions Find the range that has a 95 chance of including the GPA for students who work 20 hours on school related functions c Consider the row labeled School Hrs in the Parameter Estimates part of the output i The column labeled Prob gt ltl is the Pvalue for the test of the null hypothesis that the true population slope for school hrs is zero The actual Pvalue is less than 1 in ten thousand Interpret what this means ii The Std Error column is giving the standard deviation of the estimate of the slope analogous to sn for the sample average Construct a 95 con dence interval for the slope since n is large you may use 2 and interpret this interval in the context of this problem Note We did not cover material in Question 2 in Fall 2004 so omit this question for review Statistics 101 Final Exam April 30 2004 Notes 1 The exam is open book and notes Calculators are permitted but not computers 2 Please include all of your work in the blue book 3 Please indicate the null and alternative hypotheses when appropriate 1 15 pts Television shows are pretested by inviting 100 individuals to come to the studio showing these individuals a pilot of the television show and asking each individual to rate the show on a scale of 0 do not like it at all to 100 would be a faithful viewer of the show It is decided that if the population mean of the rating exceeds 60 then the show is worth airing For one particular show the average rating of the 100 individuals who observed the pilot was 65 with a sample standard deviation of 15 Of the 100 ratings 59 were greater than 60 and the remaining 41 were 60 or below A Are the data sufficient in showing that the population mean exceeds 60 Use 0c01 B Are the data sufficient in showing that the proportion of people who give the show a rating of above 60 exceeds 50 Use 0c05 C Are 100 individuals a sufficient sample size so that a 95 confidence interval for the proportion of individuals who rate a television show to be greater than 60 has a margin of error of 07 If not what sample size would be required 2 20 pts The television studio has an option of running one of two television shows The first television show is the one described in problem 1 You are therefore to use the data that are provided on the ratings from television program 1 that are indicated in problem 1 The second television show is viewed by a different audience of 100 individuals This show is the favorite of the professionals of the studio The average ratings that this show gets is 70 with a sample standard deviation of 13 Of the 100 ratings 69 were greater than 60 and the remaining 31 were 60 or below A Do the data justify that the mean rating for this show exceeds the mean rating for the show described in question 1 Use 0c05 B Do the data justify that the proportion of people in the population who rate this show higher than 60 is greater for this show than the one in problem 1 Use 0c05 C Assume each of the show s ratings has a known population standard deviation of 15 All question refer to H0u1 uz versus Hau1 rm at 0c05 Situation 1 u 60 and m 70 Situation 2 u 55 and m 75 For each of the four descriptions below i ii iii and iv Answer a Situation 1 is higher b Situation 2 is higher or c no difference Also include a one sentence explanation of your answer i The probability of making a Type 11 error ii The probability of making a Type I error iii The Pvalue in the test assuming given sample sizes and sample averages iv The sample size assume equal sample size for a Type 11 error of 05 3 25 points In a study of mutual funds based on real data and analysis in the literature 100 mutual funds are randomly chosen and their returns in 1992 and 1993 are found Refer to Figure 1 below in answering parts A and B A i What is the meaning of the Rsquared value in the context of the problem ii The Pvalue for the test of the null hypothesis that the population Rsquared value is zero versus the alternative that it is greater than zero is 0399 What does the p value mean in the context of this problem B i What is the predicted return in 1993 for a mutual fund that had a return of 10 percent in 1992 ii What would be the range of returns that has a 95 chance of including the true return in 1993 for a mutual fund with a return of 10 percent in 1992 assuming a normal distribution Refer to either Figure 2 or Figure 3 whichever you think is more appropriate i Do the data show that the mean return in 1993 is different from the mean return in 1992 using 0c01 ii Refer to output and comment on whether the assumptions you made in answering Ci is valid 0 D Mutual Fund A has a standard deViation of 10 Mutual Fund B has a standard deViation of 12 For what correlations between Mutual Fund A and Mutual Fund B would the portfolio that puts half of your money in each of mutual funds A and B have a lower standard deViation than putting all of your money in Mutual Fund A Figure 1 Regression on 1993 Returns on X 1992 Returns Linear Fit Return 93 166514 02998368 Return 92 Summary of Fit RSquare 004236 RSquare Adj 0032589 Root Mean Square Error 1154893 Mean of Response 144843 Observations or Sum Wgts Analysis of Variance DF Sum of Squares Mean Square F Ratio Model 1 578186 578186 43349 Error 98 13071026 133378 Prob gt F C Total 99 13649212 00399 Parameter Estimates Term Estimate Std Error t Ratio Probgtt Intercept 166514 1554716 1071 lt0001 Return 92 0299837 014401 208 00399 Figure 21992 and 1993 Returns Oneway Analysis of Returns By Year 7o 39 60 50 w 40 I E 30 39 39 3 39 n 20 39 10 0 10 I 20 Return 92 Return 93 Year Means and Std Deviations Mean Std Dev Std Err Mean Return 92 100 72276 80599 08060 Return 93 100 144843 117418 11742 Figure 31993 Returns Minus 1992 Returns Lower 95 5 628 121154 Upper 95 8 827 16814 19931992 l l l l I l rl W I39 001 01 0510 25 50 75 9095 99 999 r9 NorrT39al Quantile Plot Moments Mean 72567 Std Dev 15549585 Std Err Mean 15549585 upper 95 Mean 10342075 lower 95 Mean 4171325 N 4 24 points This problem is based on a real court case although the data have been changed because of con dentiality A manufacturer of Polio vaccines is required by law to test the virulence of the vaccine on n animals The scores on virulence are on a 110 scale We need to feel comfortable translated to be a type I error of 05 that the mean score is below four for the vaccine to be safe to be released Assume that 63 in answering all of the parts to guestions 4 A The average score and sample standard deviation from the five animals that were tested were 21 and 27 respectively Should the vaccine be released Use 0c05 B If the true mean score is 2 what is the probability of correctly releasing the vaccine using the test in Part A C Since we want to release vaccines that have population means of 2 frequently more animals need to be tested How many animals would have to be tested so that there is only a 5 chance of releasing vaccines that should not be released and a 10 chance of not releasing vaccines that should be released when the population mean is 2 D There is a law suit against the pharmaceutical company claiming that they are not following statistical practice The following data are collected on four lots of vaccine Lot Sample Average Number of Animals Released 1 2 7 18 Yes 2 2 5 8 No 3 2 9 16 Yes 4 26 1 1 No Is this consistent in that there is the same at does not have to be 05 that gives rise to releasing lots 1 and 3 but not lots 2 and 4 5 16 points For purposes of analysis assume that movies are categorized into one of two groups H0 Movie is nonprofitable bust or Ha Movie is pro table success From past experience it has been determined that the number of stars that a critic gives to a movie is related although not perfectly to whether the movie is a bust or success Movies that are H0 respective probabilities of 1 2 3 and 4 stars are 4 39 16 and 05 Movies that are Ha respective probabilities of 1 2 3 and 4 stars are 1 2 32 and 38 You decide to classify a movie as a success ie reject the null hypothesis if it receives at least c stars A i What is the value of c so that the probability of a type I error is 05 ii What is the probability of a making a type 11 error if the rule in Ai is adopted iii What is the pvalue associated with a movie that receives 3 stars The following information is available on movies 1 25 of the movies are successes and the remaining 75 are busts 2 i Not releasing a movie gives zero profit regardless of whether the movie is a bust or a success ii Releasing a movie that is a bust loses 25 million dollars iii Releasing a movie that is a success makes 50 million dollars B i What is the probability that a movie is a success if it receives a 3 star rating ii If a movie is not released it of course has an expected profit of zero Compute the expected profit if a movie with 3 stars is released to determine if it is profitable on average to release a movie with 3 stars
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'