This 7 page Study Guide was uploaded by kimwood Notetaker on Friday November 6, 2015.

Date Created: 11/06/15
Lane Chap 14 2 The formula for a regression equation is Y 2X 9 a What would be the predicted score for a person scoring 6 on X The predicted score for that person would be Y 269 21 b If someone s predicted score was 14 what was this person s score on X In this case Y 14 thus 14 2X9 Igt X 1492 25 Thus the score on X was 25 6 For the X Y data below compute a r and determine if it is signi cantly different from zero The obtained output from Minitab is given below Correlation X Y Pearson correlation of X and Y 0849 P Value 0032 From the above output we can see that the correlation coefficient between X and Y is 0849 with corresponding P value 0032 As the P value is smaller than the significance level of 005 so we can conclude that the correlation coefficient is significantly different from zero b the slope of the regression line and test if it differs signi cantly from zero Using the data analysis tool pack of Excel the obtained output is given below SUMMARY OUTPUT Regression Statistics Multiple R 08492 R Square 07211 Adjusted R Square 06514 Standard Error 35028 Observations 6 ANOVA Significance df 55 MS F F Regression 1 1269207 1269207 103441 00324 Residual 4 490793 122698 Total 5 176 Upper Coefficients Standard Error t Stat Pvalue Lower 95 95 Intercept 31231 31085 10047 03719 55075 117537 X 11332 03523 32162 00324 01550 21115 From the above output we can see that the slope of the regression line is 11332 with corresponding P value 00324 As the P value is smaller than the significance level of 005 so we are rejecting the null hypothesis of insignificance and concluding that slope differs signi cantly from zero c the 95 con dence interval for the slope The required 95 confidence interval for the slope is given in the above output which is 01550 21115 X Y 4 6 3 7 5 12 11 17 10 9 14 21 Lane Chag 17 5 At a school pep rally a group of sophomore students organized a free raf e for prizes They claim that they put the names of all of the students in the school in the basket and that they randomly drew 36 names out of this basket Of the prize winners 6 were freshmen 14 were sophomores 9 were juniors and 7 were seniors The results do not seem that random to you You think it is a little shy that sophomores organized the raf e and also won the most prizes Your school is composed of 30 freshmen 25 sophomores 25 juniors and 20 seniors a What are the expected frequencies of winners from each class The required expected frequencies of winners from its class are Class Expected Frequency Freshmen 36030 1080 Sophomores 36025 9 Juniors 36025 9 Seniors 36030 720 b Conduct a signi cance test to determine whether the winners of the prizes were distributed throughout the classes as would be expected based on the percentage of students in each group Report your Chi Square and p values The chisq value can be calculated using the formula Observed Frequency Expected Frequency2 ChlSq 2 Expected Frequency 6 1082 14 92 9 92 7 72o2 108 9 9 720 439917 Df number of groups1 413 So pvalue PChisq3gt4917 01780 c What do you conclude We can see that here the P value is 01780 which is larger than the significance level of 005 Thus we are failing to reject the null hypothesis and we are concluding that there is not sufficient evidence to reject the null hypothesis that the result is random 14 A geologist collects handspecimen sized pieces of limestone from a particular area A qualitative assessment of both texture and color is made with the following results Is there evidence of association between color and texture for these limestones Explain your answer COLOUR Texture Light Medium Dark Fine 4 20 8 Medium 5 23 12 Coarse 21 23 4 Here I m using Minitab to perform the hypothesis testing The obtained Minitab output is given below ChiSquare Test for Association Worksheet rows Worksheet columns Rows Worksheet rows Columns Worksheet columns Light Medium Dark All 1 4 20 8 32 800 1760 640 2 5 23 12 40 1000 2200 800 3 21 23 4 48 1200 2640 960 All 30 66 24 120 Cell Contents Count Expected count Pearson Chi Square 17727 DF 4 P Value 0001 Likelihood Ratio Chi Square 18141 DF 4 P Value 0001 The above output shows that the P value of this test is 0001 which is smaller than the significance level of 005 Thus we are rejecting the null hypothesis of no association and concluding that there is a significant association between color and texture for these limestones at 005 significance level lllowskv Chap11 Decide whether the following statements are true or false 70 The standard deviation of the chisquare distribution is twice the mean For a Chisq distribution mean is the degrees of freedom and variance is 2 times the degrees of freedom and thus the above statement is false Use the following information to answer the next exercise Suppose an airline claims that its flights are consistently on time with an average delay of at most 15 minutes It claims that the average delay is so consistent that the variance is no more than 150 minutes Doubting the consistency part of the claim a disgruntled traveler calculates the delays for his next 25 flights The average delay for those 25 flights is 22 minutes with a standard deviation of 15 minutes 113 df 24 lllowskv Chap12 66 Can a coefficient of determination be negative Why or why not The coefficient of determination cannot be negative because it is the division of two sum of squares As the sum of squares are the positive terms sum of positive terms thus positive so coefficient of determination is positive divided by positive thus cannot be negative The cost of a leading liquid laundry detergent in different sizes is given Size Ounces Cost Cost per ounces 16 399 32 499 64 599 200 1099 82 a Using size as the independent variable and quotcostquot as the dependent variable draw a scatter plot The obtained scatter plot is given below Scatterplot of Cost vs Size Ounces 11 O 10 Cost 0 50 100 150 200 Size Ounces b Does it appear from inspection that there is a relationship between the variables Why or why not From the above scatter plot we can see that there is a positive association between the two variables thus there exists a positive relationship between the two variables c Calculate the leastsquares line Put the equation in the form of Vabx The obtained output is given below Regression Analysis Cost versus Size Ounces Analysis of Variance Source DF Adj SS Adj MS F Value P Value Regression 1 289163 289163 69136 0001 Size Ounces 1 289163 289163 69136 0001 Error 2 00837 00418 Total 3 290000 Model Summary S R sq R sqadj R sqpred 0204512 9971 9957 9823 Coefficients Term Coef SE Coef T Va1ue P Va1ue VIF Constant 3598 0150 2396 0002 Size Ounces 003707 000141 2629 0001 100 Regression Equation Cost 3598 003707 Size Ounces So we can see that the regression equation is Cost 3598 003707 Size Ounces d Find the correlation coefficient Is it significant The output in this case is Correlation Size Ounces Cost Pearson correlation of Size Ounces and Cost 0999 P Value 0001 As the P value 0001 is smaller than the significance level of 005 so we can say that the correlation is significant e If the laundry detergent were sold in a 40ounce size find the estimated cost The estimated cost in this case is Estimated cost 3598 00370740 550808 f If the laundry detergent were sold in a 90ounce size find the estimated cost The estimated cost in this case is Estimated cost 3598 00370790 569343 g Does it appear that a line is the best way to fit the data Why or why not The scatter plot shows up clear sign of linear relationship between the variables thus it appears that a line is the best way to fit the data h Are there any outliers in the given data No there is no outliers in the given data though one value is really far from the other values But as it is on the regression line so it can t be taken as outlier i Is the leastsquares line valid for predicting what a 300ounce size of the laundry detergent would you cost Why or why not As the value 300 ounce is outside of the range considered for the regression line so using this regression line to predict the cost of a 300 ounce is not valid j What is the slope of the leastsquares bestfit line Interpret the slope The slope of the best fit line is 003707 this implies per 1 ounce increase leads to 5003707 increase in cost

