Statistics / Statistics: Informed Decisions Using Data 4 / Chapter 14.3 / Problem 13

Statistics: Informed Decisions Using Data | 4th Edition | ISBN: 9780321757272 | Authors: Michael Sullivan, III

Textbook Solutions for Statistics: Informed Decisions Using Data

Chapter 14.3 Problem 13

Chapter

Problem

Question

For the data set x1 x2 x3 x4 y 43 19.6 7.1 32 200 44 13.1 58.5 37 204 40 24.7 2.1 32 215 35 30.4 41.4 39 229 38 28.2 7.7 30 231 39 24.9 25.0 26 243 39 45.7 28.5 25 266 40 38.4 27.7 24 278 47 36.9 26.2 17 287 35 66.3 4.2 23 298 36 112.8 26.2 21 339 44 108.4 22.3 24 359 (a) Construct a correlation matrix between x1, x2, x3, x4, and y. Is there any evidence that multicollinearity may be a problem? (b) Determine the multiple regression line using all the explanatory variables listed. Does the F-test indicate that we should reject H0:b1 = b2 = b3 = b4 = 0? Which explanatory variables have slope coefficients that are not significantly different from zero? (c) Remove the explanatory variable with the highest P-value from the model and recompute the regression model. Does the F-test still indicate that the model is significant? Remove any additional explanatory variables on the basis of the P-value of the slope coefficient. Then compute the model with the variable removed. (d) Draw residual plots and a boxplot of the residuals to assess the adequacy of the model. (e) Use the model constructed in part (c) to predict the value of y if x1 = 34, x2 = 35.6, x3 = 12.4, and x4 = 29. (f) Draw a normal probability plot of the residuals. Is it reasonable to construct confidence and prediction intervals? (g) Construct 95% confidence and prediction intervals if x1 = 34, x2 = 35.6, x3 = 12.4, and x4 = 29.

Solution

Step 1 of 7)

The first step in solving 14.3 problem number 13 trying to solve the problem we have to refer to the textbook question: For the data set x1 x2 x3 x4 y 43 19.6 7.1 32 200 44 13.1 58.5 37 204 40 24.7 2.1 32 215 35 30.4 41.4 39 229 38 28.2 7.7 30 231 39 24.9 25.0 26 243 39 45.7 28.5 25 266 40 38.4 27.7 24 278 47 36.9 26.2 17 287 35 66.3 4.2 23 298 36 112.8 26.2 21 339 44 108.4 22.3 24 359 (a) Construct a correlation matrix between x1, x2, x3, x4, and y. Is there any evidence that multicollinearity may be a problem? (b) Determine the multiple regression line using all the explanatory variables listed. Does the F-test indicate that we should reject H0:b1 = b2 = b3 = b4 = 0? Which explanatory variables have slope coefficients that are not significantly different from zero? (c) Remove the explanatory variable with the highest P-value from the model and recompute the regression model. Does the F-test still indicate that the model is significant? Remove any additional explanatory variables on the basis of the P-value of the slope coefficient. Then compute the model with the variable removed. (d) Draw residual plots and a boxplot of the residuals to assess the adequacy of the model. (e) Use the model constructed in part (c) to predict the value of y if x1 = 34, x2 = 35.6, x3 = 12.4, and x4 = 29. (f) Draw a normal probability plot of the residuals. Is it reasonable to construct confidence and prediction intervals? (g) Construct 95% confidence and prediction intervals if x1 = 34, x2 = 35.6, x3 = 12.4, and x4 = 29.
From the textbook chapter Multiple Regression you will find a few key concepts needed to solve this.

Step 2 of 7)

Visible to paid subscribers only

Step 3 of 7)

Visible to paid subscribers only

Subscribe to view the
full solution

Title Statistics: Informed Decisions Using Data 4

Author Michael Sullivan, III

ISBN 9780321757272

For the data set x1 x2 x3 x4 y 43 19.6 7.1 32 200 44 13.1 58.5 37 204 40 24.7 2.1 32 215

Chapter 14.3 textbook questions

Chapter 14: Problem 1 Statistics: Informed Decisions Using Data 4
A shows the linear correlation between each pair of variables under consideration in a multiple regression model.
Read more
Chapter 14: Problem 2 Statistics: Informed Decisions Using Data 4
If the correlation between two explanatory variables is high, the least-squares regression model may suffer from .
Read more
Chapter 14: Problem 3 Statistics: Informed Decisions Using Data 4
Suppose a multiple regression model is given by yn = 4.39x1 - 8.75x2 + 34.09. An interpretation of the coefficient of x1 would be, if x1 increases by 1 unit, then the response variable will increase by units, on average, while holding x2 constant.
Read more
Chapter 14: Problem 4 Statistics: Informed Decisions Using Data 4
If there is between x1 and x2 in a leastsquares regression model, we would build a model of the form yni = b0 + b1x1i + b2 x2 i + b3 x1i x2 i + ei .
Read more
Chapter 14: Problem 5 Statistics: Informed Decisions Using Data 4
A(n) or variable is a qualitative explanatory variable in a multiple regression model that takes on the value 0 or 1.
Read more
Chapter 14: Problem 6 Statistics: Informed Decisions Using Data 4
True or False : The value of R2 never decreases as more explanatory variables are added to a regression model.
Read more
Chapter 14: Problem 7 Statistics: Informed Decisions Using Data 4
You obtain the multiple regression equation yn = 5 + 3x1 - 4x2 from a set of sample data. (a) Interpret the slope coefficients for x1 and x2. (b) Determine the regression equation with x1 = 10. Graph the regression equation with x1 = 10. (c) Determine the regression equation with x1 = 15. Graph the regression equation with x1 = 15. (d) Determine the regression equation with x1 = 20. Graph the regression equation with x1 = 20. (e) What is the effect of changing the value x1 on the graph of the regression equation?
Read more
Chapter 14: Problem 8 Statistics: Informed Decisions Using Data 4
You obtain the multiple regression equation yn = -5 - 9x1 + 4x2 from a set of sample data. (a) Interpret the slope coefficients for x1 and x2. (b) Determine the regression equation with x1 = 10. Graph the regression equation with x1 = 10. (c) Determine the regression equation with x1 = 15. Graph the regression equation with x1 = 15. (d) Determine the regression equation with x1 = 20. Graph the regression equation with x1 = 20. (e) What is the effect of changing the value x1 on the graph of the regression equation?
Read more
Chapter 14: Problem 9 Statistics: Informed Decisions Using Data 4
A multiple regression model has k = 3 explanatory variables. The coefficient of determination, R2 , is found to be 0.653 based on a sample of n = 25 observations. (a) Compute the adjusted R2 . (b) Compute the F-test statistic. (c) If one additional explanatory variable is added to the model and R2 increases to 0.665, compute the adjusted R2 . Would you recommend adding the additional explanatory variable to the model? Why or why not?
Read more
Chapter 14: Problem 10 Statistics: Informed Decisions Using Data 4
A multiple regression model has k = 4 explanatory variables. The coefficient of determination, R2 , is found to be 0.542 based on a sample of n = 40 observations. (a) Compute the adjusted R2 . (b) Compute the F-test statistic. (c) If one additional explanatory variable is added to the model and R2 increases to 0.579, compute the adjusted R2 . Would you recommend adding the additional explanatory variable to the model? Why or why not?
Read more
Chapter 14: Problem 11 Statistics: Informed Decisions Using Data 4
For the data set x1 x2 x3 y 0.8 2.8 2.5 11.0 3.9 2.6 5.7 10.8 1.8 2.4 7.8 10.6 5.1 2.3 7.1 10.3 4.9 2.5 5.9 10.3 8.4 2.1 8.6 10.3 12.9 2.3 9.2 10.0 6.0 2.0 1.2 9.4 14.6 2.2 3.7 8.7 9.3 1.1 5.5 8.7 (a) Construct a correlation matrix between x1, x2, x3, and y. Is there any evidence that multicollinearity exists? Why? (b) Determine the multiple regression line with x1, x2, and x3 as the explanatory variables. (c) Assuming that the requirements of the model are satisfied, test H0: b1 = b2 = b3 = 0 versus H1: at least one of the bi is different from zero at the a = 0.05 level of significance. (d) Assuming that the requirements of the model are satisfied, test H0: bi = 0 versus H1: bi 0 for i = 1, 2, 3 at the a = 0.05 level of significance.
Read more
Chapter 14: Problem 12 Statistics: Informed Decisions Using Data 4
For the data set x1 x2 x3 y 24.9 13.5 3.7 59.8 26.7 15.7 11.4 66.3 30.6 13.8 15.7 76.5 39.6 8.8 8.8 77.1 33.1 10.6 18.3 81.9 41.1 9.7 21.8 84.6 25.4 9.8 16.4 87.3 33.8 6.8 25.9 88.5 23.5 7.5 15.5 90.7 39.8 6.8 30.8 93.4 (a) Construct a correlation matrix between x1, x2, x3, and y. Is there any evidence that multicollinearity exists? Why? (b) Determine the multiple regression line with x1, x2, and x3 as the explanatory variables. (c) Assuming that the requirements of the model are satisfied, test H0: b1 = b2 = b3 = 0 versus H1: at least one of the bi is different from zero at the a = 0.05 level of significance. (d) Assuming that the requirements of the model are satisfied, test H0: bi = 0 versus H1: bi 0 for i = 1, 2, 3 at the a = 0.05 level of significance. Should a variable be removed from the model? Why? (e) Remove the variable identified in part (d) and recompute the regression model. Test whether at least one regression coefficient is different from zero. Then test whether each individual regression coefficient is significantly different from zero.
Read more
Chapter 14: Problem 13 Statistics: Informed Decisions Using Data 4
For the data set x1 x2 x3 x4 y 43 19.6 7.1 32 200 44 13.1 58.5 37 204 40 24.7 2.1 32 215 35 30.4 41.4 39 229 38 28.2 7.7 30 231 39 24.9 25.0 26 243 39 45.7 28.5 25 266 40 38.4 27.7 24 278 47 36.9 26.2 17 287 35 66.3 4.2 23 298 36 112.8 26.2 21 339 44 108.4 22.3 24 359 (a) Construct a correlation matrix between x1, x2, x3, x4, and y. Is there any evidence that multicollinearity may be a problem? (b) Determine the multiple regression line using all the explanatory variables listed. Does the F-test indicate that we should reject H0:b1 = b2 = b3 = b4 = 0? Which explanatory variables have slope coefficients that are not significantly different from zero? (c) Remove the explanatory variable with the highest P-value from the model and recompute the regression model. Does the F-test still indicate that the model is significant? Remove any additional explanatory variables on the basis of the P-value of the slope coefficient. Then compute the model with the variable removed. (d) Draw residual plots and a boxplot of the residuals to assess the adequacy of the model. (e) Use the model constructed in part (c) to predict the value of y if x1 = 34, x2 = 35.6, x3 = 12.4, and x4 = 29. (f) Draw a normal probability plot of the residuals. Is it reasonable to construct confidence and prediction intervals? (g) Construct 95% confidence and prediction intervals if x1 = 34, x2 = 35.6, x3 = 12.4, and x4 = 29.
Read more
Chapter 14: Problem 14 Statistics: Informed Decisions Using Data 4
For the data set x1 x2 x3 x4 y 43 19.6 7.1 32 200 44 13.1 58.5 37 204 40 24.7 2.1 32 215 35 30.4 41.4 39 229 38 28.2 7.7 30 231 39 24.9 25.0 26 243 39 45.7 28.5 25 266 40 38.4 27.7 24 278 47 36.9 26.2 17 287 35 66.3 4.2 23 298 36 112.8 26.2 21 339 44 108.4 22.3 24 359 (a) Construct a correlation matrix between x1, x2, x3, x4, and y. Is there any evidence that multicollinearity may be a problem? (b) Determine the multiple regression line using all the explanatory variables listed. Does the F-test indicate that we should reject H0:b1 = b2 = b3 = b4 = 0? Which explanatory variables have slope coefficients that are not significantly different from zero? (c) Remove the explanatory variable with the highest P-value from the model and recompute the regression model. Does the F-test still indicate that the model is significant? Remove any additional explanatory variables on the basis of the P-value of the slope coefficient. Then compute the model with the variable removed. (d) Draw residual plots and a boxplot of the residuals to assess the adequacy of the model. (e) Use the model constructed in part (c) to predict the value of y if x1 = 34, x2 = 35.6, x3 = 12.4, and x4 = 29. (f) Draw a normal probability plot of the residuals. Is it reasonable to construct confidence and prediction intervals? (g) Construct 95% confidence and prediction intervals if x1 = 34, x2 = 35.6, x3 = 12.4, and x4 = 29. (a) Construct a correlation matrix between x1, x2, x3, x4, and y. Is there any evidence that multicollinearity may be a problem? (b) Determine the multiple regression line using all the explanatory variables listed. Does the F-test indicate that we should reject H0: b1 = b2 = b3 = b4 = 0? Which explanatory variables have slope coefficients that are not significantly different from zero? (c) Remove the explanatory variable with the highest P-value from the model and recompute the regression model. Does the F-test still indicate that the model is significant? Remove any additional explanatory variables on the basis of the P-value of the slope coefficient. Then compute the model with the variable removed. (d) Draw residual plots and a boxplot of the residuals to assess the adequacy of the model. (e) Use the final model constructed in part (c) to predict the value of y if x1 = 44.3, x2 = 1.1, x3 = 7, and x4 = 69. (f) Draw a normal probability plot of the residuals. Is it reasonable to construct confidence and prediction intervals? (g) Construct 95% confidence and prediction intervals if x1 = 44.3, x2 = 1.1, x3 = 7, and x4 = 69.
Read more
Chapter 14: Problem 15 Statistics: Informed Decisions Using Data 4
Suppose we wish to develop a model with three explanatory variables, x1, x2, and x3. (a) Write a model that utilizes all three explanatory variables with no interaction or quadratic terms. (b) Write a model that utilizes the explanatory variables x1 and x2 along with interaction between x1 and x2. (c) Write a model that utilizes all three explanatory variables, interaction between x2 and x3, and a quadratic term involving x3.
Read more
Chapter 14: Problem 16 Statistics: Informed Decisions Using Data 4
Suppose you want to develop a model that predicts the gas mileage of a car. The explanatory variables you are going to utilize are x1: city or highway driving x2: weight of the car x3: tire pressure (a) Write a model that utilizes all three explanatory variables in an additive model with linear terms and define any indicator variables. (b) Suppose you suspect there is interaction between weight and tire pressure. Write a model that incorporates this interaction term into the model from part (a).
Read more
Chapter 14: Problem 17 Statistics: Informed Decisions Using Data 4
Suppose that the response variable y is related to the explanatory variables x1 and x2 by the regression equation. yn = 4 + 0.4x1 - 1.3x2 (a) Construct a graph similar to Figure 17 showing the relationship between the expected value of y and x1 for x2 = 10, 20, and 30. (b) Construct a graph showing the relationship between the expected value of y and x2 for x1 = 40, 50, and 60. (c) How can we tell from the graphs alone that there is no interaction between x1 and x2? (d) Redo parts (a) and (b) with the interaction term 0.05x1x2 added to the regression equation. How do the graphs differ?
Read more
Chapter 14: Problem 18 Statistics: Informed Decisions Using Data 4
Suppose that the response variable y is related to the explanatory variables x1 and x2 by the regression equation yn = 6 - 0.3x1 + 1.7x2 (a) Construct a graph similar to Figure 17 showing the relationship between the expected value of y and x1 for x2 = 10, 20, and 30. (b) Construct a graph showing the relationship between the expected value of y and x2 for x1 = 40, 50, and 60. (c) How can we tell from the graphs alone that there is no interaction between x1 and x2? (d) Redo parts (a) and (b) with the interaction term 0.04x1x2 added to the regression equation. How do the graphs differ?
Read more
Chapter 14: Problem 19 Statistics: Informed Decisions Using Data 4
Resisting the Computer Researchers Alfred P. Rovai and Marcus D. Childress asked the following question: How can resistance to reduction of computer anxiety among teacher education students be explained and predicted? (Journal of Research on Technology in Education, 35(2)). To answer this research question, they identified 86 undergraduate teacher education students enrolled in a computer literacy course and administered a series of questionnaires that quantified various predictors of the response variable, computer anxiety, y. For example, computer anxiety was measured by administering each student the Computer Anxiety Scale. This score ranges from 20 to 100, with higher scores indicating higher levels of computer anxiety. The explanatory variables were: x1: Computer confidence on a scale from 10 to 40, with higher scores indicating higher confidence x2: Computer knowledge on a scale from 0 to 33, with 0 indicating no computer knowledge and 33 indicating superior computer knowledge x3: Computer liking on a scale from 10 to 40, with higher scores indicating a greater like of computers x4: Trait anxiety on a scale from 20 to 80, with higher scores indicating a higher level of overall anxiety The multiple regression model was yn = 84.04 - 0.87x1 - 0.51x2 - 0.45x3 + 0.33x4 (a) The reported P-value of the regression model was less than 0.0001. Would you reject the null hypothesis H0: b1 = b2 = b3 = b4 = 0? (b) Interpret the slope coefficients of the model in part (a). Are they all reasonable? (c) Predict the computer anxiety score of an individual whose computer confidence score was 25, computer knowledge score was 19, computer liking score was 20, and trait anxiety score was 43. (d) The coefficient of determination for this model is 0.69. Interpret this value. (e) The article states that regression assumptions were tested and found to be tenable. Explain what this means.
Read more
Chapter 14: Problem 20 Statistics: Informed Decisions Using Data 4
Pistol Shooting Researchers at Victoria University wanted to determine the factors that affect precision in shooting air pistols. Inter- and Intra-Individual Analysis in Elite Sport: Pistol Shooting, Journal of Applied Biomechanics, 2838, 2003. The explanatory variables were x1: Percent of the time the shooters aim was on target (a measure of accuracy) x2: Percent of the time the shooters aim was within a certain region (a measure of consistency or steadiness) x3: Distance (mm) the barrel of the pistol moves horizontally while aiming x4: Distance (mm) the barrel of the pistol moves vertically while aiming (a) One response variable in the study was the score that the individual received on the shot, with a higher score indicating a better shooter. The regression model presented was yn = 10.6 + 0.02x1 - 0.03x3. The reported P-value of the regression model was 0.05. Would you reject the null hypothesis H0: b1 = b3 = 0?(b) Interpret the slope coefficients of the model in part (a). (c) Predict the score of an individual whose aim was on target x1 = 20% of the time with a distance the pistol barrel moves horizontally of x3 = 12 mm using the model from part (a). (d) A second response variable in the study was the vertical distance that the bullet hole was from the target. The regression model for this response variable was yn = -24.6 -0.13x1 + 0.21x2 + 0.13x3 + 0.22x4. The reported P-value of the regression model was 0.04. Would you reject the null hypothesis H0: b1 = b2 = b3 = b4 = 0? (e) Interpret the slope coefficients of the model in part (d). (f ) Based on your answer to part (e), do you think that the model is useful in predicting vertical distance from the target? Why?
Read more
Chapter 14: Problem 21 Statistics: Informed Decisions Using Data 4
Life Cycle Hypothesis In the 1950s, Franco Modigliani developed the Life Cycle Hypothesis. One tenet of this hypothesis is that income varies with age. The regression equation yn = -55,961.675 + 4314.374x - 45.66x2 describes the median income, y, at different ages, x, for U.S. residents in 2006. (a) Graph the equation for values of x between 20 and 70 by evaluating the equation at x = 20, 30, 40, 50, 60, and 70, plotting the points, and connecting the points in a smooth curve. (b) Is the median income of a U.S. resident higher when an individual is 40 years old or 50 years old? (c) How much does median income change from 50 to 60 years of age?
Read more
Chapter 14: Problem 22 Statistics: Informed Decisions Using Data 4
Divorce Rates The given data represent the percentage, y, of the population that is divorced for various ages, x, in the United States in 2007 based on sample data obtained from the United States Statistical Abstract in 2009. Age, a Percentage Divorced, D 22 0.8 27 2.8 32 6.4 37 8.7 42 12.3 50 14.5 60 13.8 70 9.6 80 4.9 Source: United States Statistical Abstract, 2009 (a) Draw a scatter diagram treating age as the explanatory variable and percentage divorced as the response variable. Comment on the shape of the scatter diagram. (b) A regression equation that describes the relation between age and percentage divorced is yn = -26.3412 + 1.4794x - 0.0136x2 . Use this equation to predict the percentage of 40-year-olds that were divorced in 2007. (c) Can we interpret the coefficients of x or x2 as we did for additive linear models?
Read more
Chapter 14: Problem 23 Statistics: Informed Decisions Using Data 4
Estimating Age In the European Union, it has become important to be able to determine an individuals age when legal documentation of the birth date of an individual is unavailable. In the article Age Estimation in Children by Measurement of Open Apices in Teeth: a European Formula (International Journal of Legal Medicine [2007]:121: 449453), researchers developed a model to predict the age, y, of an individual based on the gender of the individual, x1 (0 = female, 1 = male), the height of the second premolar, x2, the number of teeth with root development, x3, and the sum of the normalized heights of seven teeth on the left side of the mouth, x4. The normalized height of the seven teeth was found by dividing the distance between teeth by the height of the tooth. Their model is yn = 9.063 + 0.386x1 + 1.268x2 + 0.676x3 - 0.913x4 - 0.175x3x4 (a) Based on this model, what is the expected age of a female with x2 = 28 mm, x3 = 8, and x4 = 18 mm? (b) Based on this model, what is the expected age of a male with x2 = 28 mm, x3 = 8, and x4 = 18 mm? (c) What is the interaction term? What variables interact? (d) The coefficient of determination for this model is 86.3%. Explain what this means.
Read more
Chapter 14: Problem 24 Statistics: Informed Decisions Using Data 4
More Age Estimation In the article Bigger Teeth for Longer Life? Longevity and Molar Height in Two Roe Deer Populations (Biology Letters [June, 2007] vol. 3 no. 3 268270), researchers developed a model to predict the tooth height (in mm), y, of roe deer based on their age, x1, gender, x2 (0 = female, 1 = male), and location, x3 (Trois Fontaines deer, which have a shorter life expectancy, and Chiz, which have a longer life expectancy, x3 = 0 for Trois Fontaines, x3 = 1 for Chiz). The model is yn = 7.790 - 0.382x1 - 0.587x2 - 0.925x3 + 0.091x2x3 (a) What is the expected tooth length of a female roe deer who is 12 years old and lives in Trois Fontaines? (b) What is the expected tooth length of a male roe deer who is 8 years old and lives in Chiz? (c) What is the interaction term? What does the coefficient of the interaction term imply about tooth length?
Read more
Chapter 14: Problem 25 Statistics: Informed Decisions Using Data 4
Wind Chill Temperature A researcher wanted to determine if there was a linear relation among wind chill temperature, air temperature, and wind speed. The following data show wind chill temperature, air temperature (in degrees Fahrenheit), and wind speed (in miles per hour) for various days. Air Temp. Wind Speed Wind Chill 15 10 3 15 15 0 15 25 -4 0 5 -11 0 20 -22 -5 10 -22 -5 25 -31 -10 15 -32 -10 20 -35 -15 25 -44 -15 35 -48 -15 50 -52 5 40 -22 10 45 -16 (a) Find the least-squares regression equation yn = b0 + b1x1 + b2x2, where x1 is air temperature, x2 is wind speed, and y is the response variable wind chill. (b) Draw residual plots to assess the adequacy of the model. What might you conclude based on the plot of residuals against wind speed?
Read more
Chapter 14: Problem 26 Statistics: Informed Decisions Using Data 4
Heat Index A researcher wanted to determine whether there was a linear relation among heat index, air temperature, and dew point. The following data show the heat index, air temperature (in degrees Fahrenheit), and dew point for various days.Air Temp. Dew Point Heat Index 90 64 93 90 68 95 94 66 99 94 70 102 96 70 105 96 76 111 99 68 107 99 72 111 100 74 114 100 80 123 93 72 103 93 78 109 97 80 118 92 82 114 95 66 100 95 82 118 (a) Find the least-squares regression equation yn = b0 + b1x1 + b2x2, where x1 is air temperature, x2 is dew point, and y is the response variable, heat index. (b) Draw residual plots to assess the adequacy of the model. What might you conclude based on the residual plots?
Read more
Chapter 14: Problem 27 Statistics: Informed Decisions Using Data 4
. Concrete A researcher wants to determine a model that can be used to predict the 28-day strength of a concrete mixture. The following data represent the 28-day and 7-day strength (in pounds per square inch) of a certain type of concrete along with the concretes slump. Slump is a measure of the uniformity of the concrete, with a higher slump indicating a less uniform mixture. Slump (inches) 7-Day psi 28-Day psi 4.5 2330 4025 4.25 2640 4535 3 3360 4985 4 1770 3890 3.75 2590 3810 2.5 3080 4685 4 2050 3765 5 2220 3350 4.5 2240 3610 5 2510 3875 2.5 2250 4475 (a) Construct a correlation matrix between slump, 7-day psi, and 28-day psi. Is there any reason to be concerned with multicollinearity based on the correlation matrix? (b) Find the least-squares regression equation yn = b0 + b1x1 + b2x2, where x1 is slump, x2 is 7-day strength, and y is the response variable, 28-day strength. (c) Draw residual plots and a boxplot of the residuals to assess the adequacy of the model. (d) Interpret the regression coefficients for the least-squares regression equation. (e) Determine and interpret R2 and the adjusted R2 . (f) Test H0: b1 = b2 = 0 versus H1: at least one of the bi 0 at the a = 0.05 level of significance. (g) Test the hypotheses H0: b1 = 0 versus H1: b1 0 and H0: b2 = 0 versus H1: b2 0 at the a = 0.05 level of significance. (h) Predict the mean 28-day strength of all concrete for which slump is 3.5 inches and 7-day strength is 2450 psi. (i) Predict the 28-day strength of a specific sample of concrete for which slump is 3.5 inches and 7-day strength is 2450 psi. (j) Construct 95% confidence and prediction intervals for concrete for which slump is 3.5 inches and 7-day strength is 2450 psi. Interpret the results.
Read more
Chapter 14: Problem 28 Statistics: Informed Decisions Using Data 4
Income An economist was interested in modeling the relation among annual income, level of education, and work experience. The level of education is the number of years of education beyond eighth grade, so 1 represents completing 1 year of high school, 8 means completing 4 years of college, and so on. Work experience is the number of years employed in the current profession. From a random sample of 12 individuals, he obtained the following data:(g) Test the hypotheses H0: b1 = 0 versus H1: b1 0 and H0: b2 = 0 versus H1: b2 0 at the a = 0.05 level of significance. (h) Predict the mean income of all individuals whose experience is 12 years and level of education is 4. (i) Predict the income of a single individual whose experience is 12 years and level of education is 4. (j) Construct 95% confidence and prediction intervals for income when experience is 12 years and level of education is 4.
Read more
Chapter 14: Problem 29 Statistics: Informed Decisions Using Data 4
Housing Prices A realtor wanted to find a model that relates the asking price of a house to the square footage, number of bedrooms, and number of baths. The following data are from houses in Greenville, South Carolina. Square Footage Bedrooms Baths Asking Price ($ thousands) 3800 4 3.5 498 2600 4 3 449 2600 5 3.5 435 2250 4 4 400 3300 4 3 379 2750 3 2.5 375 2200 3 2.5 356 3000 4 2.5 350 2300 3 2 340 2600 4 2.5 332 2300 4 2 298 2000 4 3 280 2200 3 2.5 260 Source: remax.com (a) Construct the correlation matrix. Is there any reason to be concerned with multicollinearity? (b) Find the least-squares regression equation yn = b0 + b1x1 + b2 x2 + b3x3, where x1 is square footage, x2 is number of bedrooms, x3 is number of baths, and y is the response variable asking price. (c) Test H0: b1 = b2 = b3 = 0 versus H1: at least one of the bi 0 at the a = 0.05 level of significance. (d) Test the hypotheses H0: b1 = 0 versus H1: b1 0, H0: b2 = 0 versus H1: b2 0, and H0: b3 = 0 versus H1: b3 0 at the a = 0.05 level of significance. (e) Remove the explanatory variable with the highest P-value and compute the least-squares regression equation. Are all the slope coefficients significantly different from zero? If not, remove the explanatory variable with the higher P-value and compute the least-squares regression equation. (f) Draw residual plots, a boxplot of the residuals, and a normal probability plot of the residuals to assess the adequacy of the model found in part (e). (g) Interpret the regression coefficients for the least-squares regression equation found in part (e). (h) Construct 95% confidence and prediction intervals for the asking price of a 2900-square-foot house in Greenville, South Carolina, with 4 bedrooms and 3 baths. Interpret the results. 30.
Read more
Chapter 14: Problem 30 Statistics: Informed Decisions Using Data 4
Head Circumference A pediatrician wants to determine the relation that may exist between a childs head circumference (in centimeters), height (in inches), and weight (in ounces). She randomly selects 14 three-year-old children from her practice and obtains the following data: Height Weight Head Circumference 30 339 47 26.25 267 42 25 289 43 27 332 44.5 27.5 272 44 24.5 214 40.5 27.75 311 44 25 259 41.5 28 298 46 27.25 288 44 26 277 44 27.25 292 44.5 27 302 42.5 28.25 336 44.5 Source: Denise Slucki, student at Joliet Junior College (a) Construct a correlation matrix. Is there any reason to be concerned with multicollinearity? (b) Find the least-squares regression equation yn = b0 + b1x1 + b2 x2, where x1 is height, x2 is weight, and y is the response variable, head circumference. (c) Test H0: b1 = b2 = 0 versus H1: at least one of the bi 0 at the a = 0.05 level of significance. (d) Test the hypotheses H0: b1 = 0 versus H1: b1 0 and H0: b2 = 0 versus H1: b2 0 at the a = 0.05 level of significance. (e) Compute the regression line after removing any explanatory variable that is not significant from the regression model. (f) Draw residual plots, a boxplot of the residuals, and a normal probability plot of the residuals to assess the adequacy of the model found in part (e). (g) Interpret the regression coefficients for the least-squares regression equation found in part (e). (h) Determine and interpret R2 and the adjusted R2 . (i) Construct 95% confidence and prediction intervals for the head circumference of a child whose height is 27.5 inches and whose weight is 285 ounces. Interpret the results. 31
Read more
Chapter 14: Problem 31 Statistics: Informed Decisions Using Data 4
Gas Mileage A researcher is interested in developing a model that describes the gas mileage, measured in miles per (a) Construct a correlation matrix. Is there any reason to be concerned about multicollinearity? (b) Find the least-squares regression equation yn = b0 + b1x1 + b2 x2 + b3 x3, where x1 is engine size, x2 is curb weight, x3 is horsepower, and y is the response variable, miles per gallon. (c) Test H0: b1 = b2 = b3 = 0 versus H1: at least one of the bi 0 at the a = 0.05 level of significance. (d) Test the hypotheses H0: bi = 0 versus H1: bi 0 for i = 1, 2, 3 at the a = 0.05 level of significance. Should any of the explanatory variables be removed from the model? If so, which one? (e) Determine the regression model with the explanatory variable identified in part (d) removed. Are both slope coefficients significantly different from zero? If not, remove the appropriate explanatory variable and compute the leastsquares regression equation. (f ) Draw residual plots, a boxplot of the residuals, and a normal probability plot of the residuals to assess the adequacy of the model found in part (e). (g) Interpret the regression coefficients for the least-squares regression equation found in part (e). (h) Construct 95% confidence and prediction intervals for the gas mileage of an automobile that weighs 3100 pounds and has a 2.5-liter engine and 200 horsepower. Interpret the results.
Read more
Chapter 14: Problem 32 Statistics: Informed Decisions Using Data 4
Suppose we wish to develop a regression equation that models the selling price of a home. The researcher wishes to include the variable garage in the model. She has identified three possibilities for a garage: (1) attached, (2) detached, (3) no garage. Define the indicator variables necessary to incorporate the variable garage into the model.
Read more
Chapter 14: Problem 33 Statistics: Informed Decisions Using Data 4
Does Size Matter? Researchers wondered whether the size of a persons brain was related to the individuals mental capacity. They selected a sample of right-handed Anglo introductory psychology students who had Scholastic Aptitude Test scores higher than 1350. The subjects were administered the Wechsler Adult Intelligence ScaleRevised to obtain their IQ scores. The MRI scans, performed at the same facility, consisted of 18 horizontal MR images. The computer counted all pixels with nonzero gray scale in each of the 18 images, and the total count served as an index for brain size. The resulting data are presented in the following table: Gender MRI Count IQ Gender MRI Count IQ Female 816,932 133 Male 949,395 140 Female 951,545 137 Male 1,001,121 140 Female 991,305 138 Male 1,038,437 139 Female 833,868 132 Male 965,353 133 Female 856,472 140 Male 955,466 133 Female 852,244 132 Male 1,079,549 141 Female 790,619 135 Male 924,059 135 Female 866,662 130 Male 955,003 139 Female 857,782 133 Male 935,494 141 Female 948,066 133 Male 949,589 144 Source: L. Willerman, R. Schultz, I. N. Rutledge, and E. Bigler. In Vivo Brain Size and Intelligence. Intelligence, 15(223228), 1991. (a) Find the least-squares regression equation yn = b0 + b1x1, where x1 is MRI count and y is the response variable IQ. (b) Test the hypotheses H0: b1 = 0 versus H1: b1 0. What do you conclude? (c) Draw a scatter diagram, treating MRI count as the explanatory variable and IQ as the response variable, but use a different plotting symbol for males and females. For example, use a circle for males and a square for females. (d) Find the least-squares regression equation yn = b0 + b1x1 + b2 x2, where x1 is MRI count and x2 = 0 for males and x2 = 1 for females. (e) Test the hypotheses H0: b1 = 0 versus H1: b1 0 and H0: b2 = 0 versus H1: b2 0. (f) What do you conclude from this analysis? 34
Read more
Chapter 14: Problem 34 Statistics: Informed Decisions Using Data 4
Drill Time The following data represent the time (minutes) it takes to drill an additional 5 feet from the depth indicated for both wet drilling and dry drilling conditions.(a) Draw a scatter diagram, treating depth as the explanatory variable and time as the response variable, but use a different plotting symbol for wet and dry. For example, use a circle for wet and a square for dry. (b) Find the least-squares regression equation yn = b0 + b1x1 + b2 x2, where x1 is depth and x2 = 0 for wet and x2 = 1 for dry. (c) Test the hypotheses H0: b1 = 0 versus H1: b1 0 and H0: b2 = 0 versus H1: b2 0. (d) Construct 95% confidence and prediction intervals for time to drill an additional 5 feet in dry conditions where drilling starts at 100 feet. Interpret the results.
Read more
Chapter 14: Problem 35 Statistics: Informed Decisions Using Data 4
Putting It Together: Purchasing Diamonds The value of a diamond is determined by the 4 Cs: carat weight, color, clarity, and cut. Carat weight is the standard measure for the size of a diamond. Generally, the more a diamond weighs, the more valuable it will be. The Gemological Institute of America (GIA) determines the color of diamonds using a 22-grade scale from D (almost clear white) to Z (light yellow). Colorless diamonds are generally considered the most desirable. Diamonds also exist in other colors such as blue, red, and green, but these fancy colors will not be considered here. The clarity of a diamond refers to how free the diamond is of imperfections. The GIA determines the clarity of diamonds using an 11-grade scale: flawless (FL), internally flawless (IF), very very slightly imperfect (VVS1, VVS2), very slightly imperfect (VS1, VS2), slightly imperfect (SI1, SI2), and imperfect (I1, I2, I3). The cut of a diamond refers to the diamonds proportions and finish. Put simply, the better the diamonds cut is, the better it reflects and refracts light, which makes it more beautiful and thus more valuable. The cut of a diamond is rated using a 5-grade scale: Excellent, Very Good, Good, Fair, and Poor. Finally, the shape of a diamond (which is not one of the 4 Cs) refers to its basic form: round, oval, pear-shaped, marquis, etc. A novice might confuse shape with cut, so be careful not to confuse the two. The given data provide the 4 Cs and the retail price for a random sample of 40 unmounted, round-shaped diamonds. Use the data to answer the questions that follow: (a) Determine the level of measurement for each variable.
Read more
Chapter 14: Problem 36 Statistics: Informed Decisions Using Data 4
When testing whether or not there is a linear relation between the response variable and the explanatory variables, we use an F-test. If the P-value indicates that we reject the null hypothesis, H0:b1 = b2 = g = bk = 0, what conclusion should we come to? Is it possible that one of the bi is zero if we reject the null hypothesis?
Read more
Chapter 14: Problem 37 Statistics: Informed Decisions Using Data 4
What does it mean when we say that the explanatory variables have an additive effect or do not interact?
Read more
Chapter 14: Problem 38 Statistics: Informed Decisions Using Data 4
Explain the difference between the coefficient of determination, R2 , and the adjusted coefficient of determination, R2 adj. Which is better for determining whether an additional explanatory variable should be added to the regression model?
Read more
Chapter 14: Problem 39 Statistics: Informed Decisions Using Data 4
What is multicollinearity? How can we check for it? What are the consequences of multicollinearity?
Read more

Table of Contents

Textbook Solutions for Statistics: Informed Decisions Using Data

Question

Solution

For the data set x1 x2 x3 x4 y 43 19.6 7.1 32 200 44 13.1 58.5 37 204 40 24.7 2.1 32 215

Chapter 14.3 textbook questions

Register

Table of Contents

Textbook Solutions for Statistics: Informed Decisions Using Data

Question

Solution

For the data set x1 x2 x3 x4 y 43 19.6 7.1 32 200 44 13.1 58.5 37 204 40 24.7 2.1 32 215

Chapter 14.3 textbook questions

Login

Register

Reset password