Residuals Suppose you have fit a linear model to some data and now take a look at the residuals. For each of the following possible residuals plots, tell whether you would try a re-expression and, if so, why.
Read more- Statistics / Stats: Data and Models 4 / Chapter 9 / Problem 27E
Table of Contents
Textbook Solutions for Stats: Data and Models
Question
The value of a log is based on the number of board feet of lumber the log may contain. (A board foot is the equivalent of a piece of wood 1 inch thick, 12 inches wide, and 1 foot long. For example, a 2^{\prime \prime} \times 4^{\prime \prime} a piece that is 12 feet long contains 8 board feet.) To estimate the amount of lumber in a log, buyers mea- sure the diameter inside the bark at the smaller end. Then they look at a table based on the Doyle Log Scale. The table below shows the estimates for logs 16 feet long.
\(\begin{array}{|l|rrrrrr|} \hline \text { Diameter of Log } & 8^{\prime \prime} & 12^{\prime \prime} & 16^{\prime \prime} & 20^{\prime \prime} & 24^{\prime \prime} & 28^{\prime \prime} \\ \text { Board Feet } & 16 & 64 & 144 & 256 & 400 & 576 \\ \hline \end{array}\)
a) What model does this scale use?
b) How much lumber would you estimate that a log 10 inches in diameter contains?
c) What does this model suggest about logs 36 inches in diameter?
Solution
The first step in solving 9 problem number 36 trying to solve the problem we have to refer to the textbook question: The value of a log is based on the number of board feet of lumber the log may contain. (A board foot is the equivalent of a piece of wood 1 inch thick, 12 inches wide, and 1 foot long. For example, a 2^{\prime \prime} \times 4^{\prime \prime} a piece that is 12 feet long contains 8 board feet.) To estimate the amount of lumber in a log, buyers mea- sure the diameter inside the bark at the smaller end. Then they look at a table based on the Doyle Log Scale. The table below shows the estimates for logs 16 feet long. \(\begin{array}{|l|rrrrrr|} \hline \text { Diameter of Log } & 8^{\prime \prime} & 12^{\prime \prime} & 16^{\prime \prime} & 20^{\prime \prime} & 24^{\prime \prime} & 28^{\prime \prime} \\ \text { Board Feet } & 16 & 64 & 144 & 256 & 400 & 576 \\ \hline \end{array}\)a) What model does this scale use? b) How much lumber would you estimate that a log 10 inches in diameter contains? c) What does this model suggest about logs 36 inches in diameter?
From the textbook chapter Re-expressing Data: Get It Straight! you will find a few key concepts needed to solve this.
Visible to paid subscribers only
Step 3 of 7)Visible to paid subscribers only
full solution
GDP The scatterplot shows the gross domestic product (GDP)
Chapter 9 textbook questions
-
Chapter 9: Problem 1 Stats: Data and Models 4
-
Chapter 9: Problem 1 Stats: Data and Models 4
Problem 1RE College Every year, US News and World Report publishes a special issue on many U.S. colleges and universities. The scatterplots below have Student/Faculty Ratio (number of students per faculty member) for the colleges and universities on the y-axes plotted against 4 other variables. The correct correlations for these scatterplots appear in this list. Match them.
Read more -
Chapter 9: Problem 7 Stats: Data and Models 4
Problem 7RE Acid rain Biologists studying the effects of acid rain on wildlife collected data from 163 streams in the Adirondack Mountains. They recorded the pH (acidity) of the water and the BCI, a measure of biological diversity, and they calculated R2 = 27%. Here’s a scatterplot of BCI against pH: a) What is the correlation between pH and BCI? b) Describe the association between these two variables. c) If a stream has average pH, what would you predict about the BCI? d) In a stream where the pH is 3 standard deviations above average, what would you predict about the BCI?
Read more -
Chapter 9: Problem 4 Stats: Data and Models 4
Problem 4RE Vineyards again Instead of Age, perhaps the Size of the vineyard (in acres) is associated with the price of the wines. Look at the scatterplot (Data in Vineyards full): a) Do you see any evidence of an association? b) What concern do you have about this scatterplot? c) If the red “+” data point is removed, would the correlation become stronger or weaker? Explain. d) If the red “+” data point is removed, would the slope of the line increase or decrease? Explain.
Read more -
Chapter 9: Problem 2 Stats: Data and Models 4
Problem 2E College Every year US News and World Report publishes a special issue on many U.S. colleges and universities. The scatterplots below have Student/Faculty Ratio (number of students per faculty member) for the colleges and universities on the y-axes plotted against 4 other variables. The correct correlations for these scatterplots appear in this list. Match them. -0.98 -0.71 -0.51 0.09 0.23 0.69
Read more -
Chapter 9: Problem 2 Stats: Data and Models 4
Problem 2RE How old is that tree? One can determine how old a tree is by counting its rings, but that requires cutting the tree down. Can we estimate the tree’s age simply from its diameter? A forester measured 27 trees of the same species that had been cut down, and counted the rings to determine the ages of the trees. Diameter (in.) Age (yr) Diameter (in.) Age (yr) 1.8 4 10.3 23 1.8 5 14.3 25 2.2 8 13.2 28 4.4 8 9.9 29 6.6 8 13.2 30 4.4 10 15.4 30 7.7 10 17.6 33 10.8 12 14.3 34 7.7 13 15.4 35 5.5 14 11.0 38 9.9 16 15.4 38 10.1 18 16.5 40 12.1 20 16.5 42 12.8 22 a) Find the correlation between Diameter and Age. Does this suggest that a linear model may be appropriate? Explain. ________________ b) Create a scatterplot and describe the association. ________________ c) Create the linear model. ________________ d) Check the residuals. Explain why a linear model is probably not appropriate. ________________ e) If you used this model, would it generally overestimate or underestimate the ages of very large trees? Explain.
Read more -
Chapter 9: Problem 3 Stats: Data and Models 4
Vineyards, more information Here are the scatterplot and regression analysis for Case Prices of 36 wines from vineyards in the Finger Lakes region of New York State and the Ages of the vineyards. (Data in Vineyards full): a) Does it appear that vineyards in business longer get higher prices for their wines? Explain. b) What does this analysis tell us about vineyards in the rest of the world? c) Write the regression equation. d) Explain why that equation is essentially useless.
Read more -
Chapter 9: Problem 10 Stats: Data and Models 4
More models For each of the models listed below, predict y when x = 2. a. \(\hat{y}=1.2+0.8 \log x\) b. \(\log \hat{y}=1.2+0.8 x\) c. \(\ln \hat{y}=1.2+0.8 \ln x\) d. \(\hat{y}^{2}=1.2+0.8 x\) e. \(\frac{1}{\sqrt{\hat{y}}}=1.2+0.8 x\)
Read more -
Chapter 9: Problem 10 Stats: Data and Models 4
Problem 10RE Grades A Statistics instructor created a linear regression equation to predict students’ final exam scores from their midterm exam scores. The regression equation was a) If Susan scored a 70 on the midterm, what did the instructor predict for her score on the final? b) Susan got an 80 on the final. How big is her residual? c) If the standard deviation of the final was 12 points and the standard deviation of the midterm was 10 points, what is the correlation between the two tests? d) How many points would someone need to score on the midterm to have a predicted final score of 100? e) Suppose someone scored 100 on the final. Explain why you can’t estimate this student’s midterm score from the information given. f) One of the students in the class scored 100 on the midterm but got overconfident, slacked off, and scored only 15 on the final exam. What is the residual for this student? g) No other student in the class “achieved” such a dramatic turnaround. If the instructor decides not to include this student’s scores when constructing a new regression model, will the R2 value of the regression increase, decrease, or remain the same? Explain. h) Will the slope of the new line increase or decrease?
Read more -
Chapter 9: Problem 12 Stats: Data and Models 4
Problem 12E Crowdedness In a Chance magazine article (Summer 2005), Danielle Vasilescu and Howard Wainer used data from the United Nations Center for Human Settlements to investigate aspects of living conditions for several countries. Among the variables they looked at were the country’s per capita gross domestic product (GDP, in $) and Crowdedness, defined as the average number of persons per room living in homes there. This scatterplot displays these data for 56 countries: a) Explain why you should re-express these data before trying to fit a model. b) What re-expression of GDP would you try as a starting point?
Read more -
Chapter 9: Problem 11 Stats: Data and Models 4
Problem 11E Vineyards again Instead of Age, perhaps the Size of the vineyard (in acres) is associated with the price of the wines. Look at the scatterplot: a) Do you see any evidence of an association? ________________ b) What concern do you have about this scatterplot? ________________ c) If the red “+” data point is removed, would the correlation become stronger or weaker? Explain. ________________ d) If the red “+” data point is removed, would the slope of the line increase or decrease? Explain.
Read more -
Chapter 9: Problem 11 Stats: Data and Models 4
Problem 11RE Tips It’s commonly believed that people use tips to reward good service. A researcher for the hospitality industry examined tips and ratings of service quality from 2645 dining parties at 21 different restaurants. The correlation between ratings of service and tip percentages was 0.11. (M. Lynn and M. McCall, “Gratitude and Gratuity.” Journal of Socio-Economics 29: 203–214) a) Describe the relationship between Quality of Service and Tip Size. ________________ b) Find and interpret the value of R2 in this context.
Read more -
Chapter 9: Problem 9 Stats: Data and Models 4
Problem 9E Models For each of the models listed below, predict y when x = 2.
Read more -
Chapter 9: Problem 12 Stats: Data and Models 4
Problem 12RE Cramming One Thursday, researchers gave students enrolled in a section of basic Spanish a set of 50 new vocabulary words to memorize. On Friday, the students took a vocabulary test. When they returned to class the following Monday, they were retested—without advance warning. Here are the test scores for the 25 students. a) What is the correlation between Friday and Monday scores? b) What does a scatterplot show about the association between the scores? c) What does it mean for a student to have a positive residual? d) What would you predict about a student whose Friday score was one standard deviation below average? e) Write the equation of the regression line. f) Predict the Monday score of a student who earned a 40 on Friday.
Read more -
Chapter 9: Problem 13 Stats: Data and Models 4
Gas mileage, revisited Let’s try the re-expressed variable Fuel Consumption (gal/100 mi) to examine the fuel efficiency of the 11 cars in Exercise 11. Here are the revised regression analysis and residuals plot: a) Explain why this model appears to be better than the linear model. b) Using the regression analysis above, write an equation of this model. c) Interpret the slope of this line. d) Based on this model, how many miles per gallon would you expect a 3500-pound car to get?
Read more -
Chapter 9: Problem 13 Stats: Data and Models 4
Problem 13RE Logs (not logarithms) The value of a log is based on the number of board feet of lumber the log may contain. (A board foot is the equivalent of a piece of wood 1 inch thick, 12 inches wide, and 1 foot long. For example, a 2" * 4" piece that is 12 feet long contains 8 board feet.) To estimate the amount of lumber in a log, buyers measure the diameter inside the bark at the smaller end. Then they look in a table based on the Doyle Log Scale. The table below shows the estimates for logs 16 feet long. Diameter of Log 8" 12" 16" 20" 24" 28" Board Feet 16 64 144 256 400 576 a) What model does this scale use? ________________ b) How much lumber would you estimate that a log 10 inches in diameter contains? ________________ c) What does this model suggest about logs 36 inches in diameter?
Read more -
Chapter 9: Problem 14 Stats: Data and Models 4
Problem 14RE US cities Data from 50 large U.S. cities show the mean January Temperature and the Latitude. Describe what you see in the scatterplot.
Read more -
Chapter 9: Problem 15 Stats: Data and Models 4
Problem 15RE Correlations The study of U.S. cities in Exercise found the mean January Temperature (degrees Fahrenheit), Altitude (feet above sea level), and Latitude (degrees north of the equator) for 55 cities. Here’s the correlation matrix: Jan. Temp Latitude Altitude Jan. Temp 1.000 Latitude -0.848 1.000 Altitude -0.369 0.184 1.000 a) Which seems to be more useful in predicting January Temperature—Altitude or Latitude? Explain. ________________ b) If the Temperature were measured in degrees Celsius, what would be the correlation between Temperature and Latitude? ________________ c) If the Temperature were measured in degrees Celsius and the Altitude in meters, what would be the correlation? Explain. ________________ d) What would you predict about the January Temperatures in a city whose Altitude is two standard deviations higher than the average Altitude? Exercise US cities Data from 50 large U.S. cities show the mean January Temperature and the Latitude. Describe what you see in the scatterplot.
Read more -
Chapter 9: Problem 16 Stats: Data and Models 4
Colorblind Although some women are colorblind, this condition is found primarily in men. Why is it wrong to say there’s a strong correlation between Sex and Colorblindness?
Read more -
Chapter 9: Problem 14 Stats: Data and Models 4
Crowdedness again In Exercise 12 we looked at United Nations data about a country’s GDP and the average number of people per room (Crowdedness) in housing there. For a re-expression, a student tried the reciprocal -10000/GDP, representing the number of people per $10,000 of gross domestic product. Here are the results, plotted against Crowdedness: a) Is this a useful re-expression? Explain. b) What re-expression would you suggest this student try next?
Read more -
Chapter 9: Problem 17 Stats: Data and Models 4
Problem 17RE Life expectancy The data in the table below list the Life Expectancy for white males in the United States every decade during the last century (1 = 1900 to 1910, 2 = 1911 to 1920, etc.). Create a model to predict future increases in life expectancy. (National Vital Statistics Report) Decade 1 2 3 4 5 6 7 8 9 10 11 Life exp. 48.6 54.4 59.7 62.1 66.5 67.4 68.0 70.7 72.7 74.9 76.5
Read more -
Chapter 9: Problem 20 Stats: Data and Models 4
Problem 20RE Improving trees In the last exercise, you saw that the linear model had some deficiencies. Let’s create a better model. a) Perhaps the cross-sectional area of a tree would be a better predictor of its age. Since area is measured in square units, try re-expressing the data by squaring the diameters. Does the scatterplot look better? b) Create a model that predicts Age from the square of the Diameter. c) Check the residuals plot for this new model. Is this model more appropriate? Why? d) Estimate the age of a tree 18 inches in diameter.
Read more -
Chapter 9: Problem 21 Stats: Data and Models 4
Problem 21RE Big screen An electronics website collects data on the size of new HD flat panel televisions (measuring the diagonal of the screen in inches) to predict the cost (in hundreds of dollars). Which of these is most likely to be the slope of the regression line: 0.03, 0.3, 3, 30? Explain.
Read more -
Chapter 9: Problem 19 Stats: Data and Models 4
Problem 19E Gas mileage revisited Let’s try the re-expressed variable Fuel Consumption 1gal>100 mi2 to examine the fuel efficiency of the 11 cars in Exercise. Here are the revised regression analysis and residuals plot: Dependent variable is: Fuel Consumption R-squared = 89.2% Variable Coefficient Intercept 0.624932 Weight 1.17791 a) Explain why this model appears to be better than the linear model. ________________ b) Using the regression analysis above, write an equation of this model. ________________ c) Interpret the slope of this line. ________________ d) Based on this model, how many miles per gallon would you expect a 3500-pound car to get? Exercise Gas mileage As the example in the chapter indicates, one of the important factors determining a car’s Fuel Efficiency is its Weight. Let’s examine this relationship again, for 11 cars. a) Describe the association between these variables shown in the scatter plot. ________________ b) Here is the regression analysis for the linear model. What does the slope of the line say about this relationship? Dependent variable is: Fuel Efficiency R-squared = 85.9% Variable Coefficient Intercept 47.9636 Weight -7.65184 ________________ c) Do you think this linear model is appropriate? Use the residuals plot to explain your decision
Read more -
Chapter 9: Problem 20 Stats: Data and Models 4
Problem 20E Pendulum A student experimenting with a pendulum counted the number of full swings the pendulum made in 20 seconds for various lengths of string. Her data are shown below. a) Explain why a linear model is not appropriate for using the Length of a pendulum to predict the Number of Swings in 20 seconds. b) Re-express the data to straighten the scatterplot. c) Create an appropriate model. d) Estimate the number of swings for a pendulum with a 4-inch string. e) Estimate the number of swings for a pendulum with a 48-inch string. f) How much confidence do you place in these predictions? Why?
Read more -
Chapter 9: Problem 19 Stats: Data and Models 4
How old is that tree? One can determine how old a tree is by counting its rings, but that requires either cutting the tree down or extracting a sample from the tree’s core. Can we estimate the tree’s age simply from its diameter? A forester measured 27 trees of the same species that had been cut down, and counted the rings to determine the ages of the trees. a) Find the correlation between Diameter and Age. Does this suggest that a linear model may be appropriate? Explain. b) Create a scatterplot and describe the association. c) Create the linear model. d) Check the residuals. Explain why a linear model is probably not appropriate. e) If you used this model, would it generally overestimate or underestimate the ages of very large trees? Explain.
Read more -
Chapter 9: Problem 18 Stats: Data and Models 4
Problem 18E Crowdedness In a Chance magazine article (Summer 2005), Danielle Vasilescu and Howard Wainer used data from the United Nations Center for Human Settlements to investigate aspects of living conditions for several countries. Among the variables they looked at were the country’s per capita gross domestic product (GDP, in $) and Crowdedness, defined as the average number of persons per room living in homes there. This scatter plot displays these data for 56 countries: a) Explain why you should re-express these data before trying to fit a model. ________________ b) What re-expression of GDP would you try as a starting point?
Read more -
Chapter 9: Problem 23 Stats: Data and Models 4
Problem 23E Planets, distances and order Let’s look again at the pattern in the locations of the planets in our solar system seen in the table in Exercise 22. a) Re-express the distances to create a model for the Distance from the sun based on the planet’s Position. b) Based on this model, would you agree with the International Astronomical Union that Pluto is not a planet? Explain. Exercise 22: Planets, distances and years At a meeting of the International Astronomical Union (IAU) in Prague in 2006, Pluto was determined not to be a planet, but rather the largest member of the Kuiper belt of icy objects. Let’s examine some facts. Here is a table of the 9 sun-orbiting objects formerly known as planets: a) Plot the Length of the year against the Distance from the sun. Describe the shape of your plot. b) Re-express one or both variables to straighten the plot. Use the re-expressed data to create a model describing the length of a planet’s year based on its distance from the sun. c) Comment on how well your model fits the data.
Read more -
Chapter 9: Problem 22 Stats: Data and Models 4
Problem 22E Crowdedness again In Exercise we looked at United Nations data about a country’s GDP and the average number of people per room (Crowdedness) in housing there. For a re-expression, a student tried the reciprocal -10000>GDP, representing the number of people per $10,000 of gross domestic product. Here are the results, plotted against Crowdedness: a) Is this a useful re-expression? Explain. ________________ b) What re-expression would you suggest this student try next?
Read more -
Chapter 9: Problem 24 Stats: Data and Models 4
Problem 24E Traffic Highway planners investigated the relationship between traffic Density (number of automobiles per mile) and the average Speed of the traffic on a moderately large city thoroughfare. The data were collected at the same location at 10 different times over a span of 3 months. They found a mean traffic Density of 68.6 cars per mile (cpm) with standard deviation of 27.07 cpm. Overall, the cars’ average Speed was 26.38 mph, with standard deviation of 9.68 mph. These researchers found the regression line for these data to be Speed = 50.55 - 0.352 Density. a) What is the value of the correlation coefficient between Speed and Density? ________________ b) What percent of the variation in average Speed is explained by traffic Density? ________________ c) Predict the average Speed of traffic on the thoroughfare when the traffic Density is 50 cpm. ________________ d) What is the value of the residual for a traffic Density of 56 cpm with an observed Speed of 32.5 mph? ________________ e) The data set initially included the point Density = 125 cpm, Speed = 55 mph. This point was considered an outlier and was not included in the analysis. Will the slope increase, decrease, or remain the same if we redo the analysis and include this point? ________________ f) Will the correlation become stronger, weaker, or remain the same if we redo the analysis and include this point (125,55)? ________________ g) A European member of the research team measured the Speed of the cars in kilometers per hour (1 km ? 0.6 miles) and the traffic Density in cars per kilometer. Find the value of his calculated correlation between speed and density.
Read more -
Chapter 9: Problem 24 Stats: Data and Models 4
Problem 24RE Orange production The table below shows that as the number of oranges on a tree increases, the fruit tends to get smaller. Create a model for this relationship, and express any concerns you may have. Number of Oranges/Tree Average Weight/Fruit(lb) 50 0.60 100 0.58 150 0.56 200 0.55 250 0.53 300 0.52 350 0.50 400 0.49 450 0.48 500 0.46 600 0.44 700 0.42 800 0.40 900 0.38
Read more -
Chapter 9: Problem 25 Stats: Data and Models 4
Problem 25RE U.S. cities Data from 50 large U.S. cities show the mean January Temperature and the Latitude. Describe what you see in the scatterplot.
Read more -
Chapter 9: Problem 25 Stats: Data and Models 4
Problem 25E Cramming One Thursday, researchers gave students enrolled in a section of basic Spanish a set of 50 new vocabulary words to memorize. On Friday the students took a vocabulary test. When they returned to class the following Monday, they were retested—without advance warning. Here are the test scores for the 25 students. Fri. Mon. Fri. Mon. Fri. Mon. 42 36 48 37 39 41 44 44 43 41 46 32 45 46 45 32 37 36 48 38 47 44 40 31 44 40 50 47 41 32 43 38 34 34 48 39 41 37 38 31 37 31 35 31 43 40 36 41 43 32 a) What is the correlation between Friday and Monday scores? ________________ b) What does a scatterplot show about the association between the scores? ________________ c) What does it mean for a student to have a positive residual? ________________ d) What would you predict about a student whose Friday score was one standard deviation below average? ________________ e) Write the equation of the regression line. ________________ f) Predict the Monday score of a student who earned a 40 on Friday.
Read more -
Chapter 9: Problem 26 Stats: Data and Models 4
Planets, models, and laws The model you found in Exercise 22 is a relationship noted in the 17th century by Kepler as his Third Law of Planetary Motion. It was subsequently explained as a consequence of Newton’s Law of Gravitation. The models for Exercises 23–25 relate to what is sometimes called the Titius-Bode “law,” a pattern noticed in the 18th century but lacking any scientific explanation. Compare how well the re-expressed data are described by their respective linear models. What aspect of the model of Exercise 22 suggests that we have found a physical law? In the future, we may learn enough about a planetary system around another star to tell whether the Titius-Bode pattern applies there. If you discovered that another planetary system followed the same pattern, how would it change your opinion about whether this is a real natural “law”? What would you think if the next system we find does not follow this pattern?
Read more -
Chapter 9: Problem 26 Stats: Data and Models 4
Problem 26RE French Consider the association between a student’s score on a French vocabulary test and the weight of the student. What direction and strength of correlation would you expect in each of the following situations? Explain. a) The students are all in third grade. ________________ b) The students are in third through twelfth grades in the same school district. ________________ c) The students are in tenth grade in France. ________________ d) The students are in third through twelth grades in France.
Read more -
Chapter 9: Problem 31 Stats: Data and Models 4
Slower is cheaper? Researchers studying how a car’s Fuel Efficiency varies with its Speed drove a compact car 200 miles at various speeds on a test track. Their data are shown in the table.
Read more -
Chapter 9: Problem 31 Stats: Data and Models 4
Problem 31RE Gasoline Since clean-air regulations have dictated the use of unleaded gasoline, the supply of leaded gas in New York state has diminished. The table below was given on the August 2001 New York State Math B exam, a statewide achievement test for high school students. Year 1984 1988 1992 1996 2000 Gallons (1000’s) 150 124 104 76 50 a) Create a linear model and predict the number of gallons that will be available in 2005. ________________ b) The exam then asked students to estimate the year when leaded gasoline will first become unavailable, expecting them to use the model from part a to answer the question. Explain why that method is incorrect. ________________ c) Create a model that would be appropriate for that task, and make the estimate. ________________ d) The “wrong” answer from the other model is fairly accurate in this case. Why?
Read more -
Chapter 9: Problem 28 Stats: Data and Models 4
Problem 28RE Tree growth A 1996 study examined the growth of grapefruit trees in Texas, determining the average trunk Diameter (in inches) for trees of varying Ages: Age (yr) 2 4 6 8 10 12 14 16 18 20 Diameter (in.) 2.1 3.9 5.2 6.2 6.9 7.6 8.3 9.1 10.0 11.4 a) Fit a linear model to these data. What concerns do you have about the model? ________________ b) If data had been given for individual trees instead of averages, would you expect the fit to be stronger, less strong, or about the same? Explain.
Read more -
Chapter 9: Problem 27 Stats: Data and Models 4
Problem 27RE Twins Twins are often born after a pregnancy that lasts less than 9 months. On the next page is a graph from the Journal of the American Medical Association (JAMA) showing the rate of preterm twin births in the United States over the past 20 years. In this study, JAMA categorized mothers by the level of prenatal medical care they received: inadequate, adequate, or intensive. a) Describe the overall trend in preterm twin births. ________________ b) Describe any differences you see in this trend, depending on the level of prenatal medical care the mother received. ________________ c) Should expectant mothers be advised to cut back on the level of medical care they seek in the hope of avoiding preterm births? Explain. Preterm Birth Rate per 100 live twin births among U.S. twins by intensive, adequate, and less than adequate prenatal care utilization, 1981–1997. (JAMA 284[2000]: 335–341)
Read more -
Chapter 9: Problem 27 Stats: Data and Models 4
Problem 27E GDP The scatterplot shows the gross domestic product (GDP) of the United States in billions of dollars plotted against years since 1950. A linear model fit to the relationship looks like this: Dependent variable is: GDP R-squared = 87.6% s = 1597.7456 Variable Coefficient Intercept -2561.3552 Year-1950 237.74577 a) Does the value 87.6% suggest that this is a good model? Explain. ________________ b) Here’s a scatter plot of the residuals. Now do you think this is a good model for these data? Explain?
Read more -
Chapter 9: Problem 33 Stats: Data and Models 4
Lunchtime Does how long toddlers sit at the lunch table help predict how much they eat? The table and graph show the number of minutes the kids stayed at the table and the number of calories they consumed. Create and interpret a model for these data.
Read more -
Chapter 9: Problem 34 Stats: Data and Models 4
Problem 34E Pressure Scientist Robert Boyle examined the relationship between the volume in which a gas is contained and the pressure in its container. He used a cylindrical container with a moveable top that could be raised or lowered to change the volume. He measured the Height in inches by counting equally spaced marks on the cylinder, and measured the Pressure in inches of mercury (as in a barometer). Some of his data are listed in the table. Create an appropriate model. Height 48 44 40 36 32 28 Pressure 29.1 31.9 35.3 39.3 44.2 50.3 Height 24 20 18 16 14 12 Pressure 58.8 70.7 77.9 87.9 100.4 117.6
Read more -
Chapter 9: Problem 34 Stats: Data and Models 4
Problem 34RE Tobacco and alcohol Are people who use tobacco products more likely to consume alcohol? Here are data on household spending (in pounds) taken by the British Government on 11 regions in Great Britain. Do tobacco and alcohol spending appear to be related? What questions do you have about these data? What conclusions can you draw? Region Alcohol Tobacco North 6.47 4.03 Yorkshire 6.13 3.76 Northeast 6.19 3.77 East Midlands 4.89 3.34 West Midlands 5.63 3.47 East Anglia 4.52 2.92 Southeast 5.89 3.20 Southwest 4.79 2.71 Wales 5.27 3.53 Scotland 6.08 4.51 Northern Ireland 4.02 4.56
Read more -
Chapter 9: Problem 35 Stats: Data and Models 4
Tobacco and alcohol Are people who use tobacco products more likely to consume alcohol? Here are data on household spending (in pounds) taken by the British government on 11 regions in Great Britain. Do tobacco and alcohol spending appear to be related? What questions do you have about these data? What conclusions can you draw?
Read more -
Chapter 9: Problem 36 Stats: Data and Models 4
Problem 36RE Football weights The Sears Cup was established in 1993 to honor institutions that maintain a broad-based athletic program, achieving success in many sports, both men’s and women’s. Since its Division III inception in 1995, the cup has been won by Williams College 15 of 17 years. Their football team has an 85.3% winning record under their current coach. Why does the football team win so much? Is it because they’re heavier than their opponents? The table shows the average team weights for selected years from 1973 to 1993. a) Fit a straight line to the relationship between Weight and Year. b) Does a straight line seem reasonable? c) Predict the average weight of the team for the year 2015. Does this seem reasonable? d) What about the prediction for the year 2103? Explain. e) What about the prediction for the year 3003? Explain.
Read more -
Chapter 9: Problem 37 Stats: Data and Models 4
Problem 37RE Models Find the predicted value of y, using each model for x = 10. a) ? = 2 + 0.8 ln x ________________ b) log ? = 5 - 0.23x ________________ c)
Read more -
Chapter 9: Problem 32 Stats: Data and Models 4
Orange production The table below shows that as the number of oranges on a tree increases, the fruit tends to get smaller. Create a model for this relationship, and express any concerns you may have.
Read more -
Chapter 9: Problem 38 Stats: Data and Models 4
Williams vs. Texas Here are the average weights of the football team for the University of Texas for various years in the 20th century. a) Fit a straight line to the relationship of Weight by Year for Texas football players. b) According to these models, in what year will the predicted weight of the Williams College team from Exercise 36 first be more than the weight of the University of Texas team? c) Do you believe this? Explain.
Read more -
Chapter 9: Problem 39 Stats: Data and Models 4
Problem 39RE Vehicle weights The Minnesota Department of Transportation hoped that they could measure the weights of big trucks without actually stopping the vehicles by using a newly developed “weigh-in-motion” scale. After installation of the scale, a study was conducted to find out whether the scale’s readings correspond to the true weights of the trucks being monitored. In Exercise of Chapter 6, you examined the scatterplot for the data they collected, finding the association to be approximately linear with R2 = 93,. Their regression equation is Wt = 10.85 + 0.64 Scale, where both the scale reading and the predicted weight of the truck are measured in thousands of pounds. a) Estimate the weight of a truck if this scale read 31,200 pounds. ________________ b) If that truck actually weighed 32,120 pounds, what was the residual? ________________ c) If the scale reads 35,590 pounds, and the truck has a residual of -2440 pounds, how much does it actually weigh? ________________ d) In general, do you expect estimates made using this equation to be reasonably accurate? Explain. ________________ e) If the police plan to use this scale to issue tickets to trucks that appear to be overloaded, will negative or positive residuals be a greater problem? Explain.
Read more -
Chapter 9: Problem 41 Stats: Data and Models 4
Problem 41RE Down the drain Most water tanks have a drain plug so that the tank may be emptied when it’s to be moved or repaired. How long it takes a certain size of tank to drain depends on the size of the plug, as shown in the table. Create a model. Plug Dia (in.) ? ½ ¾ 1 1¼ 1½ 2 Drain Time (min.) 140 80 35 20 13 10 5
Read more -
Chapter 9: Problem 40 Stats: Data and Models 4
Problem 40RE Profit How are a company’s profits related to its sales? Let’s examine data from 71 large U.S. corporations. All amounts are in millions of dollars. a) Histograms of Profits and Sales and histograms of the logarithms of Profits and Sales appear below. Why are the re-expressed data better for regression? ________________ b) Here are the scatterplot and residuals plot for the regression of logarithm of Profits vs. Log(Sales). Do you think this model is appropriate? Explain. ________________ c) Here’s the regression analysis. Write the equation. Dependent variable is: Log Profit R-squared = 48.1% Variable Coefficient Intercept -0.106259 LogSales 0.647798 ________________ d) Use your equation to estimate profits earned by a company with sales of 2.5 billion dollars. (That’s 2500 million.)
Read more -
Chapter 9: Problem 42 Stats: Data and Models 4
Problem 42RE Chips A start-up company has developed an improved electronic chip for use in laboratory equipment. The company needs to project the manufacturing cost, so it develops a spreadsheet model that takes into account the purchase of production equipment, overhead, raw materials, depreciation, maintenance, and other business costs. The spreadsheet estimates the cost of producing 10,000 to 200,000 chips per year, as seen in the table. Develop a regression model to predict Costs based on the Level of production. Chips Produced (1000s) Cost per Chip ($) Chips Produced (1000s) Cost per Chip ($) 10 146.10 90 47.22 20 105.80 100 44.31 30 85.75 120 42.88 40 77.02 140 39.05 50 66.10 160 37.47 60 63.92 180 35.09 70 58.80 200 34.04 80 50.91
Read more