### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# Homework 2 STT 421

MSU

GPA 3.71

### View Full Document

## 35

## 0

## Popular in Statistics I

## Popular in Statistics

This 11 page Class Notes was uploaded by Jacob Decker on Sunday October 11, 2015. The Class Notes belongs to STT 421 at Michigan State University taught by v. melfi in Fall 2015. Since its upload, it has received 35 views. For similar materials see Statistics I in Statistics at Michigan State University.

## Reviews for Homework 2

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 10/11/15

1 222 Fuel consumption and speed a The speed should be placed on the Xaxis because it is the explanatory variable Next the data are read into R and displayed and a scatterplot is gt location quothttpwwwsttmsuedumelsttt421textdatach02eX02022csvquot drawn gt degrees readcsvlocation headerTRUE stringsAsFactors FALSE gt degrees speed fuel 1 10 2100 2 20 1300 3 30 1000 4 40 800 5 50 700 6 60 590 7 70 630 8 80 695 9 90 757 10 100 827 11 110 903 12 120 987 13 130 1079 14 140 1177 15 150 1283 gt plotdegreesspeed degreesfuel Xlab quotSpeedquot ylab quotFuelquot main quotScatterplot of Fuel vs Speedquot b The relationship is curved not linear Because low mileage is actually good it means that we use less fuel to travel 100 km this makes sense Fuel 20 15 10 Scatterplotof Fuel vs Speed 80 Speed 100 120 Note that 60 kmhr is about 37 mph Moderate speeds yield the best performance which is reasonable It takes more fuel to start moving and higher speeds also require more fuel to maintain the speed c For speeds less than 60 kph there is a negative association but for speeds greater than 60 kph it s positive We have to think of the relationship AS A WHOLE however and there is neither a positive nor a negative association for the entire relationship Aboveaverage values of fuel used are found with both belowaverage AND above average values of speed d The relationship is very strong If we were to draw a curve that fits the 140 points the points would fall on or very very close to that curve The curve is very useful for prediction 2 244 Gas mileage and sneed a R command gt cordegrees speed 10000000 0 1716216 fuel 01716216 10000000 R 0172 it is close to zero because the relationship is a curve rather than a line correlation measures linear association speed fuel 3 252 Effect of a change in units a The new speed and fuel consumption respectively values are X X 1609 and y y16091003785 0004251y The factor of 1100 is needed since we were measuring fuel consumption in liters 100 km The transformed data have the same correlation as the original r 0 172 since a linear transformation does not alter the correlation b Scatterplot for original data Scatterplotof Fuel vs Speed 0 O l LO 6 LE 0 o O 9 0 o O O O O O O O O O 20 40 60 80 100 120 140 Fuel Scatterplot for transformed data Scatterplot of Fuel vs Speed C 8 C5 0 I Q Q I Q Q co 0 039 Ln 0 o39 lt 0 0 o o39 o O o m 0 Q 0 o o o o 20 40 60 80 Speed d R commands gt location quothttpWWWsttmsuedume1f stt421textdatach02eX02052csvquot gt degrees readcsvlocati0n headerTRUE stringsAsFactors FALSE gt degrees speed fuel 10 2100 20 1300 30 1000 40 800 50 700 60 590 70 630 80 695 9 90 757 10 100 827 11 110 903 12 120 987 13 130 1079 14 140 1177 15 150 1283 gt degreesSpeedMPH degreesspeed 1 1609 gt degrees speed fuel SpeedMPH 10 2100 621504 20 1300 1243008 30 1000 1864512 40 800 2486016 50 700 3107520 60 590 3729024 70 630 4350528 80 695 4972032 90 757 5593536 10 100 827 6215040 11 110 903 6836544 12 120 987 7458048 13 130 1079 8079553 14 140 1177 8701057 15 150 1283 9322561 gt degreesFue1GPM degreesfue1 1609 100 13785 OOOUlIgtUJIJgt k OOOU1IgtUJgt k O gt degrees speed fuel SpeedMPH FuelGPM 10 2100 621504 008927081 20 1300 1243008 005526288 30 1000 1864512 004250991 40 800 2486016 003400793 50 700 3107520 002975694 60 590 3729024 002508085 70 630 4350528 002678124 80 695 4972032 002954439 90 757 5593536 003218000 10 100 827 6215040 003515569 11 110 903 6836544 003838645 12 120 987 7458048 004195728 13 130 1079 8079553 004586819 14 140 1177 8701057 005003416 15 150 1283 9322561 005454021 gt plotdegreesspeed degreesfuel Xlab quotSpeedquot ylab quotFuelquot main quotScatterplot of Fuel vs Speedquot gt cordegrees speed fuel SpeedMPH FuelGPM speed 10000000 01716216 10000000 01716216 fuel 01716216 10000000 01716216 10000000 SpeedMPH 10000000 01716216 10000000 01716216 FuelGPM 01716216 10000000 01716216 10000000 OOOUlIgtUJgt k O Fuel 40 35 3O 25 20 15 e Scatterplot for Fuel in MPG Scatterplot of Fuel vs Speed 0 o O o o o o o O o o o O o o I I I I 20 40 60 80 Speed f R Commands gt degreesFuelMPG ldegreesFuelGPM gt plotdegreesSpeedMPH degreesFuelMPG Xlab quotSpeedquot ylab quotFuelquot main quotScatterplot of Fuel VS Speedquot gt cordegrees speed fuel SpeedMPH FuelGPM FuelMPG speed 10000000 01716216 10000000 01716216 00429662 fuel 01716216 10000000 01716216 10000000 09172232 SpeedMPH 10000000 01716216 10000000 01716216 00429662 FuelGPM 01716216 10000000 01716216 10000000 09172232 FuelMPG 00429662 09172232 00429662 09172232 10000000 The new correlation is r 0043 the new plot is even less linear than the first 4 263 Water discharged bv the Mississippi River a b 1 Based on the slope volume increases at an average rate of 42255 kmA3year The estimate for 178 is 271 kmquot3 a negative number makes no sense in this context The estimate for 1990 is 617 kmquot3 Based on the time plot it appears that the actual discharge in 1990 was around 680 kmquot3 so the prediction error is about 63 kmquot3 There are high spikes in the time plot in the two ood years 5 2104 Dangers of not looking at a plot a b To three decimal places the correlations are all approximately 0816 for set D r actually rounds to 0817 and the regression lines are all approximately y 3000 0500x For all four sets we predict y 8 when x 10 R Commands Scatterplots below gt corforsetsx forsetsy1 1 08164205 gt corforsetsx forsetsy2 1 08162365 gt corforsetsx forsetsy3 1 08162867 gt corforsetsx4 forsetsy4 1 08165214 gt forsetsm mxy1 dataforsets gt coefforsetsm Intercept y1 09975311 13328426 gt forsetsm mxy2 dataforsets gt coefforsetsm Intercept y2 09948419 13324841 gt forsetsm mxy3 dataforsets gt coefforsetsm Intercept y3 1000315 1333375 gt forsetsm mx4y4 dataforsets gt coefforsetslm Intercept y4 1003640 1333657 gt predictforsetslm 1 2 3 4 5 7771823 6678224 9278856 10785888 10292435 6 7 8 9 10 8385305 5998059 15667073 6411493 9545587 11 8185257 gt plotforsetsx forsetsy2 xlab quotxquot ylab quotyquot main quot Data set Aquot co quotbuequot gt ablinea 09948419 b 13324841 Col quotredquot ata 51 D E 3 sf 3 3 5 339quot a I I 391 I3 12 14 11 gt plotforsetsx forsetsy2 xlab quotxquot ylab quotyquot main quot Data set Bquot co quotbluequot de 2 gt ablinea 09948419 b 13324841 co quotredquot de 2 Esta set m a D Iii 339 4 ET 5 I I I I I 4 E 8 IIIEI 12 3914 2 gt plotforsetsx forsetsy3 xlab ylab main quot Data set Cquot co quotbuequot gt ablinea 1000315 b 1333375 co quotredquot de 2 Esta set C 3953 4 m I E I I 4 12 3914 2 gt plotforsetsx4 forsetsy4 xlab quotxquot ylab quotyquot main quot Data set Dquot co quotbuequot gt ablinea 1003640 b 1333657 co quotredquot de 2 am 51 12 10 E t 10 1E 14 1E 13 E c For set A the use of the regression line seems to be reasonable the data do seem to have a moderate linear association albeit with a fair amount of scatter For Set B there is an obvious nonlinear relationship we should fit a parabola or other curve For Set C the point 13 1274 deviates from the highly linear pattern of the other points if we can exclude it the new regression formula would be very useful for prediction For Set D the data point with x 19 is a very in uential point the other points alone give no indication of slope for the line Seeing how widely scattered the y coordinates of the other points are we cannot place too much faith in the y coordinate of the in uential point thus we cannot depend on the slope of the line so we cannot depend on the estimate when x 10 We also have no evidence as to whether or not a line is an appropriate model for this relationship 6 2116 Condition on gender a For each gender the conditional distribution of status if found by dividing the counts in that column by tat column total For example 8904842 01838 3404842 00702 etc meaning that of all male college students about 1838 are enrolled fulltime in twoyear colleges 702 are attending a twoyear college parttime and so on Note that each of six numbers should add up to 1 except for rounding error Graphical presentations may vary one possibility is shown below We see that there is little difference between genders in the distribution of status The percentages of men and women in each status category are quite similar

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

#### "I used the money I made selling my notes & study guides to pay for spring break in Olympia, Washington...which was Sweet!"

#### "I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

#### "It's a great way for students to improve their educational experience and it seemed like a product that everybody wants, so all the people participating are winning."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.