### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# Applied Regression Analysis STAT 333

UW

GPA 3.57

### View Full Document

## 89

## 0

## Popular in Course

## Popular in Statistics

This 43 page Class Notes was uploaded by Mrs. Triston Collier on Thursday September 17, 2015. The Class Notes belongs to STAT 333 at University of Wisconsin - Madison taught by Staff in Fall. Since its upload, it has received 89 views. For similar materials see /class/205078/stat-333-university-of-wisconsin-madison in Statistics at University of Wisconsin - Madison.

## Reviews for Applied Regression Analysis

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 09/17/15

Statistics 333 Applied Regression Analysis January 25 2007 Announcements Announcements 0 Some course materials will be uploaded in the course website soon httpwww qtat wiqc min Iquot r nllrcnq amp 7 mm Announcements 0 Some course materials will be uploaded in the course website soon httpwww stat Wl min Iquot nilrcnq amp 7 mm I o The issue with the course enrollment will be handled soon Announcements 0 Some course materials will be uploaded in the course website soon httpwww stat Wl min Iquot r nilrcnq amp 7 mm o The issue with the course enrollment will be handled soon 0 Any other issues Please let Prof Johnson know your issues by email richstatwiscedu Agenda for Today s Lecture Agenda for Today s Lecture Plan to cover Agenda for Today s Lecture Plan to cover 0 Overview for Simple Regression Agenda for Today s Lecture Plan to cover 0 Overview for Simple Regression 9 Introduction to R Regression Analysis Regression analysis concerns the study of relationships between variables with the objectives of Regression Analysis Regression analysis concerns the study of relationships between variables with the objectives of 0 identifying the relationship Regression Analysis Regression analysis concerns the study of relationships between variables with the objectives of 0 identifying the relationship 9 estimating the parameters Regression Analysis Regression analysis concerns the study of relationships between variables with the objectives of 0 identifying the relationship 9 estimating the parameters 9 validating the estimates Regression Analysis Regression analysis concerns the study of relationships between variables with the objectives of 0 identifying the relationship 9 estimating the parameters 9 validating the estimates 0 predicting the response variable Regression with a Single Predictor Notation 0 X independent variable predictorvariable or input variable 0 y dependent variable or response variable Regression with a Single Predictor Example Dosage X in Milligrams and the Number of Days of Relief yfrom Allergy for Ten Patients Dosage Duration of Relief ommximmmbmmx 4 Regression with a Single Predictor Example Dosage X in Milligrams and the Number of Days of Relief yfrom Allergy for Ten Patients Dosage Duration of Relief cocoooximmmbmmx 4 What would you do as a first step to analyze the relationship Regression with a Single Predictor Scatter Diagram Scatter Diagram Duranon of Rehef y m7 3 4 5 6 7 Dosage x Regression with a Single Predictor Scatter Diagram Scatter Diagram Dura won ofRehefy Dosage x Straight Line Regression Model Statistical Model for a Straight Line Regression We assume that the response Y is a random variable that is related to the input variable X by Yi3031Xiei i12n where 0 Y denotes the response corresponding to the ith experimental run in which the input variable X is set at the value X 91en are the unobservable random errors which we assume are independently and normally distributed with mean zero and an unknown standard deviation 039 9 The parameters 30 and 31 which together locate the straight line are unknown Straight Line Regression Model The statistical model for a straight line regression Dura iOi39i ofReiiefy Scatter Diagram Dosage x Straight Line Regression Model Assume the previous statistical model is correct Then how to fit the best straight line of the y to X relationship in the scatter diagram Scatter Diagram Dura inn ar Reliefy l l Straight Line Regression Model Assume the previous statistical model is correct Then how to fit the best straight line of the y to X relationship in the scatter diagram Scatter Diagram 3 E m quot E f g u 3 3 a a f a a m l l l l l l l a A 5 6 7 E a Dusage gtlt Straight Line Regression Model Assume the previous statistical model is correct Then how to fit the best straight line of the y to X relationship in the scatter diagram Scatter Diagram gt m 39 Ev g 3 3 a a f a a m l l l l l l l a A 5 6 7 E a Dusage gtlt Method of Least Squares Suppose that an arbitrary line y b0 Ex is drawn on the scatter diagram Scala iagram m in mm Principle of Least Squares Determine the values for the parameters so that the overall discrepancy 7 di2 200 b0 b1Xi2 1 i1 M D I is minimized The parameter values thus determined are called the least squares estimates Method of Least Squares 0 Least Squares Estimator of 30 307 17 0 Least Squares Estimator of 31 31 gummuff 2571 77F o Fitted or estimated regression line Y 3031X o Residuals iW 0 1qi 17m7n Method of Least Squares ScmlerDizgrzm Dara inn mReiieV v y 710712741X By using the fitted regression line we can predict a response yfor a specified Xvalue Inferences for 31 and 30 Sumnry m t mue 1mformun y V x Resmunls Mm 1Q Memnn 3Q Max 735333 722125 43741 25425 35551 toeffluents Eshmnte sme Error 5 value 5mm Intenept 45759 22529 75559 5157219 x 27 24411 6114 5955255 m S1gmfodes a am13939ea139 aas m 1 Res 1 standard error 2521 on 5 degrees of freedom Mulnple Rquunred 55254 Adwsted 5esqunren Haas resume 3552 on 1 and 5 Dr prw ue 55552555 Regression Analysis Regression analysis concerns the study of relationships between variables with the objectives of 0 identifying the relationship 9 estimating the parameters 9 validating the estimates 0 predicting the response variable Stat 333 Spring 2004 4152004 Discussion 11 1 Dummy Variables Example Exercise C Chapter 14 Page 318 Bars of soap are scored for their appearance in a manufacturing operation These scores are on a 1710 scale and the higher the score the better The difference between operator performance and the speed of the manufacturing line is believed to measurably affect the quality of the appearance The following data were collected on this problem Appearance Operator Line Speed Sum for 30 Bars 1 255 l 175 255 l 200 249 2 150 260 2 175 223 2 200 231 3 150 265 3 175 247 3 200 256 H Using dummy variables t a multiple regression model to these data E0 Using the regression model demonstrate whether operator differences are important in bar appearance w Does line speed affect appearance a i What model would you use to predict bar appearance7 Scatter plot Scallerplol of Appearance v vs Line Speed x1 Appearance v 15 mm 17m mu Line Speed 0 4268 CSSC tingelistatwisciedu TingeLi Lin Stat 333 Consider the model Spring 2004 4152004 Y u 1X1 2X2 3Xa 4X1X2 35X1X3 6 Regression Analysis Appearance Y versus Line Speed x1 x2 The regression equation is Appearance Y 28 870180 Line Speed x1 7 165 x2 520 x3 0060 x1sx2 x3 a 0400 X1 Predictor Coef SE Coef T P Constant 750 6323 455 0 020 Line Speed x1 O18OO 03588 0 50 0 650 X2 71650 42 O18 0 865 X3 5200 8842 058 0 602 x1sx2 00600 0 5075 0 12 0 813 x1sx3 704000 05075 OJQ 0 488 S 126886 R Sq 6717 R Sqadj 121 Analysis of Variance Sourc MS P Regression 5 883 0 1866 122 0463 Residual Error 3 4830 1610 Total 8 466 Source DF eq SS Line Speed x1 1 3227 X2 1 180 Kg 1 4860 x1sx2 1 563 x1sx3 1 1000 Res39dua39 les f r Appearame V Scatterplot of Appearance v vs Line Speed X1 in m 39 x r n rzn m in nu zzn 2w 2 zsn E quot1 Fllled he 5 3 Hustnerzm D the Reslduzls Reslduzls Versus the meet D the Data amp lt o 5 3 1 I 1 x an is is n rs s z i 4 s t 7 an mam Ma 4268 CSSC tingslistat tWiSCtedu TingsLi Lin Stat 333 Spring 2004 4152004 Consider the model Y u 1X1 2X2 3X3 5 Regression Analysis Appearance Y versus Line Speed x1 x2 x3 regression equation 39 In is Appearance Y 307 a 0293 Line Speed x1 a 600 x2 a 180 x3 Predictor Coef SE Coef T P Constant 3 9 Line Speed x1 a02933 01847 a159 0173 x2 a600 x3 18000 9233 a195 0109 S 113078 RaSq 5647 R Sqadj 302 Analysis of Variance Source DF SS MS F P Regression 3 8267 2756 216 0212 Residual Error 5 6393 9 Total 8 14660 Source DF Seq SS Line Speed x1 1 3227 x2 1 180 x3 1 4860 Residual plots for Appearance v a n m rm a n n m a Rain Hustneram D the Reslduzls Reslduzls Versus the meet D the Data Pertenl Resldlul 2w zsn Filled he in n mummy Resldlul an 3 his 12 z 4 s s 7 Dbxervzllon um 4268 CSSC tingalistatwiscedu TingaLi Lil Stat 333 Consider the model Spring 2004 Y u 1X1 4X1X2 5X1X3 5 Regression Analysis Appearance Y versus Line Speed x1 x18x2 x18x3 regression equation 39 a In is Appear nce Y 299 a 0247 Line Speed x1 a 00330 x18x2 a 0107 x18x3 Predictor Coef SE Coef Constant 29933 3115 Line Speed x1 x18x2 XHXS 7010685 005017 S 108253 RaSq 6007 Analysis of Variance 7213 0086 RaSqadj 36139 4152004 Source DF SS MS F P Regression 3 8801 2934 250 0174 Residual Error 5 5859 2 Total 8 14660 Source DF Seq SS Line Speed x1 1 3227 x18x2 1 258 x18x3 1 5316 Residual Plots for Appearance v n n 1n x an rm n n a Zzn 240 in m Rana mad but Hlsluarzm u the Reslduzls Reslduzls Versus the Elmer u the Data 3 n 22 39g g 392 n E 1 a I rm dedicating 23557 Rana anew Ma 4268 CSSC tingalistat 1Wisciedu TingaLi Lin Stat 333 Spring 2004 4152004 Consider the model 2 Y u 1X1 32X2 3X3 3le 6 Regression Analysis Appearance Y versus Line Speed x1 x2 x3 X1 The regression equation is Appearance Y 984 a 813 Line Speed x1 a 600 x2 a 180 x3 00224 X1 Predictor Coef SE Coef T P Constant 9840 2697 365 0022 Line Speed x1 8133 3116 a261 0059 X2 6000 6420 a093 0403 x3 a180 420 a 80 0049 X1 0022400 0008896 252 0066 S 786342 RaSq 8317 R Sqadj 6637 Analysis of Variance Source DF SS F P Regression 4 121867 30467 493 0076 Residual Error 4 3 Total 8 146600 Source DF Seq SS Line Speed x1 1 32267 x2 1 18 x3 1 48600 X1 1 39200 Residual Plots for Appearance v a 5quot n rm e n s n Zzn 240 in m 2m Rania mum Hlsluarzm DI the Reslduzls Reslduzls Versus the Elmer DI the Data 3 n a s z 2 g a n z 1 s y H j S a a n x i z i s t 7 x a Rania unnamed 4268 CSSC tingalistatwiscedu TingaLi Lil Stat 333 Spring 2004 2262004 Discussion 5 1 Matrix Let Apth aik Bpxq CqXp Gil Show A A7 A B A B 7 AC C A 7 and A 1 A 1 2 General Linear Regression Consider Galileo7s data Horizontal Distance Initial Height punti punti 100 337 200 395 300 451 450 495 600 534 800 573 1000 Determine the matrices7 X Y7 X X7 X Xquotl7 and B X X 1X Y for the following models a distance 60 hez ght 6139 b distance 60 61heighti 62hez39ghtl2 6139 Now t the model using Minitab a distance 60 hez ght 6139 Regression Analysis Distance Versus Height The regression equation is Distance 270 0333 Height Predictor Coef SE Coef T P Constant 26971 2431 1109 0000 Height 033334 004203 793 0001 s 336785 1ampqu 9267 R7Sqadj 9127 Analysis of Variance Source DF SS MS F P Regression 1 71351 71351 6291 0001 Residual Error 5 5671 1134 Total 6 77022 4268 CSSC ting listatwiscedu Ting Li Lin Stat 333 Spring 2004 2262004 Regression Analysis Distance Versus Height100 The regression equation is Distance 270 333 Heig Predictor Coef SE Co Constant 269 71 24 Height100 33334 42 S 336785 RiSq 9267 Analysis of Variance ource Regression Residual Error 5 5671 Total 6 77022 ht100 ef T P 31 1109 0000 03 793 0001 R7Sqadj 9127 DF SS MS F P 1 71351 71351 6291 0001 39 7 39 39 2 b dzstance 7 60 61hezghti 62hezghti 6139 Regression Analysis Distance Versus Height Height 2 The regression equation is Distance 200 0708 Height 0000344 Height 2 Predictor Coef Coef T P Constant 19991 1676 1193 0000 Height 070832 007482 947 0001 Height 2 000034369 000006678 515 0007 s 136389 13qu 9907 R7Sqadj 9867 Analysis of Variance Source Regression 2 76278 38139 20503 0000 Residual Error 4 744 Total 6 77022 Source DF Seq SS Height 1 71351 Height 2 1 4927 Regression Analysis Distance Versus Height100 Height100 2 The regression equation is Distance 200 708 Height100 344 Height100 2 Predictor Coef SE Coef T P Constant 19991 1676 1193 0000 Height100 70832 7482 947 0001 Height100 2 34369 06678 515 0007 s 136389 13qu 9907 R7Sqadj 9867 Analysis of Variance ource DF SS MS F P Regression 2 76278 38139 20503 0000 Residual Error 4 744 Total 6 77022 Sourc Seq SS Height100 1 71351 Height100 2 1 4927 4268 CSSC ting listatwiscedu Ting Li Lin Stat 333 Spring 2004 3252004 Discussion 8 1 Durbin Watson Statistic Test procedures p 184 1857 Draper and Smith 1H0p0vsHApgt0 2 H0p0vsHAplt0 3 H0p0vsHAp7 0 Example Consider the steam data Fit the model Y 60 61X6 BZXS 6 Regression Analysis Y versus x6 x8 The regression equation is Y t 13 0203 x6 00724 x8 Predictor Coef SE Coef T P Constant 9127 1103 828 0000 x6 020282 004577 443 0000 x8 0072393 0007999 905 0000 s 0661565 RiSq 8497 R7Sqadj 8357 Analysis of Variance ource DF SS MS F P Regression 54187 27094 6190 0000 Residual Error 22 9 629 0438 tal 24 63816 Source DF Seq SS 18 342 x8 1 35 845 Unusual Observat ions Obs x6 Y Fi SE Fit Residual St Resid 7 110 6360 5972 0443 0388 079 X 11 200 8240 9824 0143 1584 245R 19 110 6830 6290 0437 0540 109 X R denotes an observation with a large standardized residual X denotes an observation whose X value gives it large influence DurbiniiJatson statistic 219546 4268 CSSC ting listatwiscedu Ting Li Lin Stat 333 Spung 2004 3252004 Residual Plots for V in E 39g n 5n 3 m quot 72 r n 5n 7 an ms nu mm mm Hlsluarzm DI the Reslduzls Reslduzls Versus the Elmer DI the Data s n gt 5 s n 5 n a 39 x5 4 75 in in an as m z s x m 2 m 5 m n 2 Wu un inmumu 4268 cssc ungyh sta we edu Tmer Lm Stat 333 Spring 2004 412004 Discussion 9 1 Internally and Externally Studentized Residuals Consider the model Y X E a The residuals can be expressed as eY7YY7HYIiHY where H XX X 1X 7 which is called the hat matrix b Internally Studentized Residuals 5139 5m 8139 C Externally Studentized Residuals 7 8mm which follows the t distribution with n 7 p 7 1 degrees of freedom Example 1 Given the data 82 Umbcoww OOCDNCH Consider the model Y1 60 le 6139 Compute the internally studentized residual and the externally studentized residual for the observation X 3 06 04 02 0 402 04 03 02 01 0 HXX X 1X 02 02 02 02 02 0 01 02 03 04 402 0 02 04 06 4268 CSSC ting listatwiscedu Ting Li Lin Stat 333 Spring 2004 412004 Example 2 Consider the data from Table 811 Age at First Word X and Gesell Adaptive Score Fit the model Y X 6 using Minitabi Is there any outlieri7 Data Display Row K Y RESII SRESI TRESI 000K1 FITS1 1 15 95 20310 018883 018397 0000897 92969 2 26 71 795721 7094441 7034158 0081498 80572 3 10 83 715 6040 71 46226 71 51081 0071658 98604 4 9 91 787309 7082158 7081426 0025616 99 731 5 15 102 90310 83966 083286 0017744 92969 6 20 87 703341 7003147 7003063 0000039 87334 7 18 34120 031892 0 31125 0003131 89588 8 11 100 25230 023567 022972 0001668 97477 9 8 104 31421 029716 028991 0003832 100858 10 20 94 66659 062797 061766 0015440 87334 11 7 113 110151 104798 1 05085 0054810 101 985 12 9 96 737309 7035108 70 34283 0004678 99 731 13 10 83 7156040 7146226 1 51081 0071658 98 604 14 11 84 7134770 712588 71 27978 0047598 97477 15 11 102 4 5230 042248 0 41315 0005361 97477 16 10 100 13960 013083 0 12739 0000574 98604 17 12 105 86500 080601 079828 0017856 96350 18 42 57 755403 7085154 7084511 0678112 62540 19 17 121 30 2 50 2 82337 360698 0223288 90715 20 11 6 711 4770 7107201 7107648 0034519 97477 21 10 100 13960 013083 012739 0000574 98 604 ran 72u an n 2m 3m Residual Residuals Versus the Filled Values respunse 5 0 3D 0 2n m o 0 0 o o o o an 0 0 2n 5D 7n mu an lled Value 4268 CSSC tingelistatiwisciedu TingeLi Lin Stat 333 Spring 2004 412004 2 Cook s Statistics Y 7 WOW 7 3 Di 7 p82 D b F bltigtgt X Xb F 30 2 p82 83 DZ E 17 he Example 3 ln Example 17 compute D31 Example 4 ln Example 27 is there any in uential observation7 Plot of Cook39s statistic Bank39s Statistic 1n Dbservaliun 3 Some Comments on SAS Output y versus x1 and x2 Source DF Type I SS Meau Square F Value Pr gt F x1 1 3382727273 3382727273 11322 00001 X 1 017881773 017881773 06 04741 Source DF Type III SS Meau Square F Value Pr gt F x1 1 2563685868 2563685868 8580 00002 X 1 017881773 017881773 060 04741 y versus x2 and x1 Source DF Type I SS Meau Square F Value Pr gt F X 1 836823077 836823077 2801 00032 x1 1 2563685868 2563685868 8580 00002 Source DF Type III SS Meau Square F Value Pr gt F X 1 017881773 017881773 060 04741 x1 1 2563685868 2563685868 8580 00002 4268 CSSC tingrlistattwisctedu TingrLi Lin Stat 333 1 Inverse of a Matrix The inverse of S is given by a S 1 d 9 where A 6k 7 fh D dk 7 f9 G db 7 eg Spring 2004 2192004 Discussion 4 a b c S d e f g h k b c 1 1 A B o e f 3 D E F h k G H K B7bk7ch Cbf7ce Eak7cg 7af7cd H7ah7bg Kae7bd and Zaek7fh 7bdk fg Cdh7eg aekbfgcdh7ahf7dbk 7960 2 LSE for 5 Model YX e lf X X 1 exists then the LSE for B is given by B B X X 1X Y Furthermore the variance and covariance matrix of B is VarB X X 102 3 The ANOVA Table Source df SS MS Regressionl o p 7 1 B X Y 7 71572 Residual 71 7 p Y Y 7 X Y 52 Total corrected n 7 1 Y Y 7 my2 4268 CSSC ting listat wiscedu Ting Li Lin Stat 333 Spring 2004 4 Example Consider the following data Y X1 8 1 13 1 12 1 11 1 9 1 8 1 7 1 13 1 11 0 13 0 Also7 consider the model Yi 50 an inz 6139 Answer the following questions a Calculate X Y and X X b Find the LSE for B c Construct the ANOVA table d Find the estimate of 02 e Find the variance and covariance matrix of 6 X2 1 1 1 1 2192004 4268 CSSC ting listatwiscedu Ting Li Lin

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

#### "Selling my MCAT study guides and notes has been a great source of side revenue while I'm in school. Some months I'm making over $500! Plus, it makes me happy knowing that I'm helping future med students with their MCAT."

#### "Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

#### "It's a great way for students to improve their educational experience and it seemed like a product that everybody wants, so all the people participating are winning."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.