Environmental Analysis&Modelng IES 612
Popular in Course
Popular in Environmental Science
This 0 page Class Notes was uploaded by Mrs. Gerson Lind on Sunday November 1, 2015. The Class Notes belongs to IES 612 at Miami University taught by Staff in Fall. Since its upload, it has received 12 views. For similar materials see /class/233371/ies-612-miami-university in Environmental Science at Miami University.
Reviews for Environmental Analysis&Modelng
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 11/01/15
IES 612STA 4573STA 4576 0740 Sunday January 08 2006 1 Spring 2006 Week 02 e IES6lZlectureweek02doc UPDATED 08 Jan 2006 Using SAS Check out wwwmuohioeduguantapps for links to SAS help Con dence Interval for 01 9 b1 i ta2nZSEb1 Example Manatee data 7 90 CI for the SLOPE 90 C1 gt 010 10 gt X2005 gt tom 1782 nl4 gt n2 12 SEb1 00129 b1 0125 0125 i17820129 0125 0023 0102 lt01lt 0148 Could use SAS to do this calculation tconfintsas tinv quantile v 89 function Options 1580 data myci bl 012486 slope estimate SE 001290 Std Error of b1 Tcrit quantile T 9512 PrT12 lt Tcrit 095 Tcrit tinv9512 Cement Area LEFT of Tcrit 95 Area RIGHT of Tcrit 05 ME TcritSE LCL b1 ME UCL bi ME proc print 0740 Sunday January 08 2006 2 run from PROC PRINT results in the SAS LISTING le Obs b1 SE Tcrit ME LCL UCL 1 012486 00129 178229 0022992 010187 014785 I F Test of 11 H0 11 0 Ha 11 0 TS Fob SSRegl SSResidn2 RR Reject H0 if FobS gt Fm 1 2 Conclusions Where 7 A 2 SSReg 201 y il n SSResid 2y1 y32 il FROM SAS OUTPUT Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr gt F Model 1 171197866 171197866 9361 lt0001 Error 12 21944991 1828749 Corrected Total 13 193142857 FobS 9361 with associated PValue lt00001 SSReg 171197866 and SSResid 2194491 s2 MSE 1828749 Thoughts about the ingredients of an ANOVA table 1 ANOVA alysis Qf Mriance 0740 Sunday January 08 2006 3 2 Sum of Squares represents a partitioning of the TOTAL variation into variability explained by a model the linear regression model here and the variability NOT explained residual error 3 SSTotal Corrected Total SS 193143 above is partitioned into the SSRegression Model SS 171198 above and SSResidual Error SS 21945 4 Mean Squares MS are defined as SSdegrees offreedom 5 A good regression model will have SSRegression gt SSResidual which often translates into a large value of Fobs 6 Alternative interpretation SSResidual error in predicting response y when using the linear regression model SSTotal error in predicting response y when using YBAR SSRegression SSTotal SSResidual measures how much better the YHAT prediction model is when compared to YBAR more to come later Alternatively T Test of 51 H02 11 0 Ha 31 0 some assoc Ha 51 lt0 negative assoc Ha 51 gt0 positive association TS b1 0 1tabs S SXX RR Reject H0 if ltobs l gt tax2 nZ tobs lt 39tu nZ tobs gt ta nZ Conclusions RejectFailtoreject H0 Pvalue Ptn2gt ltobsl Ptn2lt tobs Ptn2gt tobs take a look at the M natee example from SAS output Parameter Estimates Parameter Standard Variable DF Estimate Error tValue Prgtt Intercept 1 4143044 741222 559 00001 nboats 1 012486 001290 968 lt0001 H02 110 Ha 31 0 some assoc 0740 Sunday January 08 2006 4 b1 0 012486001290 968 S 11S XX Pvalue lt 00001 Decision Conclusion REJECT H0 and conclude that there is a linear relationship between the number of manatees killed and the number of boats registered in Florida TS tabs Comments Always write your conclusions in the words of the problem Translate the symbol representation back to the real world A con dence interval demonstrates the magnitude of the linear effect Tests and Con dence intervals are related For example if a 10010c con dence interval for a parameter say 31 does NOT contain 0 eg 0102 lt 51 lt 0148 then you would reject H0 51 0 in favor of Ha 51 i 0 at signi cance level on thalue sas options ls80 data myci bl 012486 slope estimate SE 001290 Std Error of b1 tcalc 968 t statistic value if 12 Plower probttcalc df Pupper 1 probt tcalc df Ptwotail 2 1 probt abs tcalc df Note SAS version 9 uses CDF as a generalization of probt proc print run from PROC PRINT results in the SAS LISTING le Obs b1 SE tcalc df Plower Pupper Ptw0tail 1 012486 00129 968 12 100000 000000254 000000508 Hypothesis tests Con dence intervals for the intercept g are similar 0740 Sunday January 08 2006 5 Can you select design points to have more precision when estimating the slope Remedial Measures and Transformations RECALL Basic Model Yi 30 BIXi Si simple linear regression Y response variable dependent variable X predictor variable independent variable covariate Formal assumptions 1 relation linear 7 on average error 0 Esi 0 7gt EYi 30 BlXi 2 Constant variance Vsi 627gt VYi 62 3 Si independent 4 Si N Normal We will talk more about model adequacy Now a few remarks about a special case when the rst assumption might be violated There may be times when a nonlinear relationship might be modeled by linear regression Example MPH and Vehicle Density on a Connecticut Highway V 0740 Sunday January 082006 6 Plot of MPH vs Vehicle DenSIty on 39agCT hlghway 390 20 4Q 3960 so 100 120 140 160 Veliicle Der39lsity Wehiclesmile What if We plot the L0gMPH vs Vehicle Density 0749 Sunday January 08 2006 7 Plot of Log10MPH vs Vehlcle DenSIty on a CT nghway 0 20 40 60 80 100 120 140 160 Density Olehicles per mile Ref http libstat cmueduDAS LData lestransformationdathtml and BD Greenshields and FM Weida Statistics with Applications to Highway Traffic Analysis Eno Foundation 1978 129 131 DENS MPH below other common examplesi exponential growth and decay LOG10 transformations are also commonly used when the range of the response or predictor variables span many orders of magnitude eg per capita gnp population size geographic area Other Inference in Regression 7 Average responses or prediction of new observations at a particular value of X X values in the dataset 7 X1 Xn Denote new value of X Xn1 Prediction of the mean response or new response at this X value fin b0 blxml SE of this prediction AVGn1 S 7 2 l an x Sxx 0740 Sunday January 08 2006 8 Con dence Interval for the Mean Response n 2 A l an x ynl i l 162112J F Sxx Observation As Xn1 get farther from 7c the SE of the prediction increases an extrapolation penalty ltSee sketchgt I Prediction Interval for a New Response Both Uncertainty in the location of the MEAN RESPONSE and variability associated with individual value given the mean response must be considered quot 1 xnl 2 yn1 tor2n72 1 n Sxx Comment SAS Proc GLM options clm mean response CI and cli prediction intervals From Manatee SAS output Dep Var Predicted Std Error Std Error Student Obs manatees aue MeanPredict 95 CLMean 95 CLPredict Residual Residual Residual 1 130000 143827 19299 101779 185876 41604 246050 13827 3816 0362 2 210000 160059 17974 120896 199222 58989 261130 49941 3880 1287 3 240000 186280 15976 151472 221089 86816 285745 53720 3967 1354 4 160000 207507 14528 175853 239161 109102 305911 47507 4022 1181 5 240000 226236 13420 196997 255475 128582 323891 13764 4060 0339 6 200000 224987 13488 195600 254375 127288 322687 24987 4058 0616 7 150000 242468 12622 214968 269968 145320 339616 92468 4086 2263 8 340000 283672 11482 258656 308689 187198 380147 56328 4119 1367 Suppose Xn1 559 corresponds to the 8Lh observation 2587 lt EYn1 lt 3087 1872 lt Yn1lt 3801 Correlation and Coef cient of Determination 7 Measures of 07140 Sunday January 08 2006 9 strength of Association n m 7X96 4 Slope Estimator 1 bl H SXY n 2 SM 2 X X il ECG EXXZ y YX i1 n 2 n 2 ZXi326139 il il n n 2 2Y1 YXZ X 2X1 X Correlation Coef cient ryX H 11 77 21Xi Xf Ee oz I il A S VYX 514 5 So ryx Estimated slope TIMES SDX SDY rescaled slope estimate Observations 1 Pearson productmoment correlation other types of correlation coefficients de ned 7 e g Spearman s rho 1 lt rYX lt 1 ryx 0 IMPLIES no LINEAR relationship correlation coefficient tends to increase as range increases test of population correlation coefficient 0 given but not discussed since equivalent to the test of slopes meww SKETCH various scatterplots associated with r09 r03 r0 r03 r09 Coef cient of Determination R square n Y 17 2 n Y I 2 2 10 E1quot i SSTotal SSResid VYX SSTotal 7 2 ze Y 11 proportionate reduction in prediction error when using YHAT instead of YBAR to predict y proportion of total variability accounted forexplained by 0740 Sunday January 08 2006 10 the linear regression model Comments gtxlt Coefficient of determination I Yx2 correlation coef cient2 for simple linear regression 7 NOT for multiple regression 96 When people report a signi cant correlation coef cient of 040 between two variables X and Y recognize that this means that 16 4x 4 of the variation in one variable is accounted for by its linear association with some other variable SAS Proc CORR can be used to determine the correlation between variables Example Manatee deaths and boats registered Root MSE 427639 RSquare 0 8864 Dependent Mean 2942857 Adj RSq 08769 CoeffVar 1453141 r2 08864 so approx 89 of the variation in the number of manatees killed is explained by a linear relationship with the number of boats registered
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'