# Week 3 notes Stat 239

Statistics for the Biological and Physical Sciences
Diane Lovett

Week 3 notes! test soon!
This 4 page Class Notes was uploaded by an elite notetaker on Thursday September 17, 2015. The Class Notes belongs to Stat 239 at St. Cloud State University taught by Diane Lovett in Summer 2015.

Date Created: 09/17/15
Statistics for Biology Sept 8 Sept 10 Notes by Brook Hoffman 95 Rule o If the distribution of the data set is approximately normal then 95 of the observations will fall within 2 STD Dev from the mean 0 Only works for large samples 0 Gives central 95 and 25 on each side to exclude any extreme outliers Z Scores 0 The zscore for an observation is the number of STD Dev that observation ies from the mean o It is unit less Zscore xmeanSTD Dev X observation 0 Below mean negative zscore Above mean positive zscore o If a distribution is normal it will have Zscore of 2 in 975th percentile Zscore of 2 in 25th percentile Correlation o Pertains to the relationship between 2 quantitative variables 0 Measures strength of the linear relationship 0 How closely a scatterplot follows a straight line 0 Also unit less o 1 perfect negative line o 0 no linear relationship o 1 perfect positive ine never occurs in real life data 0 Negative correlation as x increases the y variable decreases 0 Positive correlation x and y increase together 0 Correlation r sample correlation p population correlation Greek r0 0 Any set of pairs of numeric variables can be measured for correlation o X and y cannot explain each other 0 Cannot conclude causation NO MATTER WHAT r correlation coefficient 11 r2 coefficient of determination 01 0 The proportion of the variability in y that can be explained by the fit to a line with x Least Squares Linear Regression Line of best fit 0 Recall a straight line is completely specified by knowing its slope and intercept o Ypredicted sope x y intercept Slope rate of change for y with x lntercept only allows us to shift the line for data units unless xO is included in the data set 0 Theoretical Model 0 Y slope x intercept Random Error Term Real data errors So we must correct for them 0 Residuals o For a linear fit this is the yobsewed ypredicted for a given x 0 Least squares fit considers all possible intercepts and slopes so that the residuals2value summed up is minimized y observed value y predicted value yslopex intercept General Form of describing a slope 0 As x increases by one unit then y changes by the slope in units 0 Interpretation of the linear relationship should not go beyond the observed range because the linear relationship cannot be assumed to continue 0 An average residual of 0 means 0 Points and line match up fairly close 0 Assumption of line of best fit is that the errors residuals are about zero 0 Don t want to see 0 Signs of poor fit Trend in residuals may indicate nonlinear Large residuals outliers pulls line to compensate for residuals 0 Features of all linear regressions o The mean point is always a point of the predicted line 0 Regression to the mean Zyer Zsoore y response oorrelation zsoore x 0 Extreme values in one variable tend to get less extreme in the other variable

