# STATISTICAL METHODS I STAT 515

Date Created: 10/26/15
possibly incomplete list 0 Topics Covered in Sections 101102 and Chapter 11 Sections 101102 Oneway Analysis of Variance The null and alternate hypothesis for oneway analysis of variance The assumptions for the oneway ANOVA are the same as for a twosample ttest samples are random and independent need to know how it was sampled the populations are normally distributed make a separate qq plot for each group that the variances of the populations are the same use side by side boxplots or calculate the standard deviations How the ANOVA table is constructed for the oneway analysis of variance MS SSdf MSB MST is based on the variance between the treatment averages MSW MSE is based on the variance within each group separately TMS is the estimated variance when H0 is true If the assumptions are met we can use the MS to construct an F test How the F is constructed from the MS and what null and alternate hypotheses it tests That we reject the null hypothesis when the F statistic is large Using the formulas for the SS to construct the ANOVA table Finding the mean squares given the sum of squares and degrees of freedom How the degrees of freedom are found for total and errorwithin it is the sample size minus the number of parameters estimated for the treatmentbetween it is the number of treatments minus 1 For twosamples the oneway ANOVA and the twosample ttest with equal variances give the same pvalue and that F t2 That the oneway ANOVA can be written with a model equation yij ui 8U How we write the assumptions for a oneway ANOVA using the model equation same as for regression see page 523 Chapter 11 Linear Regression The regression model equation as given in the middle of page 508 The four assumptions for linear regression as given on page 523 Independent vs dependent variable Correlation vs Causality Extrapolation The estimates of B0 and B1 are gotten by minimizing the sum of the squared residuals What is meant by regression to the mean The analysis of variance table TSS is the total amount of errorvariation in Y SSR TSS SSE is the amount of errorvariation we explained by using the regression line MSE is the estimated variance 0392 of the points around the regression line TMSVarY is the estimated variance 0392 when 51 0 Because we assumed the residuals are normal we can use the MS to do an F test Why must SSE and SSR each be less than TSS Finding the mean squares given the sum of squares and degrees of freedom How the degrees of freedom are found for total and error it is the sample size minus the number of parameters estimated for the regression it is df for the total minus the df for the error F MSIUMSE When we accept or reject B1 0 based on the ANOVA table Using the formulas for the SS to construct an ANOVA table How the MS relate to the variances of the normal distributions in figures like 117 page 524 The ttest and confidence interval for 51 section 115 The predicted value of y given X The prediction interval for a single y at a f1Xed X value section 118 The confidence interval for the mean of the y at a f1Xed X value section 118 The range of values the correlation coefficient can take The correlation as a single number summary of regression with sample size it determines the F statistic r2 coefficient of determination percent of variation error eXplained by the model and r is like the slope of the regression line after adjusting for the scale of y and x What to look for in residual plots is there a pattern in the residual vs predicted plot violation of independence is there a quotfunnel shapequot in the residual vs predicted plot error variances are not constant does the qq plot look like a straight line are the errors normal That taking the log of the dependent variable can solve the problem of the variance being larger for large values of the independent variable

