Applied Stats Chapter 2 Defintions
Applied Stats Chapter 2 Defintions Stat 1000
Popular in Applied Statistical Methods
Popular in Statistics
This 2 page Class Notes was uploaded by Leela Morris on Wednesday September 30, 2015. The Class Notes belongs to Stat 1000 at University of Pittsburgh taught by Dr. Kehui Chen in Fall 2015. Since its upload, it has received 15 views. For similar materials see Applied Statistical Methods in Statistics at University of Pittsburgh.
Reviews for Applied Stats Chapter 2 Defintions
Report this Material
What is Karma?
Karma is the currency of StudySoup.
Date Created: 09/30/15
Applied Statistical Methods Chapter 2 Definitions/Notes Associations between Variables: Two variables measured on the same cases are associated if knowing the value of one of the variables tells you something that you would not otherwise know about the value of the other variable Response variable/dependent variable: variable being observed (or measured) Explanatory variable/independent variable: not affected by other variables Scatterplots: visualize the relationship between two quantitative variables What to look for in scatterplots: form of association (linear, curved, etc.), strength of association (amount of scatter), direction of association (positive, negative), special features (outliers) Correlation coefficients Denoted by r, measures the linear association between two quantitative variables Correlation if always between -1 and 1 When all the points lie on a line with positive slope r=1 When all the points lie on a line with negative slope r=-1 When r = 0 there is no linear association between the two variable (may have a nonlinear relationship) Least-squares regression A straight line that describes the relationship between two quantitative variables Y is a response variable/X is an explanatory variable Y = a + bx (b = slope, a = intercept) Coefficient of Determination 2 R = Variance of predicted values/variance of observed values 2 R = 100%= All the data lie on the regression line 2 R = 0%= The predictor has no explanatory effect on the response Extrapolation: using the regression line to predict responses for explanatory values outside the range of those used to construct the line Residual: error in using regression line Outlier: (in regression) point with unusually large residual Influential observation: point with high degree of influence on regression line Residual plot: a scatterplot of the residuals against explanatory variables Correlation and Regression Both describe linear relationships Both are affected by outliers Always plot the data before interpreting Beware of extrapolation Use caution in predicting y when x is outside the range of observed x’s Beware of lurking variables These have an important effect on the relationship among the variables in a study, but are not included in the study Correlation does not imply causation Two Categorical Variables: The two-way table: summarize the relationship between two categorical variables Marginal distribution and conditional distribution Simpson’s Paradox: an association or comparison that holds for all of several groups can reverse direction when the data are combined to form a single group Association and Causation Association however strong does not imply causation A properly conducted experiment may establish causation