Median is a better summary statistic

Chapter 3:

Bivariate Data: 2 variables

• Examine the relationship between these two variables

Scatterplot: displays quantitative bivariate  data

Types of variables:

• Response (dependent (y)) variable: measure outcome of a study
• Explanatory (independent (x)) variable: explain or influence changes in a response variable

Interpreting scatterplots

• Describe pattern of the relationship
• Form: linear, curved, clusters, no pattern
• Direction: positive, negative, no direction
• Strength: how closely the points fit the “form”; strong/weak
• Outlines of the pattern

Correlation coefficient ®: measure of the direction and strength of a relationship

• mean and std. dev. For explanatory variable
• mean and std. dev. For response variable
• Properties:
• r does not change if we interchange x and y
• r has no unit
• |r|= measures the strength
• sign of r gives us the direction

Lurking Variable: is not explanatory or response variable, but may influence the relationship between the two variables.

Establishing Causation

• Association is strong / consistent
• Higher doses are associated with stronger responses

Chapter 7: Samples and Observational Studies

Strength:

1. r=-0.9 or r=0.9

They are the same

1. r=-0.87 or r=0.85

-0.87 is stronger because 1-0.87= 0.87 >0.85

-r is not resistant to outliers

Chapter 4: Relationships (Regression)

Least-square regression line (LSRL): line such that the sum of the vertical distances between the data points and line is 0; the sum of the squared vertical distances is the smallest possible.

• Notation:

• Slope: describes how much we expect y to change, on average, for every unit change in x.
• Intercept: necessary mathematical description of the regression line
• For every unit change in the explanatory variable, the response variable changes by the slope

r:correlation coefficient;

• Coefficient of determination (r2): fraction of variance (or percent) of variance explained by the x.
• Influential individuals: changes the regression if removed: isolated point
• Residuals: observed value - predicted value

• Extrapdation: prediction outside range

