## MAT 221 Chapter 2 Notes

by: Niki Neidhart

# MAT 221 Chapter 2 Notes MAT 221 - M200

Chapter 2 notes where it talks about relationships and correlation on graphs.
Date Created: 02/09/16
MAT 221 – Chapter 2 Looking at Data - Relationships 2.1 & 2.2: Relationships & Scatterplots Scatterplot- one axis is used to represent each of the variables and the data are plotted as points on the graph Three Aspects of a Relationship: 1. Direction- positive or negative a. Positive: greater values of one variable tend to occur w/ greater values of other values (ex. House size and price) b. Negative: greater values of one variable tend to occur w/ smaller values of other variable (ex. Weight of cars and fuel efficiency) 2. Form – linear, curved, clusters, no pattern 3. Strength – how closely the points fit the form No relationship- the variables are independent Explanatory (independent) variable – the one that controls the other variable [x-axis] Response (dependent) variable – the one that moves based on the other variable [y-axis] Outlier- anything that doesn’t follow the trend 2.3 Correlation Correlation (coefficient) r – a numerical measure of the direction and strength of the relationship between 2 quantitative variables Properties: - Value r ranges from -1 to 1 - Gives the direction of the relationship - Closer to 1 or -1 is a strong relationship - Closer to 0 is a weak relationship - Very sensitive to outliers How to calculate: - For each case in the sample we have a pair of values (x,y) - Suppose there are n cases (x1,y1), (x2,y2), … (x n,yn) Image from Professor Xu’s online notes: https://blackboa rd.syr.edu/bbcs webdav/pid- 3995343-dt- 12064908_1/cou rses/35384.116 2/Ch2Part2.pdf - R has no unit of measure - Correlation only describes linear relationships - Not resistant to outliers – will be very affected 2.4 Least-Squares Regression Regression Line – a straight line the describes the relationship between x and y variables - Distinction between explanatory and response is important Which line “best fits”? -need line to be as close to all points as possible Residual – the vertical distance from the point to the line Least-squares Regression Line – unique line that the sum of the squared vertical distances between the data points and the line is as small as possible - A straight line is simply a picture of a relationship between two variables Straight Line: Y= (slope) X + (y-intercept) - The y-intercept is where the line crosses the y-axis - The slope tells us which way and by how the line is tilted Finding the equation of the regression line: 1. Find the slope(b 1): B 1= r (S y/Sx) r = correlation coefficient Sx = SD of the x-values Sy = SD of the y-values 2. Find the y-intercept(b ): B 0= (average of y-values) Y – b 1(average of x-values) X 3. The equation is: y = b 1X + b 0

