EXAM 3 Study Guide (Chapters 9 - 11)
EXAM 3 Study Guide (Chapters 9 - 11) PSYCH-UA 10 - 001
Popular in Statistics for the Behavioral Sciences
PSYCH-UA 10 - 001
verified elite notetaker
Popular in Psychlogy
This 6 page Study Guide was uploaded by Julia_K on Friday April 8, 2016. The Study Guide belongs to PSYCH-UA 10 - 001 at New York University taught by Elizabeth A. Bauer in Spring 2016. Since its upload, it has received 156 views. For similar materials see Statistics for the Behavioral Sciences in Psychlogy at New York University.
Reviews for EXAM 3 Study Guide (Chapters 9 - 11)
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 04/08/16
Statistics for Behavioral Sciences EXAM 4 Study Guide Chapters 9 – 11 Chapter 9 – Linear Correlation #1 rule – correlation does not imply causation! Perfect Correlation – if there is a change in variable x, there will be a proportional change in variable y. Correlation is NOT about 2 variables having the exact same value – it’s about 2 variables having the same position/location on the data set. Z scores show location of scores in distribution so perfectly correlated variables will have the same z scores. Pearson’s Correlation Coefficient (r) – a measurement of linear correlation. r = (+/- 1) when there is a perfect correlation As r approaches 0, there is lack of correlation – more error. Drawing circles, or envelopes, around points on a scatter plot can represent correlation. The more elongated the drawn envelope is, the stronger the relationship, and the higher “r” is. Note: as sample size increases, “r” becomes more significant, or a more accurate representation of the relationship between 2 variables. Problems: 1. Curvalinear relationships - have a bell curve distribution. Since this isn’t linear, it can’t be measured by Pearson’s r. 2. Restricted/Truncated range – A reasonably strong “r” can weaken depending on the range you choose to look at in your distribution. To avoid this, make sure the sample is representative of the population! (ex: if you’re measuring years of education correlating with cost, then 4 college years would yield different results than a full education measure). 3. Bivariate Outliers – an extreme combination of variables (super tall and super wide). Outliers can change the shape of the envelope weaken the correlation (reduce “r”). Test-Retest reliability – the stability of a measure over time. Test the same hypothesis again after some time if both scores have the same location, then the test is reliable. Split-half reliability – the internal consistence of a measure. Two factors within one measure must have a high correlation. (ex: on an exam, the same amount of right/wrong answers for odd questions should correlate with the right/wrong answers for even questions.) Inter-Rater reliability – a consistency of ratings among multiple raters (requires a unanimous understanding of the rating categories). Criterion Validity – are predictions accurate? Is it measuring what it’s supposed to measure? To help answer this, “r” should be no less than 0.7 The unbiased covariance (sample) “r” formula: 1 XY N XY N 1 r s X Y -N= the # of PAIRS of scores -The numerator of the “r” formula is the covariance – a measure of how variables are varying together / how they’re moving with each other. Correlation values: (0.1 – 0.29) = small correlation (0.3 – 0.49) = medium correlation (0.5 – 1) = strong correlation Testing Pearson’s “r” for Significance: 1. Null: p=0 --- (“p” means “rho” – A population parameter; a reflection of the population correlation) 2. Calculate “r” given the information. 3. Compare to critical “r” table A.5 ( here, df = N-2 ) 4. Reject if the calculated “r” falls outside the critical “r” value As N increases, “r” becomes more tightly clustered around “p”, and becomes more significant a more accurate reflection of the population correlation. Independent Random Sampling – pairs of scores that are independent from one another (make sure to have random selection and random assignment to conditions) Bivariate Normal Distribution – working with 2 variables in a distribution – finding correlation. Chapter 10 – Linear Regression Z scores are used to predict one variable from the other. Zy’ = Zx (y is prime b/c it’s the predicted score – you’re predicting y using x) Equation for Least Squares Regression line: Y’ = b(yx)X + a(yx) For example, we’re looking to see which weights (y’) best correspond to which heights (x) in a sample. Y’ = predicted weight b(yx) = slope. The formula is: r (Sy / Sx), where s=standard deviation. X = value for height a(yx) = the y intercept or value when x = 0 If there is no relationship between x and y, the best predictor of the scores becomes the mean of the y scores (Y bar). This means that the slope will equal 0. But the stronger a relationship between x and y (aka higher “r”), the better predictor you will have. The predictors form into a regression line. Residuals (the predicted ‘y’ points) formulate around the regression line, acting like deviations from it. As “r” goes up, residuals get closer to the regression line (and thus father from z=0, or the mean). Variance of the estimate (aka Residual Variance) - measures how far scores differ from regression. This quantifies the total amount of error in a prediction. Unexplained variance – when other unknown factors account for the error. If r=0 and there’s no correlation, then the unexplained variance is 1. Explained variance – how much better at prediction the regression line is than the mean line. Total Variance – the difference between individual scores and the mean. Coefficient of Determination (r^2) – how much the total variance depends on the predictor variable. In other words, explaining the variability using the explained variance. Note: this doesn’t imply causality – it’s all about estimating how well our predictions are. Coefficient of Non-determination (k^2 or 1 – r^2) – portion of the total variance not accounted for by the predictor variable. Note: CoeffD + CoeffND = 1 Homoscedasticity – for every possible X value, the Y value has the same variance. This makes the scores more evenly spaced around the regression line. Chapter 11 – Matched T-Tests : Repeated Measures and Matched Pairs Matched pairs take correlation between 2 items into consideration. (“Items” can mean either 2 different subjects, or 1 subject tested twice) The point is to decrease variability between 2 items this decreases error. Dependent sample tests, repeated measures test, and paired tests are all different from independent t-tests because they correlate 2 separate sets of data. If 2 sets of scores in a sample are not correlated (r=0), then the matched t = independent t value. Why? Subtracting variability can’t leave you with any accuracy because there was no correlation in the first place. THE PROCESS: -Let’s say we use an independent t-test, and end up failing to reject our null based on our results. -BUT this conclusion can still be wrong – if there was a lot of variability between the individual items, then this gets in the way of finding the true pattern and making accurate results. -So we turn this into a matched-t test (a test combining both items) -We start by turning these individual scores into “difference scores” (D). You would do (s1 – s2), etc. for each score pair. -The null: the mean of the difference scores (muD = 0) The alternative: (muD ≠ 0) -Then find D bar (the mean difference) – it is the sum of the difference scores. -The find S sub D bar – the standard error of the difference of the means. -So therefore the Direct Difference Formula takes the mean of the difference scores (D bar) and divides it by standard error of the difference of the means (S sub D bar). That’s how you get the new calculated t -Then we find the t critical value. NOTE: because we turned this into a one sample test, we have to use df = n – 1. -After that, we can do a new, more accurate comparison of t calc and t critical -Find confidence interval for the difference of 2 population means. APA:I am 95% confident that ____ and ___ contains the population mean difference in score between ____ and ____. Advantages and Disadvantage of a Matched T Test: Pros: -has more power! -matched t = higher correlation = higher calculated t -we can subtract unnecessary variance (extraneous variance) so we are left with necessary variance (treatment variance) Cons: -df goes down makes results harder to reject The two types of Matched T-Tests: 1. Repeated Measures (1 subject): your go-to design Measuring the same person twice (repeated measures). Simultaneous Measurement –a random presentation of conditions Successive Measurement – conditions are presented successively this causes problem of order: what if order affects the results? o Before/After o Counterbalancing –randomly varying the order of presentation gets rid of order problems. But does NOT get rid of carryover effects (this is when you carry over the effects/impression from one experiment into another – only time can fix this). Pros – more power because fewer subjects! Cons – no control group 2. Matched Pairs Design (2 subjects) Comparing 2 conditions on different yet similar subjects (i.e. twins). Has less power.
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'