Psy 202 Test 2 Study Guide
Psy 202 Test 2 Study Guide Psy 202
Popular in Elementary Statistics
verified elite notetaker
Popular in Psychology
verified elite notetaker
This 7 page Study Guide was uploaded by Anna Ballard on Thursday October 6, 2016. The Study Guide belongs to Psy 202 at University of Mississippi taught by Mervin R Matthew in Fall 2016. Since its upload, it has received 66 views. For similar materials see Elementary Statistics in Psychology at University of Mississippi.
Reviews for Psy 202 Test 2 Study Guide
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 10/06/16
FA16 PSY 202 Section 1 Exam 2 Review Sheet Chapter 6 I. CORRELATION Comparison with regression • Correlation Model – neither variable is dependent on the other; the two can be related but that does not mean one causes the other - Correlation ≠ causation - Lowest possible value for correlation = -1.00 o Means there is a perfect negative correlation as one variable increases, the other decreases - Highest possible value for a correlation = +1.00 o Means there is a perfect positive correlation As one variable increases, the other increases - A value of 0 means there is no correlation at all o The closer the value is to 0, whether that value is negative or positive, means the correlation is not as strong • Regression Model – predicts cause and effect between variables - Used for quasi-experimental designs - Uses independent and dependent variables II. SCATTERPLOTS How they’re created - ONLY SCATTERPLOTS show all scores in a distribution o Y-axis and X-axis Plot each pair of the variables on the graph Every single pair This helps with correlation coefficients - Magnitude tells us how strong the correlation between the two variables is o how far away from |1| determines how strong What their shape tells us about the bivariate distribution - Shape tells us the strength of the linear association o Positive vs. negative correlation o If our scatterplot is not linear, the information given to us by the correlation coefficient is underestimated III. INTERPRETING THE CORRELATION COEFFICIENT (PEARSON’S R) Other Considerations - Restrictions in range - Presence of outliers - Presence of subgroups - Causality To use Pearson product-moment correlation our data must be: - continuous and linear - Interval and ratio scales IV. ALTERNATIVE APPROACHES TO CORRELATION Which approaches fall into each subcategory & Characteristics of data for which each is used - Other Product-moment correlation coefficients o Spearman Rank Correlation, Point-Biserial Correlation, and The Phi Coefficient Still must be continuous and linear with interval and ratio scales - Non-product-moment with quantitative and dichotomous variables o The Biserial Correlation Coefficient and the Tetrachoric Correlation - Non-product-moment for categorical values o Contingency Coefficient and Cramer’s V Chapter 7 I. STATISTICAL EQUATION OF A LINE • Y-hat –> predicted value of Y • b0–> Y-intercept • b1–> slope of the line II. CHOOSING LINE OF BEST FIT Difficulties • We cannot know what line best summarizes a bivariate distribution by just looking at its scatterplot - Because many lines can be found to summarize most scatterplots - Must find the line that does best job of including every score Sum of squared error • Estimation for all data points - Helps predict Y from X and also squares the terms in error of estimation to find the best line possible o Sum of squared errors helps us figure out which line best summarizes a scatterplot We want a lower sum because this means the line fits the distribution best with least amount of error • Errors of estimation (residuals) - The difference between the predicted scores and the actual scores o Vertical distance between Y an1 Y hat o The amount of variance not shared • Equations that minimize Sum of Squared Errors - Because we have more than one population parameter, we lose 2 degrees of freedom for estimates of 2 means o Deviations taken from regression line which have 2 statistics: X and Y Proportion of variance accounted for • This is how well X predicts Y - Aka how much variance in Y is shared with X - Calculated by dividing the sum of squares between groups by the sum of squares total Reversing the roles of the IV and DV - How relation of one slope predicts the other o Geometric mean Standardized regression solution - How to use the z-score for X to predict the z-score for Y o Predicting Y using X Y-intercept where X = 0 intersects to where Y = 0 Y intercept always equal to 0 Slope Slope is always = r xy Standard error of the estimate - Treats error across entire regression line as equal o Why it is not a good predictor for the population - Can also differ in terms of y-intercept and slope Standard error of the prediction • Confidence Interval – to predict the average value of Y at a particular X • Prediction Interval – to predict the individual values of Y at a particular X Cook’s distance (D) • Sometimes one point will pull entire regression line towards it - If D is larger than 1, there is a substantial influence • Cook’s distance tells us whether its pulling the regression line but NOT how it is pulling it Chapter 8 I. THREE VIEWS OF PROBABILITY Definitions of each • Personal/Subjective View - Used most often - Humans are naturally terrible at probability, so this is not the one we should us for statistical inference - Biggest issue: not based on imperical probability o Based on how a person feels, which can change from moment to moment and person-person Predictions one person might make will be very different than predictions of someone else • Classical/Logical View - How nature distributes events • Empirical/Relative Frequency View - nA–> number of outcomes satisfying condition A - n –> total number of observed outcomes o if you have a large enough sample, the probability that you get from that sample should mimic that of the population from which the sample came from Why the personal/subjective view can’t be used • It is based on how a person feels - Means it can change from moment-moment and person-person How the empirical/relative frequency view and classical/logical view are related • As sample size increases, empirical/relative frequency view should start to converge with the classical/logical view II. DEFINITIONS • Experiment – some probabilistic event you make some predictions based off of and then the observe the outcome to see how solid your predictions were • Event - Simple event – one outcome can satisfy it - Compound event – multiple outcomes could satisfy it • Sample space – how many possible events there are • Probability function – defines the probability that any individual event will occur III. RULES FOR FINDING PROBABILITIES Meanings of terms • P(AB) –> probability of A or B • P(AB) –> probability of A and B When we simplify formulas • Simplify formula only when A and B are mutually exclusive Additive/union rule P(AB) = P(A) + P(B) – P(AB) - Subtract probability of both events happening together to eliminate the probability of anything being counted twice. P(AB) = P(A) + P(B) - If A and B are mutually exclusive, we do not have to subtract that probability Multiplicative/intersection rule P(AB) = P(A)* P(B|A) - Multiply probabilities of each B and A o P(A|B) –> probability of B given A P(AB) = P(A)* P(B) IV. RULES FOR COUNTING • Fundamental counting rule • Permutations – P = nn - n = number of permutations of n objects • Combinations – number of combinations of an object using r < n - combinations do not care about the orders - How they relate to permutations Chapter 9 Probability distributions • Definition - Probability distribution – tells us the probability that the random variable will take on each of its possible values o Classical/logical side of probability • Random variables – variable whose value is determined by probability • Sampling distributions – probability distribution based on repeated samples of a statistic - Empirical/relative frequency side of probability - If you have a large enough sample, sampling distribution should resemble probability distribution because the 2 should converge o like why empirical/relative frequency converges with classical/logical when there is a large enough sample size Bernoulli trials • Five characteristics 1) only 2 possible outcomes o success and failure 2) p + q = 1.00 o probability success –> p o probability failure –> q 3) specified number of trials o how many trials you are going to do 4) those trials have to be independent of each other 5) random variable The binomial expnnsirn (n-r) •P(Y = r) = C p qr - C = # of combinations o C –> 4 coins; 2 land on heads 2 - r –> how many heads you wanted - C =rn! / (4 – 2)!2! = 6 - = 6/16 The binomial distribution • Know how to read a Pascal’s triangle Expected value of a random variable • Definition - Expected Value – measure of central tendency; average for a random value over an infinite set of scores • How it’s calculated - E(Y) = ∑[Y • j(Y )] =jµ y - Get products of the possible values first first and sum them • Relationship to the population mean - µ y mean Variance of a random variable • Use expected value to get variance of infinite set of scores Probability and area • For the binomial distribution • Why it can’t be calculated the same way for continuous distributions - Must figure out area under curve instead Chapter 10 Normal distributions • Characteristics - Bell-curve – shape of distribution o Infinite number of possible values o Frequency polygon that never touches X value - Kurtosis & Skewness both = 0 o Helpful for lots of reasons - 3 major measure of central tendency: mean, median, mode are all equal o What goes on one side must go to the other o 50% above and below median - Points of inflection at -1 and +1 o 34.1% distribution between the mean and that 1 away o mean of unit normal distribution always = 0 and standard deviation always =1 • Reasons for importance - Many values in nature are already normally distributed, certain distributions are pretty close approximations to normal distribution o AKA even if distribution is not normal, you can calculate the normal distribution for the original distribution - All distributions can be connected by deriving info from the formula we use for normal distribution. • Transforming to the unit-normal distribution - Compare to other normal distributions by converting a unit-normal distribution to a normal distribution o mean of 0 o standard deviation of 1 o use the table of probabilities for that normal distribution. • Areas under the curve - Above o Convert the raw scores into Z-scores - Below o Convert the raw scores into Z-scores - Between o Convert Y into z-scores and subtract lowest value from the highest. • Approximating the binomial distribution - Ex: coin flips – 20 flips –> n = 20 - Want percent of distribution higher than 9 o Convert 9 into a z-score after adjusting for continuity Adjust because these are discrete variables o The more times you flip the coin, the more the distribution becomes normal More trials –> more normal Sampling distribution of the mean • Definition - Sampling Distribution of the mean – distributes the averages of the raw scores o these averages will create the sampling distribution, which comes out to be approximately normal o based on repeated samples. • Purpose • Central limit theorem - Tells us no matter what the raw score distribution looks like, sample distribution is still going to come out normal o means will still be equal o as sample size gets bigger, standard error of mean gets smaller which means sample means get closer to grand mean grand mean always equals mean for the raw scores of the population. o hypothetically, if N equals infinite size, standard error would = 0 - Standard error of the mean – quantifies how precise we know the true mean of a population • Three possible explanations for extreme scores 1) Sampling error (can’t really avoid) - took too many of one type in your raw scores just by chance 2) Sampling bias - Sample differs from population due to the way you drew the sample - Did not use simple random sampling (our preferred way) 3) Sample comes from a different population - Worry about the most; key for inferential statistic - “Phantom distributions” around the one we think we get our sample from
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'