Psychological Statistics PSYC 2020
Popular in Course
Popular in Psychlogy
This 0 page Class Notes was uploaded by Dayne Reichert on Monday November 2, 2015. The Class Notes belongs to PSYC 2020 at Georgia Institute of Technology - Main Campus taught by Christopher Hertzog in Fall. Since its upload, it has received 26 views. For similar materials see /class/234266/psyc-2020-georgia-institute-of-technology-main-campus in Psychlogy at Georgia Institute of Technology - Main Campus.
Reviews for Psychological Statistics
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 11/02/15
A scale that represents different qualitative types or categories with numbers for example 1 boys 2 girls is an nominal scale An interval scale is one that has the property that a higher scale score represents a greater amount of attribute than a lower scale score and is a linear transformation of the true attribute dimension For continuous RV the probability of occurrence of any exact value of X may be regarded as 0 Consider a sample set of data 0 0 0 25 50 The mean ofthese data is 15 If one is concerned above sensitivity of a central tendency measure to outliers the best descriptive index to use is the Which ofthe following is a measure of dispersion a harmonic mean b standard deviation c 80th percentile d a frequency histogram e none of the above The expected value of deviation scores Xi M is O The SD may be interpreted as the average deviation of a score from the mean After transformation to a z score any variable will have a normal distribution 0 M N The variance ofthe sampling distribution of the mean depends on The expectation of the sample variance 2 is The standard error of the mean can be interpreted asUM JLN Absolute Deviation I The data value the mean Areas under percentiles of the Normal Distribution Percentage 0 Cases Under Portions cl the Normal Curve Standard Deviations Percamlla Equivalents 74mm J J l l I l l l L I I 4L J 20 30 40 50 50 70 ED GRESAT39I i l i l i I i I i L Li A i 200 SW 400 500 600 700 EDD Boxplots I Made up of a box and 2 whiskers I The box shows 0 The median o The upper and lower quartile o The limits within which the middle 50 of scores lie I The whiskers show 0 The range of scores 0 The limits within which the top and bottom 25quot of scores lie 4 or m 457 0 T 250 m Middle 50 I Op 0 a 1 a Middle 50 Top 25 Hygiene Day 1 of Download Festival El Bottom 25 l Male Female Gender of Concert Goer Classical Statistical Inference Type I and Type II errors 0L Type I error p Decide H1 HD true B Type error p Decide HD H1 true p 1 1 Decide HD HD is true 9 Correct decision not to reject HD p 1 B D cide H1 H1 is true 9 Correct rejection of HD True HO True H1 Decide HO 1 a Type II B Decide H1 Type I 1 B a Onetailed vs twotailed tests when and why Sometimes we are only interested in one direction of the alternative hypothesis 0 HD u k 0 H1 u gt k This also corresponds to testing 2 competing inexact hypotheses 0 HD us k versus 0 H1 u gt k n We use ltailed tests when H1 is a directional hypothesis we only have a significant result if we conclude that the population mean gt the value specified in the null hypothesis lTailed Region of Rejection 0 Here we select critical ZM so that 05 of the values are above the cutoff instead of 025 for the 2tailed test 0 The standard score corresponding to cumulative p 95 is 165 0 Otherwise everything works the same Dnn r reject Ralett 1 0 1 165 o 0 Practical Implication Greater Chance of Rejecting Hu if M gt um 2Tailed Region of Rejection 0 Statistical power 0 With a specific hypothesized alternative value for u we can actually compute the probability of correctly rejecting H0 in favor of H1 1 B which is called the power of the statistical test Illustrations of errors and power from overlapping normal sampling distributions 0 Power Illustration 0 Situation in which one forms a l tailed hypothesis I Do older adults with hearing aids perform better than those without hearing aids Assumption that hearing aids could improve performance Illl 13B 140 1413142 under Ho under H O o ZM ZMquotiti4 lHo Confidence intervals for population mean 0 Given the normal sampling distribution over all samples of size N the probability is 095 for the event that 196 s M u s 1960M o M 1960M S M S M 196pM Zcriterian UM S M H S Zcriterian UM 0 Where the criterion is the normal deviate corresponding to I a the uncertainty to be tolerated for the confidence interval 0 A general rule ofthumb is take a hypothesized value of u call it 39k 0 Compute an interval estimate for u from sample data 0 IF the hypothesized value lies outside the confidence interval we can conclude with 1 0 confidence that the hypothesized value of u is correct 0 We can reject the hypothesis that u k if k is not contained in the confidence interval Degrees of freedom The concept of how many opportunities there are for data to vary independently Consider a deviation score d X M Because M there is a constraint placed on the deviation scores once I know N l ofthem the last deviation MUST M 2d There are N l opportunities for deviation scores to vary freely about the mean M Because s2 is based on the sum ofthe squared deviation scores where the deviation is computed as deviations from the sample mean M computed from the same data there are N l degrees of freedom associated with s2 or s Descriptive Statistics 0 Central Tendency 0 Mean arithmetic average of scores 0 Median 50 percentile score the middle score when scores are ordered I Appropriate for ordinal scales and above 0 Mode Most frequent score 0 Dispersion 0 Range difference between high and low scores 0 nterquartie Range difference between 25 and 75 Esquared deviations o Variance N71 0 SD xVariance 1 1 svbmm M 2 Discrete and Continuous RV 0 Discrete If a RV can assume only a particular finite or countably infinite set of values it is said to be a discrete RV 0 Continuous associates probability distribution of measures variable X in a continuous form PaltXltb defined by interval altXltb Discrete and Continuous Probability Distributions Expected Values of Sample Test Statistics Mean Variance 0 Expected Values 0 Weightedaverage of possible values of x each weighed by the probability x assumes that value 0 IfX is continuous then Pm wmm Frequency Histograms Frequency 0 Mean 16 28 0 Floor and ceiling effects are a major problem for psychological scales 0 Difficulty effects if a test is too hard floor effects if a test is too easy ceiling effects 0 Skew o 0 if symmetric distribution 0 Negative if lefttalked 0 Positive if righttailed o Kurtosis o 0 if peaked as normal distribution 0 Flatter platykurtic more info in tails negative kurtosis o More peaked leptokurtic less information in tails more in center of the distribution positive kurtosis Logistic Regression 0 Use to predict an outcome variable that is categorical from one or more categorical or continuous predictor variables 0 Used because having a categorical outcome variable violates the assumption of linearity in normal regression Odds and log odds Transformations on dependent variable Relation to multiple regression 0 With multiple predictors logistic regression isjust like multiple regression Interpretation of regression coefficients Tests of model fit and individual regression coefficients Mediational Analysis Concept of mediator Tests of mediation in terms of multiple regression tests Causally spurious correlation Test of partial regression coefficients with complete mediation Partial Mediation Moderator analysis Tests of moderated regression using hierarchical regression Variable centering What is it Why do it Model Comparisons Compact Model Augmented Model Multiple Regression Regression equation 0 Linear Regression Equation 0 Yi Bo 31X1 BZX2 B3X3 6i 0 There is a slope for each independent variable the B s are termed partial regression coefficients and are interpreted as the effect of each X on Y controlling for the other X s Meaning of lpartial regression coefficient 0 The B coefficients are standardized partial regression coefficients Tests of Model R2 0 How to compute Ftest Interpretation of Ftest Tests of individual regression coefficients Confidence intervals for regression coefficients Power in multiple regression models Indices of predictor redundancy in multiple regression Model Testing in multiple regression Order of Entry of variables in equation Relation of order of entry to 39 quot 39 partial g Tests for increments to R2 null hypothesis that increment 0 Exploratory regression methods Forward inclusion Backwards elimination Stepwise regression Multicollinearity Consequences of Diagnostics of Regression diagnostics Plots of residuals Leverage and influence The Normal Distribution 0 2 parameters 0 Mean p o Variance 02 lt u gtm o 0 Ha the cumulative probability of X S a under any probability distribution lFlalykltXr Film FIE pissxgm ltw x a Normal Normal Quantile Plots Part or Semipartial Correlation Relation of both to Venn diagrams of variance accounted for Partial Correlation Pearson ProductMoment Correlation r Definition Relationship of r to simple regression coefficient slope Interpretation of regression in terms 0 r Polynomial regression as curve tting Polynomial regression as hierarchical regression problem Population Parameters o u the mean 0 02 variance 0 M sample mean estimate of u o 82 sample variance estimate of 02 Power in relation to Proportional Reduction of Error Probability Distribution Proportional Reduction of Error Sampling Distribution of the Mean 0 EMu 2 o The variance ofthe sampling distribution of the mean is a o Variance of error for M TIE 07 039 0 Standard error ofthe mean UM W Standard Error of the Mean 039 0 0M d W S 0 Estimated 0M W 0 Use of s is important in getting a sample based estimate of the standard error of the mean Standardization and Normal Deviates 0 Common form of linear scaling xii x7 2 o Zi 039 0 Where M is the sample mean and S is the sample SD The mean ofa standardized variable is O and the standard deviation is 1 tdistribution when used over normal distribution 0 The problem is thatt has more area under its tails than 2 Statistical Control concept of Simple Regression Equation Predicted Scores Proportional Reduction of Error in fitting a linear regression model Test of R2 in Simple Regression Test of null hypothesis that slope 0 Expected value of numerator and denominator of Fstatistic Test of null hypothesis that intercept 0 Confidence intervals for population slope Assumptions required for regression analysis Sums of Squared Errors Testsof39 39 in g 39 39 g 39 Order of entry of variables in moderated regression Decomposing the significant interaction simple slopes Test of Model with Mean only relation to onesample ttest for mean k Types of Scales Nominal Ordinal Interval Ratio 0 Quantitative Scales ordinal interval ratio 0 fo9 implies functional state of observed entity mapped to variable 0 t09 true but unobservable property of object o mo 9 property of empirical scale or measure of object assigned numbers Nominal Scale categories kinds gender major religious affiliation Ordinal Scale numbers can be used to define rank orders but they do not convey relative difference in amount of underlying attribute o m0 ltgt m0j implies t0 ltgt toj o m0 gt m0j implies t0 gt t0j o Interval Scale ordinal properties apply as well as for any object 0 m is an interval scale ifft0 x impliem oi ax b where a at 0 o The s assigned tell us about relative differences in the amount of underlying attribute and this difference is equivalent across all levels of the interval scale ie temperature 0 In any sample the deviation ofa score from M is d x M o Edi 0 Variance as Residuals ofa Model with a Constant Likely sample problems hand calculations from summary statistics have the formulas ready Compute mean median and mode from raw data Compute variance and standard deviation from raw data Use normal distribution to calculate region under curve tables provided Use normal distribution to compute 95 or 99 confidence interval for mean given N mean variance or summary statistics that can be transformed to get the mean and variance Compute one sample t test given N mean and variance or equivalent Compute Pearson correlation from covariance and variances oftwo variables Compute partial and semi partial correlations from summary statistics Compute test of correlation coefficient 0 if provided r to z transformation table Compute test oftwo independent correlations to each other if provided r to z transformation table Compute simple regression equation from summary statistics means variances and the covariance of x and y Compute test of slope equal to O in simple regression Compute confidence interval of slope in simple regression Compute test of model R2 in multiple regression if given total sums of squares residual Sums of Squares regression sums of squares N and P in needed combinations Compute standard error of a multiple regression coefficient t test for a regression coefficient and 95 confidence interval for the regression coefficient Compute tests of semipartial correlations as increments to R2 in multiple regression equation if given R2 at each step along with N and number of variables in each step Compute tests of increments to R2 if given R2 at each step Compute test ofinteraction moderated regression as increment to R2 If given moderated regression equation and sample summary statistics compute simple slopes and plot interaction QUIZZES STANDARD NORMAL DISTRIBUTION 1 What is the probability of observing a z score less than or equal to 157 2 points pz S 157 pz 2 157 005821 2 What is the probability of observing a z score between the value of 150 and 050 p 150 lt Z lt 050 pz lt 050 pz 2 150 069146 006681 062465 CONFIDENCE INTERVALS Professor Jobs completes an unemployment survey with a sample of 144 persons randomly drawn from the US workforce The mean satisfaction rating with their job was 35 rated on a 1 10 Likert scale with 1 being highly dissatisfied 5 being slightly dissatisfied 6 being slightly satisfied and 10 being highly satisfied The estimated sample variance is 6 1 Compute a 90 confidence interval for the population mean ofjob satisfaction on this scale S M i Zcriterion N 6 245 35 i 164 35 i 164 35 i 1640204 35 i 0335 4144 12 31653835 2 Is professor Jobs justified in stating that on average the US worker is not satisfied with hisher job and why Yes the 90 CI of the mean incorporates scores from the lower range of the scale This end of the scale corresponds to workers disliking their jobs REGRESSION The ACME lottery company tracks the number oftickets sold and the estimated size of the jackpot over a one year period The mean jackpot size is 182 million USD The mean number oftickets sold is 107 million The following summary statistics are provided to regress the number of tickets sold Y on jackpot size X Stickets 8 Sjackpot 1439 Sjackpot tickets 1 What is the regression equation b S 5395 5395 0 028 1 s 14219639 b0 7 bl 107 0028 a 182 107 05096 101904 Y 101904 0028X Tickets 101904 0028ackp0t 2 Briefly interpretexplain the regression coefficient slope in terms of chances in the number of tickets sold per 1 million dollar increases in jackpot size For every 1 million dollar increase in jackpot size there is an associated 028 million increase in the number oftickets sold 3 What is the predicted number of tickets sold when the jackpot is 60 million USD 1 101904 0028X Tickets 101904 0028ackp0t Tickets 101904 0028 60 101904 168 118704 MULTIPLE REGRESSION 1 Complete the following table by computing the F tests for changes in R2 in each step For the critical value of F for each test you should use either F150 403 or F250 318 What would the analyst conclude in each case Step Variables R2 dfresidual F test Change in R2 39Decision 1 X1 X2 074 54 216 Fail to Reject H0 2 Add X3 X4 179 52 333 Reject H0 3 Add X5 200 51 133 Fail to Reject H0 4 Add X6 275 50 515 Reject H0 Step 1 F254 08027 z 216 lt meal 250 318 5773 926 l 821 12826 z 333 gt mea 250 318 l 02651 z 133 lt Fm 150 403 l Step 2 F25 2 800 Step 3 F151 5776 5777 Step 4 F150 1 j 10350 z 515 gt mea 150 403 2 What is the F test for the overall regression equation test that n2 0 at Step 3 F650 57 6 1 6 250833 z 2083 HOMEWORKS HOMEWORK 1 1 A researcher collects the following set of data 125 135 113 150 167 153 159 126 116 102 Show all work by hand a What is the mean ofthis sample 102 113 116 125 126 135 15 153 159 167 10 1346 b What is the median of this sample 126 135 1305 c What is the mode ofthis sample No mode bc each value occurs only once d Is the distribution of this sample skewed positively or negatively This distribution is skewed positively e What is the variance ofthe sample use n 1 as the denominator X X u X u 2 102 326 1062 113 216 467 116 186 346 125 096 092 126 086 074 135 004 0002 15 154 237 153 184 339 159 244 595 167 324 1050 2x 2 42624 VAR W T AlB f What is the standard deviation of the sample DWm2176 2 a Construct the stem and leaf plot of the following values 35 61 58 64 74 89 87 71 56 61 49 64 78 74 80 60 81 87 70 51 70 STEM LEAF 3 5 4 9 5 1 6 8 7 0 0 1 4 4 8 8 0 1 7 7 9 b Identify the Mean Median and Mode of the values above c Is the distribution of these values skewed positively or negatively 3 Given data of the following types state the scale of measurement that each type appears most clearly to represent nominal ordinal interval ratio a Nationality of an individual s father b Memory ability as measured by the number of words recalled from an initially memorized list c Reading ability offifth grade children as shown by their test performance relative to a national norm group d Hand pressure as applied to a flexible bulb that is on a dynamometer e Social Security numbers f Taking the arithmetic difference between two values g stating that one value represents a higher level of some property than does another value HOMEWORK 3 1 An experimenter was interested in the possible linear relationship between the time spent per day in practicing a foreign language and the ability ofthe person to speak the language at the end of a 6 week period A random sample of 12 students showed the results as follows Student in score in the 30 115 30 83 30 110 50 107 89 77 82 99 125 140 127 109 1 2 3 4 5 6 7 8 9 Stude Xmea Xdif Xdifquot2 YFL Ymea YdifYF Ydifquot2 XdifYd nt X Prac n L if XPra c Ymean Xmea n 1 O 3 1096 0796 06333 115 10525 975 950625 7759 39 51 2 O 3 1096 0796 06333 83 10525 2225 495062 17707 39 51 3 O 3 1096 0796 06333 110 10525 475 225625 3780 39 51 4 O 5 1096 0596 03550 107 10525 175 30625 1043 39 17 5 O 5 1096 0596 03550 89 10525 1625 264062 9682 39 17 6 O 5 1096 0596 03550 77 10525 2825 798062 16832 39 17 7 l 5 1096 0404 01633 82 10525 2325 540562 9397 39 51 8 l 5 1096 0404 01633 99 10525 625 390625 2526 39 51 9 l 5 1096 0404 01633 125 10525 1975 390062 7982 39 51 10 2 1096 0904 08175 140 10525 3475 120756 31420 17 25 11 2 1096 0904 08175 127 10525 2175 473062 19666 17 12 1096 1154 13321 109 10525 375 140625 4328 225 01 SUM 642 434225 8311 a Compute Sum of Squares for X Y and XY SSX 2X X2 642 550 20 72 434225 SSX Y 2X XY 17 8311 b Compute the correlation between x and y using the answers obtained in part a 3x VarX x 584 764 sy VarY 439475 19868 C0vX Y N 1 SSX Y 8311 12 7555 1 DATA MODEL ERROR Yi 50 5i 0 Y represents scores for i 1 to N cases on variable Y 0 8i represents an error score for each individual case 0 B0 is a parameter value assumed to be constant over i I b0 is an estimate of B0 Central Tendency 0 Mean 1 o M sample mean 0 Median 0 Mode Confidence Interval M izrriterion Correlation r 512 5152 ny 22622312 0 x y 8L xy represent the SS of X Y amp XY respectively 0 01 small 03 medium 05 large 0 Coef cient of Determination PRE R2 variance in Y determined by X 2 5x 0 hi all M SSEC o n2 PRE of population I If r2 0 then population variance of the predicted scores and the population variance of the error scores have the same expected value 0 Increments to R2 I For Y a b1X1 e RZVJ rzvl With x1 and x2 R212 r2V1 rzwu 39 With X1 X2 amp X3 R2v123 r2v1 r2vl21l r2v1321 Covariation extent to which X and Y vary together XrMMYrMyH N o Sxy 2 Degrees of Freedom nu v df 0 Typically Nlk o N total number of cases 0 k total number of parameters Dispersion 0 Range 0 Interquartile Range I Quartile 3 Quartile 1 2 o Variance 0392 2xi M o 52 sample variance 2 7 o E52 6111 0 Standard Deviation 500 1NZ xi M2 0 Square root of variance 0 S sample standard deviation FStatistic F M Smodez M 597707 0 M5 mean square 55 R2 7171 7 o k of parameters Fdfregression I dfresidual lAiRRzz 12 O dfregression O dfresidual N p n307M2 52 3 PRE MSEA PRE R2 MSEA 2 o t2 Kurtosis peakedness o Platykurtic 0 Flat I More information in tails 0 Negative 0 Leptokurtic o Peaked I More information in center 0 Positive Linear Regression Yi 30 1Xi 51 B0 is an intercept location parameter B1 is a slope accounting for change in Y per change in X b ELKXFXXYFVH 1 maxi462 O O 5xy 39 bra Mean Squared Error MSE estimate of residual variance o MSEA 17R2 quot 1 PA 7P6 1 iPREn iPA Normal Distribution Function xiu g 202 fx Partial Correlation T12 T13r23l T123 1 T123X1 r223 Power 0 How to increase 0 Increase type 1 error rate criterion 0 Reduce error variance I Improve measurement I More control 0 Increase sample size 0 Reduce number of predictor variables Residual ei error 0 actual calculated Semipartial Correlation T12 T13r23l o r103 1 2 T23 Skew asymmetry o 0 symmetrical 0 Negative lefttailed 0 Positive righttailed Standard Error Standardized Coefficient o standardized B b 5y o b slope Sum of Squares 5 2X X 2 o X or u population mean 0 X or M sample mean Sum of Squared ErrorSSE 2xi b02 Sum of Squares Reduced 55R 55190 SSEA quot30 M2 o PRESSEC Tolerance o 1 R 0 Should be greater than 02 TStatistic t Milt SW rxv SQRTn2 SQRT1r2xv rxy n 72 14 o xT Variance In ation Factor 14 0 Should be less than 10
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'