Notes for 4/13/15-4/17/15
Notes for 4/13/15-4/17/15 PSYC 3301
Popular in Introduction to Psychological Statistics
verified elite notetaker
Popular in Psychlogy
verified elite notetaker
PSYC 3310 Industrial-Organizational Psychology
verified elite notetaker
This 13 page Class Notes was uploaded by Rachel Marte on Sunday April 19, 2015. The Class Notes belongs to PSYC 3301 at University of Houston taught by Dr. Perks in Fall. Since its upload, it has received 127 views. For similar materials see Introduction to Psychological Statistics in Psychlogy at University of Houston.
Reviews for Notes for 4/13/15-4/17/15
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 04/19/15
dfgtlt 15 50 41315 More about ANOVA Assumptions Assumptions of an independent samples ANOVA o The observations within each sample must be independent 0 The populations from which the samples are selected must normal 0 The populations from which the samples are selected must have equal variances o The homogeneity of variance assumption 0 Violating this assumption risks invalid test results Note These are the same assumptions as those for an independent samples ttest The FDistribution Recall that when running an ANOVA we calculate an Fratio This Fobt value and Fcalc value which we will find in a table correspond to an Fdistribution Like the tdistribution the F distribution is really a family of distributions whose shape changes based on the df bOth dfbetween and dfwithin Unlike the tdistribution the Fdistribution never approaches normality It is always rightskewed positively skewed so the test is always onetailed and the Fratio cannot be negative F Distribution n113 gt 5 10 Figure 1 Figure 2 Accept Hg Reject Hg Figure I shows how the family of F a istributions changes shape as the a f s change Figure 2 shows why we always have a onetailed hypothesis test FDistribution Table To find an Fem value in the Fdistribution table Table B4 you need to know the alpha dfbetween and dfwithin Note that the Fdistribution table does not ask for dfbetween or dfwithin Instead it asks for the degrees of freedom in the numerator and in the denominator Recall the formula for the F MS ratlo F m Between always goes on top and w1th1n always goes on bottom So the within degrees of freedom in the numerator is dfbetween and the degrees of freedom in the denominator is dfwithin Also note that bolded entries are for 0L01 and nonbolded entries are for 105 We will most always use 105 unless otherwise specified TABLE THE F DI39S TRIEUiTI Hi39 Tefttte euttiee tn Zttgtrtfeee type ere erttrieel entitles tor the IIJS Itevel of Significance Exeitdteee type 1tetuee are felt the m level ref eigrtttteenee 9f Degrea at Freedem Memerater Freedom Denumtneter 1 2 3 e E E E 9 It 2 12 M IE 2139 I l t 21ft Et 225 23ft 23211 23 243 244 345 243 411152 499339 5403 51525 5T 4 5355 5923 5931 14022 61132 6106 142 alt i 62403 2 13151 HQ IHZI39 Whltj 1925 193 1933 IRES g 11433 I439 1941 IIQ AI 1942 1943 Ilg i eeee w 991 9925 9934111 9934 teee wee 9940 9941 we sit44 wee Example You calculated Fobt5 With dfbetween3 and deithin15 Should you reject or retain the null 0f Degree at em Nemereter Fteeefem Dementiaattire f 2 9 4 E E 9 It 7 3392 3394 IE 21quot 14 1 296 235 23quot Tt39 39 265 l 256 253 243 244 239 M 3414 33 334 340 362 351 g ft 29qu 229 2 2 4 259 255 251 243 243 239quotquotquot39239j 2325 s 4459quot 456 432 4J4 44 339 333 333 3amp7quot 356 S 3313 infill 324 312 235 224 2H 259 254 2142 245 242 232 233 2223 353 H221 529 43 444 421 41113 33quot 323 3amp9 3 355 345 33quot 325 Fcrit329 3 29 5 Fnri r Fnh r Effect Size The effect size we use for ANOVA is usually called 112 eta squared This follows the same concept as r2 proportion of variance explained In other words it computes the percentage of variance accounted for by the treatment conditions 2 SSbetween T 55 total Reporting Results In addition to reporting ANOVA results in a table which will be discussed below treatment means and standard deviations are presented in text table or graph The results are summarized in a similar fashion as the results of any other hypothesis test including F df the pvalue and effect size 112 Example F320 645 p lt01 n2 0492 F indicates the Fob value that you calculated 320 indicates the df s 3 is the dfbetween and 20 is the dfwimm Note that the dfbetween value comes first and the dfwimm value comes second p lt01 indicates that the hypothesis test was significant at an alpha level of 01 172 is the e ect size Remember to italicize statistical notation F p etc ANOVA results are always reported in a table in the following format Source SS df MS F Between Treatments Within Treatments Total Note Recall that there is no total MS only an MSbetween and MSwithin F does not really belong in any of the categories between within or total since it is calculated for the overall hypothesis test but we list it in the first category slot so that it is easy to find F is the F obt value There are SSbetween and SSwithin values that can be added to get a SStotal value and the same goes for df values Example Source SS df MS F Between 40 2 20 10 Treatments Within 20 10 2 Treatments Total 60 12 41515 Post Hoc Tests and Complex ANOVA Post Hoc Tests Recall that ANOVA compares all individual mean differences simultaneously in one test so although a significant F ratio indicates that at least one difference in means is statistically significant it does not indicate which means differ significantly from each other If the results of an ANOVA are significant we run post hoc tests Post hoc tests are follow up tests done to determine exactly which mean di erences are significant and which are not They compare two individual means at a time pairwise comparison in a similar fashion as ttests but unlike ttests they have extra measures to reduce the risk of Type I error Each individual comparison includes a builtin risk of a Type I error and this risk accumulates from each comparison you do and is called the experimentwise alpha level Increasing the number of hypothesis tests increases the total probability of a Type I error but post hoc posttests use special methods to try to control this experimentwise Type I error rate We will not actually run any post hoc tests in this class but you should be aware of the theory behind them and of one of the most common tests The Scheff test is one of the safest post hoc tests and uses an F ratio to evaluate the significance of the difference between two treatment MS cond1tlons FA m B W Not1ce that th1s 1s the same formula as that of the F ratlo except within that instead of combining all the treatment groups it tests the difference between two specific treatment groups A vs B That means that the MSbetween and MSwithin values are calculated using the SS values of two groups not the SS values of all the groups For example if you had a certain Factor A with three levels 1 2 and 3 you would need to run three Scheffe tests for levels 1 and 2 levels 2 and 3 and levels 1 and 3 if the overall F ratio is significant Say that Fobt 125 and p 001 and you got the following results when you ran the Scheffe tests Levels bein F value 1 amp 2 3 5 2 amp 3 12 001 1 amp 3 25 5 You can tell that the first and third Scheffe tests were not significant since they had pvalues of 5 which is not significant by any stretch of the imagination The difference that the overall ANOVA picked up is between levels 2 and 3 p 001 In this way we have located which differences are significant Remember you will not have to actually run post hoc tests in this class Just know how they work in theory Complex ANOVA Designs RepeatedMeasures ANOVA Independentmeasures ANOVA what we have been doing uses multiple participant samples to test the treatments but participant samples may not be identical since different people are in each one individual differences Therefore if the groups are different we are not certain whether these differences are due to differences in the treatment or differences in the makeup of the participant groups A repeatedmeasures design solves this problem by testing all the treatments using only one sample of participants The same group of participants undergoes all the experimental conditionstreatments In this case the null hypothesis would be that in the population there are no mean differences among the treatment groups The alternative hypothesis would be that there is one or more mean differences among the treatment groups The theoretical formula for the F ratio would be F variance diff erencesbetween treatmentswithout individual differences variance differences expected with no treatment effect without individual differences The biggest change between independentmeasures ANOVA and repeatedmeasures ANOVA is the addition of a process to mathematically remove the individual di erences variance component from the denominator of the F ratio The numerator of the F ratio includes systematic differences caused by treatments unsystematic differences caused by random factors are reduced because the same individuals are in all treatments The denominator estimates the variance it is reasonable to expect from unsystematic factor the effect of individual differences is removed mathematically and only residual error variance remains Advantages of repeatedmeasures designs 0 Individual differences among participants do not in uence outcomes 0 Smaller number of participants needed to test all the treatments Disadvantages of repeatedmeasures designs 0 Some unknown factor other than the treatment may cause participant s scores to change 0 Practice or experience may affect scores independently of the actual treatment effect 0 Boredom fatigue or carryover effects may affect scores independently of the actual treatment effect Note We will never actually calculate a repeatedmeasures ANOVA in this class but you need to know the theory behind it Factorial ANOVA The structure of a factorial ANOVA consists of three distinct tests 0 Main effect of Factor A 0 Main effect of Factor B 0 Interaction of A and B o Interactionthe effect of one factor depends on the level or value of the other A separate F test is conducted for each one and the results of one test are independent from those of the other tests The theoretical formula for the F ratio in this case would be variance mean diff erencesbetween treatments variance mean diff erencesexpected if there is no treatment e f f ect39 An example of the three different tests Test F A 3 8 B 12 04 A X B 30 001 Test A would test for the main e ect of factor A and likewise for test B Test A x B tests for the interaction between factors A and B The F values for each test are independent of each other The main e ect of factorA is not significant but the main e ect of factor B and the interaction between A and B are significant To properly test the interaction you must test the main effects as well However once you have a significant interaction you cannot reliably talk about the main effects simply report that they exist and then focus on the importance and implications of the interaction An interaction indicates a dependence of factors where the effect of one factor depends on the level or value of another factor They are sometimes called nonadditive effects because the main effects do not add together predictably On a graph an interaction is indicated by non parallel lines lines that cross converge or diverge 10 Graph 1 10 Graph 2 Low 9 Low 9 selfesteem 8 selfesteem 8 E 7 g 7 T o T 3 5 High EC 5 E 4 selfesteem 2 4 O OHigh 3 3 self esteem 2 2 1 1 No Audience No Audience dudience dudience Cengage Learning All Rights Reserved Graph 1 shows no interaction between selfesteem and the presence of an audience Graph 2 shows an interaction In other words a person s selfesteem and the presence or absence of an audience work together to a ect how many mean errors they make The presence or absence of an audience does not a ect the mean errors of a person with high selfesteem but it does a ect the mean errors of a person with low selfesteem Interaction implies moderation Moderation occurs when the relationship between and IV and a DV varies depending on the level of a third variable the moderator Tests of interactions are typically testing moderation hypotheses Sometimes you can have an interaction with the same variable in a test of nonlinearity This would be an interaction of A X A In other words the effect of an IV on the DV may differ based on the level of that IV Example Take the effects of drinking on social skills At lowmoderate drinking levels you social skills might increase positive relationship However at high drinking levels your social skills are likely to decrease negative relationship Graphing this relationship would give you an upside down parabola 41715 Introduction to Correlation Characteristics of Correlation A correlation measures and describes the relationship between two variables Relationships have three characteristics all of Which are independent of each other 0 O 0 Form Direction positive or negative Direction is indicated by the sign or of the correlation coefficient r If both variables move in the same direction as one increases the other increases the direction is positive If both variables move in different directions as one increases the other decreases the direction is negative The most common form is linear and that is the only type we Will worry about in this class Strength or consistency o Varies from 0 to 1 o 0 means that there is no relationship and i1 is a perfect relationship 0 An r of 1 is the same strength as an r of 1 o The closer to 1 or 1 r is the stronger the relationship 0 A correlation coefficient cannot be greatersmaller than i Relationship between beer sale Relationship between coffee sales and temperature and temperature 60 I Graph 1 39 Graph 2 g o 0 so 50 o 390 9 6 o o 9 f 40 lt1 40 D o 1 lt1 t 0 O K 5 30 9 so e M o O C 3 C o o 20 g 20 E E 39 lt 39 lt3 10 0 0 i l i i i I l 4l l l i l i 1 2o 30 40 so so 70 80 20 so 40 so so 70 80 Temperature in degrees F Temperature in degrees F O b D Cengage Learning All Rights Reserved Graph 1 shows a positive relationshipas temperature increases beer sales increase Graph 2 shows a negative relationshipas temperature increases co ee sales decrease d c d A shows a perfect negative correlation r I B has no linear trend and the variables are uncorrelated r 0 C shows a strong positive correlation about 8 or 9 D shows a weak negative correlation about 3 or 4 Notice that the graphs where the data points are closer to a line show stronger correlations The graphs where the data points are more scattered and less linear show weaker correlations Pearson Correlation The Pearson correlation is the most common type of correlation calculated It measures the degree and direction of the linear relationship between two variables Recall that a perfect linear relationship Will have a correlation of 1 or 1 The Pearson correlation is a ratio comparing the covariability of X and Y numerator with the variability of X and Y separately denominator In other words the theoretical definition of r is covariability of X and Y variability of X and Y separately Formula for calculating r SP ssX 55 where SP2XY 71 Note SP is the sum of products and is similar to SS sum of squared deviations It measures the amount of covariability between two variables Note We will actually calculate correlations next week Correlations are used for 0 Prediction 0 Validity 0 Reliability 0 Theory verification Correlation describes a relationship but does not demonstrate causation correlation does not prove causation Establishing causation requires a welldesigned experiment in which one variable is manipulated and others are carefully controlled Example height is positively correlated with intelligence but height does not cause intelligence as you get older you grow taller and more intelligent hopefully Coefficient of Determination A correlation coefficient measures the degree of relationship on a scale from 0 to 1 and it is easy to mistakenly interpret this decimal number as a percent or proportion Correlation is not a proportion it is only a decimal However the squared correlation may be interpreted as the proportion of shared variability This squared correlation is called the coefficient of determination or r2 yes you do find r2 by squaring the value you get for r The coefficient of determination measures the proportion of variability in one variable that can be determined from the relationship with the other variable shared variability For example if r28 then the variables share 80 of the variability Know the difference between r and r2 o r 0 Correlation coe icient 0 Measures the degree of relationship between two variables 0 r2 o Coe icient of determination 0 Measures the proportion of shared variability between two variables Outliers An outlier is an extremely deviant individual in the sample characterized by a much larger or smaller score than all the others in the sample They are easily recognizable in scatter plots and produce a disproportionately large impact on the correlation coe icient 12 Graph 1 1O r 008 a 8 a 2 2 9 9 gt 6 T gt 4 2 l l l l l 1 2 4 o 8 1O 12 14 2 4 o 8 1O 12 14 Xvolues Xvolues Original Doto DOTO with Outlier Included Subject X Y Subject X Y A 1 3 A 1 3 B 3 5 B 3 5 C o 4 C o 4 D 4 1 D 4 1 E 5 2 E 5 2 F 14 12 O b 6 Cengage Learning All Rights Reserved Graph 1 and 2 share the same first five points while graph 2 has an additional sixth point which is an outlier You can see that graph 1 has a correlation of almost 0 there is basically no relationship between X and Y However graph 2 shows a very strong positive relationship r85 Looking at graph 2 reveals that there still appears to be no relationship among the variables but the single outlier has completely distorted the correlation coe icient If you have an outlier you must decide what to do with it If you can find that it is an error data was entered wrong participant lied etc then you can delete it If you cannot be sure that the outlier is an error then you can do the analysis with and without the outlier so that you get two separate correlation coefficients Whatever you decide to do with the outlier you must explain it in your report of the results Hypothesis Testing with r The Pearson correlation is usually computed for sample data but can be used to test hypotheses about the relationship in the population The population correlation is symbolized by the Greek letter rho p The Hypotheses 0 Nondirectional 0 Ho p 0 0 H1 p i 0 o Directional 0 Ho p S 0 0 angt0 OR 0 Ho p Z 0 0 anlt0 Note You will most always perform directional hypotheses tests with the correlation You can test the significance in three different ways 0 Look up in table B6 Critical Values for the Pearson Correlation o Ttest o Ftest The sample correlation r is used to test population p Where the degrees of freedom is n2 If you use the t table to find a critical value using df n2 then use the equation t to get a I 139 Tl 2 tobt value However the easiest way to test the hypotheses is to just look up the critical value in table B6 ThELE 376 EHII39I39IJEAI 1lll g litlLlLlIE5 F l quotIl39IlIE F39lEiA S M E lElATl Hlt WM 11114 ugnil 1mnl 111 5444441 turrt tu limrl r unuiJrii it 3414 Hum Mt Luann Ill ML ij39lllll i ll iattuluri il Hui 15411 LaurelElf far 45 1225 4 4123 4 at lmIaiie RISE Eff n E 141 E 412 A I J zll F3 413415 922439 2 J Eil l 22454 Wl El 3259 444 9512 4 224 l J 342 El 5 664 254 333 3424 i4 2112 2244 4934 T 5542 4144 i tl 2134 51 5491 4132 214 245 L H I2 2335 m 44 3th 454 Etna1 i I 421 552 4224 4991 i2 445 532 41 2 E H i i2 44 i 5I4 5142 4344 i4 421 442 524 42 i 442 432 5154 4414 14 4ll 4453 542 jigIi 171quot 335 45l u 525 3395 13 444 516 Elfll 14 345 434 543 544 2 0 Bit 2 423 492 33quot 4lii 442 524 4114 422 5 IE 23 222 2134 442 5115 24 334 2444 45 4124 23 244 445 44 2an I T 314 424I 41quot 2139 Sill l JET 424 4TI39EI 23 3124 316 J 423 4521 239 31 l 353 4142 456 3 2415quot 349 AW 4439 23 225 225 234 4 H4 4 252 7414 254 2132 242 244 2215 222 ll 23 222 222 1154 ll 24 I 354 245 TITII El n 2252 224 2412 El 133 2I2 251quot 24 It I23 2115 242 212quot mu 7 Im A 145 a 2341 154 Decide whether you have a onetailed directional 0r twotailed nondirectional test pick your alpha level and determine the df to find the rem value Example You calculated r 68 for a sample of n 30 You did a onetailed test With an alpha level of 05 Is r significant Note that df 28 because df 112 302 28 155 m5 m liar Eff a n 3 3 m 135 m J Iquot Ill 35 J a 4 I E sum 351 413 2 i at rcrit306 306 68 rcrit robt I Reject Ho the variables are positively related I
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'