PSY 3392: Exam 3 Review
PSY 3392: Exam 3 Review PSY 3392
Popular in Research Design and Analysis
verified elite notetaker
Popular in Psychlogy
This 11 page Study Guide was uploaded by Kimberly Notetaker on Friday March 25, 2016. The Study Guide belongs to PSY 3392 at University of Texas at Dallas taught by Noah Sasson in Summer 2015. Since its upload, it has received 17 views. For similar materials see Research Design and Analysis in Psychlogy at University of Texas at Dallas.
Reviews for PSY 3392: Exam 3 Review
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 03/25/16
3392 Exam 3 Review I. Null Hypothesis Testing (NHST) (ch. 12 (p. 384-90, lecture) a. Terms to understand: null hypothesis (H0), alternative hypothesis (H1), Type I error, Type II error, alpha level, p-value (level of significance), effect size, power o Null Hypothesis (H ) 0 The IV has no effect on the DV There is no difference between groups There is no association between variables Example: There is no difference between estimated and actual calories consumed o Alternative Hypothesis (H ) – 1here is a difference The IV does have an effect on the DV There is a difference between groups There is an association between variables Example: There is a difference between estimated and actual calories consumed o P value: the probability value that the null hypothesis is true; which is why we want to get a small p value because we WANT it to be unlikely o Alpha level: threshold the p needs to be under to reject the null hypothesis (.05) o Type I Error: » When you reject the null hypothesis but it is actually true; a false positive » In other words, you find statistically significant results to support your hypothesis (i.e., you reject the null hypothesis), but the effect isn’t real! o Type II Error: » You fail to reject the null hypothesis when it is actually false; a false negative » In other words, you fail to find statistical significance for a real effect! *usually caused by small sample size (p value highly sensitive to effect size) o Effect Size: An estimate of the size of an effect that is mostly independent of sample size (p-value is very dependent on sample size) o Power = the probability that the null hypothesis will be correctly rejected when it is false; the ability to detect statistically significant effects b. Should know that NHST is based on probability (can support a hypothesis but can never prove or disprove one; there is always possibility for error). c. Should be able to identify null and alternative hypotheses d. What does it mean to reject the null hypothesis? It is likely that we have a true difference e. Should know what a p-value is: “The probability of obtaining the obtained effect if the null hypothesis were true” (don’t just memorize—understand what this means!). Be able to conclude whether you should reject or fail to reject a null hypothesis based upon a given p-value. o Two Possible Outcomes: p < .05 » Reject the Null Hypothesis » Supports the alternative hypothesis (the study hypothesis) p > .05 » Fail to reject the Null Hypothesis » Does not support the alternative hypothesis o Cannot prove either H o0 H 1 f. Should understand Type I and Type II error and how they relate to mistakenly rejecting/failing to reject the null hypothesis. Should understand the most common reason why each occurs. o Two types of error: Type I (thinking a man is pregnant) and Type II (thinking a pregnant woman isn’t) o Type I Error: When you reject the null hypothesis but it is actually true; a false positive In other words, you find statistically significant results to support your hypothesis (i.e., you reject the null hypothesis), but the effect isn’t real! Even the improbable (say p < .05) is still possible But most commonly occurs when running many analyses (a huge problem in psychology studies) o Can reduce Type I error by having a more strict p value (say, p < .01 instead of p < .05). o Current movement towards “pre- registration”. Why is Type I Error a problem? Miracle drug doesn’t work o Type II Error: You fail to reject the null hypothesis when it is actually false; a false negative In other words, you fail to find statistical significance for a real effect! *usually caused by small sample size (p value highly sensitive to effect size) Why is Type II error a problem? There is an effect of the drug, but not big enough sample size to see it – therefore not using the drug that COULD be beneficial g. Should understand the concepts of effect size and why it is important. What does the effect size tell you that the p value does not? Is the effect size affected by sample size? Is the p value? You do NOT need to be able to calculate effect sizes but should know about Cohen’s d and Pearson’s r. Effect Size o Very important! o NHST only tells us whether or not there was an effect, but tells us nothing about the magnitude of the effect Effect Sizes do this Ex: Study finds new drug extends life for terminal cancer patients compared to usual treatment (p < .001) But extends for how long? Years? Just several days? NHST just tells us the drug extends life, but not how much. Reaching statistical significance (e.g., p < .05) tells you if there is an effect Effect size tells you the size of an effect o Why are Effect Sizes Important? Knowing if there is an effect (NHST) is not the only information you need to evaluate results. You also want to know the size of the effect! Effect sizes: Can help determine if a statistically significant effect is meaningful in applied terms Useful in comparing effects across studies (meta-analysis) Can be used in pilot data to determine if it’s worthwhile to collect a larger sample Pearson’s r (is the effect size of a correlation) r = the strength of the association between two variables What is considered a small vs a strong correlation differs by field. o In psychology: None: r = 0 to .10 Small: r > .10 Medium: r > .24 Large: r > .37 Cohen’s d (the effect size of the difference between groups) Determined by: o The size of the difference between the two groups o The amount of variability within the groups Small: d > .20 Medium: d > .30 Large: d > .80 h. Power: more of it makes it more likely you will detect an effect. What are the best ways to increase power in your study? o Three things determine power: Significance level Smaller alpha level mean less power While lower alpha levels decrease likelihood of Type I error, they increase the likelihood of Type II error Effect size It’s easier to detect a large effect than a small one Sample size The most important determinant and under your control The larger the sample size, the greater the likelihood of detecting a significant effect (because a larger sample is a better estimate of the population) i. Should be able to draw accurate conclusions based on the p-value, sample size (n), and effect size. You should also be able to use these indices to figure out if Type I or Type II error might be occurring. Practice: » Results: p = .09, d = .83, n = 20, » Do you REJECT or FAIL TO REJECT the null hypothesis? Fail to Reject » What type of error might have occurred here? Type II Error; large effect size (likely that the null hypothesis isn’t true) » What is a likely reason that error may have occurred? Small sample » Results: p = .04, d = .001, n = 12,367 » Do you REJECT or FAIL TO REJECT the null hypothesis? Reject » What type of error might have occurred here? Type I Error » What is a likely reason that error may have occurred? j. Should know why replicating findings is important II. Experimental methods/Independent Group Designs (ch. 6 lecture) a. An experiment is used to infer causality by using: i. Manipulation of the IV – the experimenter manipulates what level of the IV a participant receives ii. Control (All conditions held constant except the IV—why is this a must?) 1. Control requires balanced samples – what does this mean and why is random assignment the best way to achieve it? o Random assignment means group balanced on all other variables b. Be able to identify a one group pretest-posttest design and understand why it is not an experiment. c. What is required to make a causal inference? Why does an experiment make this possible? i. “Co-variation” – performance on the DV changes for different levels of the IV (e.g., exam score differs for those who had caffeine compared to those who do not) ii. Time-order relationship – IV must precede DV iii. Elimination of plausible alternative explanations (this is a big one! Only achieved through balanced groups. And the one that a pretest-posttest design fails to meet) d. Understand why experiments are the best way to determine causality. If control is sufficient and all causal requirements are met, then any differences between levels of the IV must be caused by the independent variable. Be able to tell if a finding is causal or correlational. e. What does it mean for an experiment to have high internal validity? » A study with high internal validity has no confounding variables. f. What are some threats to internal validity? i. Confounds and extraneous variables ii. Systematic differences between groups 1. What are intact groups and why are they a problem? iii. Attrition 1. Mechanical Loss (not too big a deal if infrequent and occurs randomly (i.e., non-systematically) 2. Selective Subject Loss (this is a big deal) – remember the fitness club example. iv. Participant and experimenter biases 1. Can control by using a double-blind design g. What is an independent groups design? What are between groups and within groups differences? Independent Groups (“Between Subjects”) Designs o Independent Variable Multiple Groups Experimental (aka, Treatment) Group(s) Control Group(s) o Compare differences between groups (the effect of the IV) on the DV while controlling for differences within groups (individual differences or error) Importance of balanced samples o Separate group of participants for each level of an IV Can have multiple groups, usually limited to 3-5 at most Between Groups vs. Within Groups Independent Group Designs analyze between group effects » Compares Between Group and Within Group variability o Practice Effects (only with RMD) Confounds of individual differences between groups is eliminated (all Eliminates within groups individual differences nonsystematic variance (i.e. individual differences and error-- factors that not under experimental control) is reduced For example, it is now impossible to have brighter or more motivated people in one condition vs. the other because we are using the SAME people for both conditions. » But introduces new problems, namely practice effects h. Why can’t random assignment always be used? Know why matched and natural group designs cannot use random assignment, and how the experimenter might try to balance the groups without random assignment. What are the best variables to match groups on in a matched group design? Matched Groups What do you do when you have a smaller sample and randomization won’t work well? o Instead of random assignment, experimenter matches group to balance them Match the groups on variable(s) relevant to the study These include: 1. The DV » Ex: weight or BMI when testing an exercise regiment 2. Related to the DV » Ex: teaching problem solving will result in better success on a puzzle 3. An important potential confound Natural Groups Naturally occurring IVs o E.g., gender, abuse history, divorce, etc. o IVs that you can’t ethically or practically manipulate Problem: not a “true” experiment** o Natural groups may differ systematically o Technically, we can’t make causal statements when comparing natural groups o Technically, no IV manipulation by the experimenter = not an experimental design i. There are therefore three different types of independent groups designs. How are the groups determined in each? Random, Natural, Matched Random Groups Balance out individual differences between groups Assumes distribution of within group differences (individual characteristics) are the same in each group Randomly assign participants to levels of IV o Not alternating assignment, must be truly random III. Repeated Measure Design (ch. 7 lecture) a. What is a repeated measures design? How does it differ from an independent group design? Repeated Measures Designs analyze within group effects » Each participant completes all levels of the IV of the experiment Levels in RMD often called “conditions”… you are comparing results between conditions » Thus, participants serve as own controls In independent group designs, each level is a different group so we are constantly worried about the groups being balanced (randomization helps minimize this concern) In RMD, your levels are always perfectly balanced! Ex: it is now impossible to have smarter or more motivated people in one condition vs. the other because we are using the SAME people for both conditions. But introduces new problems, namely practice effects. b. How are confounds that might occur in independent groups designs eliminated in a repeated measures design? Control through counterbalancing o Balances the order of that conditions are administered to control or average out practice effects. Changes of the DV therefore cannot be because of practice effects. o Counterbalancing is to RMD as randomization is to IGD: maximizes internal validity c. Understand why practice effects are a unique problem for repeated measure designs. Also, what are anticipation effects? Know why anticipation effects are a problem with ABBA counterbalancing. o Anticipation effects are present: a type of practice effect in which the participant perceives the ABBA pattern and changes response accordingly d. What is the purpose of counterbalancing anyway? We must counterbalance! » Controls for practice effects by spreading them evenly across conditions » Two basic types: complete and incomplete designs e. What are some advantages and disadvantages of the repeated measures design? » Advantages (of RMD) o Require fewer participants Each participant is administered every level of the IV, so separate groups are not needed o This means you have greater power: You get the equivalent of a larger sample with fewer participants Small differences between conditions become easier to detect because other extraneous variables are eliminated o Commonly used when participants rate different stimuli Good for assessing comparisons or preferences; Ex: desirability of car color; facial attractiveness » Disadvantages o Practice Effects (only with RMD) Confounds of individual differences between groups is eliminated (all individuals complete all IV levels), BUT: Introduce a new possible confound: practice effects Change in participants responses over time because of: 1. Learning the task (improve); a literal practice effect, or 2. Boredom or fatigue (worsens) If uncorrected, changes on the DV might be driven by practice effects and not the IV! Practice effects are the primary confound in RMD. o Just like unbalanced groups is the primary confound in IGD Control through counterbalancing o Balances the order of that conditions are administered to control or average out practice effects. Changes of the DV therefore cannot be because of practice effects. o Counterbalancing is to RMD as randomization is to IGD: maximizes internal validity f. What is the difference between Complete and Incomplete counterbalancing? Why is ABBA a type of Complete counterbalancing? Why is “All Possible Orders” a type of Incomplete counterbalancing? Know how to figure out how many orders would be required to counterbalance with All Possible Orders (i.e., N!). o In a complete design, all conditions are admisnistered to each participant several times, using different orders each time (using ABBA or Block Randomization) o Practice effects controlled for every participant Happy vs Sad vs Angry faces example o In an incomplete design, all conditions are administered to a participant only once in a single order o Thus, data for any single participant is still confounded by practice effects o Practice effects controlled only for the full group ABBA Counterbalancing o Conditions in one sequence and then in reverse o Ex: HH, SS, HS, SH, SH, HS, SS, HH So, this example is not ABBA, but rather ABCDDCBA The average position for condition is equal o ABBA not appropriate if… Practice effects are non-linear o Ex: Primacy and recency effects in memory Recall of emotion and non-emotion words: E, NE, NE, E E may be better recalled because of primary and recency All Possible Orders: Best choice for an Incomplete Design… but only when you have few conditions o Each participant randomly assigned to one of all possible orders o Problem…there are N! possible orders, so must have at least N! participants g. What is differential transfer? Why is it problematic for repeated measures designs? Differential Transfer o A problem in some repeated measures designs o Performance on one condition is dependent on the condition that precedes it (i.e., effects carry over) Instructions (puzzle example) Interventions o If differential transfer is a possibility, don’t use repeated measures. » Use an independent groups design! o Test for differential transfer by comparing results of repeated measures and independent groups IV. Journal Article a. A series of studies were reported in this article concerning factors that affect how accurate heterosexual women are at detecting whether a man is gay or straight. What factor in the first study improved their accuracy? b. How did study two clarify the findings from study one? c. In the third study, what additional factor was determined to improve the accuracy of judging male sexual orientation? What type of research design was used in study 3a? Journal Article: a. A series of studies were reported in this article concerning factors that affect how accurate heterosexual women are at detecting whether a man is gay or straight. What factor in the first study improved their accuracy? The nearer women were to peak ovulation, the more accurate they were in judging men’s sexual orientation. b. How did study two clarify the findings from study one? Study 2 data was consistent with the hypothesis that women’s accuracy in judging men’s sexual orientation increases nearer peak ovulation because male sexual orientation is relevant to women’s reproductive success. However, Study 2 demonstrated that this accuracy did not extend to female targets. c. In the third study, what additional factor was determined to improve the accuracy of judging male sexual orientation? What type of research design was used in study 3a? Study 3 compared the judgments of women primed with a mating goal with the judgments of nonprime control subjects for both male and female targets. Experimental Design
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'