PSY290 EXAM 3 notes R. Stuetzle
PSY290 EXAM 3 notes R. Stuetzle PSY 290
Popular in Intro to Research Methods
Popular in Department
verified elite notetaker
One Day of Notes
verified elite notetaker
verified elite notetaker
verified elite notetaker
verified elite notetaker
One Day of Notes
verified elite notetaker
This 40 page Study Guide was uploaded by Eureka on Sunday January 17, 2016. The Study Guide belongs to PSY 290 at University of Miami taught by Rick Stuetzle in Spring2015. Since its upload, it has received 79 views.
Reviews for PSY290 EXAM 3 notes R. Stuetzle
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 01/17/16
Chapter 9 Woodworth: The experimental method manipulates independent variables The correlational method “measures two or more characteristics of the same individual [and] computes the correlation of these characteristics” These two research strategies were of equal value. Difference between two disciplines: experimental and correlational psychology: Correlational psychology is concerned with investigating the relationships between naturally occurring variables and with studying individual differences. The experimental psychologist is not usually interested in individual differences, however, but rather with minimizing or controlling these differences in order to show that some stimulus factor influences every individual’s behavior in a predictable way to a measurable degree. The correlationist observes variables and relates them; the experimentalist manipulates variables and observes the outcome. The correlationist looks for ways in which people differ from each other; the experimentalist looks for general laws that apply to everyone. Box9.1 Origins—Galton’s studies of genius Galton was a pioneer in the empirical study of intelligence, among the first to make a strong case that genius is inherited and not the result of one’s upbringing. Along the way, he invented correlations. He used scatterplots to find out that mid-parent heights correlate with the heights of their adult children. Tall parents have tall children; short parents have short children and the “regression to mean” phenomenon: the children’s heights tended to drift back to, or regress to, the mean for the population. This “regression- to-the-mean” phenomenon, Galton discovered the main features of a correlation/regression analysis. Correlational research Focus on examining the relationships among variables o Compared to experimental research, no manipulation o Allows for study of individual differences but does not allow for conclusions about causality Correlation is NOT equal to causation Two variables that are related to each other “in some fashion” are said to be correlated Direction of Correlation o Positive Correlation (Direct correlation) High score on one variable is associated with high score on a second variable E.g., Height is positively correlated with weight E.g., Study time is positively correlated with GPA o Negative Correlation (Inverse correlation) High score on one variable is associated with low score on a second variable E.g., # weeks training is negatively correlated with race time E.g., amount of time goofing off is negatively correlated with GPA Strength of Correlation (Pearson’s r) o Correlation coefficient ranges from –1.00 to 1.00 -1.00 = perfect negative correlation; 0.00 = no relation; +1.00 = perfect positive correlation o The strength of a correlation is indicated by the size of a statistic called the coefficient of correlation, which ranges from –1.00 for a perfect negative correlation, through 0.00 for no relationship, o The most common coefficient is the Pearson’s r, Pearson’s r is calculated for data measured on either an interval or a ratio scale. o Like means and standard deviations, a coefficient of correlation is a descriptive statistic. The inferential analysis for correlations involves determining if a particular correlation is significantly different from zero. That is, in correlational research, the null hypoth0sis (H ) is that the true value of r is 0 (i.e., no relationship exists); the alternative hyp1thesis (H ) is that r ≠ 0. Rejecting the null hypothesis means deciding a significant relationship between two variables exists. Using Scatterplots to Visually Represent Correlations: It provides a visual representation of the relationship shown by a correlation Each data point (vs. mean values in experimental designs) is plotted x y Greg 6 8 Heather 4 6 Sam 5 7 Will 3 5 As strength of correlation weakens, points on scatterplot move further away from diagonal lines of perfect correlation Perfect positive (9.1a) and perfect negative (9.1b) correlations produce points falling on a straight line, whereas a correlation of zero yields a scatterplot (9.1c) in which the points appear to be randomly distributed on the surface of the graph. Compared to those for relatively weak correlations (9.1d and 9.1e), the points bunch closer together for relatively strong ones (9.1f and 9.1g). Figure 9.2 shows you how a scatterplot is created from a set of data Threats to the detection of linear relations between variables Non-Linearity o Pearson’s r describes the direction and magnitude of linear association between variables If the variables are associated in a non-linear fashion, Pearson’s r cannot identify nature of the relationship o Example of non-linear or curvilinear relationships in psychology? What would scatterplot look like for such an association? Where would the line of best fit be in such a plot? Yerkes-Dodson The scatterplot that points would fall consistently along this curved line, but trying to apply a linear correlational procedure would yield a Pearson’s r of zero or very close to it. You can see how this would happen: The left half of the curve is essentially a strong positive correlation; the right half is a strong negative correlation, which would cancel out the effect of the left side when calculating a Pearson’s r. Restricted Range o In a correlational study you want to maximize the range of scores on your measures o Why? Restricting the range of one (or both) of the measured variables weakens the correlation, If the range of SAT scores is restricted to those of 1200 or above, the correlation drops considerably. Procedures exist for “correcting” correlations to account for the range restriction problem, but one must be aware that restricting the range has direct effects on the ability to make predictions. Highly selective schools using an “SAT = 1200” cutoff will certainly be getting a lot of good students, but their ability to predict grades from SAT scores will not be as great as in a school without such a cutoff point. The correlation between SAT and academic success will be higher at the less restrictive school than it will be at the more restrictive school. The GRE and GPA do not have strong relationship in graduate school Cause, cut off score in graduate school, people getting in graduate school are selected from good grades. Being Aware of Outliers: An outlier is a score that is dramatically different from the remaining scores in a data set. Another way of saying this is that including the outlier would lead one to make a Type I error (you think there is a relationship, based on rejecting the null hypothesis, but there really is no relationship). Remove the outlier before calculation. Interpreting the meaning of a correlation: r values range from –1.00 to 1.00 r = coefficient of determination o Because it is a squared value will always be a positive number 2 o r = portion of variability in one of the variables in the correlation that can be accounted for by variability in the second variable o E.g., Depression is correlated with GPA 2 r = .50 , r =0.25= 25% interpret: only a quarter (25%) of the variability in GPA scores can be associated with depression. What does 1 - r represent? Presumably, the remaining 75% would be related to other factors, such as the ones listed previously (study habits, etc.). Making Predictions – Regression Analysis Making predictions based on correlational research = regression analysis If X and Y are strongly correlated, knowing score on X allows you to predict score on Y regression line: Y(prime) = a + bX where a = the place where the line crosses the Y-axis (i.e., the Y-intercept) b = the slope The line is used for making the predictions and is called the line of best fit; it provides the best possible way of summarizing the points on the scatterplot. More precisely, if you took the absolute values of the shortest distances between each point and the line, those distances would be at a minimum. In order to predict with confidence, however, the correlation must be significantly greater than zero. The higher the correlation, the closer the points on the scatterplot will be to the regression line and the more confident you can be in your predictions. And that confidence can be expressed mathematically in the form of a confidence interval (95%-99%). One final point about a regression analysis is both procedural and ethical. In general, predictions should be made only for people who fall within the range of scores on which the correlation is based. The equation should not be used to predict success for any future applicant not part of that population. Interpreting Correlations: Correlation does not equal causation o A correlation between two variables does not allow you to conclude that one of the variables is causing the other to occur o A good example of the type of research that is easily misinterpreted in press: The implication of causality occurs many times because the news report will use the term link, as in, for example, “Researchers have established a link between baldness and heart disease,” Researchers found that coronary heart disease among these doctors was associated with vertex baldness. Most press reports were careful to point out the link was not causal, but the uncritical reader, making more than should be made of the term link, might conclude that going bald is a direct cause of having a heart attack. o How is this different from experimental research we’ve discussed up until now? (e.g., when an independent variable is manipulated) In an experimental study with a manipulated independent variable, we’ve already seen that cause-and-effect conclusions can be drawn with some degree of confidence. The independent variable of interest is manipulated and, if all else is held constant (i.e., no confounds), the results can be attributed directly to the independent variable. With correlational research, the all-else-held-constant feature is missing, however, and this lack of control makes it impossible to conclude anything about cause and effect from a simple correlation. Directionality Problem: If there is a correlation between two variables, A and B, it is possible that A is causing B to occur (A → B), but it also could be that B is causing A to occur (B → A). That the causal relation could occur in either direction is known as the directionality problem. o There is a correlation between the amount of TV children watch and the frequency of aggressive behavior---Chicken eggs question Watching TV might cause aggressive behavior BUT Aggressive children might like to watch TV more than nonaggressive children Based on a correlation alone, you can’t tell which of these is true….the most you can say is that there is an association between TV watching and aggression o A study described in the New York Times in 2008, in which researchers examined the research productivity and beer consumption of ornithologists in the Czech Republic. They found a negative correlation: The more beer consumed by ornithologists, the less productive they were as scientists Directionality: drinking lots of beer causes Czech ornithologists to fail in their publishing efforts (A → B), but it is also possible that failing to publish causes Czech ornithologists to drink more beer (B → A). o Techniques for examining directionality? Research psychologists are generally satisfied with attributing causality between A and B when they occur together with some regularity, when: 1.A precedes B in time, 2.A causing B makes sense in relation to some theory, and 3. Other explanations for their co-occurrence can be ruled out. o For the TV and aggressiveness study, all we have is A and B occurring together and the fact that A causing B makes some sense from what is known about observational learning theory (Bandura’s Bobo doll study) Cross-Lagged Panel Correlation: it is possible to increase one’s confidence about directionality. In essence, this procedure investigates correlations between variables at several points in time. Hence, it is a type of longitudinal design, adding the causal element of A preceding B. o To talk about A causing B need to show: 1.A and B occur together, 2.A precedes B in time, 3.A causing B makes sense in relation to your theory, 4.Other explanations for co-occurrence can be ruled o Figure 9.8: Of special interest are the diagonal or cross-lagged correlations because they measure the relationships between two main variables, but separated in time. If third-grade aggressiveness caused a later preference for watching violent TV (B → A), then we would expect a fair-sized correlation between aggressiveness at time 1 and preference at time 2; in fact, the correlation is virtually zero (+.01). On the other hand, if an early preference for viewing violent TV programs led to a later pattern of aggressiveness (A → B), then the correlation between preference at time 1 and aggressiveness at time 2 should be substantial. As you can see, this correlation is +.31, not terribly large but significant. Based on this finding, concluded that an early preference for watching violent TV is at least partially the cause of later aggressiveness To think about directionality, focus on the diagonal or “cross-lagged” correlations Cross-lagged panel correlations must be interpreted cautiously, however. For one thing, if you examine the overall pattern of correlations in Figure 9.8, you will notice the correlation of +.31 may be partially accounted for by the correlations of +.21 and +.38—that is, rather than a direct path leading from third-grade preference to thirteenth-grade aggression, perhaps the path is an indirect result of the relationship between preference for violent TV and aggression in the third grade and between the two measures of aggression. A child scoring high on preference for violent TV in third grade might also be aggressive in third grade and still be aggressive (or even more so) in thirteenth grade. Alternatively, it could be that aggressiveness in third grade produced both (a) a preference for watching violent TV in third grade and (b) later aggressiveness. Thus, cross-lagged panel correlations help with the directionality dilemma, but problems of interpretation remain. More generally, interpretation difficulties take the form of the third variable problem. Third Variables and Problems with Causality o Correlational research does not attempt to control extraneous variables BUT these extraneous variables could account for the association between 1 & 2 Variable “3” causes both 1 and 2 The third possibility is that both A and B result from a third variable, C (C → A and B). For instance, perhaps the parents are violent people. They cause their children to be violent by modeling aggressive behavior, which the children imitate, and they also cause their children to watch a lot of TV. The children might watch TV in order to lie low and avoid contact with parents who are always physically punishing them. Another third variable might be a lack of verbal fluency. Perhaps children are aggressive because they don’t argue effectively, and they also watch a lot of TV as a way of avoiding verbal contact with others. o “Green peace” Example: New york times magazine: Countries where a substantial portion of the population plays golf are less belligerent than countries without golf. Third variables: Economic prosperity, perhaps? Highly prosperous countries might be more likely to be peaceful and also have more time for leisure, including golf. o Smoking cause cancer: experiment(ethical , don’t do experiment on human); they did correlation,use rats; rule out other factors (genetics , nuclear energy, 外星人) Can examine the influence of a third variable by looking at a partial correlation o If partial correlation is a lot smaller than correlation means that the third variable is indeed accounting for the association between the variables o If partial correlation is about the same as the correlation means that the third variable doesn’t account for the association between the variables o What results is a partial correlation that measures the remaining relationship between reading speed and reading comprehension, with IQ partialed out or controlled. Eron’s Study: The partial correlations range from +.25 to +.31, indicating that none of the 12 factors were very different from the original correlation of +.31. Even taking into account these other factors, the correlation between early preference for violent TV programs and later aggressiveness remained close to +.31. The analysis strengthened their conclusion “that there is a probable causative influence of watching violent television programs in [the] early formative years on later aggression” Example: with IQ, reading speed and reading comprehension Correlation between reading speed and reading comprehension is high. Furthermore, you suspect that a third variable, IQ, might be producing this correlation—that is, high IQ might yield both rapid reading and strong comprehension. To complete a partial correlation, you would correlate (a) IQ and reading speed and (b) IQ and reading comprehension. In this case, the partial correlation turns out to be +.10. Thus, when IQ is statistically controlled (“partialed out”), the correlation between speed and comprehension virtually disappears, which means that IQ is indeed an important third variable making a major contribution to the original +.55 correlation between speed and comprehension. The need for correlational research: On practical grounds: some research is simply not possible as a pure experimental study. o Studying gender differences in behavior, differences among age groups, or differences among personality types are major research areas in which it is not possible to randomly assign subjects to groups. Other correlational studies might involve data collected for other purposes, again eliminating random assignment. On ethical grounds: some studies simply cannot be done as experiments with manipulated variables. we can appreciate the difficulty of recruiting human volunteers for this experiment. This is one reason why animals are used as subjects in experimental research investigating the relationship between brain and behavior. With humans, the studies are invariably correlational. Varieties of Correlational Research: Applications of Correlational Research: Correlations are especially prevalent in (a) research concerning the development of psychological tests, in which reliability and validity are assessed; (b) research in personality and abnormal psychology, two areas full of subject variables; and (c) twin studies, research in the Galton tradition that relates to the nature-nurture issue. Correlations and Psychological Testing: o A reliable and valid measure of intelligence will yield about the same IQ score on two occasions and be a true measure of intellectual ability and not a measure of something else o The Kaufmans evaluated the reliability of their test in several ways. o Split-half reliability. This involves dividing in half the items that make up a particular subtest (e.g., even-numbered versus odd-numbered items) and correlating the two halves. The correlation should be high if the test is reliable—someone scoring high on one half should score high on the other half as well. o Test-retest reliability, the relationship between two separate administrations of the test. Again, these reliabilities should be high—a reliable test yields consistent results from one testing to another. For the KrABC, both split-half and test-retest reliabilities were high (correlations in the vicinity of +.90). o Evaluation of validity: o Criterion validity: For an IQ test, criterion measures are often scores on tests relating to school performance because IQ tests are designed to predict how well someone will perform in school. Scores on a valid test should correlate positively with these school performance scores. o Implied in this K ABC example is the importance of using good (i.e., reliable and valid) tests. o But ethical issues pertain to the reliability and validity of these tools. Box 9.2 considers some of them and describes the APA guidelines for the development and use of tests. A questionnaire that seems to make sense and administering it to friends. Yet that is the typical procedure for most of the pseudoscientific psychological tests found in popular magazines. These tests might appear to be scientific because they include a scoring key (“If you scored between 20 and 25, it means...”), but the scales are essentially meaningless because there’s never been any attempt to determine their reliability or validity Personality Research/ Correlational research in personality and abnormal psychology with box 9.3 ◦ When individual differences in personality traits and differentiating psychological disorders are investigated. ◦ Correlations between different personality dimensions Do individual differences in depression correlate with people’s attributions for failure? Pessimistic explanatory style = blame self for failure, believe it is reflection of general inadequacy, believe failure will be stable over time Find positive correlation (r = .56) between depression and pessimistic explanatory style Therapy reduces depression but magnitude of correlation stays same during and after therapy Interpret? ◦ Study: relationship between physical attractiveness and happiness Diener hypothesized a positive correlation between physical attractiveness (PAtt) and “subjective well-being” (SWB), a term frequently used by researchers as an indicator of “happiness.” But pointed out the directionality problem and the third variable (personality, extroversion) The correlations between PAtt and SWB were highest when the students were rating themselves for attractiveness Box9.3 the achieving society The Achieving Society, which documents an extraordinarily ambitious attempt to extend the results of psychological research on achievement into the realm of historical explanation They developed ways of measuring the achievement motive, completed countless studies on the correlates of achievement and the environments conducive to developing the need to achieve, and created a theory of achievement motivation (the drive to take on challenges and succeed). One way of measuring the need for achievement, or nAch, is to use the Thematic Apperception Test (TAT), in which subjects look at ambiguous pictures and describe what they see in them McClelland’s classic research on achievement in society. He subjected children’s literature to the same kind of analysis given TAT stories, and then took various measures of societal economic health and correlated the two. He found a positive correlation; as achievement themes increased, actual achievement increased. Although his research solves the directionality problem in much the same way that a cross-lagged correlational study does (the 50-year lag), the most obvious problem is the usual one with correlational research: third variables.T he relationship between children’s literature and later achievement is surely an intriguing one, but historical trends are immensely complicated and susceptible to countless factors. The Nature-Nurture Issue ◦ Are certain characteristics more highly correlated among identical versus fraternal twins? If yes, what does this suggest about the “source” of these characteristics? If no, what does this suggest about the “source” of these characteristics? ◦ E.g., the heritability of Shyness ◦ BUT…..still need to be cautious in interpretation Study: how heredity and environment interact to produce traits. ◦ Hereditary and environmental factors can be evaluated separately by comparing twins differing in genetic similarity (identical or monozygotic twins versus fraternal or dizygotic twins) and in the similarity of their home environments (twins reared together in the same home versus those separated and raised in different homes). ◦ Compared four groups: monozygotic twins, some reared together (MZT) and others apart (MZA), and dizygotic twins, also reared either together (DZT) or apart (DZA). High scorers on the third factor, “constraint,” tend to be conventional, unwilling to take risks, and generally restrained and cautious. Two methodological points here: 1. Simple size: MZA = 44 pairs DZA = 27 pairs MZT = 217 pairs DZT = 114 pairs The relatively small number of pairs of twins reared apart (especially DZA) prompted the authors to add a degree of caution to their conclusions 2. Pearson’s r should not be used here, cause one of the requirements is that the pairs of scores that go into the calculations must be from the same individual. That is not the case in twin studies, where one score comes from one twin and the second score comes from the other twin. To deal with this problem, a different type of correlation, called an intraclass correlation, is calculated. Combining correlational and experimental research: Another common strategy for increasing confidence in causality is to do a correlational study, use it to create causal hypotheses, and then follow the correlational study with experimental studies. The study: a relationship between loneliness and a tendency to anthropomorphize 1. Correlational study: Twenty subjects completed a brief personality test for loneliness. Their tendency to anthropomorphize was assessed with a clever set of surveys. What they found was a correlation of +.53 between loneliness and the tendency to agree with the anthropomorphic attributes. Subjects who scored high on loneliness were more likely to think Clocky had a mind of its own. 2. Experimental study: Directly manipulating social disconnection (i.e., to create feelings of loneliness) those in the disconnected condition reported a significantly higher level of anthropomorphic belief in the supernatural Multivariate Analysis: A bivariate approach investigates the relationships between any two variables. A multivariate approach, on the other hand, examines the relationships among more than two variables (often many more than two). Multiple Regression: multivariate approach, on the other hand, examines the relationships among more than two variables Examining relations among more than 2 variables ◦ A multiple regression study has one criterion variable and a minimum of two predictor variables. The analysis enables you to determine not just that these two or more variables combine to predict some criterion but also the relative strengths of the predictors. ◦ The advantage of a multiple regression analysis is that when the influences of several predictor variables are combined (especially if the predictors are not highly correlated with each other), prediction improves compared to the single regression case ◦ Multiple predictors of a single criterion variable Do (1) SAT scores, (2) motivation, and (3) high school grades predict GPA at the end of your freshman year? Do these variables combine to predict outcome? What is the relative strength or weighting of each predictor? ◦ Y = a +b X1+b1X +2b2X n n ◦ R = multiple correlation coefficent = a correlation between the combined predictors and the criterion ◦ R = multiple coefficient of determination =an index of the variation in the criterion variable that can be accounted for by the combined predictors. ◦ R and r tell you about the strength of a correlation, and both R and r tell you about the amount of shared variation. Multiple Regression - Applications ◦ Mediation = Is the association between two variables accounted for by a third variable? ◦ E.g., Rates of delinquency are twice as high in single mother headed households vs. two parent households…. ◦ Correlation between Family Status and Delinquency ◦ vs. ◦ Partial correlation between Family and Delinquency controlling for SES ◦ If magnitude of partial correlation is reduced to non-significance what will you conclude? ◦ Moderation = Is the association between two variables different depending on the level of a third variable? ◦ Is magnitude/direction of association/correlation between Delay and outcome different depending on level of shyness/sociability? Factor Analysis: Factor analysis. In this procedure, a large number of variables are measured and correlated with each other. It is then determined whether groups of these variables cluster together to form factors. Pearson’s r’s could be calculated for all possible pairs of tests, yielding a correlation matrix. It might look like this a vocabulary test (VOC); a reading comprehension test (COM); an analogy test (ANA)geometry test (GEO) a puzzle completion test (PUZ) a rotated figures test (ROT) Correlations between tests from one cluster and tests from the second cluster are essentially zero. This pattern suggests the tests are measuring two fundamentally different mental abilities or factors. We could probably label them “verbal fluency” and “spatial skills.” Factor analysis is a multivariate statistical tool that identifies factors from sets of intercorrelations among variables In multiple regression, two or more variables combine to predict some outcome; in factor analysis, the goal is to identify clusters of variables that correlate highly with each other; both are multivariate techniques involving the measurement of more than two variables. Chapter 10 Applied Research: Applied research is designed primarily to increase our knowledge about a particular real world problem with an eye toward directly solving it, and is often conducted in clinics, social service agencies, jails, government agencies, and business settings. Research Example 28—Applied Research: Cognitive interview: the interviewer tries to get the witness to reinstate mentally the context of the event witnessed The technique’s effectiveness in the controlled laboratory environment, in which typical subjects viewed a video of a simulated crime and were then randomly assigned to be interviewed using the cognitive interview or a standard interview. ex post facto design; result: trained to use the cognitive interview elicited more information from witnesses. Applied Psychology in Historical Context: Miles, who built what he called a “multiple chronograph” as a way of simultaneously testing the reaction time of seven football players, an offensive line It is a good example of an experimental psychologist using a basic laboratory tool—reaction time, in this case—to deal with a concrete problem: how to improve the efficiency of Stanford’s football team. Design Problems in Applied Research: Ethical dilemmas: A study conducted outside of the laboratory may create problems relating to informed consent and privacy. Also, proper debriefing is not always possible. A tradeoff between internal and external validity: low internal validity due to take place in the filed and out of control; high external validity, cause the setting more closely resembles real life situations and the problems addressed by applied research are everyday problems Problems unique to between-subjects designs: It is often impossible to use random assignment to form equivalent groups, so reducing internal validity by subject selection problems or interactions between selection and other threats such as maturation or history. When matching is used to achieve a degree of equivalence among groups of subjects, regression problems can occur, as will be elaborated in a few pages. Problems unique to within-subjects designs: It is not always possible to counterbalance properly in applied studies using within-subjects factors. Hence, the studies may have uncontrolled order effects. Also, attrition can be a problem for studies that extend over a long period. Box10.1 Classic Studies—The Hollingworths, Applied Psychology, and Coca-Cola Harry Hollingworth and his wife Leta collaborated on the design for the studies Several methods: counterbalance, a placebo control; a double blind Complex, considering the large of number tests used, the dosages employed, a fair amount of individual variation in performance, and the absence of sophisticated inferential statistical techniques In general, no adverse effects of caffeine were found, except that larger doses Quasi-Experimental Designs: Subjects cannot be assigned randomly, however, the design is called a quasi-experimental design They do allow for a degree of control, they serve when ethical or practical problems make random assignment impossible, and they often produce results with clear benefits for people’s lives. When there is less than complete control over the variables in a study, causal conclusions cannot be drawn = Quasi-Experimental Design e.g., nonequivalent groups design/ P x E factorial designs/ all of correlational designs/ Single-factor ex post facto designs, with two or more levels/ Ex post facto factorial designs Nonequivalent control group designs, interrupted time series designs and archival research Nonequivalent control group design: Example Langer & Rodin The purpose is to evaluate the effectiveness of some treatment program. This design is used when random assignment is not possible, the groups are not equivalent at the outset of the study. In the case of the quasi-experimental nonequivalent control group design, the groups are not equal at the start of the study; in addition, they experience different events in the study itself--a built-in confound Used in order to evaluate effectiveness of a treatment program o Two ways in which groups are not the same 1. Groups are not equal at start of study (nonequivalent groups) 2. Groups experience different events in study (control group) Following the scheme first outlined by Campbell and Stanley,1963 Experimental group: O1 T O2 Nonequivalent control group: O1 O2 o O1 = pretest O2 = posttest What are the comparisons of interest? change scores (the difference between O1 and O2) o Study of effects of “flextime” on worker productivity 2 plants – one stays with regular schedule, other implements flextime Independent variable = whether or not flextime is present Dependent variable = some measure of productivity 4 possible outcomes (see Fig. 10.3) Pittsburgh’s plant the experimental group and Cleveland’s plant the nonequivalent control group. Interpretation: a. Pittsburgh’s productivity increased, but the same amount of change happened in Cleveland. Therefore, the Pittsburgh increase cannot be attributed to the program, but it could have been due to several of the threats to internal validity. History and maturation are good possibilities. Perhaps workers just showed improvement with increased experience on the job. b. Productivity in Cleveland was high throughout the study but that in Pittsburgh, productivity began at a very low level but improved due to the flextime program. However, there are two dangers here: a ceiling effect—their productivity level was so high to begin with that no further improvement could possibly be shown. If an increase could be seen (i.e., if scores on the Y-axis could go higher), you might see two parallel lines, as in Figure 10.3a; Regression to the mean--perhaps at the start of the study productivity was very low for some reason, and it then returned to normal. c. Both groups start at the same level of productivity, but the group with the program (Pittsburgh) is the only one to improve. A problem can exist: subject selection could interact with some other influence—that is, history, maturation, or some other factor could affect the workers in one plant but not those in the other. E.X: some historical event affects the Pittsburgh plant but not the Cleveland plant. FYI: using the matching procedure to minimize the selection effect. d. The outcome provides strong support for program effectiveness. Here the treatment group begins below the control group yet surpasses the controls by the end of the study. Regression to the mean can be ruled out as causing the improvement for Pittsburgh because one would expect regression to raise the scores only to the level of the control group, not beyond it. Of course, selection problems and interactions between selection and other factors are difficult to exclude completely, but this type of crossover effect is considered good evidence of program effectiveness Regression to the mean and matching o A special threat to the internal validity of nonequivalent control group designs occurs when there is an attempt to reduce the nonequivalency of the groups through a form of matching o Example: Testing a program to see if improve the reading skills of disadvantaged youth. Experimental group: volunteers select those most in need (i.e., those whose scores are, on average, very low). Control group: volunteers from similar neighborhoods in other cities. Matching variables: the initial reading skills score. Result: Experimental group: pre = 25 reading program post = 25 Control group: pre = 25 post = 29 Interpretation: regression to the mean resulting from the matching procedure overwhelmed any possible treatment effect. The experimental group was formed from those with the greatest need for the program because their skills were so weak. When using matching procedure, you were forced to select children who scored much higher than the mean score from this population of poor readers. Therefore, on a posttest, many of these children will score lower (i.e., move back to the mean of 17) simply due to regression to the mean. Let’s suppose the program truly was effective and would add an average of 4 points to the reading score. However, if the average regression effect was a loss of 4 points, the effects would cancel each other out, and this would account for the apparent lack of change from pre- to posttest. o What if the two populations sampled from (the nonequivalent groups) differ as a population on mean levels of the matching variable? Regression effects may selectively affect one group Even if samples are matched, neither is representative of the population they were selected from See fig 10.4 Regression effect o Head Start Purpose? Attempt to give underprivileged preschool children a “head start” on school by teaching them school-related skills and getting their parents involved in the process. Westinghouse evaluation project: “fade-out effects” The Westinghouse study documented what it called “fade-out effects”; early gains by children in Head Start programs seemed to fade away by the third grade. Why? Head Start was well under way when the Westinghouse evaluation project began, children couldn’t be randomly assigned to treatment and control groups. Instead, the Westinghouse group selected a group of Head Start children and matched them for cognitive achievement with children who hadn’t been through the program. However, in order to match the groups on cognitive achievement, Head Start children selected for the study were those scoring well above the mean for their overall group, and control children were those scoring well below the mean for their group. Misinterpretation of results: the Head Start group’s apparent failure to show improvement in the third grade was at least partially the result of a regression artifact caused by the matching procedure, Smoll, Smith, Barnett, and Everett (1993) Research Example 29—A Nonequivalent Control Group Design o Little League Study Nonequivalent control group designs IV whether coaches given “coach effectiveness” training Nonequivalent groups – coaches from two different leagues DV player self-esteem (preseason and postseason) Result: boys with relatively low self-esteem at the start of the season showed a large increase when they were coached by someone from the training program, whereas boys with coaches from the control group showed the same level of self- esteem (the apparent decline was not significant). If you consider that low-self- esteem boys might be the most easily discouraged by a bad experience in sport, this is an encouraging outcome. Nonequivalent control group design: Example Langer & Rodin o Nursing home residents Two groups Self-regulation, choice and responsibility Others’ responsibilities and roles Not randomly assigned – why? o Short-term effectiveness o And longer-term effectiveness Research example 30: A Nonequivalent Control Group Design without Pretests o A study to see if the experience of such a traumatic event would affect dream content in general and nightmares in particular o The groups were nonequivalent to begin with (students from two states); in addition, one group had one type of experience (direct exposure to the earthquake), while the second group had a different experience (no direct exposure). o The frequency of nightmares correlated significantly with how anxious participants reported they were during the time of the earthquake. o The interpretation problems that accompany quasi-experimental studies, lacking any pretest (pre-quake) information about nightmare frequency for their participants, they couldn’t “rule out the possibility that California residents have more nightmares about earthquakes than do Arizona residents even when no earthquake has recently occurred” Interrupted Time Series Designs: no control group Take measures for an extended period before and after the event expected to influence behavior, their study would have been called an interrupted time series design. Another type of quasi-experimental design Measures taken for an extended period before and after an event occurs that is expected to influence behavior o O1 O2 O3 E O4 O5 O6 O = measures of behavior taken before and after E, which is the point at which some treatment program is introduced or some event (e.g., an earthquake) occurs E = the interruption in the interrupted time series o Of course, the number of measures taken before and after T will vary from study to study and are not limited to five each. It is also not necessary that the number of pre- interruption and post-interruption points be the same. As a general rule, the more data points, the better. Allows for examination of pattern of behavior over time with reference to “event” Design allows researcher to rule out alternative explanations of an apparent change from pre- to post-test o Example: Effects of antismoking campaign on teenage smoking behaviors Compare interpretation from pre- post-test design vs. interpretation from interrupted time series design There certainly is a reduction in smoking from pre- to posttest, but it’s hard to evaluate it in the absence of a control group (i.e., using a nonequivalent control group design). Yet, even without a control group, it might be possible to evaluate the campaign more systematically if not one but several measures were taken both before and after the program was put in place. a. In this case, the reduction that looked so good in Figure 10.6 is shown to be nothing more than part of a general trend toward reduced smoking among adolescents. This demonstrates an important feature of interrupted time series designs: They can serve to rule out (i.e., falsify) alternative explanations of an apparent change from pre- to posttest b. Smoking behavior was fairly steady before the campaign and then dropped, but just briefly. In other words, if the antismoking program had any effect at all, it was short-lived. c. the decrease after the program was part of another general trend, this time a periodic fluctuation between higher and lower levels of smoking. d. Here the smoking behavior is at a steady and high rate before the program begins, drops after the antismoking program is put into effect, and remains low for some time afterward. The relatively steady baseline prior to the campaign enables the researcher to rule out regression effects. Research Example 31—An Interrupted Time Series Design Test the effect of instituting an incentive plan in which workers were treated not as individuals but as members of small groups, each responsible for a production line. This study also illustrates how those conducting interrupted time series designs try to deal with potential threats to internal validity. Rule out threats to internal variables: o History----they carefully examined as many events as they could in the period before and after the change and could find no reason to suspect unusual occurrences led to the jump in productivity. o Instrumentation (a problem if the techniques for scoring and recording worker productivity changed over the years)----events that might be expected to hurt productivity (e.g., a recession in the automobile industry, which affected sales of iron castings) didn’t. o Subject selection---- it can occur in a time series study if significant worker turnover occurred during the time of the new plan; the cohort of workers on site prior to the plan could be different in some important way from the group there after the plan went into effect. But that did happen in the study. o The authors argued that history did not contribute to the change because they carefully examined as many events as they could in the period before and after the change and could find no reason to suspect unusual occurrences led to the jump in productivity. o In fact, events that might be expected to hurt productivity (e.g., a recession in the automobile industry, which affected sales of iron castings) didn’t. o The researchers also ruled out instrumentation, which could be a problem if the techniques for scoring and recording worker productivity changed over the years. It didn’t. o Third, although we normally think of subject selection as a potential confound only in studies with two or more nonequivalent groups, it can occur in a time series study if significant worker turnover occurred during the time of the new plan; the cohort of workers on site prior to the plan could be different in some important way from the group there after the plan went into effect. This didn’t happen in Wagner et al.’s study Variations on Basic Time Series Design Inclusion of control condition can help with interpretation O1 O2 O3 E O4 O5 O6 O1 O2 O3 O4 O5 O6 Switching replications o It can occur in a time series study if significant worker turnover occurred during the time of the new plan; the cohort of workers on site prior to the plan could be different in some important way from the group there after the plan went into effect. o Vary the time at which the “event” or treatment is put into place o Ask yourself whether change can be directly tied to the event O1 O2 O3 E O4 O5 O6 O7 O8 O1 O2 O3 O4 O5 O6 E O7 O8 Measure multiple dependent variables o Select some for which you expect change and others for which you expect no change based on event o E.g., Three strikes and you’re out policy in California o If you look at the curve for serious crimes right after the law was passed, it looks like there is a decline, especially when compared to the flat curve for the misdemeanors. If you look at the felony crime curve as a whole, however, it is clear that any reduction in serious crime is part of a trend occurring since around 1992, well before passage of the three strikes law. Overall, the researchers concluded the three strikes law had no discernible effect on serious crime. Program Evaluation The origins of program evaluation: Box10.2 Reform as experiments ◦ Campbell: “ We should be ready for an experimental approach to social reform, an approach in which we try out new programs designed to cure specific social problems, in which we learn whether or not these programs are effective, and in which we retain, imitate, modify, or discard them on the basis of apparent effectiveness”. ◦ Example: a study evaluating an effort to reduce speeding in Connecticut; Regression could be a problem, especially the bad things happened to grab attention (not in this case). By applying an interrupted time series design with a nonequivalent control (comparable states without a crackdown on speeding), Campbell concluded the crackdown probably did have some effect, Campbell (1969) recommended an attitude change that would shift emphasis from the importance of a particular program to acknowledging the importance of the problem A specific type of applied research: Applied research that attempts to assess the effectiveness and value of public policy ◦ 4 major purposes Needs Analysis = determine community and individual needs for programs; Needs analysis is a set of procedures for predicting whether a population of sufficient size exists that would benefit from the proposed program, whether the program could solve a clearly defined problem, and whether members of the population would actually use the program. Methods exist for estimating need: Census data; Surveys of available resource; Surveys of potential users; Key informants, focus groups and community forums. Formative Evaluative = assess whether program is being run as planned and if not implement change There’s no point in trying to evaluate the effectiveness of the program if people don’t even know about it. One general function of the formative evaluation is to provide data on how the program is being used. Examines whether the program as described in the agency’s literature is the same as the program that is actually being implemented. A final part of a formative evaluation can be a pilot study. Program implementation and some preliminary outcomes can be assessed on a small scale before extending the program. Summative Evaluation = evaluate program outcomes Summative evaluation and even the rationale for doing it call into question the very reasons for existence of the organizations involved. Formative evaluation, by contrast, simply responds to the question “How can we be better?” without strongly implying the question “How do [we] know [we] are any good at all?” The actual process of performing summative evaluations involves applying some of the techniques you already know about, especially quasi- experimental designs. One problem that sometimes confronts the program evaluator is how to interpret a failure to find significant differences between experimental and control groups—that is, the statistical decision is “fail to reject H0.” A finding of no difference has important implications for decision making for reasons having to do with cost, and this brings us to the final type of program evaluation act
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'