### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# Psych Stats Psychology 240

UMass

GPA 3.21

### View Full Document

## About this Document

## 72

## 0

## Popular in Statistics in Psych

## Popular in Psychlogy

This 21 page Study Guide was uploaded by cairo stanislaus on Thursday December 31, 2015. The Study Guide belongs to Psychology 240 at University of Massachusetts taught by Adrian Staub in Fall 2015. Since its upload, it has received 72 views. For similar materials see Statistics in Psych in Psychlogy at University of Massachusetts.

## Similar to Psychology 240 at UMass

## Reviews for Psych Stats

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 12/31/15

Wednesday, November 25, 2015 PSYCH. 240 FINAL REVIEW UNIT 1: The Structure of Data In order to make sense at all out of psychology experiments, you need to have a set of • procedures for organizing, analyzing, and interpreting your results — statistics • There are two branches of statistics; 1. Descriptive: describes data 2. Inferential: draws conclusions about the data • Population — the group in which the information is being drawn from • Sample — a small subset of the population; must be representative — each member in the sample is equal • Variable — the properties of individuals • Two types of variables; 1. Quantitative — numbers 1. Continuous: any numerical value 2. Discrete: only integers 2. Categorical — no numbers • Quantitative variables: they can be measured on 3 different scales; (1) ordinal (2) interval (3) ratio 1. Ordinal — only interpreted in relative terms; no speciﬁc meanings; the difference doesn’t matter ex. On a scale of 1-5, how much do you like me? We don't know if a 4 is good or bad. A 4 is also higher than 2, but by how much? we don't know! 2. Interval — the difference between the two values is meaningful; ex. The difference between 100 and 90 degrees is the same as the difference between 80 and 70 degrees 3. Ratio — Just like interval, but there is an actual meaning of 0 ex. height, weight, enzymatic activity Quiz Review: 1. In which of the following formulas is variable A a linear transformation of variable B? A a. A = 17 + .7B Correct b. A = 17 + B2 c. A = 17/B d. A = √B + 100 1 Wednesday, November 25, 2015 Which of the following best explains the use of statistics in psychology? B Select one: a. One branch of psychology is concerned with people's ability to do numerical computation; statistics helps us understand how people do this. b. Almost all psychology experiments involve the collection of numerical data; statistics is used to interpret this numerical information. c. Psychology experiments have complicated results; statistics helps us simplify these results for presentation to laypeople. Which of the following statements about psychological research is accurate? A Select one: a. We make inferences about a population on the basis of data we collect from a sample. b. We make inferences about a sample on the basis of data we collect from a population. c. We use descriptive statistics to summarize data from a population, and then draw a sample from that population. d. First we make statistical inferences, and then we use descriptive statistics to summarize our data. If you are told that moopiness is measured on a ratio scale, which of the following is true? D Select one: a. the difference in moopiness between a person with a moopiness score of 3 and a moopiness score of 2 is the same as the difference between a person with a moopiness score of 6 and a person with a moopiness score of 4 b. a moopiness value of 0 has no real meaning c. a person who has a moopiness score of 6 is four times as moopy as a person who has a moopiness score of 2 d. a person who has a moopiness score of 5 is twice as moopy as a person who has a moopiness score of 2.5 Which of the following is the best example of a discrete quantitative variable? D Select one: a. the interest rate on credit cards b. the size of the Federal deﬁcit c. the temperature in Celsius d. the number of addresses you've had in your life A researcher who studies the development of face perception tests 20 healthy infants from the surrounding community who are between 6 months and 9 months of age. She compares how long they attend to pictures of human faces, monkey faces, and faces made of colored plastic pieces. The population of interest in this study is best described as: B Select one: a. 20 human infants from the area surrounding the researcher's university b. human infants between 6 months and 9 months of age c. infant mammals d. humans 2 Wednesday, November 25, 2015 Which of the following is the best example of a categorical variable? B Select one: a. Survey respondents are asked how important the issue of climate change is to them, using a numbered rating scale that has not at all important and very important as its endpoints. b. In a study of relationships, couples are coded as dating, engaged, married, or separated. c. Participants in a study of romantic attraction are asked to rate the importance of looks, intelligence, sense of humor, and ﬁnances, each on a 1-to-5 scale. d. Survey respondents are asked to report how many years of education they have completed, as a whole number. Unit 2: The Distribution of a Variable • Remember, in the last unit, we saw that a branch of psych stats is descriptive statistics that involves describing data from a sample • We’ll start by working with categorical variables; descriptive stats is easier with this • The distribution of a variable is the frequency each of these values have • The distribution can be shown by a chart — column 1: m&m color and column 2: the amount of each color in the bag • Another is through a bar graph, which consists of SPACES • A pie chart does not show numbers, so it is not used as often • A bar graph can also be used to show a the joint distribution of two categorical variables; in this case the two are touching • Another way is a contingency table • Now, we’ll work with quantitative (numbers) variables • A histogram is the most common type of graph showing a quantitative variable — the bars are touching — in this case, bins can be adjusted • Histograms also have shapes: 1. Unimodal — one hump; symmetrical • Some can be skewed; if the tail is on the right, it’s right skewed and if the tail is on the left, it’s left skewed 2. Bimodal — two humps 3. Uniform — no hump at all • Sometimes, there are outliers • There are also important measurements using histograms • The Mean is the average or the central tendency • The Standard Deviation is a measure of spread, s — the variance is s^2 • The same shape of two different histograms usually signify the same standard deviation and possibly the same or different mean The Median is the midpoint of the distribution • • When the distribution is symmetrical, the mean and the median will be similar • If the distribution is perfectly symmetrical, the mean and median will be identical • The less symmetrical the distribution is, the more the mean and the median will differ • The mean will be pulled in the direction of the tail 3 Wednesday, November 25, 2015 • Using a box plot, the median is displayed as a line • Q1 is the midpoint of all observations that fall below the median Q3 is the midpoint of all observations that fall above the median • • The IQR is the difference between Q1 and Q3 • If there are outliers BELOW the box plot, it is left skewed • If there are outliers ABOVE the box plot, it is right skewed Quiz Review: The ﬁrst seven days of 2014 were very cold in Amherst, MA. The high temperatures (in degrees fahrenheit) were 11, 19, 21, 23, 24, 28, 28. Which of the following statements is correct? C Select one: a. the mean temperature was 23 degrees b. the median temperature was 22 degrees c. the mean temperature was 22 degrees d. the median temperature was 28 degrees If a distribution is right-skewed, which of the following is likely to be true? A Select one: a. the mean is greater than the median b. the mean is less than the median c. the mean is equal to the median What is the relationship between the variance and the standard deviation of a distribution? B Select one: a. The standard deviation is twice the variance. b. The standard deviation is the square root of the variance. c. The standard deviation is one half of the variance. d. The standard deviation is equal to the variance squared. 4 Wednesday, November 25, 2015 Unit 3: Foundations of Statistical Inference I • Now, we’ll do inference stats; we have a set of data and want to draw inferences about a population Probabilities: • What things have probabilities? whatever can have various outcomes in which we can;t knows in advance — a random process • Random processes have probabilities • The probability is the ‘long run’ proportion of times it occurs; the amount of times it would occur if we ran the experiment an inﬁnite number of times Proportion — (number of times outcome occurs) / (number of times we run the process) • • The Law of Large Numbers — as n increases (n = number of trials), p-hat gets closer to the true probability of the outcome, p • p-hat is the proportion of times the outcome occurs • p-hatN is the number of times it occurs with that amount of trials • The probability has a restricted range between 0 and 1 (with those two numbers included) — the sum of the probabilities must be one • The probability of an outcome + the probability of not getting that is always one Independency: • Sometimes, two outcomes could be independent: one occurring does not affect the probability that the other will occur Independent outcomes have two rules: (1) the multiplication rule (2) the addition rule • • Multiplication rule — p(A and B) = p(A) x p(B) • What’s the probability of rolling two dice, and getting a 6 on both? (1/6) x (1/6) = 0.028 • Addition rule — p(A or B) = 1 - p(A and B) • What’s the probability of rolling two dice, and getting at least one six? ??? Answer = 0.305 Curves: • We use a probability density function, or a pdf, to show the probability when the outcomes are continuous rather than random • Because all of the probabilities sums to one, the area under the curve always sums to one Types of distributions: (1) Discrete — binomial • • (2) Continuous — normal Binomial distribution: • The number of times an outcome happens in n trials of a random process if the probability is ﬁxed The outcome we’re interested in is the “success”, even if it’s not necessarily good • • Getting an exact number of chances of something occurring • Suppose you want to know the probability of getting exactly 5 kids with IEP’s in a class of 20 students? 0.1 is the probability of a ‘success’ on each trial • > dbinom(5, 20, 0.1) • The answer given will correspond to the maximum height of the graph 5 Wednesday, November 25, 2015 Normal distributions: • A continuous probability The graph has only one hump in the middle of the range and is symmetrical • • The 68-95-99.7 rule — • The mean and standard deviation can also inﬂuence this shape: N (mu, sigma) • mu = the mean and sigma = the standard deviation • Percentiles: If I say that the woman is 67” (where +1 falls), she is taller than 84% of all women — she is in the 84th percentile Pnorm(): • This tells us the probability to the left of a given value • We need the value we’re interested in, the mean of the distribution, and the sd of the distribution • Example: In this case, we learn that a woman who is 66” tall is taller than 0.726 of all women; in other words, she’s in approximately the 73rd percentile • > pnorm (66, 64.5, 2.5) • 64.5 is the mid value and 2.5 is the distance between each value on the x axis • To ﬁnd the area to the right, we would do 1-pnorm Qnorm(): • This will tell us what height corresponds to what percentile • We give it the probability, the sd, and the mean • Example: ﬁnd the height that corresponds to the 55th percentile • > qnorm (0.55, 64.5, 2.5) Z-score: How many standard deviations is this value away from the mean? • • Z = (x - mu) / sigma • This is if the variable is normally distributed 6 Wednesday, November 25, 2015 Quiz questions: Which of the following is an example of an outcome of a random process? C Select one: a. Number of letters in the English alphabet b. Number of days in a year c. Number of goals on a soccer ﬁeld d. Number of inches of snowfall in a year Which of the numbers below is NOT a possible value for the probability of an outcome? D Select one: a. .99 b. 0 c. 1 d. -0.2 The Normal distribution is: D Select one: a. skewed b. typical c. uniform d. unimodal Among the students in her class, Mary's course grade is in the 70th percentile. This means that: C Select one: a. Mary's course grade is lower than 70 percent of the students in her class. b. Mary's course grade lies somewhere in the middle 70 percent of the students in her class. c. Mary's course grade is higher than 70 percent of the students in her class. d. Mary's course grade is higher than 70 students in her class. Which of the following variables would be distributed according to the binomial distribution? C Select one: a. The age of a student in a class of 30 students if the mean age is 19 years, 6 months. b. The number of pets owned by a family in the U.S. c. The number of grades of A in a class of 30 students if the probability that any student gets an A is .25. d. The time it takes a student to ﬁnish an exam if the mean time to ﬁnish is 45 minutes. Feedback Two intro to biology classes are offered this semester. Suppose that the grades in class A are normally distributed with a mean of 79 and a standard deviation of 10, while the grades in class B are also normally distributed, but with a mean of 70 and a standard deviation of 5. Ingrid is in class A and got a grade of 95; Frank is in class B and got a grade of 86. Whose z-score is higher, Ingrid or Frank? D Select one: a. Their z-scores are equal b. Ingrid 7 Wednesday, November 25, 2015 c. Not enough information d. Frank Which best describes the Law of Large Numbers? A Select one: a. As the number of trials increases, the proportion of times that an event occurs increases. b. As the number of trials increases, the proportion of times that an outcome occurs moves further from the true probability of the outcome. c. As the number of trials increases, the proportion of times that an outcome occurs converges on the true probability of that outcome. d. As the number of trials increases, the probability of an event occurring increases. 8 Wednesday, November 25, 2015 Unit 4: Foundation of Statistical Inference II In this unit, we’ll build on sampling distributions • The sampling distribution of the sample mean — the probability distribution of the sample mean Population: mean = mu (sign), variance = sigma squared, and sd = sigma • • Sample: mean = x (with line over it), variance = s squared, and sd = s • The mean of the sample mean distribution is near the population mean — it has a ‘roughly’ normal shape and is less spread out than the population distribution itself • As n (the sample size) gets smaller, the graph gets more spread out • Population: N(mu, sigma) Sample: N(mu, sigma / square root of n) • • The standard error — sigma / square root of n • The central limit theorem — when the sample size is late enough, the sampling distribution of the sample mean will be N(mu, sigma / square root of n), even if the population is not normally distributed Conﬁdence Interval: • If I’m within 95% of the interval, the main number is 1.96 — sample mean +- 1.96(sigma / square root of n) OF A POPULATION MEAN Example: if sigma = 10, n =16, the sample mean = 30, the 95% CI for the population mean is: 30 +- 1.96 (2.5) — 30 +- 4.9 • As the sample size increases, the CI decreases; As the sample size decreases, the CI increases Critical value: • Adding z or z* • z* is given depending on the CI percentage • Remember, the standard error of the mean is only the sigma / the square root of n The margin of error — the standard error multiplied by z* • Different sample sizes: • When the sample size is less than 50, we use t df* — df* is the degrees of freedom = n -1 • The sampling distribution of the sampling proportion will be normal if (1) np is at least 10 and (2) n(1-p) is at least 10 p os the probability of successes • • If those two are true, it will follow — N (p, square root of ((p (1-p), n)) • Basically, p-hat +- z* (square root of ((p-hat (1-p-hat) / n) OF A POPULATION PROPORTION • Exluding the z* is the standard of error; including the z* is the margin of error ****REMEMBER THE DIFFERENCE BETWEEN POPULATION SAMPLE MEAN AND POPULATION SAMPLE PROPORTION**** 9 Wednesday, November 25, 2015 Quiz Questions: What is the formula for a conﬁdence interval for the population mean, when n < 50? D Select one: a. x ± z * (s/√n) b. x ± t * (σ/√n) c. x ± z * (σ/√n) d. x ± t * (s/√n) The Central Limit Theorem tells us that the sampling distribution of the sample mean will be Normal: A Select one: a. whenever the sample size is large b. only when the population distribution is Normal c. only when the population standard deviation is small d. whenever the sample size is small
The sampling distribution of the sample mean will tend to be more spread out when: Select one: A a. sample size is small b. the population is large c. the population sd is small d. none of the above Compared to a 95% conﬁdence interval, and 99% conﬁdence interval will be: Select one: A a. wider b. narrower c. taller d. based on a larger sample Which of the following symbols represents a population parameter? Select one: A a. μ b. s2 c. r d. n Suppose that in a sample of 50 college students, 45 are found to have smartphones. Why can't we use the Normal approximation to the binomial to calculate a conﬁdence interval for the proportion of students in the population who have smartphones? C Select one: a. n*p is less than 10 b. n*p is greater than 10 c. n*(1-p) is less than 10 d. n*(1-p) is greater than 10 10 Wednesday, November 25, 2015 Which of the following best deﬁnes the term "standard error”? A Select one: a. the standard deviation of a sample statistic b. the standard deviation of a population parameter c. the mean of a sample statistic d. the mean of a population parameter When computing a conﬁdence interval for the population mean from a small sample, what is the formula for calculating the degrees of freedom (df)? A Select one: a. n - 1 b. √n - 1 c. x - 1 d. s - 1 Unit 5: Concepts of Hypothesis Testing Now, we’ll be testing a hypothesis about a population parameter • The null hypothesis is where nothing happens — H0 • The alternative hypothesis is where something other than the norm will occur — HA • The alternative hypothesis can either be one sided (< or >) or two-sided (not equal) • The probability of getting a sample statistic at least as extreme as the one we got, if the null hypothesis were true, is the p-value • If the p-value is less than the alpha-level (always 0.05 i this class), we reject the null hypothesis • If we reject the null hypothesis, we refer to our results as statistically signiﬁcant Different errors: • Type I error — the null is true but we reject it • The chance of this occurring is alpha = 0.05 • Type II error — the null is false but the test does not reject it • The chance of this occurring is 1 - power • Power — rejecting the null when it is actually false As the amount of times the experiment is ran increases, power increases • • Low powers leads to type I errors • Power depends on the true size of the effect (‘how wrong’ the null hypothesis actually is) and the sample size Quiz Questions: Under which circumstance do we reject the null hypothesis? B Select one: a. If our p-value is greater than our α-level. b. If our p-value is less than our α-level. c. If our p-value is equal to our α-level. 11 Wednesday, November 25, 2015 Failing to reject the null hypothesis when the null hypothesis is actually false is known as a: C Select one: a. Type III error b. Type I error c. Type II error d. Correct decision If a researcher ﬁnds sufﬁcient evidence to reject a null hypothesis using an alpha-level of .05, then which of the following must be true? A Select one: a. There is sufﬁcient evidence to reject the null hypothesis using an alpha-level of .10. b. There is sufﬁcient evidence to reject the null hypothesis using an alpha-level of .01. c. The null hypothesis must be false. d. The alternative hypothesis must be false. A researcher who performs a null hypothesis signiﬁcance test is usually interested in: D Select one: a. rejecting the alternative hypothesis b. supporting both null and alternative hypotheses c. supporting the null hypothesis d. rejecting the null hypothesis The statistical power of an experiment is affected by which of the following: D Select one: a. sample size b. how far from correct the null hypothesis actually is c. how inﬂuential the result of the experiment will be d. both a and b A researcher performs two null hypothesis signiﬁcance tests. One comes out at p = .04, the other at p = .06. The researcher decides that in the ﬁrst case the null hypothesis should be rejected, but in the latter case, the null hypothesis is probably true. Which of the following statements about this situation is correct? A Select one: a. The traditional .05 criterion is an arbitrary cutoff, so the researcher should be wary of this conclusion. b. The researcher is correct; a null hypothesis is false if and only if the p value is less than .05. c. The researcher should probably run the second experiment again, until both p values are under .05. A result obtained in an experiment is said to be "statistically signiﬁcant" when: C Select one: a. the result has been replicated in multiple experiments. b. the result has some important real world implications. c. the probability of the result occurring by chance is less than our alpha-level. d. the result is consistent with our null hypothesis. 12 Wednesday, November 25, 2015 Suppose that on average, American households own 1 pet. You suspect that Massachusetts households might own more pets than the typical American household. Which of the following would be the null hypothesis? A Select one: a. H0: μMass = 1 b. H0: μMass > 1 c. HA: μMass = 1 d. HA: μMass > 1 Rejecting the null hypothesis when the null hypothesis is actually true is known as a: A Select one: a. Type I error b. Correct decision c. Type III error d. Type II error Which of the following statements would be a typical null hypothesis? C Select one: a. a defendant is guilty b. there is a relationship between income and height c. there is no difference between incomes of male and female doctors d. a given coin does not come up heads 50% of the time Unit 6: Tests for Means Main example: Imagine that I know that on average, adults sleep for 7 hours per night. What I want to know is whether this is true for college students. Null hypothesis: college students, like others, sleep an average of 7 hours per night — Ho: mu • (college) = 7 • Alternative hypothesis: college students do no sleep an average of 7 hours per night — Ha: mu (college) ≠ 7 • Psychologists usually use a two-sided alternative, even when they do have more speciﬁc hypothesis about the direction in which the mean may differ from the null value Suppose we have 110 college students in our sample, and the mean hours of sleep they report is 7.42, with sd = 1.75 • The sampling distribution of the sample mean will be N(mu, sigma / square root of n) • In this case, the sampling distribution of the sample mean will be N(7, sigma / square root of n); N(7, 1.75 / square root of 110) = N(7, 0.17) • Since 7.42 is 0.42 above the mean, we use 0.42 below the mean, which is 6.58 BECAUSE THE ALTERNATIVE IS TWSO-SIDED 13 Wednesday, November 25, 2015 • Now, we use the z test statistic — it expresses how far our sample mean is from the null hypothesis z = (x - mu o) / (sigma / square root of n) • • The numerator is the sample mean minus the mean under the null hypothesis • The denominator is the standard error of the mean • In this case, z = 2.52 SEs above the mean under the null hypothesis • Using the p-value chart below, we see that p is in between 0.02 and 0.01; we say p < 0.02 Conclusion: Based o our study, we reject the null hypothesis that college students sleep an average of 7 hours per night (z = 2.52; p < 0.02 by two-tailed test). College students appear to sleep more than 7 hours per night. • As the test statistic gets bigger, the p-value will get smaller • One sided values are always half the two-sided values • A one-sided alternative will always give you a smaller p-value but may be regarded as “cheating” — making it way too easy to reject the null hypothesis — unless there is a very good reason to adopt a one-sided alternative Example no. 2: Assume that a sample is n = 70; the sample mean is 6.80 and the sd is 1.4. After the calculations, our z = -1.195 • For a two-sided alternative, the sign of the z test statistic is ignored; instead, it’s absolute value is compared to the values in the table above • So, p > 0.20 Conclusion: Based on our study, we are unable to reject the null hypothesis that college students sleep 7 hours per night on average (z = -1.195; p > 0.30 by two-tailed test) • The difference between our sample mean and the null hypothesis is not statistically signiﬁcant Now, when our sample size is smaller (n < 50), we use t instead of z; we also replace sigma with s 14 Wednesday, November 25, 2015 Example no. 3: Instead of a 110 student sample, let’s say we only have a sample of 20 with the mean being 7.42 and the sd being 1.75 • t = (7.42 - 7) / ((1.75) / (square root of (20))) • t = 1.07 • Remember the degrees of freedom when the sample size is less than 50; 20 - 1 = 19 • 1.07 is smaller than any aye in the 19 row so p > 0.20 by a two-tailed test, and p > 0.10 by a one-tailed test • **When reporting, specify the degrees of freedom in parentheses** Conclusion: Based on our study, we are unable to reject the n all hypothesis that college students sleep 7 hours per night on average (t(19) = 1.07; p > 0.20 by a two-tailed test). • Researchers in psychology almost always use the t-test rather than the z-test even if the sample size is technically large enough to use the z-test. • A two-sided alternative test with the alpha-level of 0.05 will reject the null when the null value falls outside a 95% CI for the mean • This goes both ways, so if we reject the null hypothesis, this means that the null value falls outside a 95% CI for the mean • We can always apply this rule: for a two-sided hypothesis test with any alpha-level of our choosing, a CI with a conﬁdence level of 1-alpha will not include the null hypothesis value when your test rejects the null • Example: If the alpha is 0.10, the 90% CI won’t include the null hypothesis when it is rejected Testing the difference between means: distinguishing two types of situations: • Matched-pairs data: We only have sample of individuals but two observations (more powerful and will reveal that the null is false, if it actually is) • Two-sample data: We have two distinct samples of individuals but one observation of each individual Example: “Was the mean grade in this class on HW2 signiﬁcantly different from the mean grade on HW 1?” Here, the two observations are the two means, which are independent — matched-paired Example: “Was the mean in the class on HW1 signiﬁcantly different from the mean grade on HW1 the last time I taught this class?” Here, different individuals are compared Matched-pairs data exist: Any time we have before-and-after measurements on the same individual • • Any time we give the same individual two different tests • Any time we test the same individual under two different experimental conditions Example: 15 Wednesday, November 25, 2015 Question 1: Matched-pairs — Are college GPA scores different from high school GPA scores? (using 40 students) • Ho: mu (difference) = 0 • Ha: mu (difference) ≠ 0 Step 1: make a difference score for each individual Step 2: compute the mean and sd mean = 0.6855 sd = 0.5757579 Step 3: plug into the t test formula t = (0.6855 - 0) / (0.5758 / sqrt (40)) t = 7.529494 Step 4: Look at the p-value table p < 0.001 Now, suppose we have two independent samples: • We will do the same thing with a different t formula: • t = (x1 - x2) / (sqrt ((S1)^2)/n1) + (S2)^2/n2)) Question 2: Two-sampled — If men and women differ in HSGPA (the HSGPA is the one and only observation); we have 40 individuals — 22 men and 18 women • In this case after all of the observations, t = -.658 • For the degrees of freedom, we would take the smaller of the following: (n1-1) ad (n2-1) • In this case, it the df would be 17 (18 women - 1) • P-value = 0.5119 Quiz Questions: Which of the following is a matched pairs experiment? C Select one: a. an experiment where there's a statistically signiﬁcant difference b. an experiment that includes both control and experimental subjects c. an experiment where each subject undergoes two different treatments d. an experiment where subjects are randomly assigned to one of two treatments A researcher is interested in the effectiveness of a new form of psychotherapy. Twelve clinic patients are randomly assigned to each of two groups, with one group receiving this new form of therapy, the other receiving traditional therapy. The therapies are evaluated by giving each patient a depression survey after treatment. What is the null hypothesis in this situation? B Select one: a. H0: μdiff = 0 b. H0: μnew = u traditional c. H0: μnew < .05 d. H0: μnew = 0 16 Wednesday, November 25, 2015 A 95% Conﬁdence Interval for the mean extends from -2 to 13. This means that: D Select one: a. We WOULD reject the null hypothesis that μ = 0, with a two-sided test and alpha = .10. b. We WOULD NOT reject the null hypothesis that μ = 0, with a two-sided test and alpha = .10. c. We WOULD reject the null hypothesis that μ = 0, with a two-sided test and alpha = .05. d. We WOULD NOT reject the null hypothesis that μ = 0, with a two-sided test and alpha = .05. Why? 0 falls in between -2 and 13, so it’s not rejected. With the alpha-level of 0.05, 1 - 0.05 is 0.95 How many degrees of freedom are there for a t-test, when you're testing a null hypothesis about a single population mean? D Select one: a. There are no degrees of freedom for a t-test b. complicated formula that cannot be calculated by hand c. (n1 + n2) - 2 d. n-1 A researcher performs a two sample t test, with the result p < .01 (two-tailed). Which of the following is the proper interpretation of this result? C Select one: a. The probability that there is no real difference between population means is less than .01. b. The probability that there is actually a difference between population means is less than .01. c. The probability of obtaining a difference between sample means as large as obtained in this experiment is less than .01, if in fact there is no difference between population means. If statistical software is not available, which of the following is an acceptable method of calculating the df for a two-sample t test? D Select one: a. (n1 + n2) - 1 b. n1 - n2 c. n1 + n2 d. whichever is smaller of n1 - 1 and n2 - 1 Difference scores are computed in order to perform which of the following? A Select one: a. a matched-pairs t test b. a two-sample t test c. a one-sample t test A researcher performs a one-sample t test, and gets t(19) = -3.1. This means that: B Select one: a. the null hypothesis mean is equal to the sample mean plus 3.1. b. the sample mean is 3.1 standard errors below the null hypothesis mean. c. the sample mean is 3.1 times less likely than the null hypothesis mean. 17 Wednesday, November 25, 2015 Unit 7: A population proportion (not mean): • To determine if the conditions for the normal approximation are met, we need np to be at least 10 and for n(1-p) to be at least 10 For a population proportion, np0 must be at least 10 and n(1-p0) must be at least 10 • • There are two different ways to do a null hypothesis when using a population proportion: (1) using the binomial distribution to get an exact p-value (2) using the normal approximation to get a p-value (only if the conditions above are met) Chi-square test of goodness of ﬁt: The ﬁrst step is to compare the observed and expected frequency of each outcome • • We then compute the difference of the two for each row • The chi squared equation = E ((observed-expected)^2 / (expected)) • The E is the ((observed-expected)^2 / (expected)) of each row added together • The degrees of freedom is the number of categories minus 1 • The E and the degrees of freedom is what will be used to ﬁnd the p-value We can also use a chi-squared test to test the distribution in more than one population: Chi-square test of independence: • We can test if the variables are dependent or not • In this case, we need to calculate the expected count: ((row total x column total) / n) • Now, we can use the observed and expected count to ﬁgure out the ‘E’ value For a test of independence, the degrees of freedom = (number of rows -1) x (number of • columns -1) • The higher the chi-square number, the more that cell frequency departs from what is expected based on independence For both of these tests, speciﬁc conditions must be met: All expected counts have to be 1 or more • • No more than 20% of cells can have expected counts less than 5 • This means, if you have a 2x2 contingency table, no cells expected count can be less than 5 • If these assumptions are not met, the p-value you get from the test is incorrect • There is also o such thing as a ‘one-sided’ or ‘two-sided’ alternative hypothesis in the case of chi-square Quiz Questions: A researcher wants to use the normal approximation to the binomial distribution to test a null hypothesis about a population proportion. Which of the following states the conditions that must be met? D Select one: a. n*p0 is 10 or greater b. n*p0 is 10 or less c. n*(1-p0) is 10 or greater d. both (a) and (c) 18 Wednesday, November 25, 2015 You are trying to determine if a six-sided die is fair, i.e., unbiased. If you throw the die 150 times, what are the expected frequencies for the purpose of a chi-square test? C Select one: a. 30, 30, 30, 30, 30, 30 b. 50, 50, 50 c. 25, 25, 25, 25, 25, 25 d. 75, 75 One way in which a chi-square test is different from a t-test is that: C Select one: a. you do not get a p-value as a result from a chi-square test b. you have to consider the degrees of freedom when doing a chi-square test c. there is no such thing as a 'one-sided' or 'two-sided' alternative hypothesis in the case of chi- square Which of the following is NOT an assumption that must be met in order for a chi-square test to be valid? A Select one: a. All observed counts have to be 1 or more b. All expected counts have to be 1 or more c. No more than 20% of cells can have expected counts of less than 5 How many degrees of freedom are there for the chi-square test of goodness of ﬁt? A Select one: a. Number of categories minus 1 b. There are no degrees of freedom for the chi-square test of goodness of ﬁt c. (# of rows-1)*(# of columns -1) d. N-1 How many degrees of freedom are there for the chi-square test of independence? C Select one: a. There are no degrees of freedom for the chi-square test of independence b. N-1 c. (# of rows-1)*(# of columns -1) d. Number of categories minus 1 You know that in the past, 70% of students in an introductory calculus class were engineering majors, 20% were other science majors, and 10% were majors in another ﬁeld. You want to know whether the distribution has changed signiﬁcantly in this year's class. Which of the following is appropriate in this situation? C Select one: a. chi-square test of independence b. test for a population proportion, using the binomial distribution c. chi-square test of goodness of ﬁt d. test for a population proportion, using the Normal approximation 19 Wednesday, November 25, 2015 Unit 8: Relationships between quantitative variables This consists of graphing and comparing two quantitative variables - They can either have a positive or negative correlation (basically shown in the slope) - Whatever variable is on the x-axis is the dependent or explanatory variable - Whatever variable is on the y-axis is the independent or response variable Linear graphs - The points fall roughly along a straight line - The change in the y-variable is about the same for any given change in the x-variable - Graphs that aren’t linear are considered non-linear or curvilinear Numerical measurements - Correlation coefﬁcient: This measures the strength of the linear association between variables (r) - It’s values range between -1 to +1 - The negative values go with a negative association, and the positive values go with a positive association - Values near -1 or +1 mean there’s a strong association, and values near 0 mean there’s a weak association - This correlation coefﬁcient can’t be used when the graph is non-linear and/or has outliers Associating two variables: - A direct casual relationship: hours studying (x) inﬂuences grades earned (y) - An indirect casual relationship: hours studying (x) inﬂuences grades, which inﬂuences parents approval (y) - A common cause: There is some third variable that inﬂuences both More about the linear relationship: - The line that captures the relationship between the x and y variable is known as the regression line - y = mx + b - It is the best line, because it is the line that makes the sum of the squared vertical distances between the points and the line as small as possible - Basically, for a point, (1) ﬁnd the vertical distance between that point and the line (2) square the distance (3) do this for every point (4) add up all of the values R squared: - This is the proportion of the variance in y that is ‘explained by’ its relationship with x - Ex. — If r = -.88, the R squared if 0.77 or 77%; This means that 23% of the variance is ‘unexplained’ - This unexplained variance is represented by the regression line - The regression line gives us a predicted y-value for every x-value; this predicted y-value is y- hat - Sometimes, the actual y value will be above or below the predicted y-value; this is the ’23%’ that is ‘unexplained’ - The unexplained parts of each y value are residuals 20 Wednesday, November 25, 2015 - When plotting the residuals, we want to see a random scatter around 0, for all values of out x variable - If this doesn’t happen, the relationship is not linear - Sometimes plotting the residual can show heteroscedasticity — there is more variability in y for some parts of the range of the x variable - We also cant extrapolate — adding x values that are out of range Others: - Everything above was based on a sample; what about a population? - Well, if the sample has a non-zero correlation and regression slope, the population does as well - For this t-test, the hypothesized value is always 0 - This means that the t-test formula is: t = (statistic) / (estimated standard error of the statistic) - The degrees of freedom is n-2 21

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

#### "When you're taking detailed notes and trying to help everyone else out in the class, it really helps you learn and understand the material...plus I made $280 on my first study guide!"

#### "I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

#### "It's a great way for students to improve their educational experience and it seemed like a product that everybody wants, so all the people participating are winning."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.