Popular in Statistics In Psychology
Popular in Psychlogy
Naomi Ama Baidoo
verified elite notetaker
verified elite notetaker
One Day of Notes
verified elite notetaker
verified elite notetaker
verified elite notetaker
verified elite notetaker
This 37 page Class Notes was uploaded by Nate Dickstein on Saturday January 24, 2015. The Class Notes belongs to PSYCH240 at University of Massachusetts taught by Jeffrey Starns in Fall. Since its upload, it has received 193 views. For similar materials see Statistics In Psychology in Psychlogy at University of Massachusetts.
Reviews for classnotes
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 01/24/15
Class 1 12015 About the Class 0 know the course objectives for each section to get a better understanding of what you should know for that section and the tests for the sections 0 bring calculator 0 bring paper to labs 0 up to 12 sona credits syllabus has details 0 round to 2 decimal points for everything throughout the semester 0 Scientific revolution knowing that your ideas might be wrong and testing them against observations to find out if they are wrong 0 Statistics mathematical tools to organize analyze and interpretation of data we can never be certain about anything we try to quantify how uncertain we are about things ways to estimate probability that something is true estimating some range of values for different things Class 212215 Q Do you understand the world you live in to understand the world we need information we have all this information but the problem is understanding that information 0 look at large data sets and try to find out what we can attain from them we can39t just look at the raw data because there is too much information 0 this is when we use statistics to mathematically organize all the information Q 2 types of statistics descriptive statistics describe a data set inferential statistics use data to make general conclusions conclusions that go beyond the data at hand 0 Terminology variables condition characteristic or measure that can take on different values value one of the possible levels that a variable can take score the value of a variable observed for a particular datacollection units distribution collection of scores across a number of datacollection units 0 types of variables you have to know what kind of variable you are considering to know what kind of conclusions you can make numericquantitative variables values of the variables are numbers height nominalcategoricalqualitative variables values of the variable are verbal labels marital status sometimes people use numbers as labels for categorical variables this is still considered a categorical variable since the values don39t actually code a numerical quantity 0 Types of numerical variables equal interval equal sized changes in the variable represent equal sized changes in what is being measured temperature ordinal values of variables only indicate the rank of the score finishing position in a race rank in a class rating scales should probably be considered ordinal but some researchers treat them as equal interval watch out for this continuous variable for which here are no gaps in the values that the variable can take there are in infinite number of possible values between any two scores height speed response time discrete variable for which there are gaps in the values that the variable can take some scores simply never occur number of siblings number of prior marriages 0 ratio variable type of equal interval variable equal interval variable for which a value of zero truly indicates the complete absence of what is being measured pounds of beef consumed in a year bank account balance 0 degrees fahrenheit does not mean the absence of temperature you can go lower than 0 degrees fahrenheit or celsius absolute 0 however is as cold as it can get so the kelvin scale is a ratio scale a variable must have a true zero point to determine what ratio is formed by two of the values is one value half of another value a third of the other value twice the other value on one september day the temperature is 50 degrees fahrenheit in albany NY and 100 degrees fahrenheit in Austin TX the newscaster says it is twice as hot in austin as it is in albany This is technically incorrect because 0 degrees fahrenheit is not really 0 you can go below 0 0 Section 2 0 once we have data to evaluate we need to organize the data in ways our minds can understand 0 there are three pieces of info you need to know to have a complete picture of a variable the central tendency variability and shape of the distribution of scores 0 Central tendency the goal is to give a number that describes the typical score for the variable this is called a measure of a central tendency the two most popular measures central tendency are mean and median meanaverage sum of the scores divided by the number of scores MEXN M is mean E is sum X is all of the scores N is the number of scores 2 4 5 O 1 6 E18 N6 M3 score mean deviation 2 3 1 4 3 1 5 3 2 0 3 3 1 3 2 6 3 3 properties of mean 1 the sum of all the deviations from the mean is zero 2 the sum of the squared deviations from the mean is lower than the sum of the squared deviations from any other number Median value at the middle of a sorted set of scores for an odd number of scores sort the scores and take the value of the middle one for an even number of scores sort the scores and take the average of the two middle numbers 2 4 5 6 O 1 order of scores 0 1 2 4 5 6 two middle scores 2 4 median3 Class 312915 0 Central Tendency cont sometimes the median will have the same value as one or more of the actual scores sometimes it won t there will usually be an equal number of scores above and below the median the only exception to this is variables that have more than one score with the same value as the median Example 1 2 2 2 2 2 2 2 4 6 9 the median is 2 so there is only one score below the median and three above it Q Variability now we want a single number to summarize how divergent the scores are from one another if the scores all have very similar values then variability is low if some scores have values that are very different than other scores than variability is high 0 Range very simple measure of variability it is the difference between the highest and lowest scores instead of just looking at range we more frequently use statistics that are based on all of the scores not just the highest and the lowest one way to think about it is how much do the scores deviate Q Variance approximatelythe average squared deviation from the mean Squot2EXMquot2N1 Squot2 is variance E is sum X is all of the scores M is mean N is the number of scores If we were taking the true average we would divide by N We do N 1 instead because this has some nice statistical properties 0 Set of scores 2 4 5 016 N6 M3 score x Mean XM XM 2 3 1 1 3 1 1 3 2 4 3 3 9 3 2 4 3 3 9 OU LOO39lPN the variance is in squared units so its not always easy to interpret we usually work with the standard deviation which is simply the square root of the variance this lets us work with the original units and you can interpret it as much as the scores deviate from the mean on average Ssquare root of S squared 0 Shape how the distribution looks when applied most researchers don39t typically use summary statistics to describe the shape of a distribution instead they mostly use graphs histogram to visualize the distribution shape Histogram graph that shows the frequency number of scores in a data set that equal a given value or fall in a given range of values 0 Symmetrical distribution if you draw a line in the middle of the range of scores one side of the distribution looks very similar to the other side doesn39t have to exactly match on a histogram O positively skewed distributions if you draw a line in the middle of the range of scores scores will be concentrated on the left side of the line with fewer scores a tail on the right side 0 Negatively skewed distribution scores will be concentrated on the right side of the line with fewer scores on the left side 0 Floor effects many of the scores are concentrated near the lowest possible value for the variable this often leads to positively skewed distributions 0 Ceiling effects many of the scores are concentrated near the highest possible value for the variable this often leads to negatively skewed distributions 0 Mean Vs Median the mean is a misleading measure of central tendency for distributions with extreme skew andor outliers the median is preferred for skewed variables since it is less sensitive to extreme scores for a positively skewed distribution the mean will almost always be higher than the median for a negatively skewed distribution the mean will almost always be lower than the median for a symmetrical distribution the mean and median will have around the same value 0 Unimodal Vs Bimodal 0 Mode technically the one most common score we also use it to mean a high point in the distribution more generally 0 Unimodal distribution only one peak in the distribution 0 Bimodal distribution two peaks in the distribution with a clear dip in between bimodal distributions often arise when there is a mixture of different types of scores in the data set like men and women or students and professors 0 Uniform distribution distribution with an approximately equal number of scores across the entire range of values 0 Quantiles when it isn39t convenient to show a whole graph but still convey all 3 qualities of a distribution central tendency variability and shape choco chips example given in class Q quantiles set of cutoff scores that divides a distribution into N regions with an equal proportion of scores in each reason 0 quartiles quantiles that divide distribution into 4 regions each with 25 of the scores 0 deciles quantiles that divide distribution into 10 regions each with 10 of the scores 0 percentiles quantiles that divide distribution into 100 regions each with 1 of the scores the middle quartile is the median a good measure of central tendency the distance between the 1st and the 3rd quartiles is a good measure of variability called the interquartile range you compute it by subtracting the fst quartile from the 3rd quartilehigher values indicate more variable distributions interquartile range is the distance between the fst and 3rd quartile so its the range in which half of the scores fall around the center of the distribution Class 42315 0 Section 3 0 Proportion number of elements with a certain characteristic divided by the total number of elements 0 percentage proportion times 100 Q Zscores tells you the number of standard deviations that a score falls above or below the mean even if a score is far away from the mean it still might be a typical score because of a certain deviation from the mean ZXMS ZZscore Xraw score Mmean Sstandard deviation The height of American females has the mean of 64 inches and a standard deviation of 3 inches 30 M64 and 33 what is the Z score for a woman 70 inches tall X70 Z706432 so the woman is two standard deviations taller than the average regardless of the original distribution the Zscore distribution will have a mean of zero MO a standard deviation of 1 81 and the same shape as the original distribution ZO means that the score equals the mean so basically the score is very typical Higher Z s indicate more atypically higher scores Lower Z s indicate more atypically lower scores You can also get back to a raw score from a Zscore XZSM US females have an average height of 64 inches with a standard deviation 3 Sandra has a z score of 6 How tall is she 6364622 0 Section 4 0 Correlation and Prediction 0 how do u take one variable and use it to predict another variable or how its related to another variable 0 scatterplot a graph in which the Xaxis is the score on one variable and the y axis is the score on the other variable each point on the plot is one datacollection unit to define the relationship between the 2 variables mathematically we ll put in a line that shows how one variable tends to change across changes in the other variable this is called the LeastSquares regression line the regression line is defined by the linear equation YhatabX Yhat predicted score for variable Y Xvalue of score on variable X ayintercept or regression content predicted Y value when XO bslope or regression coefficient predicted change in Y for every unit change in X Y is the criterion variable what you try to example X is the predictor variable what you use to predict Y example Y is free throw percentage and X is height Yhat17260129X Q Slope you can calculate the slope given 2 X values and their corresponding Y hat values BY 2Y 1 Xquot2Xquot1 Class 5 2515 0 Yintercept the predicted Y value when XO The Yintercept won t make sense if 0 is not a possible X value For instance this value suggests that players who are 0 inches tall will make over 100 of their free throws which is impossible of course In this case just think of the intercept value as something you need to move your line up or down to line it up with the data More generally fitting a regression function will rarely produce good predictions for the criterion variable outside of the range of data that you have for the predictor variable For example say I take the height of kids from their 6th to 10th birthdays and find a linear relationship such that predicted height increases by two inches per year It would not be reasonable to predict that the kids will grow another 120 inches 10 feet between their 10th and 70th birthdays Having data in the older age range would allow us to make a better prediction for height on their 70th birthday 0 Regression line Why is it called least squares the leastsquares regression line minimizes the sum of squared deviations between the predicted and observed Y values that is no other line will make better predictions for Y based on X 0 Strength of the relationship we need a way to express how strongly the two variables are related slope does not give this strength by itself slope is affected both by the strength of the relationship AND the relative scale of the two variables like using centimeters vs inches 0 A common measure of relationship strength is called proportion of variance accounted for rquot2 This statistic measures how much better we can predict Y using information about X versus NOT using information about X you wont have to calculate this from scores but you will need to understand what it is rA2 is not affected by scale so we can use it as a pure measure of relationship strength rquot20 means there is no relationship between the two variables knowing about one variable does not help you predict the other variable at all rquot21 means there is a perfect relationship between the variables knowing about one variable allowed you to perfectly predict the value of the other variable every time 0 another way to measure the strength of a relationship is with correlation a correlation is the measure of the degree and direction of a relationship between 2 variables for a correlation we do a regular regression except that we convert both variables to z scores first because 2 scores puts everything on the same scale so a correlation is the slope you get from a regression on zscores Q interpreting r sign orindicates direction of the relationship means the y variable increases as the x variable increases means the y variable is decreasing as the X variable is increasing the absolute value indicates the strength of the correlation O is weakest correlation 1 or1 is strongest correlation you can get r 2 by just squaring the correlation coefficient r 0 Linearity of relationship 0 the correlation coefficient assumes a linear relationship between the two variables if the variables are related by any other type of function then the correlation coefficient might be low even though the variables are strongly related you should always look at scatterplots to make sure the relationship between the variables is linear if it isn39t then don39t use linear regression or correlation 0 Non linear relationship relationship between variables that approximately follows pattern that is not a straight line 0 Causality 0 how do we figure out what39s causing what in a relationship between two variables 0 3 possibilitiesdirections of causality 1 X could be causing Y 2 Y could be causing X 3 another variable could be causing both X and Y 0 Correlation does not imply causation BUT all correlational evidence cannot be dismissed automatically you have to evaluate each case to figure out which pattern of causation is most likely Q establishing causality what do we do when we want to make causal statements 1 run an experiment instead of just measuring two variables and trying to predict one with the other we directly manipulate an independent variable and measure its effect on a dependent variable all variables except for the independent variable are either held constant across conditions or randomly assigned conditions to prevent a systematic effect any changes in the dependent variable should be uniquely attributed to the independent variable 2 get more information on the time course of the relationship and possible outside variables that might produce the relationship if you can rule out plausible outside variables and establish the change in X precede changes in Y then you have stronger evidence that X causes Y 3 establish how one variable causes another variable we can develop theories of the mechanism by which one variable changes another and test these theories in controlled experiments with knowledge of the underlying mechanisms we can make stronger inferences about relationships we find in the real world 0 Section 5 0 Probability 0 Empirical probability the proportion of times that an outcome occurs in some number of observed attempts Q theoretical probability objective definition the empirical probability we would get from an infinite number of attempts Q probabilities can be expressed either as proportions or percentages you can either say a probability is 75 or 75 0 Theoretical probability subjective definition the extent to which you should expect or believe something on days with a 95 chance of rain you should strongly expect that you will need an umbrella subjective probabilities are a great way to represent our degree of certainty or uncertainty if you know absolutely nothing every possible outcome is equally likely as far as you know if you know everything there is to know then you know for sure that one outcome will happen 100 probability and the others won39t 0 probability 0 one outcome we are often interested in is whether or not our beliefs will turn out to be true subjective probabilities can represent how confident we are that we are right subjective probabilities represent our ignorance making probability statements does not necessarily mean that the world is inherently unpredictable even if events are actually deterministic we have to make probabilistic statements about them because we never know all of the determining factors Class 621015 0 being scientific means expressing an appropriate level of confidence in our beliefs 0 if strong evidence is not available the scientific thing to do is acknowledge our uncertainty 0 combining probability addition rule the probability of getting any one of multiple mutually exclusive outcomes is the sum of the individual probabilities of each outcome the probability of getting a 1 on a dice roll is 16th the probability of getting a 2 on a dice roll is 16th the probability of getting either a 1 or a 2 on a dice roll is 26th Multiplication rule the probability of simultaneously observing multiple independent outcomes is the product of the probabilities of each individual outcome always express probabilities as proportions when applying this rule 0 you believe a lot of things and you can39t be completely certain that any of them are true 0 the multiplication rules shows that even if you strongly believe in all the things you believe in the chances are that at least one of them is wrong 0 Percentage change 0 percentage change the difference between the new and old value of a variable divided by the old value times 100 to make it a percentage PC100NVOVOV PCpercent change OVold value NVnew value if the value comes out positive then it is a percentage increase if the value comes out negative then it is a percentage decrease percent change can be over 100 When you have the old value and the percentage change you can figure out the new value NVOVPC100OV increase NVOVPC100OV decrease When you only have information on the percentage change and not the absolute change you should be suspicious Q If you don t know the original value or the new value then you can t figure out how big the absolute change was in the variable 0 The same percentage change can represent different absolute changes depending on the original value 0 This is especially tricky when the original variable itself is expressed in percentages O o increasedecrease means something very different than an increasedecrease of percentage points lf 4 of people quit smoking without a therapy program and 6 quit with the therapy program you can either say that there was a 50 increase in quitting or that the rate of quitting increased by 2 percentage points Guess which one advertisers will choose 0 Marginal Probability overall probability of an event in a population the marginal probability of A is denoted pA O conditional probability probability of an event just for a subset of the population that has a certain characteristic the conditional probability of A given B is pAB Q Contingency table table that has all the combinations of 2 or more variables tells you how many of the scores in a population are at an intersection of those variables contingency tables are a good way to demonstrate the difference between marginal and conditional probabilities contingency tables show the number of scores at each combination at the levels of multiple variables You get marginal probabilities by dividing the column or row totals by the grand total You get conditional probabilities by dividing a cell total by a column or row total The condition tells you which column or row total to use The requested probability tells you which cell to use Conditional probabilities can be different depending on which variable you use as a condition pAB does not have to equal pBlA Class 721115 0 Normal distribution 0 we can define idealized distributions that follow a particular mathematical function these are called formal distributions 0 one famous formal distribution is the normal or Gaussian distribution 0 this is a special distribution that is seen for many natural variables 0 probability density when it is high there is a lot of scores that are near the value of the variable when it is low there is a few scores near the value of the variable 0 the normal curve is symmetrical bell shaped and unimodal 0 because the normal curve is defined mathematically we can also mathematically work out the proportion of scores in a range of values for example what proportion of people have lQ s between 120 and 140 lormal Curve When a variable is normally distributed the following proportion of scores fall between standarddeviation intervals 3 2 1 0 1 2 3 Standard deviations away from the mean 2 scores The length of adult eastern diamondback rattlesnakes is normally distributed with a mean of 45 feet and a standard deviation of 1 foot About what proportion of adults are between 55 and 65 feet first convert the scores to Zscores z 55 45 1 1 z 65 45 1 2 so we need the proportion between 1 standard deviation and 2 standard deviations above the mean next we apply our knowledge about how many scores fall between standard deviations in a normal distribution About 14 of the scores fall in our target region so the answer is 14 Samples and populations populations consist of all possible scores for a variable sample smaller set of scores actually available to a researcher statistic index that is computed from sample data regular letters Sample meanM sample standard deviation 8 Parameter characteristic of a population as a wholegreek letters population meanu population standard deviation 0 0 Law of large numbers sample estimates will tend to converge to population values as sample size increases The law of large numbers says that the estimated empirical probability will tend to converge to the true theoretical probability as sample size increases 0 populations don39t have to be infinite 0 Samples should be selected from a population completely at random to sample bias If samples are not random sample statistics give biased estimates of population values That is the sample value tends to come out consistently lower or consistently higher than the population value 0 About test 1 multiple choice section computational section short answer section they are up on moodle 4 of them only 2 will be on the test at random Class 821915 0 Test 1 10 multiple choice30 points 10 computational50 points 2 short answer 20 points 0 bring calculator formulas will be up on the screen use objectives 15 as study guide possible short answer questions are up on moodle O O O 0 Class 922615 Q Hypothesis Testing 0 Dog Problem 0 You work for an eccentric trillionaire MrBurns and he wants to get a new set of guard dogs You get a purchase order for a bunch of dogs it s a mixture of bulldogs and german shepherds Mr Burns does not want the bulldogs but all you have on the purchase order are the dogs names and weights You tell Mr Burns that you ll go through the list dog by dog and you ll only order a dog if you have evidence that it is NOT a bulldog The evidence will be each dog s weight German shepherds weigh more than bulldogs on average 80 if a given dog is very heavy then that is evidence that it is not one of the bulldogs You say that you ll pick a weight so heavy that it will be unlikely that any dog above that weight is just a fat bulldog lf fact you say you ll make sure that there is no more than a 5 chance that a bulldog would be above your cutoff weight For each dog there are 2 hypotheses Hypothesis 1 is that the dog is not a bulldog and hypothesis 2 is that the dog is not a bulldog Mr Burns will let you order when you have evidence for hypothesis 2 you have assumed him that the evidence you use will be specific evidence for a hypothesis is specific if the evidence would be unlikely to be observed if the hypothesis was false now you need a way to make sure your evidence is as specific as you promised Mr Burns you can do this by carefully setting your standard for probably too heavy for a bulldog Q NHST Null Hypothesis significance testing we use this technique when we want to provide evidence against a null hypothesis and in favor of an alternative hypothesis the null and alternative hypotheses are defined in reference to a comparison population i want you to follow 5 steps every time you do a significance test 0 5 steps 0 Step 1 State the Null and alternative Hypotheses comparison population a population of scores with a known distribution bulldog weights the new sample that we are testing may or may not belong to this population The null hypothesis states that the sample you are testing belongs to the comparison population The alternative hypothesis states that the sample you are testing does not belong to the comparison population The two competing hypotheses are mutually exclusivelf the null is false then the alternative has to be true If the alternative is false then the null has to be true the alternative hypothesis is usually the one we want to support 0 Step 2 specify the characteristics of the comparison distribution know the shape mean and standard deviation degrees of freedom 0 Step 3 determine the critical values state if test is one or twotailed directional or nondirectional onetailed test we are only looking for evidence against the null hypothesis in one direction so there is only one critical value twotailed test we are looking for evidence against null hypothesis in either direction so there are two critical values twotailed tests are nondirectional state the alpha value probability of concluding that we have good evidence for the alternative hypothesis when it is actually false researchers typically use alphas of 05 or 01 report the critical values Once we have an alpha value we use it to figure out where to place the cutoffs that we will use to separate significant results from nonsignificant results or the critical values Each test we run will have one of two results The test will be significant or the test will be nonsignificant Significant means that you achieved evidence for the alternative hypothesis that is specific enough for your desired alpha value Nonsignificant means that that you failed to do this We will always set the critical values so that a proportion of scores equal to our alpha value is more extreme than the critical values in the comparison distribution Any result more extreme than the critical values is deemed significant so this makes sure that our chance of getting a significant test if the alternative hypothesis is false equals alpha 0 Step 4 calculate the sample score on the comparison distribution 0 Step 5 state your conclusions state whether the test result was significant or nonsignificant explain what this means for the particular research scenario that was given 0 What can we conclude from NHST if the sample is more extreme than the critical values then you have some evidence for the alternative hypothesis if not you don39t have useful evidence about either hypothesis results are inconclusive Alpha is NOT the probability that the alternative hypothesis is wrong if you get a significant result It is the probability that you will get a significant result if the alternative hypothesis is wrong These do not have to be the same and usually won t be Class 103315 0 Specific means if the hypothesis is false then the evidence is unlikely to be observed 0 Significant result means you got a result that probably would not have occurred if the null hypothesis were true 0 The comparison distribution shows us what we can expect under the null hypothesis 0 Section 7 Central Limit Theorem 0 Distributions of means to figure out if the evidence for the alternative is specific enough to meet our standards we will have to know how typical a sample mean is if the null hypothesis is true Theoretically this involves pulling samples of N scores out of the distribution again and again getting the mean of each sample of forming a distribution with all of those means N sample size uM mean of distribution of means oM standard deviation of distribution of means Also called the standard error of the mean 0 shape usually normal 0 mean the distribution of means has the same mean as the distribution of scores WMw Q variability the variance of the distribution of means is the variance of the distribution of scores divided by the sample size the standard deviation of the distribution of means is the standard deviation of the scores divided by the square root of the sample size 0 Central lLimit Theorem UM al a 2qu 03 02M variance of the distribution of means 02 a variance of the distribution of scores UM standard deviation of the distribution of means aka the standard error of the mean 039 a standard deviation of the distribution of scores N sample size 0 the sampling distribution gets less variable as sample size increases 0 the distribution of means will be normal if the distribution of scores is normal regardless of the sample size Q If the distribution of scores is any other shape besides normal then the distribution of means will still be very close to normal if the sample size is 30 or greater The shape can be far from normal for smaller sample sizes 0 Why does the distribution of means become less variable with higher N To get a sample mean that is far away from the population mean you must draw scores that are consistently far from the mean in one direction This becomes less likely with larger sample sizes If you take a big sample extreme scores from a skewed distribution of scores will be moderated by less extreme scores in your sample The larger your sample size the more likely you are to find specific evidence for the alternative hypothesis if it is true Larger sample sizes produce less variable comparison distributions and having a less variable distribution to compare to makes it easier to figure out that your sample is atypical 0 Class 113515 Class 1231015 0 Single sample Ttest Q A developmental psychologist runs an experiment in which 9month olds watch several small plays in which one puppet helps a bunny climb up a slope the helper while another puppet pushes the bunny back down the slope the blocker After watching each play the babies are given the choice of playing with the helper or the blocker If the play had no influence on their choice they would pick the helper 50 of the time A sample of 25 babies picked the helper 61 of the time on average with a standard deviation of 20 With an alpha of 01 did the play have any effect on the babies choices 0 For the singlesample t test we will test the null hypothesis that a sample came from a population with a given meanThe alternative hypothesis will state that the sample did NOT come from a population with the hypothesized population meanSometimes the alternative hypothesis will be directional and specify that the sample came from a population with a higher or lower meanSometimes the alternative will be nondirectional and just say a different mean without saying higher or lower Q If the sample mean is far enough from the population mean specified in the null hypothesis then we can take this as evidence against the null and for the alternative Q If we don39t know the population standard deviation how can we figure out when we have evidence for the alternative hypothesis 0 AKA how can we know if our sample is typical 0 one solution is to use the sample scores to estimate the population standard deviation 0 we can use the formulas that we know already from section 2 to estimate the population standard deviation ECX M2 52 W 5 Q then we can use the sample standard deviation to estimate the standard deviation of the distribution of means 0 Now we can estimate how typicalatypical our sample mean is for a population with the mean specified in the null hypothesis M HM 5M f M mean of tlhe sample uM pepulation rnean according to tlhe null hypothesis SM standard error of the mean estimated from the sample 0 This is interpreted the same way as a z score t values abovebelow zero indicate that the sample mean is higherlower than we would expect based on the null hypothesis t values farther from zero indicate that the sample mean is more surprising under the null hypothesis Q If the sample t value is far enough from zero we can take this as evidence against the null and for the alternative 0 Tdistributions Q tdistributions are theoretically constructed like this Randomly sample from a population of scores that follow a normal distribution with a mean equal to the value specified in the null hypothesis Compute a tvalue using the mean and standard deviation of each sample After you do this a bunch of times the t values will follow a tdistribution We can use the t distribution as a comparison distribution showing what results we are likely to get if the null is true We don t need to know the population standard deviation You get the same t distribution regardless of how variable the scores are in the population tdistributions are unimodal bell shaped symmetrical and shaped like a normal distribution but with fatter tails T distributions come closer to a normal distribution as sample size increases tdistributions get closer to a normal distribution as sample size increases Q The shape of the tdistribution depends on the degrees of freedom in your sample Degrees of freedom number of scores that are free to vary in the calculation of a statistic For the single sample ttest the degrees of freedom are N 1 Q Where did our degree of freedom go To estimate the population standard deviation we need to know all of the scores and the mean that they are varying from ZfX M2 52 N 1 5 Q If we knew the mean of the distribution that these scores came from then the number of varying elements in these formulas would be the number of scores we have But we DON T actually know the mean of the distribution that our scores came from We just have one hypothesis that says it is a certain value and another hypothesis that says it is another value To have a mean to use in the formula we have to use the mean of our sample scores This doesn t actually add any information If you already know all the scores in the sample me telling you the mean of the sample doesn t tell you anything new you could have figured it out for yourself using the scores So if we considered the variability estimate to be based on N degrees of freedom we would be pretending like we have more information than we actually do 0 We actually have N 1 degrees of freedom because we used up one degree of freedom by using the sample scores to estimate the mean 0 Think of that 1 degree of freedom as the penalty that we have to pay for not knowing the true mean and having to estimate it Knowing the t distribution lets us figure out where to place the critical values to achieve the desired alpha value 0 You can get the critical values from a table or a computer program like R Q If you are doing a 1tailed test then you need a critical value that puts a proportion of t scores equal to alpha or in the tail This is the first three columns in your table 42024664a20245 f value t value 0 Choose the column that corresponds to the requested alpha value and go to the row with the correct degrees of freedom df If the df is not on the table go to the closest one Q The critical values get closer to zero as the df increases because the t distributions are losing their fat tails Q The table gives you the absolute value of the cutoff and you have to figure out the sign Use positive to test for an increase and negative to test for a decrease Q If you are doing a 2tailed test then you have to divide alpha or evenly between the high and low tail This is the second three columns in your table 6 4 2 0 2 4 6 tvalue Q Assumptions of the singlesample t test 0 The test assumes that scores are normally distributed at the population level If the scores do not follow a normal distribution then the distribution of t values under the null hypothesis might not actually follow a t distribution The test is usually very robust to nonnormality as long as the deviation from normal isn t really extreme such as a very pronounced skewAssumptions of the single sample ttest There are many bad methodological practices that invalidate the results of a hypothesis test You ll cover these Psych 241 but I want to highlight one that might not be mentioned there You have to plan the sample size before the study If you decide to add subjects later to try to make a result significant then you increase the chance that you will incorrectly claim to have evidence for the alternative hypothesis Q If you come across a ttest in a journal you ll probably see something like this Results indicated that people knew the names of significantly more guitar players than bass players t 18 411 p lt 05 The value in parentheses is the degrees of freedom The value after the is the t value for the test Let s talk about what that p means 0 Dependent samples ttest Q the next two tests that we will learn will be used to gain evidence that two samples came from different populations 0 for these tests the null hypothesis is that the two samples came from identical populations so they have the same population mean 0 which test we use depends on whether the two samples are dependent or independent 0 psych experiments can be divided into between and within subjects design 0 in a within subjects design every participant contributes data to all of the conditionssample Q withinsubject designs should always be analyzed with a dependentsamples test 0 The scores in the two samples are related because they come from the same people eg a subject with a high score in one condition will usually have a high score in the other condition too Q betweensubjects design each subject contributes data to one and only one of the conditions 0 betweensubjects designs are usually analyzed with an independentsamples t test but you need to watch out for cases that can create dependencies between the two groups even if they are separate people 0 Test for differences f knowing a score in one sample could not in any way help you guess the value of a score in the other sample then use the independent samples test This condition will always hold if subjects are sampled as individuals and randomly assigned to a condition lf knowing a score in one sample could help you guess the value of a score in the other sample then use the dependent samples test This condition will hold if subjects are sampled as couples or groups 0 Dependent ttest The dependent samples ttest is just a special case of the single sample ttest To conduct the test you first create difference scores by subtracting each score in one sample from its corresponding score in the other sample After this you do a single sample ttest to see if difference scores could come from a population with a mean of zero The null hypothesis is that the two samples come from identical populations which is equivalent to saying that the population mean of the difference scores is zero Class 1231215 Q If the lowpopulationmean condition is subtracted from the highpopulationmean condition then the alternative hypothesis is that the difference scores have a mean above zero If the highpopulationmean condition is subtracted from the lowpopulation mean condition then the alternative hypothesis is that the difference scores have a mean below zero 0 Confidence Intervals 0 To make a confidence interval we conceptually run a hypothesis test for EVERY POSSIBLE hypothesized population mean The confidence interval summarizes which hypothesized population means produce significant or nonsignificant hypothesis tests 0 Any potential population mean inside the interval has a nonsignificant result Any potential population mean outside the interval has a significant result So if a potential population mean is outside of the interval that means that our sample provides specific evidence against that value as the true population mean 0 95 confidence intervals use an alpha of 05 to classify results as significant versus nonsignificant 99 confidence intervals use an alpha of 01 You always use a 2tailed critical value for confidence intervals We want to rule out population means that seem too high OR too low for our sample Q If the hypothesized population mean is really close to the sample mean then you will get a nonsignificant test That is the test will not provide evidence against the null hypothesis 0 To construct a confidence interval we find the cutoffs below and above the sample mean where hypothesized population means switch from being nonsignificant to significant Confidence Intervals LCL lVl n3 x SM UCL M tcv x SM LCL lower confidence limit lJCL upper confidence limit M sample mean tCV absolute value of the 2 tailed critical value for the desired alpha level SM standard error of the mean estimated from the sample Confidence Interval on Difference lVlean LCL lVl tCV x SM M tcv x SM For a dependentsamples design use the exact same formulas to get a confidence interval on the difference between conditions M and SM are for the difference scores in this case 0 Reporting a confidence interval on the differencescore mean lets you test hypotheses for differences other than zero Again anything inside the interval would give a nonsignificant result if it was tested as the true difference between the two conditions in the population Anything outside the interval would give a significant result 0 When we test the hypothesis that the mean is the true population mean the probability of a significant result is alpha So the confidence interval will fail to include the true population mean with probability alpha and will include the true population mean with probability 1 alpha 0 The intervals are probabilistic not the population mean Each new sample has a new interval based on the mean and standard deviation of that sample The population mean does not change A proportion of intervals equal to 1alpha will include the population mean and the rest will not Q 99 confidence intervals are wider than 95 intervals The more stringent your criterion for calling something significant the farther your hypothesized population mean has to be from your sample mean to get a significant resuH This also makes intuitive sense If you guess a bigger range then you have a better chance of being correct Class 1332415 O Sm tells us how much sample means tend to vary from the population mean if Sm is low means that every sample means comes out relatively close to the true population mean a high SM means that some of the sample means can actually come out relatively far from the population mean 0 we use t scores to measure whether a sample mean is unexpectedly far from a population mean t scores far from zero mean the sample mean is farther from the population mean than we would usually expect to see we get t by seeing how far sample mean is from the population means and dividing this by how much sample means tend to vary from this value Sm Q a critical value is a cutoff we set of how unexpected a sample means has to be for us to say we have evidence against a given population mean 0 confidence intervals are narrower for larger samples 0 P values should be interpreted like alpha s 0 except that p values use the actual result that you obtained as the standard for significance 0 so a p value is what alpha would be if the critical values were set such 0 p values are how many of the scores fall outside of the critical value area on both sides added together Class 1432615 Class 154215 Q Independentsamplesttest 0 scores in the two samples are independent 0 null hypothesis is that the two samples come from populations with the same mean 0 alternative hypothesis is that the two samples come from populations with different means 0 even if the two samples come from the same population the sample means won39t be exactly the same 0 the two means are randomly selected samples 0 we need to know how big the difference in the sample means has to be to count as good evidence that the population means are different 0 to figure out if we have evidence that the population means are different we need to know what the distribution of differences between means looks like if the population means are actually the same take two random samples from the same population get the mean of each sample subtract one mean from the other repeat 13 a huge number of times and make a distribution with all the differences that you get central tendency mean 1 will be higher than mean 2 on a random half of the samples and vice versa on the other half so the mean of the distribution of difference between means uDlFFis zero uDlFF is a population mean its what the mean would be if you had the difference between means for every possible pair of sample from the population Shape will be normal if the samples are taken from a normal distribution of scores if the sizes for both samples are above 30 then the distribution of difference between means will be very close to normal regardless of the shape of the distribution of scores Variability here you have to find the population standard deviation of the distribution of differences between means odiff DIFF o opulation variance of the scores N sample size for samle 1 N2 sample size for samle 2 we use variance for this equation because in this case it is easier the more variable the scores are the more variable the DDBM for all the problems that we do we won39t actually know the population variability for the scores we will have to estimate this using the scores in our two samples you will be given a variability estimated from both samples that was calculated with the formulas that you learned in section 2 the estimate from each samples has n1 degrees of freedom drM f df lirg degrees ef fredem fer sample 1 er 2 MIN sample size fer sample l r 2 if the null hypothesis is true then both estimates are estimating the same population variance thus we combine or pool them to get one better estimate the combined estimate reflects the degrees of freedom in both estimate dme total degrees of free dem across both samples to combine the two variance estimates we take their weighted average S 39 39 E 512 d mT 55 SEPDDLED pooled variance estimate lemma degrees of freedom for sample 1 or 2 dfTDT tetal degrees of freedom 821mm variance estimate for sample 1 pr 2 if there are an equal number of scores in both samples N1N2 then the pooled is just a simple average of the two individual estimates with unequal N the larger sample has more influence on the overall estimate we want the pooled estimate to come out closer to the larger sample because bigger sample sizes give better estimates 52 S E 39 PGD LED P LED DfF39F r N1 N2 SDIFF estimated population standard deviation of the distribution of differences between means SEPOOLED estimated population variance of the scores N sample size for sample t N2 sample size for sample 2 After you find SZPOOLED you can use this to estimate the population standard deviation of the distribution of differences between means SDIFF Now we are ready to get a measure of how unexpected the difference between the means of our two samples would be if the null hypothesis was true because we had to use variance estimates this will be a tscore M1 Ms Moran t Sores hi1 j mean of sample 1 M2 mean of sample 2 pWF population mean of the distribution of differences between means if the null hypothesis is true This will be zero for all the problems you will do SDIFF estimated standard deviation of the distribution of differences between means assuming that the null hyiothesis is true The t value tells us how much the difference between means that we observed M1 M2 deviates from the difference that we would expect if the null was true leFF relative to how much sample differences tend to vary from the expected difference if the null is true SDIFF a tO means that your sample is something that would be totally expected if the null hypothesis were true compared to the comparison distribution but for all we know it might be consistent with the alternative it does not support either hypothesis below zero means the mean of sample 2 was a lot higher than the mean of sample 1 above zero means the mean of sample 1 was a lot higher than the mean of sample 2 is our t value unexpected under the null hypothesis to determine that we need a comparison distribution for t values the comparison distribution tells us that t values would be common where the distribution is high and rare where it is low the comparison distribution is a t distribution with DFtot degrees of freedom if the alternative hypothesis is directional you have to figure out whether you are looking for a positive or negative difference between means in this test you get to decide which condition is considered sample 1 and sample 2 for a directional test make sample 1 the condition that is hypothesized to have a higher mean which is not necessarily the one with a higher sample mean if you do this then you will always need to use a positive critical value for a directional test Class 16 4715 0 if the problem gives you two sample means M and two sample standard deviations S then you need to do an independent samples t test 0 if the problem only gives you one sample mean M and one sample standard deviation 8 then you either need to do a single sample t test or a dependent sample t test if the problem says something about difference scores than you use a dependent sample t test if not then its a single sample ttest Q If the scores are unrelated in any way then you use an independent samples t test 0 if there is any potential link between the scores across the two samples then you use a dependent samples t test 0 Which is better to do a within or a between subjects experimental design if the alternative hypothesis is true you are more likely to find a significant result in a withinsubjects design than a between subjects design this is because the within subjects design removes subject level variability the comparison distribution is less variable so it is easier to tell that a score away from the mean probably wouldn39t have been samples from it if the null hypothesis is true you are equally likely to find a significant result in a withinsubjects design and a betweensubjects design 0 Section 12 Q Banning NHST s the practice of using null hypothesis significance testing in science is as controversial as it is ubiquitous NHST s are a tool that works well for one very limited task checking to see if evidence for an alternative hypothesis is specific enough to meet some present standard of evidence there are many questions that NHST s just cannot address Class 17 4915 0 Bayesian Statistics 0 The goal of Bayesian Statistics is to estimate the probability that a claim or a hypothesis is true The Bayesian approach provides a method for updating the probability that a hypothesis is true when new observations are considered 0 if someone makes a claim and gives you some evidence to support it what do you need to determine if they are probably right 0 3 things Plausibility of the claim probability that the claimhypothesis would be true given what you know before you get new evidence Sensitivity of the evidence probability that the new evidence would be observed if the claim was true Specificity of the evidence probability that the new evidence would be observed if the claim was false 0 NHST s only make sure that evidence is specific that is why they cannot tell you the probability that a hypothesis is true or false you have to know two other things that a hypothesis test doesn39t tell you to figure Bayesian statistics attempts to consider all of the factors relevant to the truth of a 0 Claims are more likely to be true if they are more plausible supported by more sensitive evidence supported by more specific evidence these factors can override each other for example very strong evidence can show that even implausible claims are almost certainly true there is an equation for all of this Bayes Theorem NH gtlt PCD lIH Him 2 pCH gtlt MDIH Ji MWHD pCDlMHJ H 39 hypothesis D data 130 prehahility l 39 given m mat negates something or indicates that it is false 0 Probability that the hypothesis is true given the observed data H the selected unit contains something valuable in other words it is a winner D observation that the unit is climate controlled example used in class 0 Prior Probability the overall probability that the hypothesis is true given what you know already before new evidence is offered for or against the hypothesis 0 Likelihood the probability that the data would be observed given that the hypothesis is true 0 Numerator overall chance that hypothesis will be true AND you will observe the data 0 Denominator probability of the data the overall probability of observing the data regardless of whether or not the hypothesis is true Bayes Theorern pm swlsj ptHl gtlt swim mmm X pw mHHquot 13HlD This is the webability that the hypothesis is and you will observe the data It s exactly the same but H hypothesis is replaced with H net hypethesis Q The whole equation is saying out of all the times you observe evidence like this how many times is the hypothesis true 0 the prior probability that the hypothesis is false will always be 1 minus the prior probability that the hypothesis is true pHl 1 pH Problems will only give you pH and you can always gure pwH for yourself 0 Plausible means that pH is high and pH is low Q the prior probability is also based on evidence just evidence that you already knew about when you encountered the claim if you have no relevant background knowledge the prior probability is 5 for all you know the claim is equally likely to be true or false 0 Sensitive evidence means that pDlH is high the evidence offered is something you would expect to observe if the hypothesis was really true 0 Specific evidence means that pDH is low the evidence offered is something you would NOT expect to observe if the hypothesis was really false 0 What is good evidence a good way to measure strength of evidence is with a likelihood ratio LR this is the probability of having the evidence if the hypothesis was true divided by the probability if the hypothesis was false 7 D l H Q a likelihood ratio of 1 means that the evidence is irrelevant to the hypothesis the evidence is equally likely to be observed regardless of whether the hypothesis is true or false 0 a likelihood ratio above 1 means the evidence favors the hypothesiswith higher values meaning stronger evidence 0 a value of 2 means the evidence is twice as likely to be observed if the hypothesis is true 10 means 10 times more likely etc 0 a likelihood ratio below 1 means the evidence goes against the hypothesis with lower values meaning stronger evidence against 0 a value of 12 means the evidence is twice as likely to be observed if the hypothesis is false 110 means 10 times more likely etc Class 18 41415 0 NHST s only ever considered the probability that data would be observed assuming a particular hypothesis is true Q Bayesian statistics define the probability that the hypothesis itself is true given what you know 0 NHST s define the quality of evidence only in terms of its specificity Q Bayesian statistics define the quality of evidence in terms of its relationship between its sensitivity and its specificity even very specific evidence can be useless for supporting a claim if sensitivity is low Q NHST s funnel conclusions into two categories significant or nonsignificant Q Bayesian Statistics quantify the degree of support for a hypothesis on a continuous scale the probability that a hypothesis is true can be anything from 0 to 100 Class 19 41615 0 A what point do beliefs cross over to being facts never 0 if you allow for some degree of uncertainty in your beliefs they can change in two ways 1 you can become aware of new evidence that you did not know about before the new data can change pDH and pDlH both of these can change pHlD 2 you can become aware of new alternative explanations for the evidence that you did not know about before the new explanation can change pH pH and pDlH this can change pHlD 0 Section 13 Q Interpreting NHSTs Q Bayes theorem can help us make the best conclusion possible given the limited information available from a NHST Significant Nonsignificant Null Hyp Truel Type 1 Alpha Alt Hyp False Error NO Co ndusmn Alt Hyp 39l39ruei39 Detected Type 2 Beta Null Hyp False evidence Error Significant Nonsignificant Null Hyp True Type 1 Alpha Alt Hyp False Error Alt Hyp True Detected Type 2 Beta Null Hyp False evidence Error No conclusion Type l Alpha Error claiming to evidence for the alternative hypothesis when it is false Significant Nonsignificant Null Hyp True Type 1 Alpha Alt Hyp False Error Alt Hyp True Detected Type 2 Beta Null Hyp False evidence Error No conclusion Type 2 Beta lError failing to find evidence supporting the alternative hypothesis when it is true This is a missed opportunity but not actually a false conclusion we don t conclude anything from a nonsignificant result Significant Nonsignificant Null Hyp True Type 1 Alpha y Alt Hyp lFalse Error No CDHCIUS39D Alt Hyp True Detected Type 2 Beta Null Hyp False evidence Error No conclusion failing to find evidence for the alternative hypothesis when it is false This is the best we can hope for when the null is true lbecause NI lSTs can t nd evidence for the null Significant Nonsignificant Null Hyp True Type 1 Alpha Alt Hyp False Error Alt Hyp True Detected Type 2 Beta Null Hyp False evidence Error No conclusion Detected evidence claiming to have evidence for the alternative hypothesis when it is actually true 0 we want to figure out the probability that each hypothesis is true null and alternative 0 what three things do we need to know to do that the prior probability that the null versus the alternative hypothesis is true the probability of the test outcome if the null hypothesis is true the probability of the test outcome if the alternative hypothesis is true 0 Power the probability of observing a significant result if the alternative hypothesis is true to define power we must define the results we expect if the alternative hypothesis is true 0 Effect size 0 Cohen s d a standardized measure of effect size that can be used for any different variables 0 in the context of an independent samples t test it is the distance between the means of sample 1 and sample 2 divided by the pooled standard deviation of the scores in each sample 0 there are rough standards for small medium and large effect sizes in terms of Cohen s d small2 medium5 large8 0 factors affecting power effect size power is higher for variables that produce larger effect sizes sample size power is higher for larger sample sizes experimental design power is higher for within subjects designs than between subjects designs 1 tailed tests are more powerfull than 2 tailed tests but only if the effect goes the direction you think it goes alpha power is lower for lower alpha values 0 A reasonable middle of the road power estimate for Psychology is 6 this is the approximate power to detect a medium effect in a between subjects design with N40 per group and alpha 05 2 tailed 0 decrease your power estimate for smaller sample size increase it for larger 0 decrease it for experiments trying to detect smaller effects increase it for larger effects 0 decrease it for lower alpha values 0 increase it for within subjects design After we define power we can specify the probability of every possible outcome for a significance test Significant Nonsigniiicant Nlulll Hyp Truel V AIL Hypm Fame Alpha 1l Alpha Alt llyp Truequot E Nlulll Hyp False Power Power 0 Prior probability 0 if you can39t think of any good reasons why one hypothesis is more plausible than another then you can start with even priors pAlt 7 p rtllljl Q researchers are often motivated to find surprising results in this case it is reasonable to set the prior probability that the alternative hypothesis is true below 5 pAlt a 7 pNull I the more surprising the result the further below 5 pAlt should go Class 2042115 Q Bayes theorem is just dividing the chance the hypothesis is true and the chance you get the data by the total chance you get the data 0 we want to figure out the each hypothesis is true null and alternative how do we put together alpha power and prior probability together Class 21 4281 5 Q What affects conclusions changing alpha increases the specificity of evidence for the alternative hypothesis psiglNull decreases but it also increases the sensitivity pSiglAlt decreases the change in specificity is the most important That is why significant results give better support for the alternative hypothesis with lower alpha values 0 Chisquared tests 0 A chisquared test is a type of NHST 0 it is used to try to find evidence that two qualitative variables are related specifically it is used to find evidence that the proportion of scores in each response category on one variable are different based on the value of the other variable 0 The standard way to display a relationship between two qualitative variables is with a contingency table a contingency table shows the number of scores at each combination of the levels of multiple variables For example imagine that we sample 3000 people before tlu season and randomly assign them to either get a flu vaccine start taking vitamins or do nothing control We track them over the flu season andl score whether or not each person got the flu Group Control Vaccine Vitamin Outcome Didn t get flu Got tlru It there is a relationship between the two variables at the population level then the proportion of people who do and do not get the flu will be different for the controt vaccine and vitamin groups Group Control Vaccine Vitamin Outcome Didn t get flu 90 95 00 Got ftp 10 05 10 It there is no relationship between the two variables at the population level then the proportion of people who do and do not get the flu will be the same tor the control vaccine andl vitamin groups Group Control Vaccine Vitamin Outcome Didn t get flu 90 V 00 90 Gotiltlltl 10 10 10 Q Null Hypothesis the two independent at the population level 0 Alternative Hypothesis the two variables are related at the population variable Of course we cannot just look at our sample and assume that it perfectly reflects the population Random sampling is going to move the sample numbers around some We might see something like this with our sample of 3000 people is this evidence that the two variables group and outcome are related Group Control Vaccine Vitamin Outcome Didn t get fill 900 STE 8192 Got flu 100 3390 108 To answer this we define the results we would expect to see if the variables were independent and we measure how much the actual results deviate from this expectation Gr ou Observed p Control Vaccine Vitamin Outcome Didn t get flu 900 970 892 Got flu 1 00 30 1 08 The expected frequencies are generated by getting as close as we can to the observed frequencies without allowing for any relationship between the variables Group IControl Vaccine Vitamin Outcome Did n t get flu 9206 9206quot 92067 Got flu 39 7933 7 93 933 Q We measure how much the observed and expected frequencies deviate from one another with a statistic called chisquared XA2 39 O Observed frequency E2 Expected frequency 0 if we take lots of random samples from a population with independent variables and compute Xquot2 for each one and we get a Xquot2 distribution Q If our sample XA2 is atypically high relative to the distribution we expect for variables with no relationship then we get a significant result a significant result means that our sample provides evidence that the variables are related in the population 0 if our sample XA2 is fairly typical for the distribution we expect for variables with no relationship then we get a nonsignificant result a nonsignificant result means that our sample failed failed to provide evidence that the variables are related in the population 0
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'