Popular in Introductory Statistics for the Behavioral Sciences
Popular in Psychlogy
This 10 page Study Guide was uploaded by Amanda Huang on Friday October 9, 2015. The Study Guide belongs to 31 at Tufts University taught by Lara Sloboda in Fall 2014. Since its upload, it has received 21 views. For similar materials see Introductory Statistics for the Behavioral Sciences in Psychlogy at Tufts University.
Reviews for StatisticsExam1.pdf
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 10/09/15
Provides Nominal OrdilnaI Interval Ratio Counts aka Frequency of Distribution Mode Median The order of values is known Can quantify the difference between each value Can add or subtract values Can multiple and divide values Has true zero t 5 I v39 Absolute zero f Distance is meningful Attributes can be ordered Attributes ere onlyIr named weakest R K K K Introduction to Statistics 0 Statistics techniques used to summarize data in order to answer ques 0 Variables Characteristics measured on w 0 Population a large group of cases a researcher is interested in studying gt HOWEVER it is difficult to study all the people in this category so they research a subset of the population Sample gt If the characterizes a sample its is a statistic M if the characterizes a population it is called a parameter p 0 Descriptive Statistic statement about a set of cases reducing data to some meaningful value 0 Inferential Statistic sample of cases to draw a conclusion about the large pop Frequency distribution is a count of how often the values of a variable occur in a set of data 0 Ungrouped how often each value of a variable occurs in a data set 0 Grouped adjacent group of value or intervals of the variable used when the there are a large of values Discrete no in between values BAR GRAPH vs Continuous could have fractional values HISTOGRAM or FREQUENCY POLYGON EXPERIMENTAL DESIGNS 0 Tables gt Collated data collected during research gt Presents descriptive and inferential statistics gt Stand alone from the text text should accompany the table gt Steps to making tables I Summarize data I Presenting exact values I Means standard deviations probabilities I Inferential statistics t test ANOVA results 0 Graphs effective method of analyzing and representing scientific data gt Describe relationship of 2 variables histogram only one gt Types of Graph I Scatter Plots relationship between two scale variables causal 0 Organize your data so that each participant has ne score for each variable 0 Identify the IV and DV 0 Label the abscissa x axis with the IV 0 Label the ordinate y axis with the DV 0 Could be linear or non linear relationships Line Graphs Line of best fit in a scatterplot 0 shows linear and linear trends Timeseries time plot one of the variables is time dots connect in a line NOT a best fit line Bar Graphs when one variable ID on X aXis is nominal and other Dependent on y aXis is scale than a bar graph would be best 0 Histogram when you are given only variable data 0 Put all date in order 0 Make a table to tabulate the frequencies 0 Put data into a bar graph HISTOGRAM or a frequency polygon gt Frequency Polygon frequency is on the y aXis values are on the X aXis gt add an interval to either side of data so the polygon ows gt use the midpoints Pictorial Graphs not the best to use Pie charts a graph in the shape of circle w a slice for each category represents proportion BOX and Whisker charts 5 number summaries min lower quartile median upper quartile maX as well as outliers I 0 Shows diff in populations 0 Denotes dispersion and skewness Lewer Duer le Upper Duertile Upper Hedjan Extreme Whiskers ll O 15 IQRinterquartilerangeSD 10 en 30 ee 50 an m 39 93 100 etc 0 outliers 0 tomake 0 find median of a set of data Q2 9 Find lower and upper quartiles Q1 amp Q3 this is the box 0 Find min and maX scores these are whiskers Pareto Chart bar graph based on y aXis values largest y aXis value is on the left smallest is on the right 0 Few vital factors cause more of the problems helps see 8020 rule 0 8020 rules in relationships social circle of friends Estimate amount of time and energy you invest and compare to amount of stress and satisfaction 0 Scatter Plot or Line Graph gt Do you want relationships or predictors Scatter plots show relationships between data points line could be helpful to show relationship Line show a predictor line or line of best fit that can predict y from X or DV from IV 0 GUIDELINES gt Title Label axes No abbreviations Meaningful terms Units of measurement Axes go to zero or Shades of gray No chart junk distracting appearance of vibration and movement grid lines 0 Central Tendency descriptive statistic or parameter that best represents the center of a data set the particular value that all the other data seem to be gathering around gt Modality peak skewness how symmetric kurtosis how peaked or at gt 3 measurement Median Place holder I Mean arithmetic avg of a group of scores oo5 0 Problem w means is that they are very highly affected by outliers 51 0 Deviation score X M 705 26 I Median actually middle of a group splits data in 2 62 0 If large L use Median Place holder 705 315 I Mode most common score f Measurement Scale Best measure of middle 0 Nominal 0 Mode 0 Ordinal 0 Median maybe mode Positive skew 0 Scale 0 Symmetrical Mean 1 1 0 Skewed Median Mode I l l I I I X 0 Variability numerical way of describing how much spread is in a distributiosn Range IQR Variance SD gt Range measure of variability in calculation I Limits to a range 0 includes the outliers and patterns within range are unknown gt Interquartile Range Mid spread Middle 50 I IQR Q3 Qlremoves the effect of outliers gt Variance Mean squared deviance score I Avg of the squared deviations from the mean I When something varies must deviate from some standard usually mean I Depending on size you can get a sense of the dispersion from the mean Number Use Mean Variance SD Statistic Sample M SDZ s2 or MS SD or 5 Parameter Population u 02 0 whole population Variance POPULATION W E F E 2EX m 33L leli 133 311 1 0 Larger the sample gets the correction difference is less significant 0 Variability gt Range interquartile range variance standard deviation I SD 0 square root of the variance 0 Typical amount that each score varies from the mean 0 OFTEN more meaningful than variance as variance can be really large 0 Larger value of standard deviation less clustered around the standard line mean Understanding SDs How tightly are clustered around the mean 0 1 SD the mean accounts for 68 of data 0 2 SDs account for 95 0 3 SDs account for 99 0 in a SD the mean median mode are the same value Population data has more variability sample data includes a 5 H r W p the correction value which moves it closer to the population FL 5 36 y 7 um 15 l4 m air 4 21 p i 3639 data H a p 1954 I Z SCORES Standardizing scores within a distribution Normal Distribution characteristics of the Normal probability Distribution 0 the distribution is symmetric and if often illustrated as a bell shaped curve 0 Two parameters m mean and 0 standard deviation determine the location and shape of the distribution gt The highest point on the normal curve is at the mean which is also the median and mode 0 The standard deviation determines the width of the curve larger values result in wider atter curves 0 The total area under the curve is 1 5 to the left of the mean and 5 to the right 0 Probabilities for the normal random variable are given by areas under the curve gt Curve goes from negative inf To positive inf 0 The probability that X is greater than 1 equals the area under the normal curve bounded by a and plus infinity Not shaded Examples of Normal Distribution 0 Height and weight annual rain fall in inches test scores llfai A marina 0 Standardization converts individual scores to standard scores for i i which we know the mean SD and percentiles a i E 0 Standard normal a normal distribution mean of 0 SD of l gt Use standard normal to calculate Z scores the of SD a score is away from the mean 0 Taking a normal curve and any random variable that could be only value within the X aXis gt score below the mean negative Z score gt score above the mean positive 2 score Ztable table w distribution of cumulative probability for each 2 score Z scores allows through standardization to compare scores from diff distributions process of changing scores from any normal distribution to a standard normal from any mean and SD to a mean of 1 and SD of O gt Gives us probabilities associated w ranges of scores in a distribution gt Give us percentile rankings of scores within a distribution gt Allow us to easily compare two diff scores on two diff distribution that otherwise may not be comparable 0 Why use z scores Z scores give the area under the curve and a means of finding the probability for any range of values for a random variable 91 a 1 2 3 4 58l 95 99 FL 039 Standard Dalmatian Inferential statistics use sample data to make general assumption about the population 0 Make decisions based on probability statistics Du m39liilivL Statistics IELEETEJIIEEIJ E Lali l39LCS 0 Process of Inferential Statistics 4 l gt 1 Create Hypothesis 2conduct research 3 analyze data 4make a decision mammal gt Experimental Design I Control group level of the IV that does not receive the treatment of interest I Experimental group receive treatment or variable of interest 0 Stating hypotheses 9 Null Hypothesis H0 that which you are not trying to find no diff between groups 9 Alternative Hypothesis H1 or Ha there is a difference in a specific direction or not I Reject or Fail to Reject 0 REJECT 9 I reject the idea that there is no mean diff between groups CORRELATIONAL DESIGNS Measured observed as 2 numerical values naturally occur gt Find relationships between variables BUT NOT cause or effect 0 Must find evidence to reject H0 beyond burden of proof 0 Do not accept H1 0 FAIL TO REJECT 9 Fail to reject the idea that there is no mean difference between groups Type I and Type II errors 0 Reject the null hypothesis when if fact the null is true alpha error usually 5 0 Fail to reject the null hypothesis when the null hypothesis is false Beta error 20 Defendant Defendant Innocent Guilty Reject presumption of innocence Type 16fr0r guilty Fail to reject presumption of innocence Type H 1T0f C ORELLATI 0N IS NOT CA U SA TI 0N not guiltY gt Sometime a third variable confounding variable caused BOTH X and Y 0 Quantifiable gt Correlation coefficient statistic that quantifies a relation between 2 scale variables Can be or shows the direction of the relationship ALWAYS falls between lO and 10 Magnitude of coefficient NOT its sign indicates how large strong the relationship 0 Both lO and 10 are perfectly correlated Positive as X increase y increase 0 Relationship between 2 variables such that participants with high scores in one variable tend to have high scores in another variable Same for vice versa Negative as X increases y decreases 0 An association between 2 variables in which participants with high sores on one variable tend to have low scores on another variable gt DOES NOT indicate where or no there is a relationship It indicates the DIRECTION of the relationship If no linear than he correlation would be zero but this does NO mean that no relationship eXists just no linear relationship eXists gt Correlation is NOT Causation 0 Learn how to quantify a relation when the data rate is linearly related gt EX Scatter plot of data is best described by a straight line this many not always be obvious Think of 3 reasons gt gt gt X cause Y Y cause X Q cause both X and Y Pearson Correlation Coefficient most popular of several correlation coefficients summarizes the strength of the linear relationship between rwo variables c gt gt gt Statistic that quantifies a linear relation between 2 scale variables Used when overall relationship indicates a straight line Can be used as a descriptive statistic value of r describes a relationship between 2 values Can be used an inferential statistic can conduct a hypothesis test to see if r differs from zero or no relationship r n2 XyZ XMZ h i R lnz za2 lnzV zta2 Degree degree 3 which x y vary together which x y vary separately 3 Pearson s s r EX Husband and wife asked to independently give their ideal of children in a family n2 XyZXHZM Z if XXV nZy2 ZV2 r 6 32 1214 J628 144 16 42 196 192 168 1168 144 1252 196 24 r 490 748 24 392 3665 r 065 0 CORRELATION AS AN INFERENTIAL STATISTICS gt gt Df degrees of freedom number of scores that are free to vary when it is correlation than there are 2 scores that make up a correlation Pearson r with dfn2 4 where n is the of data points or pairs of x and y I Is your correlation value rare enough that it exceeds in either direction some critical value 0 Small data needs larger correlation value to be more than chance alone I Our robserved value does not exceed the rcritical value seen on the table therefore it is not a rare or significant relationship NR correlation When there are a very small amount of scores then an outlier can drasticallv affect the results Weak correlation score much less of the variation in one variable is accounted for by the relationship between the 2 variables Strong correlation score much less of the variation in one variable is accounted for by the relationship between the 2 variables E corresponds to the of variation in one variable that is associated w or is predicted from or explained by the relationship between the 2 variables NOT CAUSAL I ex Correlation between height and weight is 075 r2 056 0 Interpretation 0 56 of the variable in height is explained by or predicted from the relationship between height and weight 9 44 is not associated w height it is due to some other factors I ex Correlation between performance on 2 exams is 0 r2 0 0 Interpretation 0 no variation in the performance on the 2 exam is predicted on the basis of performance on the 1st exam 0 performance on the 2 exams is totally unrelated TYPES OF REGRESSION MODELS l explanatory variable Simple Linear and Non Linear and 2 explanatory variables Multiple Linear and Non linear gt Prediction vs Relation w data we can look at relationships in scatterplots I Regression Equation yields the best fit line to the data which is called least squares prediction line or the least squared regression line 0 Can now make precise predictions about the value of one variable y from the value of the other variable x O ALWAYS predicting y from x since you are minimizing the variation along y axis V 3 1 0 0 Positive correlation scores above the mean positive Z scores predict scores above the mean Where 0 Negative correlation scores above the mean positive Z scores on one 5 factor would predict scores below the mean negative Z scores on the b y other factor 5X 0 Y a b X Can be used for finding predicted values of Y for exact values of X a Y O Y is predicted value 0 B slope of the regression line 0 A y intercept when xo O S Standard Deviation Only use regression when an examination of your scatterplot suggest it is OK to proceed Family X drivers Xquot2 Y cars Yquot2 XY 25 4 16 20 25 15 mmmUnmgt I IUJNNU39I AI IkoAb I INNNUJ AHA54gt AI ImAb 2 mx 2857 sx 1574 sum 20 2 my 2286 sy 0951 1 sum 16 sum 72 sum 42 sum 54 i pretest a39 pusttest XXVHM Unzxz mz an 222 754 2016 r W 72 400 4142 256 378 320 r 410 4 4 58 r 101986164 r 092 b r y s X b 092 056 15747 a 7 b a 2286 056 2857 069 Pab0 P 069056 X
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'