Statistics for the Behavioral Sciences Week 2 Lecture Notes (2/2 and 2/4)
Statistics for the Behavioral Sciences Week 2 Lecture Notes (2/2 and 2/4) PSYCH-UA 10 - 001
Popular in Statistics for the Behavioral Sciences
PSYCH-UA 10 - 001
verified elite notetaker
Popular in Psychlogy
This 6 page Class Notes was uploaded by Julia_K on Saturday February 6, 2016. The Class Notes belongs to PSYCH-UA 10 - 001 at New York University taught by Elizabeth A. Bauer in Spring 2016. Since its upload, it has received 50 views. For similar materials see Statistics for the Behavioral Sciences in Psychlogy at New York University.
Reviews for Statistics for the Behavioral Sciences Week 2 Lecture Notes (2/2 and 2/4)
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 02/06/16
Course: Statistics for the Behavioral Sciences Professor Elizabeth Bauer Lectures Three (2/2) AND Four (2/4) Lecture 3: Measures of Central Tendency and Variability Feb. 2 , 2016 Central Tendency – the average for a group; the point in the middle of the distribution. 3 things to know about the Central Distribution: 1. Center – it’s not helpful to just know the center because it only shows the average not the variability among all the digits. 2. Spread 3. Shape Center (Measures of Central Tendency) Looks at the mean, median, and mode of a group 1. Mean: Formula for population mean Formula for sample mean: Formula for weighted mean: (use this when you don’t have equal sample sizes and are affected by outliers). 2. Mode a. Not very reliable b. Helps to distinguish multimodal from unimodal distributions c. Multimodal – more than 1 mode (can be due to overlooked factors among variables) d. Can be used with any measurement scale. 3. Median a. The middle score b. 50 percentile (2 Quartile of a distribution) c. Can be used with undeterminable scores and openended categories (ex: kids ages 1020) d. The median CAN be found for Ordinal data if the values are placed in some significant order (smallestlargest, etc.) e. Resistant to outliers [only counts the middle 2 numbers]. Measures of Dispersion from the Central Tendency 1. Range : [highest score – lowest score] a. NOT useful in higher mathematics b. Affected by outliers and movements in distribution 2. SIQR (semiinter quartile range) a. Based on percentiles (Used in conjunction with the median. Q2 = median). b. SIQR is half of the Interquartile Range [IQR is Q3 – Q1. The SIQR is (Q3 Q1)/2] c. SIQR shows distance of a typical score from the median (but doesn’t account for all the scores in the data). Splits values into mini ranges from the median and then finds the mean of dispersion. Avg. of distances from median. d. Resistant to outliers (scores could be dispersed further away without affecting the SIQR. This isn’t good because that data would be left unaccounted for.) 3. The Mean Deviation a. If you have 3 scores, the md of all 3 of them adds up to 0. b. The average of the absolute value of deviation scores. (Takes distance of every score from the mean and averages those distances). In more simplistic terms: Average of how far the scores are away from the mean. Avg. of distances from mean. c. Unlike the SIQ, it takes into account all the scores from the center. d. Formula: (has absolute value b/c we’re measuring distance) ∑|X i – µ | N 4. The Variance: a. Sum of Squares (SS) – squaring the deviations from the mean, and then adding these squares together. SS = ∑ (X i– )^2 2) b. Dividing the SS by N (total # of scores) yields the population variancσ (. It is also known as the Mean Square, or MS, because it’s the mean of the squared deviations from the mean. σ2= MS = Pop. Variance = SS/N = [∑ (Xi – )^2]/[N] The Standard Deviation: i. How spread out the data is ii. Taking the square OF the variance ^ gives you the population standard deviation. Symbolized by σ. σ = square root of (SS/N) iii. Not good for dealing with only a few extreme scores (use Mean Deviation or SIQR if there aren’t that many extreme scores). iv. Adds more weight to large scores huge effect on variance v. Used in advanced statistical procedures. The Variance of a Sample: i. S^2 = sample variance ii. When you’re given a sample of scores from a larger population, and you want to use that sample description to estimate the entire population, use this formula: (Use “n1” so you don’t underestimate the population variance when you find it. This way, infinitely many sample variances can be calculated, and their average will equal the population variance. This makes it an unbiased sample variance.) iii. When you’re finding the standard deviation of a sample, you should take the square root of the unbiased sample variance. The formula, therefore, is called the Unbiased Sample Standard Deviation. This chart lists the definitional formulas used when working between population and sample variances/standard deviations. You can see that the variances are squared whereas the standard deviations are square rooted. Degrees of Freedom # of scores that are free to vary (n1). Shows the pieces of info you have about variability. Why is it n1? Here’s an example: Let’s say that the N amount of scores is 3. Since the deviations of these 3 scores ALWAYS add up to 0, that means that if you know 2 of those 3 deviations, then the 3 one is determined. But those 2 would be free to vary. Therefore, the scores that are free to vary are n – 1 (3 – 1 = 2 scores are free to vary). Skewed Distributions: If the majority of the scores are grouped on one side of the scale, you call that a skewed distribution. Two types: 1. Positive – limit is at the bottom of the graph (flooreffect) 2. Negative – limit is at the top of the graph (closest to the hump of the score group – ceiling effect). The median is most reliable when looking at pos/neg skewed distributions because it better represents the majority of the population. Central Tendency of a Skewed Distribution: In a symmetrical distribution, the median and mode are equal. When the mode changes and the graph is skewed, the mean is pulled in the direction of the skew (adding/removing scores affects the mean). The median, however, is more robust – it shifts when you add/remove scores, but the shift usually compensates for the added scores. So it’s not greatly affected by the skewing. (Also, the median always stays between the mean and mode). Look at the graphs on pg. 71 for more info. Variability of a Skewed Distribution: The SIQR is usually best and used in conjunction with the median (outliers don’t affect it). Box and Whisker Plots: Visual of a distribution. Created using resistant statistics (not affecting outliers). The hinges are the top and bottom sides of the box, the horizontal line in the box is the median, the whiskers are the skews (shows whether it’s a positive or negative skew), the inner fences represent a limit to the whiskers; this limit is called an adjacent value. Dealing With Outliers: Trimming – delete some scores from top and bottom of distribution; but after you calculate the mean becomes trimmed mean (can’t use in higher level stats). Winsorizing – replacing certain percentage of extreme scores. Replace with extreme scores you’re willing to accept (scores that won’t drastically impact statistics). Data transformations – they’re fair; can only do this if goal is to make distribution symmetric, or when dealing with outliers. Lecture 4: Standardized Scores and Normal Distribution Feb. 4 , 2016 IQ scores are normally distributed with a mean of 100 and standard dev. of 15. An IQ of 130 is 2 standard dev. above the mean. Finding the Z Score: John has 3 midterm grades: Standard score (Z score) says how many standard deviation scores above/below the mean is your score. Positive = above and Negative = below. Measure of location in relation to other people’s scores. Allows you to compare different distributions. z X Formula: Psychology: John scored ½ a standard dev above the mean. Mathematics: z score is 0 (mean and score are the same) Geology: John scored ¾ standard dev below the mean Using the Z score to find the Raw Score: Properties of scores: 1. The mean of a complete set of zscores is 0 2. The standard deviation is 1 3. Going from raw to Z does NOT change the shape of distribution. Why? Because the Z score is called a linear transformation (can add/sub/mult/div by a constant). Doing this to all your scores does NOT change the shape of distribution. Finding the T Score (given the Z score, raw score but wanting to use new distribution Tscore z conditions such as the mean/standard dev): (new mean) Normal Distribution: All have basic bell shapes and are symmetric Can have different means or standard deviations, but since they have the same shape that means that a zscore will fall in the same relative location for different distributions From values z scores we can visualize a probability of likelihood on the graph. We can specify a proportion of the distribution that is above/below a certain score. The area is the probability of a particular event.
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'