Statistics for the Behavioral Sciences (Week 1 Notes)
Statistics for the Behavioral Sciences (Week 1 Notes) PSYCH-UA 10 - 001
Popular in Statistics for the Behavioral Sciences
PSYCH-UA 10 - 001
verified elite notetaker
Popular in Psychlogy
This 5 page Class Notes was uploaded by Julia_K on Saturday January 30, 2016. The Class Notes belongs to PSYCH-UA 10 - 001 at New York University taught by Elizabeth A. Bauer in Spring 2016. Since its upload, it has received 311 views. For similar materials see Statistics for the Behavioral Sciences in Psychlogy at New York University.
Reviews for Statistics for the Behavioral Sciences (Week 1 Notes)
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 01/30/16
Course: Statistics for the Behavioral Sciences Professor Elizabeth Bauer Lectures One (1/26) AND Two (1/28) Lecture 1: Introduction to Psychological Statistics Jan. 26, 2016 Main question of statistics: How do we accurately count/measure a variable? Rosenthal and Jacobsen operationally defined this problem they made it testable. In 1968, Rosenthal and Jacobsen experimented with the Pygmalion Effect; they wanted to see whether higher expectations of students would actually lead to higher IQs. The students labeled as “Bloomers” had a 16.5 increase in IQs, whereas the others had a 7 increase. The basic problem: Was this increase due to the labeling or due to chance? (chance means that certain probabilities could always affect variability/results) Descriptive vs Inferential Statistics. Descriptive Stats describe the given data. Inferential Stats take the descriptive data and apply it to a bigger population. (Rosenthal and Jacobsen’s experiment exemplified this.) *Note: Some problems with statistics 1. Group averages don’t always measure individual progress. 2. Variability plays a role. (Ex) an average score of 85 can be b/w 70 and 100 or 80 and 90). Each range implies something different about the average score. Population vs Sample Population parameters A population is the entire group you are interested in. Parameters are numbers calculated/measured from a population. Noted in Greek letters) Sample statistics A sample is a subset of a population (should be random and representative of the population). Samples of convenience are samples that simplify the experimental process but aren’t always generalizable. Statistics are numbers calculated from a sample. *Researches use sample statistics to estimate population parameters (start small work their way up). Scales of Measurement (Organizing Data) : 1. Nominal a. Categorical or qualitative data b. Values are names (dogs, cats, political party, religious affiliation, etc.) c. Commonly displayed on bar graphs 2. Ordinal a. Values have a numerical/specific order b. *Spaces between ordinals may NOT be the same [ex: on an anxiety scale test, the difference between 8 and 9 is not the same as the difference between 4 and 5] c. Value examples: class rankings, smallmediumlarge 3. Interval a. Values have a numerical/specific order b. The intervals between adjacent values are equal (unlike Ordinal data) c. Value examples: degrees F or C 4. Ratio a. Order matters, intervals are equal, but values have 0 points of absence. This means a characteristics is always present, and can be scaled on a ratio. Comparing ratio to interval: take Celsius and Kelvin for instance. Both have a value of 0 degrees. But Celsius can’t be a ratio measurement because 0 degrees C does not indicate an absence of heat. It only indicates a freezing point of water. For this reason, you can’t say that 40 degrees C is twice as hot as 20 degrees C, because that implies 20 degrees is less hot than 40. Kelvin is different. 0 degrees Kelvin actually indicates an absence of heat, so it can be used on a ratio scale. Continuous vs Discrete Scale 1. Continuous – infinite number of values; no gaps between adjacent values (ex: time, weight, height) 2. Discreet – generally means there’s a finite number of values; there are gaps in between adjacent values (ex: students in a classroom) Independent vs Dependent Variables 1. Independent – the variable being manipulated/observed in an experiment 2. Dependent – the variable used to assess whether the independent variable makes a difference. Assesses any varying results. Parametric vs Nonparametric Statistics 1. Parametric – rely on parameters, their distributions, and calculations (ex: deviation from a mean) 2. Nonparametric – rely on inferential and descriptive statistics rather than parameters. Experimental vs Correlational Research 1. Experimental – using variables to test out a hypothesis. 2. Correlational – data working with predetermined groups (ex: someone’s year in college, date of birth, etc.). Correlation does not imply causation. You can’t directly attribute a result to a predetermined fact (ex: Group A is less technologically adept because those participants were born before 1965). Summation Notation N N=Sample size (number of participants/people being studied) ∑ X X = variable i=1 i = the given score we’re talking about Ex) X: 3,2,5 Y=4,1,6 ∑ X=10 ∑ X^2= (3)^2+(2)^2+(5)^2 = 38 ( ∑ X)^2 =(10)^2=100 Summation rule 1A: ∑ (Xi+Yi) = ∑ Xi + ∑ Yi Summation rule 1A: ∑ (XiYi) = ∑ Xi ∑ Yi Summation rule 2: ∑ C = NC C=sample amount Summation rule 3: ∑ CXi = C∑ Xi Summation rule 4: ∑(XiYi) ≠ (∑Xi)( ∑Yi) Instead…∑(XiYi) = 3(4)+2(1)+5(6) = 44 Lecture 2: Frequency Tables, Graphs, and Distributions Jan. 28, 2016 Q: How would researches conveniently organize large amounts of numerical data? A: Methods of Distribution, including Sample Frequency Distribution, Graphing, Frequency Polygon, Group Frequency Distribution, and the Stem and Leaf Plot Scenario: Suppose a researcher sends out a stress questionnaire asking 151 students to rate their stress level on a 010 scale. With the varying amount of students and stress scores, how can we find the frequency of each score? 1. Sample Frequency Distribution (SFD) a. Shows the frequency of each score b. Has 2 columns : The left column is the array (scores listed down from highest to lowest) and the right column is the frequency of each score (how many students picked each score). c. *Always list all scores on the array, even if they have a frequency of 0* d. Once everything is listed, the sum of the frequencies should equal the total number of students (N). So, ∑f =N. e. The SFD can help identify the Cumulative Frequency (cf), or the number of scores at or below a particular value. Ex) The cf of score 1 = the frequency of that score (3 people) + the frequency of the scores below it (0 people). So, cf(1)=3. Note: To see how much one score is higher than the rest on a chart, you would find the listed score, go straight across to the cf column, and then go one score down. That shows how many people picked a lower score, and this goes for finding the “cumulative relative frequency” as well, mentioned later. f. The cf can help identify the Relative Frequency (rf), which converts the frequency into portions out of the whole (ex) .08 of the group chose a score of 7) So, rf=(F/N) g. The rf helps identify the Cumulative Relative Frequency (crf), which shows what fraction of the scores are lower than a particular value. So, crf = (cf/N) h. Finally, we can find the Cumulative Percentage (cpf) by doing: [crf x 100]. This tells us the exact score percentage. The Percentile Rank would show % of scores at or below a particular value. Note: To find this on a chart, look at a given score, then go straight across to the number listed in the cpf column. 2. Graphs a. Bar graph – height of the bar indicates frequency of occurrence b. Bar graphs are used for more discreet and vague data because the bars don’t touch. c. Bar graphs can be used for nominal data. d. Histograms – bars touch (implies a continuity, a variable such as height). They show the real limits of a number. 3. Frequency Polygon a. Connects the highest values of the histogram. b. On a graph: drawing a dot on the highest value of each bar on a histogram, and then connecting those dots. c. Cumulative Frequency Polygon (Ogive Shape) Cumulative Frequency always slopes up or stays constant. It can’t go down because it only measures what values are equal to or below the given value. The xaxis shows the upper real limits of the value. What are upper real limits? When you graph the value 5, for example, you want to make sure you’re covering all the possible frequencies of that value, so your “upper real value” measurement becomes 5.5. d. Cumulative Percentage Polygon – where on the graph does the percentile (in between 2 values) fall? 4. Group Frequency Distribution a. Works similarly to the SFD method. b. This method is used when the data is too varying, so you would group them into intervals/classes. With this method, specific individual scores are lost. c. Apparent limits (20) vs Real limits (20.5) d. Interval width (i) = how wide the group is, or the range. The range = Upper real limit – lower real limit e. # of intervals = (range)/(interval width) f. A histogram is used to graph this 5. Stem and Leaf Plot a. These plots are continuous and don’t lose the score data, rather they plot them all out. They are grouped by the same initial first digit, so 12, 13, and 14 would all be grouped in one line. b. Left side = stem (first digit of score) c. Right side = leaves (all numbers following the first digit)
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'