Week 1, January 24th, 2018
Theory of Science
● Only way to get to the truth. Methodology.
● Example: Aristotle’s book of problems: hair!
Characteristics of Science
● Body of knowledge.
● Understanding of natural world.
● Inductive by forming principles from observations.
● Deductive that principles are tested with new observations.
○ The more precise, the better!
● Science can be simple or complex.
● Experience helps with understanding, but is not necessary.
● Medicine is not a science, because its primary goal is treatment.
If you want to learn more check out Why is sustainable agriculture important to practice?
● Math can prove things. Science cannot.
● Premises (AKA “axioms” or “assumptions”) → logically valid conclusions. ● Example: 1) all men are mortal. 2.) Socrates is a man. 3.) Socrates is mortal. ● Advantages: if premises are true and reasoning is sound, conclusions are certain. ● Disadvantage: nothing new can be learned! You are basing conclusions off what you already know. Don't forget about the age old question of What is fauvism?
● Independent observations → general rules/laws/principles.
● Advantage: new info!
● Disadvantage: that info could be wrong.
○ Example: 2, 4, 6, 8. What is the rule?
■ Could be, “every even number”. But what about “every rational
number”? Or “every number below 100”?
○ Example: the Aztecs sacrificed someone nightly so the sun God would bring back the sun. We know the sun will come back regardless of a sacrifice
happening or not (because of Earth’s rotations).
● Black swan effect. One falsification can falsify entire conclusion.
○ Knowledge gained by induction is inherently uncertain! Must be tested under new conditions. Things that haven’t happened doesn’t mean it can’t.Don't forget about the age old question of What is chronic traumatic encephalopathy?
Week 2, January 29th, 2018
● Stats is based on it!
● Deriving things from first principles.
● Example: Sally Clark story. Two of her children right after birth. British pediatrician testified against her because chances of two kid dying from SID is 1/73 million.
○ Should she be prosecuted?
What is probability?
● “Weight of empirical evidence”. There is technically no consensus on what it is specifically. ● Objective (Frequentist): relative frequency in the long run.
● Subjective (Bayesian): degree of belief.
● Likelihood/plausibility of an event happening.
● Quantified as a number, 0 to 1.
○ Probability space for mutually exclusive events.
○ Example: you can be either pregnant or not. No in between. Don't forget about the age old question of What opera style was most popular in the 19th century?
● In-class activity: Sophia and med school example.
● Premise 1) if A is true, then B is true.
Premise 2) it is raining.
Conclusion) the street is wet.
● False! There could be an awning!
● Deductive reasoning is not useful in practice.
● Makes logic useful (if fallible).
● If A is true, then B becomes plausible. If B is true, then A is less plausible.
● If events are mutually exclusive:
Don't forget about the age old question of What are the symbols for reagents?
→ p(A or B) = p(A∪B) = p(A) + p(B)
→ ∪ = union!
● If not mutually exclusive:
→ The highlighted region!
→ P(A or B) = p(A) + p(B) - p(A and B)
→ Subtracting the intersection avoids double counting.
● If events are independent of each other:
We also discuss several other topics like What is the formula for gross profit?
→ p (A and B) = p(A∩B) = p(A) x p(B)
→ ∩ = intersection/joint!
→ Example: there is a 50% chance of at least one 6.5+ earthquake in 100 years. What is the chance of one in 200 years?
*First year / second year.
There is a 75% chance on at least one earthquake 6.5+ in 200 years!
● Example: 25% chance at a good GPA and MCAT.
42% chance at good GPA.
10% chance at good MCAT.
What is the probability of having a great MCAT if you have a great GPA?
Remember: these are not independent!
p(A ⋂ B) = p(A) * p(B | A)
p(A⋂B) = p(A)
p(B | A) =p(A)
Solving the example:
p(B | A) =p(A)
p(MCAT | GPA) =p(GPA)
p(A ⋂ B) = 0.25
p(A) = 0.42
p(B | A) = = 0.6
Mutually Exclusive vs. Conditional vs. Independent Events
● Mutually Exclusive example: Tom and Sabrina break up and never want to see each other again. If Sabrina goes to a party, probability of Tom going to that part?
→ 0! He won’t go!
● Independent Events example: Tom and Sabrina break up but remain friends. If Sabrina goes to a party, probability of Tom going to the party?
→ Both events have same prob. Neither affect each other.
● Example: Gender and Depression
p(D) = shaded region
Independent Not independent
Week 2, January 31st. Descriptive Stats 1: single variable.
Definition: describe essential characteristic of data set with few parameters.
2 parameters: central tendency and dispersion.
Measures of central tendency: # that represents an entire distance.
→ Mean, median, mode.
Mean: finding “true value” in presence of noise.
● Properties: sum of deviations from mean = 0. Sum of squared deviations = minimal. ● Disadvantage: outliers affect mean.
● Median is more robust.
Ergodicity: instead of measuring one person many times, measure many people and combine by calc. average.
● Using a group to predict average but only works if 1) everyone is identical and 2) everyone remains same over time.
Dispersion Measures: how certain a number is.
● How much variance around expected value one should expect.
● Remember property of mean- summing simple dev won’t work since it adds to zero. Solution? ○ Standard Deviations: square dev. before summing. Like mean, more robust to outliers.
● Mean absolute deviation: more robust. Akin to median.
Example: 40, 60, 60 80.
Mean = 60, so…
Divided by number of measurements: 40/4=10.
Week 3, February 5th, 2018
Descriptive Statistics II: Correlation
Science is always about relationships.
● I.e, relationship between smoking and cancer.
● Parameters summarize distribution of measurements.
● Parameters as a correlation summarize relationship between measurements.
● Peg board example.
● Very rare. In psych, used in IQ with mean of 100.
Deriving Correlation: start with distance around mean, then calculate the standard deviation of the distribution!
Standard Deviation ( eq. below).
Standard deviation → variance (eq. below).
(square both sides!)
Variance → covariance (eq. below).
(Multiple by itself and rewrite).
Properties of covariance:
● If both X and Y are above the mean, covariance is positive.
● If both X and Y are below the mean, covariance is positive.
● Is one of below and other is above mean, covariance is negative,
Covariance to correlation (eq. below).
● Correlation = normalized covariance.
● Normalized by product of respective SD.
Properties of correlation:
● # between -1 and 1.
● Quantifies strength of a linear relationship between 2 variances. ● Correlation of 0 does NOT mean no relationship!
● Most psych relationships between 0.3 and 0.4!
The last graph has no linear correlation but is still correlated! (Sine graph).
Interpreting Correlation Strength: aleatory considerations.
● Dice example!
Pearson’s Product-Moment Correlation
● Only linear.
● Based on means and standard dev.
Spearman’s Rank Correlation:
● Instead of mean, in analogy to median.
● Can be nonlinear relationships, as long as it is monotonic!!
○ Monotonic = always one direction. Either always rising or always falling.
Algorithm: 1) transform values to rank, 2) calculate product-moment correlation between ranks, 3) formula!
d= rank difference.
Week 3, February 7th, 2018
Measurement theory: data result from measurements- not all measurements equal.
Scales of Measurement: reduces natural phenomenon to a number.
● #s have properties that can be interpreted mathematically.
● #s can be converted to other #s via math.
● Measurement scales.
Nominal Scale: names as labels. No magnitude.
● Only property that can be interpreted is “same” and “different”.
○ Example: “students” and “teachers”. License plates.
● Valid operation: counting.
● Valid summary statistic: mode.
Ordinal Scale: magnitude matters. Adding order.
● Higher inherits properties of lower.
● Other than # and order, not much can be interpreted.
○ We know something is bigger but not by how much.
● Most common in psychology.
○ Likert scales.
● Valid operation: ordering.
● Valid summary statistic: median.
Interval Scale: adds distance. Two points will have equal unit size.
● Mean is interpretable.
● CANNOT do ratios because no zero.
● Rare in psychology.
○ Example: IQ.
○ IQ of 85 (from 100) is the same distance as 115 (from 100).
○ We cannot say an IQ of 100 is twice as smart as IQ of 50!
● Valid operation: adding.
● Valid summary statistics: mean.
● Highest measurement scales.
● Ratio becomes meaningful because there is a meaningful zero.
○ Example: reaction time.
● All behavior is measured on a ratio scale. Firing rate of neurons, most scales in physical sciences. Not as common in psych.
● Zero is necessary but no sufficient.
○ For example, on nominal scale, can have the value of “0”.
■ Example: 0 = female. 1 = male.
Operationalization- fundamental problem of science!
● How is theoretical construct measured?
○ Example: Denmark marked most happiest country but how is happiness measured? ● A theory links measurement to construct to meaning.
○ Example. Temperature measured by column of mercury.
● How can you do this in psychology?
Week 4, February 12th, 2018
Test Theory I: objectivity reliability.
We assess measurements by three criteria: objectivity, reliability, validity.
→ All are ways of using correlation.
● Not an issue in most sciences… Except psych.
● Judgement call → bias!
○ Example: how do we measure babies if they can’t speak?
● The person being assessed (giving data) is influenced by person recording it (experimenter). ○ Example: afraid of being judged, so will lie.
● How to assess objectivity?
○ Have 2+ people measure data from same population with the same instrument. ● If the effect depends systematically on who logs the data rather than data itself, it is not objective. ● Why does it matter?
○ Science wants to be objective, not subjective.
○ Lack of objectivity undermines science’s whole purpose!
● How consistent are the measurements?
○ If you repeated a measurement, how close would it be to the first measurement? ● Measured value = true value + measurement error
● Reliability =true vaue
true value + measurement error
● How to assess?
○ Generally want 0.8 or higher!
● Several types of reliability:
○ 1. Test-Retest: consistency over time.
■ In psych, usually measured in six month increments.
■ Problem: learning and practice effects, and duration of interval.
○ 2. Alternate-Forms Reliability: consistency across particular instantiations of the test. ■ Problem: “parallel” tests are still not the same.
■ Example: In a lecture hall, you miss your exam date so you take a make-up exam. However, the make-up exam is not the same as the original exam.
○ 3. Split Half Reliability: consistency within the same test.
■ Correlation between the two halves of the test.
■ Usually split odd/even.
○ 4. Interrater reliability: consistency across raters if it is part of the instrument. ■ Two psychologists measure the same person. How consistent are the results between the two experimenters?
● MBTI (Myers Briggs)
○ Personality typology.
○ E/I, S/N, T/F, J/P..
■ Example) INTP, ENFJ…
○ Often used in companies to build teams!
■ Pro: focuses on personality.
■ Cons: unreliable!
○ Why is it unreliable?
■ Bipolar and bimodal.
■ Assumption the graph would look like (below). Introversion blue, extraversion red. No overlaps but personality does not actually act like that!
■ Low in test-retest reliability.
● Internal consistency: within one test, questions are measuring the same construct. ○ Example) don’t want IQ question in personality test.
○ To measure, use Cronbach’s alpha: pairwise between items.
■ Close to 0.9 is desirable, but not correlation of 1! That implies it is the same question but in different words.
○ Example) NEO-PI
Week 4, February 14th, 2018
Test Theory II: validity.
Validity: is the test measuring what it’s supposed to?
● To assess, use correlation.
● Unlike reliability, no cutoff. Even a correlation of 0.2 can establish validity- better than 0!
1. Face Validity: does it look valid on the surface?
● Not actual validity. Only appearance of it.
● Example: neither LSAT and MCAT can predict practicality perfectly, but has face validity.
2. Content Validity: do items on test assess content that matters?
● Consults experts.
● Example: Hare Psychopathy Checklist
3. Construct Validity: how good the test is at capturing the construct.
● The most important!
● Correlating scores on one test of a construct with that of another.
● Example: you want to create a new construct of “grit” and decide to measure its relation to IQ and conscientiousness.
4. Criterion Validity
● Important in applied psych.
● 1. Present performance (concurrent validity)
○ Example: field sobriety test.
○ Can help determine who are drunk, however some people are more functional than others.
2. Future performance (predictive validity)
Example: SAT predicting college GPA.
5. “Practical” Validity
Week 5, February 21st, 2018
Scientific prediction problem: you want to know something in a relationship.
→ Could be in future or present.
→ i.e, wanting to know which students are likely to become depressed for early intervention. → Need to know a predictor.
→ Name = regression.
Predictor = regressors.
● Introduced by Galton: did a study on height of children in relation to height of their parents and noticed:
○ Height is regressing to the mean. Tall kids have tall parents, but kids are not as tall as their parents.
● Height = nature + nurture + luck.
○ The higher the luck factor, more likely to regress to mean.
● Regression is only avoided if measurements are perfect and completely determined by predictor. ○ Perfect regression line on graph is 45 degrees.
● Central in economics and logic of science.
○ IV changing DV.
○ Used in sciences you can’t do experiments in:
■ I.e, can’t change national income, can’t put someone in jail to see how they function.
For the regression line, you want to minimize residual line length.
● Minimize sum of squared deviations.
● Think of a parabola opening up: if you drop marbles into it (like a bowl), the marble will fall to the lowest point.
● Not robust to outliers.
Regression relies on correlation!
● Regression Line: conceptualized as a 2D average.
○ Average value of Y at all values X.
○ Increase 1 SD of X, Y increases r SDs, on average.
○ r = correlation!
GLM: general linear model
Y = β0 + β1X1 +ε
Β = regression parameter(s), weights
X = IV (predictor)
Y = DV (predicted)
● Measurement = true value + error
○ I.e, someone takes a midterm. Their score is based on:
■ Their skill in subject ← true value!! competence
■ Mood ← error
■ Lucky guessing ← error
● The lower the correlation between regressor and outcome + lower the reliability of the measurements = higher the regression to mean effect.
● “True value” overcome by “error values” which are not consistent between measurements.
Real Life Examples:
● Praise and criticism
○ I.e, your child is really acting out so you scold them and they stop acting out. Why? ○ Likely, the child will stop acting out on their own. They will get tired/their tantrum has peaked and can only diminish.
○ Patients only apply for therapy at their worst. Likely, they will slowly get better on their own over time.
● Only based on correlation.
○ Many relationships are not causal.
○ Correlation does not equal causation. Might be another factor affecting outcome. ● Must control as many confounding variables as possible!
○ This is what multiple regression does.
Week 6, February 26th, 2018
→ More than one predictor!
→ In social sciences, always have to consider multiple predictors!
● Conducting an experiment
○ Unethical at times: can’t assign random people to smoke
○ Or, can’t actually conduct an experiment: unable to change income tax in US for an experiment
○ Instead, use multiple regression!
Experiment: there is a set definition.
● “Randomly assign people to different experimental conditions”.
● A large # of participants “control” confounds.
○ Thus, causality!
Fundamental issue with regression…
● There is always a lot going on! Can’t possibly account for all factors an experiment can, and may be hidden confounding variables that researchers are unaware of.
○ For example, PROJECT STAR.
○ Small vs. large classroom sizes on academic performance.
○ Possible confounds: teachers can choose the size of class they want to teach, richer communities can afford small classes, large lectures may use PPTs while smile are verbal...
● Observational study where the situation appears to have random assignment. ○ Example, can’t put people in prison but can observe it.
Multiple Regression Model
What does β stand for?
Beta Weights: reflect degrees of correlation in a regression.
● Like a slope.
Measurements = Y
Average measurement = Ȳ
Predictions = Ŷ
Residuals = Ŷ - Y
SSresidual = �� (Ŷ - Y)2
SSexplained = �� (Ŷ - Ȳ)2
SSexplained + SSresidual
*** Assumption there is a linear relationship!
Assessing Regression Models
R = multiple correlation. Correlation between the regression model and the actual values obtained. R2 = proportion of variance that can be accounted for.
1-R2 = proportion of variance that cannot be accounted for (confounds, errors!)
Partial Correlation: tease apart what is going on in non experimental, relational datasets. ● Example: correlation between education and income, controlling intelligence. ● Bivariate between X and Y: rxy
○ Controlling z: rxy • z
Week 6, February 28th, 2018
Logit Regression Functions!
● Used for binary outcomes.
○ Binary = one or other. Only two possible answers.
● The regression line would look like that (to the right).
○ Not very accurate!
○ So, we need a nonlinear model that link outcome either occurring or not occurring. Logs/Logit Function: links values in predictor variable to probability of outcome. ● These are estimated from data since we don’t actually know what they are. ● Logit: natural log of odds, with e as base: log e x = ln x
● This graph will look like:
● But that is backwards, so we take the inverse and it looks like this now!
This looks like it can fit the binary graph, right?