Week 2

Random Sample: If Y1 ,Y2 … Yn are independent random variables with common pdf (y, θ) determined by one parameter θ then {Y1, Y2,….Yn} is said to be a random sample from that pdf.

• Y ∼ (∝, σ^2) can take finite sample from population and then estimate ∝, σ^2

• Use sample average to find mean!

o Sample average y-bar = (ΣYi)/n is an estimator for ∝!

Estimator is unbiased if the probability distribution of the estimator has an expected value equal to the parameter of interest. Y-bar 1 ≠ Y-bar 2

E[Y-bar] = ∝

E[(ΣY-sub2)/n] =1/n*E[ΣYi] =1/n*Σ*e[Yi] =n/n*E[Y] =∝

θ W = h(Y-sub1,….Y-subn) If you want to learn more check out How can i manage/invest my money so that it will make me more money?

E[W] = θ

upward bias E[W] > θ

s^2 = (Σ(Yi-Y-bar)^2/n—1

Central Limit Theorem

X-bar ∼ (∝, σ^2/n)

Always start hypothesis tests the same way

Do college students drink a lot?

H(null) ∝ = 6

Ha ∝ > 6

Ha ∝ not equal to 6

Ha ∝ < 6

0 = Fail to Reject Don't forget about the age old question of Why did ww1 begin?

Type Errors

Type 1: Reject the null hypothesis when it is in fact true

• Ex. Someone who is found guilty when he/she did not commit the crime -> Consequence: Innocent sent to jail

o Ho: guilty, Ha: Not guilty

• Wage increased, when in reality the wage stayed the same -> Consequence: Waste money on policy

Type 2: Fail to reject the null when it is actually false. • Ex. Not guilty when he/she did commit the crime. -> Consequence: murder walks free

• Wage did not increase; when in reality it did not. -> Consequence: Could have improved people’s lives.

Using the Central Limit Theorem we can standardize it to make it a t distribution (x-bar - ∝)/s/sqrt(n)

Study standardize central limit theorem and setting up hypothesis tests*

Wage example

X-bar = 14

∝ =13

s^2 = 4

n =9

df = n-1= 8

Ho: ∝ = 13

Ha: ∝ > 13

14-13/(2/3)=1.5 (t-value) critical value = 1.86 Reject

Ex. 2

X-bar = 13.8

∝ = 13

s^2 =4 We also discuss several other topics like What is the purpose of healthcare finance?

n =100

Ho: ∝ = 13

Ha: ∝ ≠ 13

13.8 – 13/ (2/10) = 4 1.98 1.98

Reject the null ( 2 tailed a = 1.98)

Confidence Interval

95% of the time the confidence interval will contain the true ∝ (x – t* (s/sqrt(n) x-bar + t* s/sqrt(n))

________________________________________________________ _

Econometrics is slightly different from statistics because it is a quantitative social science.

We’re often looking for estimates.

• Estimate: your best guess of a population parameter given your data

• Causal Effect: A relationship between two things (or events) whereby one causes the other to happen.

o Ex. People who go to college earn more money, that doesn’t mean going to college gets you more money. Maybe, really smart people go to college and that’s We also discuss several other topics like How does product development affect marketing function?

what affects the relationship

• Counter-factual (opposite-what really happened)

o Ex. College vs. no college. You can’t know what

happens in 2 scenarios because you can’t be in 2 states at once Don't forget about the age old question of Which word is synonymous to cancer?

Data

• Observational: data that consists of information collected about people/firms from the real world

o Problems: don’t get counter-factual, don’t know if they are random variables, incorrect causality (tedtalk)

▪ Married men live longer then non married

marriage. Men who are healthy & have a higher

life expectancy get married more often vs. men

with lower life expectancies

▪ Kids who sleep with the lights on tend to be short

sighted. Short sighted parents like to leave lights

on because it’s a genetic disorder, so their kids

tend to be near sighted (because of genetics)

???? Lesson: CORRELATION DOES NOT IMPLY

CAUSATION

• Causality: If A then B.

• Correlation: If A then sometimes, maybe, B. If you want to learn more check out What does tushita heaven mean in buddhism?

• Confounders: when we observe a relationship between A and B, but in actuality it is a third variable that causes both o Ex. A -> B (observed) but in reality C (confound) ->A & B

▪ Ex. Genetics in nearsighted example above

• Reverse Causality: We observe a relationship between A and B, conclude A causes B, but in reality B causes A

o Ex. Marriage -> expected longer life in reality it’s the opposite

• Simultaneity: We observe a relationship between A and B, conclude A causes B, but in reality part of A cause B and part of B causes A.

o Ex. Spending on police officers and crime (negative relationship) => doesn’t necessarily mean if you spend more on cops the crime will decrease

• Experiment: randomized control trials

o Ex. Tennessee Star Experiment – split up classrooms into smaller classrooms to see if class size affects test scores

• Drawbacks

o Cost!!!! It’s very expensive

▪ You have to pay to implement the study and

sometimes you have to pay participants

o Ethical concerns!

▪ Ex. You can go to college but you other guys can’t

o This leaves us with Observational Data. ☹ Yes it’s good but it’s not the best

• Macro or fiancé use time series data

• Labor issues use panel data or repeated cross sectional • Time series: Observations correspond to different time periods but for the same individual.

o You’re comparing things to themselves (past vs. now) ▪ Ex. Stock data, GDP

• Cross Sectional: Observations correspond to different individuals for the same time period.

o Ex. Census

• Repeated Cross Sectional: Series of cross sections appended to each other, different individuals

• Panel: Time series for several individuals, repeated individuals Majority of what you do in econometrics is regression* Know how to do it well!

Regression- the best fitting line (relationship) that can be made on a graph

• Things to consider:

o How do we allow for things other than x to affect y? o What functional relationship is there between x and y? o How do we make sure we capture ceteris paribus

(able to control everything else) effect of x on y?

Yi = βo + β1x1 + ∝ Yi = βo +β1x1 +β2x-sub2^2 + ∝i