### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# 381 Class Note for STAT 462 at PSU

### View Full Document

## 17

## 0

## Popular in Course

## Popular in Department

This 12 page Class Notes was uploaded by an elite notetaker on Friday February 6, 2015. The Class Notes belongs to a course at Pennsylvania State University taught by a professor in Fall. Since its upload, it has received 17 views.

## Similar to Course at Penn State

## Reviews for 381 Class Note for STAT 462 at PSU

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 02/06/15

mmltgtzm gt mmmmmmmaz zm n 03583033 This course is about REGRESSION ANALYSIS 0 Constructing quantitative descriptions of the statistical association between y response variable and X predictor or explanatory variable on the sample data 0 Introducing models to interpret estimates and inferences on the parameters of these descriptions in relation to the underlying population log length ratio 05 10 15 20 log large inst ratio MULTIPLE regression when we consider more than one predictor variable F Chiaromonte loglen ratio Fitting a line through bivariate sample data as a descrigtor Least squares fit find the line that minimizes the sum of squared vertical distances from the sample points y 80 81x generic equation of line y o neD2 obj fct 05 10 15 20 log large inst ratio Normal equations derivatives of obj fct Solution unique Zlyin o l xi at nor n 1 n 1 11 bl i1 n gxlylz ogxi l xlz 21xl YY F Chiaromonte b0 7 b1f 3 yi b0 b1xi tted value for sample point i1n e yi 2 residual Geometric properties of the least square line i6 0 i1 VI 2e 2 min 11 A n Zyi Zyi i1 i1 ixiei 0 i1 VI Zyiizo 05 10 15X 20 i1 og arge mat rauo og ength rano 7 b0 b1f F Chiaromonte 4 Simple linear regression MODEL Assume the sample data is generated as yl o lxi 81 xvi 2 Ln fixed or condition on 81i 1n random errors st E 81 0 Vi no systematic component var81 0392 Vi constant variance corg 8 0 Vi 2 j uncorrelated l The values of y given various values of X scatter about a line with constant variance and no correlations among the departures from the line carried by different observations quite simplistic but very useful in many applications Note distribution of errors is unspecified for now F Chiaromonte 5 If we assumed a bellshaped distribution for the errors which we will do later here is how the population picture would look like Hours Y EOi 30 3195 Varyi 202 COTOpyj 0 T Emi 104 El9 52lX 1 25 45 X Number of Bids Prepared Interpretation of parameters B1sope change in the mean ofywhen X changes by one unit Bointercept if x0 is a meaningful value mean ofywhen x0 62error variance scatter of y about the regression line F Chiaromonte Under the assumptions of our simple linear regression model the slope and intercept of the least square line are point estimates of the population slope and intercept with the following very important properties GAUSSMARKOV THEOREM Under the conditions of the simple linear regression model b1 and b0 are unbiased estimates for 31 and BO Eb181 Ebo80 they are the most accurate smallest MSE ie variance among all unbiased estimates that can be computed as linear functions of the response values Linearity ice m y ice m n 91 i1 n i2 Zkiyi Zoe 2oz Zoe fgt2 i1 i1 71 1 71 71 0 yb1x 222 Zkix yi Zkiyi i1 i1 i1 F Chiaromonte Example Simple linear regression for y log length ratio between human and chicken DNA on X log large insertion ratio as sampled on n100 genome windows Estimates from least squares Line parameters Intercept 019210 Slope 021777 Error variance 0033 on 98 dof s Mean responses ya 041 92 063 would you trust this F Chiaromonte iog length ratio 05 10 i5 20 log large inst ratio Example Simple linear regression for y mortality rate due to malignant skin melanoma per 10 million people on X latitude as sampled on 49 US states S t a t e LAT MORT 1 Alabama 330 219 2 Arizona 345 160 3 Arkansas 350 170 4 California 375 182 5 Colorado 390 149 49 Wyoming 430 134 Estimates from least squares Line parameters Intercept 38919 Slope 5977 Error variance 36557 on 47 dof s Mean responses 930 2 20988 940 15011 F Chiaromonte 2w O 150 Mortality 100 o 30 40 50 Latitude would you trust this 10 Maximum likelihood estimation under normality xi 139 2 Ln xed or condition on Assume error distribution 81i 1n iid N00392 Then yl N 0 1xi0392 and indep 1 Ln Likelihood function L oa 102 2lizexp22 yi lt3031xgtgt2 72039 1 exp 1 yi 0 1xi2 2 02n2 202 F Chiaromonte 11 max L 0 102 obj fct 3023152 Solution unique n N 209 xyl y 1 2 i1 n 2 20 x gt same as from least square fit 70 y 91 J 1 n n 2 s2 lty1 ltbob1xi 2 gt S2 n i1 J n F Chiaromonte Some remarks a strong statistical association between y and X does not automatically imply causation eg a functional relationship of linear form X can proxy the real causing variable perhaps is a spurious fashion ln observational studies x likewise y is not controlled by the researchers we condition on the observed values of x In experimental studies x is controlled we can consider the values of x as fixed although assignment of x levels to units may be arranged at random in an experimental design Experimental design facilitates causality assessment Extrapolating a statistical association eg a regression line outside the range of the data on which it was fitted is dangerous we don t really know how the association would be shaped where we didn t get to look With an experimental design we can make sure we cover the range that is of interest F Chiaromonte 13

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

#### "Selling my MCAT study guides and notes has been a great source of side revenue while I'm in school. Some months I'm making over $500! Plus, it makes me happy knowing that I'm helping future med students with their MCAT."

#### "I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

#### "Their 'Elite Notetakers' are making over $1,200/month in sales by creating high quality content that helps their classmates in a time of need."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.