Terminology and Simple Linear Regression Model
Terminology and Simple Linear Regression Model MH3510
Popular in REGRESSION ANALYSIS
Popular in Applied Mathematics
This 5 page Class Notes was uploaded by Andre Sõstar on Friday August 28, 2015. The Class Notes belongs to MH3510 at Nanyang Technological University taught by TBA in Summer 2015. Since its upload, it has received 45 views. For similar materials see REGRESSION ANALYSIS in Applied Mathematics at Nanyang Technological University.
Reviews for Terminology and Simple Linear Regression Model
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 08/28/15
MH3510 Regression Analysis Nanyang Technological Univeristy Instructor Pan Guangming notes Andre Sostar 23082015 1 Terminology De nition 1 The variable which is of our primary interest is called the re sponse variable output variable outputs Yvariables or dependent vari able whereas the remaining variables are called predictor variables input variable inputs X variables regressors or independent variable Example Response price Predictor size building age facilities location De nition 2 Quantitative variables can be measured in a numerical form 6 g age income time temperature etc De nition 3 Qualitative variables are not numerical in nature 6 g gen der categorized age education level type of crime committed style of cuisine served in a restaurant etc MH3510 Regression Analysis Nanyang Technological Univeristy Instructor Pan Guangming notes Andre Sostar 23082015 1 Simple linear regression Simple Linear Regression is used to explain the relationship between two variables using a straight line In our case we are interested in relation be tween the response variable and the predictor variable 11 Formal Statement of Model De nition 1 A simple linear regression model where there is only one predictor variable and the regression function is linear can be stated as fol lows YiZBO l BiXi l z where Y is the value of the response variable in the i th trial 60 and 61 are parameters called regression coef cients X is constant that is given the value of the predictor variable in the i th trial and e is a random error Example 1 A model relating a persons wage to observed education and other unobserved factors is Wage 60 61X 6 where Y is amount of money in hour X is years of education and e is previous erperience honors etc everything that makes a person more valuable 12 Method of least Squares To nd quotgoodquot estimators of the regression parameters 60 and 61 we use the method of least squares For the observations X2319 for each case the method of least squares considers the deviation of Y2 from its expected value Yi 50 Ble39 Before stating the de nition we look two important facts that are extremely useful Fact 1 if E 2721 332 then n 2 5 0 1 i1 Fact 2 n 2 m2 2 ma x3 n5 2 De nition 2 Least Squares Method requires that we consider the sum of the n squared deviations denoted Q Q Zoe a extgt2 i1 Using calculus we can nd the values of 60 and 61 that minimize Q Differ entiating 662 6 50 4212 60 61X 662 6 51 ZZX239Y239 50 Ble39 Setting equal to 0 and simplifying gives us a normal equation 2 50 Ble39 0 gt 50 Ble39 ZZX239Y239 50 61Xi 0 gt ZXiYi 50 61Xi The LSE of 60 and 61 are found by solving the normal equations for 60 61 A A A S 50 g 515 and 51 XX Where sample mean g 2 21 yz and sum of squares are de ned n n SXX 27 SXY my thus for 61 we can use 31 2211 2842 E 39 g SXX A tted regression line with the form y1 30 31332 01quot 33239 g 5133239 fly represents the mathematical regression equation for data It is used to illus trate the relationship between a predictor variable X scale and a response variable y scale Theorem 1 GaussMarkov Theorem Among all estimates that are lin ear combination of y1 yn and unbiased the LSE has the smallest vari ance 13 Properties of tted regression line The sum of the LS residuals is zero ie n 2 6239 i1 The sum of the observed values equals the sum of the tted value n n Z w Z i1 i1 The sample covariance between the regressions and the LS residuals is zero n 2 172396239 i1 The regression line always goes through the point my 3 14 Estimation of Error Tenns Variance 02 We need to calculate a sum of squared deviations but must recognize that the Y comes from different probability distributions With different means that depend upon the level X Thus the deviation of an observation Y must be calculated around its own estimated mean Hence the deviations are the residuals A Yi YiZQ and the appropriate sum of squares denoted by SSE error sum of squares SSE 20 12 63 i1 i1 The sum of squares SSE has n 2 degrees of freedom hence 22 SSE EGG 1702 2e 71 2 71 2 71 2 8 s2 is an unbiased estimate of 02 that is E 82 02
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'