ADV APPL LIN MODELS
ADV APPL LIN MODELS BIOST 570
Popular in Course
Popular in Biostatistics
This 21 page Class Notes was uploaded by Ramona Leannon on Wednesday September 9, 2015. The Class Notes belongs to BIOST 570 at University of Washington taught by Staff in Fall. Since its upload, it has received 33 views. For similar materials see /class/192290/biost-570-university-of-washington in Biostatistics at University of Washington.
Reviews for ADV APPL LIN MODELS
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 09/09/15
W Smoothing Thomas Lumley BIOST 570 20051174 Davison 107 Regression splines Recall the problem of fitting a straight line with a breakpoint at a prespecified value 75 This is a linear regression with predictors maxxt minxt where the coefficients are the slopes of the two line segments Alternatively 13 minx tO for the coefficients to be the first slope and the difference in slopes This is the simplest case of regression splines piecewise polynomials of degree d with d 1 continuous deriviatives at the breakpoints Arguably using indicator variables for ranges of 1 gives d O Example CHS data risk of CHD increasing with blood pressure Linear gta Call glmformula CHD SYSBP family binomial data chs Coefficients Intercept SYSBP 1824782 0003149 Degrees of Freedom 499 Total ie Null 498 Residual Null Deviance 5004 Residual Deviance 500 AIC 504 Example 120 is borderline between normal and prehypertension put a breakpoint there gt summaryb Call glmformula CHD pminSYSBP 120 pmaxSYSBP 120 family binomial data chs Deviance Residuals Min 1Q Median 3Q Max 06931 06872 06808 05663 21448 Coefficients Estimate Std Error 2 value Prgtz Intercept 5286022 3145163 1681 00928 pminSYSBP 120 0034257 0028569 1199 02305 pmaXSYSBP 120 0001072 0006297 0170 08648 Signif codes 0 0001 001 005 01 1 Dispersion parameter for binomial family taken to be 1 Null deviance 50040 on 499 degrees of freedom Residual deviance 49865 on 497 degrees of freedom AIC 50465 Number of Fisher Scoring iterations 4 Example Reparametrize the model so that one coefficient measures the Change in slope at the breakpoint test for linearity gt summaryd Call glmformula CHD pminSYSBP 120 0 SYSBP family binomial data chs Deviance Residuals Min 1Q Median 3Q Max 06931 06872 06808 05663 21448 Coefficients Estimate Std Error 2 value Prgtz Intercept 1175165 0914904 1284 0199 pminSYSBP 120 0 0035329 0031581 1119 0263 SYSBP 0001072 0006297 0170 0865 Dispersion parameter for binomial family taken to be 1 Null deviance 50040 on 499 degrees of freedom Residual deviance 49865 on 497 degrees of freedom AIC 50465 Number of Fisher Scoring iterations 4 Spline bases One general basis set of predictors for a regression spline of degree d with breakpoints knots ti is 1 13 132 xd maxO a t1dmax0x t2d o the spline can exactly reproduce any d order or lower poly nomial c all basis functions have d 1 continuous derivatives every where and d continuous derivatives except at the knots so any linear combination is also Cd 1 and piecewise Cd 0 the basis at a point 1 depends only on the cutpoihts o the basis functions are highly collinear so the fit will be numerically unstable Spline bases We actually use reparametrizations that do not have the collinear ity problem The standard B spline basis produced by bs in R and S PLUS has nicer numerical properties including local support only d 1 of the predictors are non zero at any 13 For splines with d gt 1 the coefficients are not readily inter pretable use termplot to get a picture of the curve Uses of splines a When nonlinearity is of interest splines will fit it If you need a simple functional form eg for extrapolation you can look at the spline fit and choose one 0 Regression spline modelling of confounders is a simple way to avoid expending effort on choosing a functional form and investigating linearity 0 Linear splines are useful for testing nonlinearity in a way that makes sense at least to medical researchers Quadratic splines fit better than linear splines Cubic splines are visually smoother but don t give any better fit In the absence of any better choice put the knots at duintiles Local polynomials A regression spline is still described by a finite set of parameters which makes inference easier A more flexible class of smoothers are local polynomials Rather than having a fixed set of k polynomials for k disjoint intervals a local polynomial smoother uses a different interval for every 13 value The resulting curve is more flexible and does not depend on pre specified knots though in practice the difference is usually small Local linear regression A local linear regression smoother estimates EYX mg by a weighted regression giving more weight to points near 1 x0 Define weights by 1015130 h oc K where K is some kernel function typically a symmetric probability density function and h is the bandwidth Examples Kz gbz the Normal density Kz 1 z33 the tricube function used in loess Kz 1 z2 the Epanechnikov kernel Kz z lt 1 the uniform or rectangular kernel Local linear regression The choice of kernel is much less important than the choice of bandwidth but choosing a smoother kernel will produce a visually smoother result Bandwidth is standardized across kernels by saying the bandwidth of two kernels is equal when their standard deviation is equal There are three standard ways to define bandwidth so that the standard Normal has bandwidth 1 so that the rectangular kernel has bandwidth 1 and so that the rectangular kernel has bandwidth 2 Once the kernel and bandwidth are chosen and we have the weights we estimate axo xo by linear regression with weights wx0 and estimate EYX mg by 6430 08330 We need a linear regression for each point so this is relatively slow Local linear VS local constant The simplest local regression estimator uses just an intercept in the local regression model n ya EYX 5130 Zzilwix0 This local constant estimator performs badly near the edges of the data because it must average in data from only one direction Suppose Y is related linearly to 1 Xlt110 ylt110 plotxy linesksmoothxyquotnormalquot bandwidth5 Local linear VS local constant Local linear VS local constant Using a linear or higher order polynomial removes this problem since the model can extrapolate towards the edge of the data Xlt110 ylt110 plotxy linesksmoothxyquotnormalquot bandwidth5 lineslowessxyiter0Colquotpurplequotlty2lwd2 Local linear VS local constant 10 Bandwidth choice Davison pp529 3O shows how to calculate the asymptotic bias and variance of a regression smoother The argument goes by considering a Taylor expansion A local linear smoother will fit the linear part of the Taylor expansion leaving the quadratic and higher order terms as bias The bias over a window of width h is 0072 Since there are roughly nh points contributing to the fit the variance is of order 01nh So we need nh gt 00 and h gt O to get consistency and the asymptotically optimal bandwidth will have variance and squared bias of the same order 1 h2 2 M or h 71 15 Bandwidth choice The same analysis can be done for a local constant regression smoother There the bias is 007 in the middle of the data but 0h near the edges The variance is still 01nh A locally constant smoother thus needs a smaller bandwidth near the edge of the data and even then will have higher error at the edges Bandwidth choice Since the bias is not asymptotically smaller than the standard error we cannot readily get confidence intervals for EYx The smoother is a linear function of Y since it is a set of fitted regression values and these are linear functions of Y We can write the smoother as the functional shY Write shEYX 213 for the smoother applied to the true mean This incorporates the bias so smoothing gives us an asymptotically unbiased estimate of shEYX 213 and we can get confidence intervals for it from confidence intervals in all the individual linear regressions Crossvalidation The asymptotic analysis only shows us that the bandwidth should be 71 15 times an unknown number A more practical approach to accurate smoothing is leave one out crossvalidation fit the smoother to all but one data point compute the error at that data point repeat for all data points and for a range of h and choose the best h For many practical purposes the Goldilocks method of bandwidth choice is sufficient
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'