### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# Nonlinear Statistical Models for Univariate and Multivariate Response ST 762

NCS

GPA 3.79

### View Full Document

## 38

## 0

## Popular in Course

## Popular in Statistics

This 57 page Class Notes was uploaded by Jordane Kemmer on Thursday October 15, 2015. The Class Notes belongs to ST 762 at North Carolina State University taught by Peter Bloomfield in Fall. Since its upload, it has received 38 views. For similar materials see /class/223932/st-762-north-carolina-state-university in Statistics at North Carolina State University.

## Reviews for Nonlinear Statistical Models for Univariate and Multivariate Response

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 10/15/15

The Folklore Theorem o Return to the general mean variance specification Ele fx varYix 02903 tax Consider the iterative GLS scheme 0 Preliminary estimators 3 and 9 0 Update 6 and 0 by solving where 1 7016707X lt Veca g x gt39 0 Then update B by solving 0 Note that the argument of QC is held at 3 in this update 0 Iterate the last two steps 0 times possibly to convergence C The theorem 0 Suppose that W 30 0201 and 53 00 0201 ie 3 and 9 are consistent o Then under suitable regularity conditions 60 i N0082WL5gt 0 Here EWLS nligtmoon1XTWXgt1 where f X1 oT f6ltx27160T W diagw1w2IUna 1 9 oa 90 X3 LUj Notes 0 The result is true for any 0 C 1 for a one step estimator C 2 00 for a converged iterated estimator o This is the same large sample distribution as for the WLS estimator with optimal weights 0 That is to this order of approximation using estimated weights gives the same sampling distribution as if the weights were known 0 But we have assumed that the variance function g is cor rectly specified o The folklore result implies that 3 r9 Nbomg X oTW 07 00 X 0 l39 c We use this by plugging in 9 and 1 i la mtg npj1 9B9Xjgt 62 o This is the approximation used by SAS s proc nlin and the R function nls Subject Specific Modeling a Example Theophylline concentration time profiles parmfrow C2 2 for subj in C1 6 10 12 plotTheophETheophSubject subj CquotTimequot quotconcquot o Pharmacokinetics suggests the subject specific model 1z39tz39j Dz 3i 6 52239 e 3z tz j 32533 51 EltYg7j Z2335 z o Here 2m contains D1 2 dose for it subject and tm time of jth measurement for it subject Di is the same for all measurements but is included in 2m instead of ab for convenience the vector 31 t 32 53 consists of parameters specific to subject 139 o This may be written EY Zi7j7 r fltZij7 r J 0 Because i is associated with the randomly selected it sub ject it is a random variable and the model for that subject is conditional on the value of pi e We also need some assumptions about how i varies from subject to subject 0 We might assume i N N D or equivalently bi where bi N N0D 0 Here is the mean parameter vector across the population of subjects and D quantifies the subject to subject variation 3 o If the subjects were in two groups eg smokers vs non smokers the mean might depend on the group to which the subject belongs z gm 5Z 1 bi 0 I 5i1gtltglt1gtgtbi where Si is an indicator variable for smokers o More generally Az b2 where Ari is a subject specific design matrix and is the vector of all relevant parameters o For instance for the theophylline data 500 5011 300 ve b BV u Bkap ka l 31 1 r 52 O 53 0 C w 00S 00 Oi O 5 i OO 00 C 0 Here wi 2 it subject s body weight and Ci 2 it subject s creatinine Clearance rate are components of ai o The 7 s have pharmacokinetic interpretation Within individual variation 0 Need to specify varY zi i o For examples like the theophylline data we might assume V8Fltth7 Ziyj Z 0123 029181 7 07 Zij where 0123 represents natural variation in the true concentration 2 2 a 9 i0zi7jgt represents measurement error c To complete the specification of varY zi i we also need covariances or alternatively correlations c We might assume that the biological variations have some correlation matrix F azi and that measurement errors are uncorrelated o Then varYi zi g 2 0123F az 02W i 0ziyj1 where Wlt b 0 2m 1 1 1 9181 707Zi1gt27918i707Zi2gt27 981 707Zinigt2 7 diag A general form o Stage 1 Individual model 0 With i Ai bi and ab the among subject covariates on which AZ depends we may also write EYz Zi bbz EY Xz bz fiXi767bi39 o For the variance VarYi Zia a b Z 77 Zi 77 xiv 39 o Stage 2 Population model 161 Az bz where b1b2bm are usually assumed to be iid inde pendent of xi with Ebz 0 varbi D Marginal model o The 2 stage model implies that EY Xz EEY Xz 7bz Xz Efz Xz b X o If fi is linear in bi this isjust f xzgt 0 c That is for a subpopulation of subjects all with the same covariates xi the mean response across the population satis fies the same model as the mean response for a single subject with the population average values of the parameters 10 o For a nonlinear model this is not generally the case and typically EY xzgt cannot be written in closed form o Similarly varYL xZ is complicated 11 Population averaged marginal modeling o In some problems interest focuses on the dependence of EYL xz on xi not on a subject specific model o Eg Six Cities study Y1 indicator of wheezing in it child at jth observation time tri j 6m 2 indicator of mother s smoking at that time other covariates location gender 12 c We could use a logistic regression model e o 1trj 25rja53 EY J X 0 Since YEW is Bernouilli X1 2 X1 1 and we need only to specify correlations varltYZLgt7j 1 e o 1tm 25rja5 X0 13 A general form 0 For the mean EY xi firm r3 J o For the variances and covariances varY7j xi Vi67 axi39 0 Note fi and do not have the same interpretation as in a subject specific model 14 Subject Specific versus Population Averaged models 0 Parameters often have a more direct subject matter inter pretation in the Subject Specific 55 approach 0 Fitting and making inferences about Population Averaged PA models are direct extensions of univariate methods but much harder for 55 models 15 Specifying Variance Covariance Structure 0 Variances may be suggested by the nature of the response or by exploration as in the univariate case 0 Variance covariance structure is usually completed by speci fying correlations 16 Examples unstructured compound symmetry m dependent ARM exponential simplifies to AR1 for equally spaced times Gaussian 17 Notes 0 For many types of response the correlation matrix need only be nonnegative definite o For others notably binary Bernouilli responses the means may impose constraints on the correlations c We often ignore these constraints and may in fact use a working variance covariance structure that is infeasible for the type of response 18 Generalized Linear Model 0 Certain nonlinear models with a specific structure arise from using linear modeling with a parent distribution in the expo nential family 0 If the linear part is replaced by a more general nonlinear specification the result is a special case of our general mean variance specification Eltle fx varle 029 ex 0 Estimation may also be carried out using the GLS estimation equations The Scaled Exponential Family Y has a scaled exponential family distribution if its density or probability mass function is of the form fy a exp cya g is the canonical parameter and a is the scale parameter If 02 is known this is the usual one parameter exponential family with canonical parameter 5 o If 02 is unknown it may or may not be the usual two parameter exponential family o Moments Em bg 2 V8I Y Z If EY u 2 big then g pf 1a o The function bglc is called the canonical link function be cause it links the canonical parameter 5 to the mean u 0 Also varoo 02125412300 029002 so the variance depends on the mean in a specific way 0 Examples of the scaled exponential family Distribution Mg mi gu2 Normal 02 1 82 p 1 Poisson exp log p p Gamma log 1p p2 Inverse Gaussian 2 1 2 p3 Binomial loglt1e gt logT M l p Sufficiency o If Y1Y2Yn is a random sample from a member of this family the log likelihood is n Y39 b IogL Z ICOwa 39 039 91 1 n n 2 Yj nbw Z 4ij a j1 j1 so if 02 is known ZYj is sufficient for g 0 Also if Y1Y2Yn are independent but in the distribution of Y7 g is replaced by gj ij the log likelihood is n T n n IogLi2 2w 3 w Zebw so now ZYij is sufficient for 0 But note that T EM Xi W 195 bdxj so this is a conventional linear model only if 194g g ie for the normal distribution o Otherwise it is a generalized linear model Note that b is determined by the distribution We can replace it by a different function Elexjgt fbf l and it is still called a generalized linear model Because the link f 1 is no longer the canonical link we lose sufficiency not a big deal R and SAS support fitting these models with the link function chosen from a list Example Six Cities Wheezing data 0 Response child wheezes at age 9 O or 1 o Predictor mother s smoking status 0 none 1 moder ate 2 2 heavy 0 Possible covariate community Portage or Kingston a Model Yj N Bernoullilttijgt o Canonical link loglt1nggt xT J or exp XT EYj 39 W 1 efpzxgm o Logistic regression o Alternative link probit function ILLj dgtltij gt Generalized Nonlinear Model 0 We may want a more general specification for the conditional mean E06 39 MW o This is consistent with the scaled exponential family if g7 satisfies b jgt Xi o The mean variance relationship is still determined by the dis tribution varltYj x 029 EYj 592 Z 29 X 2 10 Specifying the Variance Function 0 Recall in the linear model the studentized residuals are bj aole 1 hm where rj is the jth residual Yj XJTB 0 Here hjyj the jth diagonal entry in the hat matrix H measures the leverage of the jth observation o If still in the linear model we have a variance function varle 029 9x2 with scalar 0 and g60X 1 we can show that var m 1 292mm 0197 1 hm o This suggests that the graph of by plotted against 16 0 xj 1 hjw should have slope 29 o The graph gives a diagnostic plot for non constant versus constant variance unaffected by design effects leverage o In the nonlinear case use a linear approximation 139 In H Y f where H is the approximate hat matrix Hm X X TX 1X T and X is the gradient matrix fpX1 Mx39n r3 0 We can use the same approach as for the linear model with approximate leverage values hj7jBgt Did it Work Based on the original OLS fit we decide on a tentative form for g60xjgt and refit eg by GLS PL 0 Define weighted residuals m Yj fxja 916707Xj and standardized weighted residuals rweightedyja 7 weightedj 0 Can use use these to explore mis specification What about leverage 0 As a vector rweighted W12 Y 1 1 1 9B9x12 gB9x22 903 Y W12Y Xwa W12X ma W12f 0 Then another linear approximation gives rweighted Y f 3 In H Y f 0 Here H is again an approximate hat matrix Hm X Xr3TXrs 1 Xr3T o Studentized weighted residuals are therefore 7 weightedg bweightedj a 1 hix REML Estimation of Variance o The above treatment of leverage effects on the variances of residuals can be used to modify variance estimation 0 Recall PL estimating equations for a and 0 quot 33 fxj73gt221 gtltlt A1 j1 029BQXjgt V0396707Xjgt 2 n Tweightedg 1 X A 1 0392 VH6707Xjgt 1 j 0 0 Motivation when evaluated with true values of all parame ters r2e39ghted W j E 02 1 o But when estimated parameters are used 72 ht d 39 welg e 7 N o Modified estimating equations 0 Solution for 02 0 That is the modified PL equations yield the bias adjusted estimator of 02 10 0 Recall that the original PL estimating equations arise from maximizing for fixed the normal log likelihood A n A 1 n us Mm PL 0 0 A 0 n 090 jgl 0919 X 2731 029B7Q7XD2 o The modified equations arise in the same way from maxi mizing A 1 A T A A PL 0 a Ip log a 5 log detX gt W i9 not obvious an exercise for the reader 11 o Terminology The notes suggest the extra terms as a penalty term that imposes a restriction on the solution hence REML REstricted ML Modified likelihoods of this form were first suggested based on the marginal distribution of residuals from a linear fit hence REML REsidual ML In some cases linear mixed model generalized linear model they may be derived from the conditional distribution of the observations conditioned on the sufficient statistic for 2 This is a partial or restricted likelihood 12 Multivariate Responses 0 In the general mean variance specification Eltxslxgt for varmlx 029mm we have assumed that Y1Y2Yn are conditionally inde pendent conditioning on x1x2xn o In many situations this assumption may fail clusters of observations such as pups born to mother rats serial correlation in repeated measurements on each ex perimental unit o If we ignore dependence parameter estimates are generally inefficient standard errors are generally wrong hence inferences con fidence intervals hypothesis tests do not have nominal properties coverage probability size statistical framework is inappropriate for scientific objec tives o Inefficiency may not be important invalidity is always impor tant but relevance to the science is paramount 0 We Iimit discussion to situations where groups of observa tions may unambiguously be assumed to be independent m response vectors YM39 12 m m observations on subject 139 Covariates o Within individual covariates describe conditions under which YEW was observed needed even if inference were restricted to individual i eg tm39 time of jth observation on individual i o Among individual covariates same value for all observations on individual i eg treatment assigned to this individual or individual characteristics such as gender Covariate notation o Within individual covariate vector 2m 0 Stacked Zz 1 Zz 2 Ziyni o Among individual covariate vector ai Z Xi Z 1 ai 0 Combined Sources of Dependence 0 Dependence simply means that fz Yz Xz 7E fijYij j1 Xi 0 Very general hence difficult to specify o It is helpful to distinguish individual level sources population level sources Individual Level Sources of Dependence o For example suppose we model repeated measurements of a subject s blood pressure using a within individual linear re gression Yij 50 l m em where tm39 is the time of the jth measurement on the it subject 0 The linear trend 50 51175 represents the mean response for that subject o The subject s actual blood pressure at time t on the day of testing is say 50 51175 61305 where ept is random variation around that mean response perhaps a stationary stochastic process a If measurement error eM j is non negligible then em 6130190 eMm o Then varY7jgt varep 7jgt vareM7i7j and for j 72 j COVYL7j7 2amp7 Z COVlt p7i7j 61371937 c We would need to specify a model for these variances and covariances in order to make inferences about o and u 0 Here the frame ofinference is the individual subject Population Level Sources of Dependence o If the subjects are themselves a random sample from some population then the parameters 30 z lt 51 associated with the it subject are a random sample from the corresponding population of parameter vectors 0 We shall be interested in the mean and dispersion in this population which describe the average across subjects and the variation among subjects 0 Here the frame ofinference is the population of subjects 10

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

#### "I made $350 in just two days after posting my first study guide."

#### "Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

#### "It's a great way for students to improve their educational experience and it seemed like a product that everybody wants, so all the people participating are winning."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.