### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# Applied Time Series Analysis ST 730

NCS

GPA 3.79

### View Full Document

## 35

## 0

## Popular in Course

## Popular in Statistics

This 73 page Class Notes was uploaded by Jordane Kemmer on Thursday October 15, 2015. The Class Notes belongs to ST 730 at North Carolina State University taught by Staff in Fall. Since its upload, it has received 35 views. For similar materials see /class/223978/st-730-north-carolina-state-university in Statistics at North Carolina State University.

## Reviews for Applied Time Series Analysis

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 10/15/15

Cointegration general discussion Definitions A time series that requires d differences to get it stationary is said to be quotintegrated of order 1quot If the 1th difference has p AutoRegressive and q Moving Average terms the differenced series is said to be ARMApq and the original Integrated series to be WW Two series X and Y that are mtegrated of order 1 may through linear combination produce a series aX bY which is stationary or integrated of order smaller than 1 in which case we say that X and Y are cointegrated and we refer to a b as the cointegrating vector Granger and Weis discuss this concept and terminology An example For example if X and Y are wages in two s1m11ar 1ndustr1es we may nd that both are unit root processes We may however reason that by virtue of the similar skills and easy transfer between the two industries the difference X Y cannot vary too far from 0 and thus certainly should not be a unit root process The cointegrating vector is specified by our theory to be 1 1 or 1 1 or c c all of which are equivalent The test for cointegration here consists of simply testing the original series for unit roots not rejecting the unit root null then testing the X Y series and rejecting the unit root null We Just use the standard D F tables for all these tests The reason we can use these DF tables is that the comtegratmg vector was spec1f1ed by our theory not est1mated from the data Numerical examples Y AY1 E 1 B1var1ate stationary Y1 12 3 YLH 61 0 4 1 12 104 051Y21e2t EtNN 0gt 1 3 12 A 3 4 5 A A AIz 2A2 17612A 9 8 this is stationary note E distribution has no impact 2 B1var1ate nonstationary Yit 12 5 Y1t71 612 Y2t 02 05 Y2 4 21 A2 17A6 1 A 1 7 12 A 5 A 2i 02 5 Ai this is a unit root process Us1ng the spectral decompos1t10n elgenvalues vectors of A we have 12 5 1T16 1 1T16 1 1 0 02 05 AAT16 1 AAT16 1 i0 07i A T T A Tly T lATT1Y1 T lEt Zt A Zt l 7725 components of the Z vector 211 Z1t1 77141 quotcommon trend unit root Z2 07 Z2t1 7m stationary root 1145 2 wl Zl g quotLUQ Z115 lV ZLt 1 Z27t lt share quotcommon trendquot 12775 2 211321775 211422775 4 Z1715 Z27t Zt T lYZ so that last row of T iL is cointegrating vector Notice that A is not symmetric and T L 7E T Engle Granger method This is one of the earliest and easiest to understand treatments of cointegration Y1tw1Z1tw2Z2t Z2 wherezquotI is O 1and 22 52 111321 111422 2 p n is 014 H 2 th n zwiwgzzi op so 1f we regress Y1 t on Ygt our regress1on coeff1c1ent1s n 2 2 2 1 7 7 2 2 1391 Lngzm i Ol j n ZYZt 11 O and our residual series is thus approximately wigw3Y1t w1Y2t wg w1w4 103 th a stationary series Thus a simple regression of Y on YR gives an estlmate of the comtegrating vector and a test for comtegration 1s Just a test that the residuals are stationary Let the residuals be m Regress rt 734 on 734 and possibly Comt 3 some lagged d1fferences Can we compare to our DF tables Engle and Granger argue that one cannot do so The null hypothe31s 1s that there is no comtegration thus the b1var1ate serles has 2 un1t roots and no linear combination 1s stationary We have in a sense looked through all pos31ble linear combinations of Y and Y1 finding the one that varies least least squares and hence the one that looks most stationary It is as though we had computed unit root tests for all possible linear combinations then selected the one most likely to reject We are thus in the area of order statistics If you report the minimum heights from samples of 10 men each the distribution of these minimae will not be the same as the distribution of heights of individual men nor will the distribution of unit root tests from these quotbestquot linear combinations be the same as the distribution you would get for a pre specified linear combination Engle and Granger provide adjusted critical values Here is a table comparing their EG tables to our DF tables for n100 EG used an augmented regression with 4 lagged differences and an intercept to calculate a t statistic 7 so keep in mind that part of the discrepency is due to finite sample efects of the asymptotically negligible lagged differences Prob of smaller 7 01 05 10 DF 351 289 258 Example P cash price on delivery date Texas steers Ft Futures price source Ken Mathews NCSU Ag Econ Data are bimonthly Feb 3976 through Dec 3986 60 obs 1 Test individual series for integration 5 A VPt 76 0117 P1 Z 01 VPH 7D F 053 2203 i1 5 VFt 77 0120 F1 Z 3 VFH 7DF 2228 i1 each series is integrated cannot reject at 10 2 Regress Ft on Pt let 5861 9899 Pt residual Rt VRt 0110 9392 RH 7EG 7428 Thus with a bit of rounding Ft 100 Pt is stationary Qoint4 The EngleGranger method requires the spec1fication of one series as the dependent variable in the bivariate regression Fountis and Dickey Annals of Stat study distributions for the multivariate system Aytil Ethen 1 Y1 I AK1 E We show that if the true series has one unit root then the root of the least squares A estimated matrix I A that is closest to 0 has the same limit distribution after multiplication by n as the standard DF tables and we suggest the use of the eigenvectors of I 1 to estimate the cointegrating vector The only test we can do with this is the null of one unit root versus the alternative of stationarity J ohansen s test discussed later extends this in a very nice way Our result also holds for higher dimenSion models but requires the extraction of roots of the estimated characteristic polynomial For the Texas steer futures data the regression gives V P 53 177 169 P1 VFt 69 103 093 E1 3 lagged differences 054 082 l 177 169 l 380 442 where 69 072 103 093 363 284 indicating that 69 P 0 72 F is stationary This is about 7 times the difference so the two methods agree that P F is stationary as is any multiple of it J ohansen39s Method This method 13 Similar to that Just illustrated but has the advantage of being able to test for any number of unit roots The method can be described as the application of standard multivariate calculations in the context of a vector autoregression or VAR The test statistics are those found in any multivariate text J ohansen s idea like in univariate unit root tests is to get the right distribution for these standard calculated statistics The statistics are standard their distributions are not We start with just a lag one model with mean 0 no intercept V Y IIY1 E where Y is a pdimenSional column vector as is E assume EEEt A 1m HQZHZ oz6 7 0 gt all linear combinations nonstationary 7 p gt all linear combinations stationary 0lt7 lt p gt comtegration Note for any H there are infinitely many a 8 such that H 048 because a aTT l so we do not test hypotheses about a and 6 only about the rank r Now define sums of squares and cross products V37 YQI i n VYt 8 00 8 01 for example 311 XYLQQQ 1271 510 11 tzl Now write down likelihood conditional on Y6 0 E em iZWYt HYH A 1ltVK 11YHgt 161 1 27022 IA I If H is assumed to be full rank r p then the likelihood is maximized at the usual estimate the least squares regression estimate 7 A L 1 H We m E nil 5013131 t 1 1 w l A 1 n A A I A l71 I 500 H311 E E1 H0 r stationary linear combinations of Yt linearly indep and thus pr unit root linear combinations H0 r quotcomtegrating vectorsquot and pr quotcommon trendsquot H0 H 048 with cup and lm A A So far we have the unrestricted estimate H A and can evaluate likelihood there Principle of likelihood ratio requires that we maximize the likelihood for H 058 and compare to the unrestricted maximum That is we now want to maximize 6W ZltV1 ta 39mgt Alltvm maxH 7r 251 Step 1 For any given 8 we can compute 8 Y21 and find the corresponding 1 by regression in the model VYE 03391 2 1 Et and thls 1s Simply 7L 1 W KL1 gtZlt 39Y21YL1m n t1 t1 6M3 Step 2 Search over 8 for maximum To do this plug 64 8 into the likelihood function which now becomes a function of 8 and A Now recall from general regressmn that exp gtmceXX X1X39 exp gtracd XX1XX exp where exp is the exponential function and prankX In our case X has tth row WY a8 1 24 and by our usual maximum likelihood arguments we will for any given 8 estimate A by A 1 I A8 n XX so that L 2Wplx 395 exp 2 Our goal now is to maximize C which we would do by minimizing 1AX8 Step 2 a Minimize A8 SOO 5018 l51118118l310 Recall for a 2x2 matrix we have 1 b adbc am bd c d ad ccflb and s1milarly for the determinant of a partltloned matrix Thus 300 SM 2 Kltm Ham 3 510 3113 SOO 33113 351033501m so our problem now becomes Ifuml I Blsii Sios sm l Mg lS ool Mg I Isllm Recall Cholesk Root 8 11 psd and symmetric gt 8 11 U U U upper triangular SAS PROC IML UROOTSl l I 6 UI If71 510 8amp1801U71U16 3 UU Note We have seen that H a allows a lot of exibility in choosing the columns of 8 Corresponding adjustments in a will preserve H We choose 8 U U 8 I LetZU8 8U 1Z Fact C ith column of Z is eigenvector of symmetric matrix so can get it in SAS Z I U 15103 1301U1Z diagonalmat r izc 1 Cholesky on 83911 2 U1310 5amp150117 1 3 EIGENVECTORS Ci EIGENVALUES A1 gt A2 gt A3 gt gt p 4 3 U12 I A A A 5 Get a by regressing W on 3 141 6 Halo Raw Note eigenvalues are called quotsquared canonical correlationsquot between VY and Y1 PROC CANCOR will compute these for you Testing Maximized E unconditionally Maximized under H0 Now look at likelihood ratio test Summary 1 Choose 8 to minimize 2 U invertible so any 8 is expressible as 3 U 1Z for some choice of Z 3 Length of 3 vector is arbitrary so we can specify Z Z I 4 Pick Z to minimize Z U 1S3900 50131 11310U 1Z and thus Z would be picked as the matrix whose columns are those associated with the smallest l Ai that is the largest quotsquared canonical correlationsquot Ai Hm Hiir gtr0 maz H UH A r12 quot0 0 1 lel H1 i73901 11 7 n 2 A quotA flu A21 p LRT meOlt gt 39A139 i 11 14 W2 Now in standard likelihood ratio testing we often take the log of the likelihood ratio The reason for this is that it often leads to a Chisquare limit distribution There is no hope of that happening in this nonstandard case but J ohansen still follows that tradition 19 suggesting that we reject when 2 Zn 1 M is small that is we will reject when 39r 1 p n 2 ln 1 A is large where ArOH gt Am gt AWL gt gtp lr01 are the pro smallest squared canonical correlations This is J ohansen s tracequot test To keep things straight 1 You use the smallest squared canonical correlations thus making l A large nearer l and hence making n E lnl small ie you select the A that best protect H0 In a later article J ohansen notes that you may have better power if you opt to test H0 rr0 versus H1 rr0l and thus use the largest ArUH of the smallest A39s This is J ohansen39s quotmaximal eigenvalue testquot 2 Under H0 07 H1 you have at least r0 cointegrating vectors and hence at most prg quotcommon trendsquot Therefore I CJCCUOII of the null hypotheSis means you have found yet another comtegrating vector Coint 9 3 The interpretation of a cointegrating vector is that you have found a linear combination of your vector components that cannot vary too far from 0 Le you have a quotlawquot that cannot be too badly violated A departure from this relatlonship would be called an quoterrorquot and 1f we start at any pomt and forecast into the future w1th this model the forecasts w1ll eventually satlsfy the relationship Therefore this kind of model is referred to as an quoterror correctionquot model Example th Zlt1 elt We observe Ylt th 9th Z216 8 22713 1 2t Y2t th 62211 Notice that Ylt Ygt 15 th is stationary so we are saying that Ylt can39t wander too far from Ygt and yet both Yts are nonstationary They are wandering around but wandering around together you might say Now in practice we would just observe the Y39s Notice that 1 0 Zt0 8gtZt1Et gt 1 1 9 1 0 1 9 K 1 60 81 6 K lanmse exactl 88 12 y 08 92Yt1nozse Now suppose Ylt 12 and Ygt 2 These are not very close to each other and thus are in violation of the equilibrium condition Ylt Ygt In the absence of future shocks does the model indicate that this quoterrorquot will quotcorrectquot itself ErroLcorrection The next period we forecast 1 122 12088 whose components are ClOSCI togetl ier O t t h df t 88 12 108 984 nd nmh n 111 wosepa ea OI CC S IS a CO l 111 g0 C fd tht 88 12 1quot 12 6644 d 88 12 50 12 60001 ms 8 08 92 2 5711 a 08 92 2 59991 Let us take this example a step further Modeling the changes in the series as a function of the lagged levels we have VYt 88 1 12 Yt1 noise 08 92 1 12 g 08 gt0 1 Yt l nozse so we see that the discrepancy from equnibrium 1 1 1 24 which in our case is 10 is computed then 12 times this is subtracted from Y1 and 08 times this is added to Y2 The quotspeed of adjustmentquot is thus faster in Y1 and we end up farther from the original Y1 than from the original Y2 Also although the model implies assuming 0 initial condition that EY1 EY2 0 there is nothing draw1ng the series back toward their theoretical means 0 Shocks to this series have a permanent effect on the levels of the series but only a temporary effect on the relationship equality in this model between the series components Remaining to do Q1 What are the critical values for the test Q2 What if we have more than 1 lag Q3 What if we include intercepts trends etc Q1 Usually the likelihood ratio test has a limit Chisquare distribution For example a regression Ff test is such that 4F 5 xi Also F711 t2 5 Z 2 is a special case Now we have seen that the t statistic 739 has a nonstandard limit distribution expressible as a functional of Brownian Motion B t on 01 We found fall1 05 dBltt folB2tgtdt awn 11 folB2lttgtdtJ a and we might thus expect J ohansen39s test statistic to converge to a multivariate analogue of this expression Indeed J ohansen proved that his likelihood ratio trace test converges to a variable that can be expressed as a functional of a vector valued Brownian Motion w1th independent components channels as follows re the error term has variance matrix 102 LRT trace f01BtdBt 39 f01Bt 33905 dt 1 folBt 61305 For a Brownian motion of dimension in 12345 Table 1 page 239 of J ohansen he computes the distribution of the LRT by Monte Carlo Empirically he notes that these percentiles are quite close to the percentiles of cxg where f2m2 and c8558f We Will not repeat J ohansen39s development of the limit distribution of the LRT but will simply note that it is what would be expected by one familiar with the usual limit Joint 1 1 results for F statistics and with the nonstandard limit distributions that arise with unit root processes As one m1 ght expect the ml case can be derived from the 7 distribution DZ What happens in higher order processes th A1Yt 1 AQK Q ASK 3 39 AkYi k EA NYt i I A1 A2 Ai7l A2Aaquot39AtVYt71 143 A439 AVK2 AkVYLkH Er lwhich has the form WY I Ari A2quot Ale7 iBiVY l H BaVYi72 quot BitVYt7k1 Eti The characteristic equation is II 111k 7 Alml 1 7 Agmk 2 7 7 Ak 0 Now if m1 is a root of this we have II Al A Akl 0 The number of unit roots in the system is the number of 0 roots for the matrix I A1 A27 AL and the rank of this matrix is the number of cointe ratinty vectors I am assumim7 a vector lautoregression in which each component has at most one unit root ie differencing makes each component stationary here While this parameterization most closely resembles the usual unit root testing framework Johansen chooses an equivalent way to parameteiize the model placing the lagged levels at lag k instead of lag 1 His model is Written NYt I A1vyvt71 7quotquotI7A17A2quotquot Aklvyvt7k1 I Ar A2 AkYLkEA landheischeckingtherankof HI 7 A17 A377Ak Recall Regression can be done 1n 3 steps Suppose we are regressing Y on X1 X2 X1 Xkand we are interested in thel poefficient matrix of X k We can get that coefficient matrix in 3 steps tep l Regress 1 2 k1 residuals Ry Step 2 Regress Xi on X1 X2 qu residuals Rk Step 3 Regress Ry on Rk gives same coef cient matrix as in full regression ho J ohansen does the following in higher order models Step 1 Regress VY on VYLI V1672 VYLAVH residuals RV 353 Ht 011 l7l l l7Al ICSI u 8 Step 3 quared canonical conelations between Ry on Rk Joint 12l The idea is veiy nice Johansen maximizes the likelihood for any l39IC with respect to the bther parameters by performing steps 1 and 2 Havmg done th1s he s back to a lag 1 typel lgf problem By analogy if in ordinary unit root testing the null hypothesis were true you bould estimate the autoregresslve coeff1c1ents consistently by regressing the first ldifference on the lagged rst differences Having these estimates 17 you could compute laquotfilteredquot version of Y namelyZ K HYPI opYH and under the null hypothesis Zt t 7 rlet1 7 7 l39thp is a random walk so you could regress L on ZH and in large samples compare the results to our DF unit root tables both for the coefficient and t statistic Note that all of Johansen s statistics are multivariatel lanalogues of T there is nothing like the quotnormalized bias test n0 1 Example M AiK i AQYr Q ASK 3 Et K Yin th Yam Kli 100 observations r0 0 p4 so use 40 4 smallest canonical correlationsl LRT100 ln099 ln098 ln092 ln0547298 J ohansen Table 1 gives critical value for m4 H0 implies m4 common trends pritical value is 412 Reject H0 There is at least 1 cointegrating vector r0 l p4 so use 41 3 smallest canonical correlationsl LRT In n l 71 1 look in J ohansen s table under m3 common trends primal value is 238 Do not ICJCCI H0 There are no more comtegratmg vectors Q3 Johansen later wrote a paper addressing the intercept case The trend issue is a bit tricky as 1s true 1n the umvarlate case Even the 1ntercept case has some mterestmg features in terms of how intercepts in the model for the observed series translate into the underlying canonical series common trends An additional note We are used to taking 2 ln in maximum likelihoodl psnmation and l1kel1hood ratio testmg We do this because under certam regularlty onditions it roduces test statistics with standard limiting distributions Chi s uare usually In this case we do not have the required regularity conditions for the test of r Knumber of comtegratmg vectors otherw1se J ohansen would not have needed to do anyl new tabluations Because this is the case J ohansen could have opted to use other Cointl3 functions of the eigevalues such as nZAi in place of his n E ln1 Ai statistic We show for the case of a uinivariate series just one A that both of these statistics converge to the same distribution and it is the distribution 7392 First note that for the series Yt Yt1 65 we have V Y Y Yt1 6 so our regression MSE is 26371 00 our standard error is MSEZYt21 SOOn 311 and f3 1 801811 so 1 501511 7 013 v 500511 Now the quotCholesky rootquot of the univariate variable 11 is of course just 811 and Johansen looks at the matrix 5111 jig u xSn x 511 l 773V Sncalling its eigenvalues lAi We see immediately that n A is just 7392 so we already know its distribution the multivariate cases are analogous This also shows that A Op1n so if we expand J ohansen39s statistic using the Taylor series for ln1 33 expanded around a 0 we then have nln1 i nln1 z Op1n2 from which we see that nln1 Ai n Op1n proving as claimed that these two statistics have the same limit distribution of course there are some extra details needed for the multivariate case Notice that for a single A these two statistics are monotone transforms of each other so even in finite samples prov1ded we had the right distributions they would give exactly equivalent tests For the more interesting multivariate case they are the same only in the limit Coint 1 D A Dickey April 1998 Notes on Cointegration and Unit Roots Model g Initial Estimator regression Condition Yt th1 at p 1 Y0 0 f3 RegressYt on YFI NOINT Ytu 11047111 5t p 1 Yo It IA Regrestt OHM111 orYt 7 You Yt17 Y Yt7a7 t p1 Y0a 7 RegressKon1tYt1 PYt7104 7W7 1 6t Throughout data can be generated by YIYI1eI niliyiilet 212Y2 i5t2 1 2 A 7 quot 7 n 7 T 71 O 1 1 If p 1then np7p quotn1 n H 2 n 1 A 72234271 2 quot 10 234271 n t2 t72 YT 6162 en2 Sum of sub 7 diagonal 6162 6163 6164 6165 all ehS e162 3162 I 6162 6263 6264 6265 I 6163 6263 5263 I 6163 6263 6364 6365 I i 2 61 62 6364 Y364 6164 6264 6364 6465 t1 t Y465 K6165 6265 6365 e465 2ZK7let t1 n Tn N01 so numerator g 7 1 Denominator E nAnEn where E 61 62 en1 7171 7172 7173 7172 7172 7173 HHH U 7173 7173 7173 I K 1 1 1 1 A Coint 2 1 71 0 0 0 I71 2 71 0 0 A710 71 2 71 0 quot 0 0 71 2 0 K0 0 0 0 2 Rutherford 1946 J Royal Soc Edinbg shows that 1Rm 17 a7 b1 sinml6 ab1n3absinm7l6 where 0 Arccosx2 and 15 1 0 0 0 1 x 1 0 0 0 1 x 0 0 Rmxab71 1 I 0 0 0 x 1 I K 0 0 0 1 xal Thus if 1A 7 A I 1 0 eigenvalues then 1 7 A A 1 I 1 1 Rn1 7 2 A 1 0 D1 0 so eigenvalues of An are reciprocals ofthose of A1 from which we obtain the ilth 71 eigenvalue of An to be A 140052ltquot7iw 1 12 n 7 1 Now 2n71 2139 v 1 sin 717 n7z7r n727r 22n71 2271Tr 2271Tr 2n005lt 2W1 51 27H 12H 2 gt 2 22n V V 72 2 Let em We have A 12005 Now let x be the corresponding i1th eigenvector column of An with t1th element X 7 00512t 7 1 Stack these columns side by side to get X x1x2X3 xml a symmetric orthonormal matrix ie X Xgl Now Tn 7117 def 71 171717quot 71610762076307 HenU 71 171717 U71En1 m7 and Coint 3 ETL1ATL1ETL1 En1 Xn1 Xn1ATL1 Xn1 En1 Xn1 2 n1i 39 Z HARH Z n1 where An1ls a dlagonal matr1X w1th elements 4 cos lt ZHH 7T 7 1 1 1 Zn1 3 Xn1En1 5 F En1 Xn1Zn139 Tn 111Xn1zn1 2t712i71 Tl 1 1 2 7r NOW 11211 n m 005 W 5 from Jolley Summation ofSerz39es Dover press Sin 2n22quoti11 0 1 1i1608 2 l i2n2n2 Sm2 gt1 i2n2n Sm2 gt1 NI NI HHHH 2nl 2nl A L V alto 239 cos 6 2 2 i 1 1 smleu ij e Hal101 9m1 1 2n2n kiwhere A1 7r21271li1 sinAB sinA cosB cosA sinB A 2nllt22711gt 72 2 B 7 pr 1then n i l all Op1E where 2 n 39 R71 7 W1 6059m 1 7 1 Tu 27 2712 smomil Zia Fn 2 sin iyn Z il and where Z N N I DO7 1 Now de ne the following limit random variables Tm Aizi FOO Angwhere Ai 2 1i1 g W217 We thus see that n05 7 1 g wail and 7 11 g at where se is the usual regression standard error Coint 4 Verify unit root eigenvalues and vectors for n6 options 1372 proc iml reset fuzz00001 spaces3 AN54321 44321 33321 22221 11111 inan invAn evaleigvalAn eveceigvecAn print an format53 inan format53 cval shape051 pi 4atan1 cvecshape055 do i1 to 5 theta 2i1pi22 cvali1 12sintheta2 do t1 to 5 cvecit 2sqrt261cos2t1theta end end print cval eval print evec format53 cvec format53 AN INAN 5000 4000 3000 2000 1000 1000 100 0000 0000 0000 4000 4000 3000 2000 1000 100 2000 100 0000 0000 3000 3000 3000 2000 1000 0000 100 2000 100 0000 2000 2000 2000 2000 1000 0000 0000 100 2000 100 1000 1000 1000 1000 1000 0000 0000 0000 100 2000 CVAL EVAL 12343538 12343538 14486906 14486906 05829645 05829645 03532533 03532533 02715541 02715541 EVEC CVEC 0597 549 0456 326 0170 0597 0549 0456 0326 0170 0549 170 326 0597 456 0549 0170 326 597 456 0456 0326 549 170 0597 0456 326 549 0170 0597 0326 0597 0170 456 549 0326 597 0170 0456 549 0170 0456 0597 0549 0326 0170 456 0597 549 0326 Coint 5 Alternative to quadratic form decomposition o D0l all functions on 01 such that ft is right continuous and has left limits m Examples CDF Xt 2Z1 fort 6 01 where Z11 is a sequence 11 0 Wiener Process Probability space 933 for each w we get a function Wt such that Wul Wu2Wu1 Wu3Wu2 WukWuk1 are normal independent means 0 and variances u1 u2u1 ukuk1 for 0 3 ul u2 3 uk 3 l o Donsker39s Theorem 111 et N iid002 and Snt U39ln39lZZei i1 Then sun 3 Wt see Fuller Theorem 535 for weaker Martingaletype assumptions 0 Example Tl Snl U39ln39lZZei g W1N N0l i1 This is the usual CLT 0 Continuous Mapping Theorem a slight variation Snt a sequence of random functions on D0l and St one such function Let f1 fm be real valued continuous functions on the real line De ne for il2m and k 2 0 Zins fosuk filsna dt ZiS fosuk filstldt If Sn g S then UOimIY SRO Z1nS Z2ns 539quot anS g St Z1S 5Z2S ZmS 0 Example t nt e1 N iid002 YEZei a random walk Snt U39ln39lZZei 171 11 Thus going from Y to S l Rescales t so that it runs from 0 to 1 rather than 1 to n 2 Rescales Y so Yn becomes Snl which has variance 1 j 3 Sets St in the interval Un Qln all equal to SHOE U39ln39IZei 11 Coint 6 Thus for example U39zn39Zth2 i0391n3912Yt2 g f01W2tdt and 11 11 ZhFi tnj U39ln39lZYt g folthtdt Note that the corresponds to dt as one 1 might expect that is fol S t dt i0391n3912Yt2 because Snt is a step function with value U39ln39l 2Y1 over intervals of tjvlidth 0 Conclusions f01W2tdt must be a random variable with f01W2tdt F A Z12 0 Unit Root Test zero mean assumed Y1 Yt1 e1 with Y00 Tl Tl Regression n3 l n39IZYt1et n39ZZYil 11 1 n n L nlazY Ze3 nzazzY l a W211f01W2tdt F1 1 Notice that Donsker s Theorem implies this is the same distribution for an iid sequence e1 so if you can compute it for normal e1 you have it for any iid sequence MSE nlzm YHY n39IZetl3Yt12 and we have n192n39ZZY31 n 0pn392 0p1 013 so MSE n39letz Op 0p 02Op o Studentized statistic zero mean assumed Because MSE g 02 then n times the standard error 1 l MSE n39zilYil t converges in law to t l f01W2tdt and the studentized statistic 11MSEYfl t n511MSE n39ziY l g W211 f01W2tdt 11 Coint 7 Now suppose our regression has an intercept M n71 7 4 7 7 whereY 7 ningtande 7 nilt zet Note that EU Tu and g 17 1 7 161 n 7 262 674 T37rlgeEa2 7 71Tm g Call this last expression W Then Mb 7 1 F 7 W3 1T27117TW A 1 5 l T2717TW 2 i 2 andTu p e 7 F7W2 FirV2 Finally for a regression with a linear trend we de ne n n l Vn fixRina t275139Yt71 W JnJ39J391 j Pp 1 T3717ann 767Wnrnvno1 5 T7W T76V r 7an7 3Vn2op1 n F7W273V2 and 71 xF7W273V2 The limit random variables can be expressedjointly in terms ofa sequence of N0 1 variates Z We have see Fuller Thms 1013 1016 where F is G Wis H and V is K PMTMWMVR gamma 00 323 2min xtzi mi 7 AM 7 13971 13971 1 i1 f01W2tdt7W1 folwgodt f012t 1 Wtdt 2 i 1 i WM Letting Wt represent a Wiener Process on 01 Brownian Motion we can represent 1 the limits as for example np 7 1 g W This representation was rst 0 suggested by J S White in the Annals of Math Stat and has been rigorously proved and popularized by PCB Phillips of Yale and CZ Wei of U Maryland and their students Although the expression contains an integral it is a random variable and must somehow be simulated or further developed to get limit distributions Coint 8 Our approach Dickey and Fuller is to approximate the in nite weighted sum of Z with a nite sum and simulate from that Higher Order Processes 1 Note In lag 1 model regress Yt 7 141 on 141 to get 23 7 1 and 7 right offthe printout This is the most convenient form of regression for extension to higher order 2 Regress VYt on 141 V144 V144 VYtp where VYt Yt 7 141 to test H0 ARIMA pl0 versus H1 ARIMApl00 Example Yt a pYt1 7 ap 162 at with roots 1 and p is the same as VYt 1 QXP 1Yf71 OvaYf71 6t Note Coefficient offt4 estimates 1 7 ap 7 1 amultiple of p 7 1 Limit distribution will be 1 7 1 times that of no 7 1 but standard error will have same multiplier so 7 will be unaffected in the limit same for mean and trend models We will show this Using standard matrix regression symbols our regression gives 5 7 8 X X 1X E and letting Dn diag1n we will normalize this as D1E 7 8 DnX XDn 1DnX E where for one lagged difference 171 0 171 0 I I 7 DquotXXDquot 0 WEgtXXlt0 1 gt t3 t3 KnigZ Z K71vyf71 712vyt712 t3 t3 and we have Claim n 2 n W 7142 n Y VY 1 gt 1 t1 t1 H0213 0 30 n n 2 nigZZK7IVK71 712vyt712 0 1042 t3 t3 Notice that this is a diagonal matrix With more lagged differences a block diagonal limit structure would be obtained Coint 9 Proof To show this limit result note that under H0 p 1 we have these items item a n71ivyt712 3 WV0 021 7 0 2 t3 item b Mimwim ZZiEmiliSilmwialvm t3 t s EY571VYt71EYt71VYs71 EYs71VYs71EYt71VYt71i Notice that the last two terms are 01 because EYJVYk EVY1 VY2 VY3 VYJWYICH E 2 WWW J39 700 Tl which is 01 Turning to the rst term we have E 544554 EVYt1VYS1 12 11 Z t3 EEWG7QHWQ7S S t hiQQ w MHAVais 00 n E yvh yv 7 5 O1and we sum this over 5 that is h7oo t3 M M On so that item b is 0n2 and imamH Opn t3 m H w 92 item 9 Yt K71 04Yt71 5272 6t 9 t de Yt aYH Yaham et 7 Yt aYH 2e J St 1 so Yt 7 aYt St 1541 7 aYt St 7 04Yt 7 141 Squaring and summing both sides we have 7 0473 may a27 LZ 0p n2 UZFHOP71 El Coint 10 Having studied X X we turn to X E 1 n D X E 111 0 gtX E 7 n gnilQN 0 UV Kn izv iet t3 where the second element is the numerator of the ARl regression coefficient for the differenced data Thus by standard stationary arguments eg Brown s martingale central limit theorem it converges to N0 la239104 Because Dn X X Dn converges to a block diagonal matrix with the appropriate lower right element block we see that the coefficients on the lagged differences will have the same limit normal distribution when we regress differences on a lagged level and lagged differences as when we regress on just the lagged differences as would be appropriate if we knew we had a unit root Thus using F or t tests to determine an appropriate order is asymptotically justified when our data has a unit root and we regress the first difference on the lagged level and lagged differences Now the first element of DnX E is n lzl 1et and what we have previously studied would be written 714 Est1et in the model we are now studying We have from our previous calculations Tl Tl 1 0071 1 Yihlet 71712 St71 ayf71 K72et 3 mil sale on 7 aam 2795 Op 1 Combining this with item c we find n81 7 1 amp 1 Z n Uan1 i 12 50213 1 i m 01471 7gt 1 7 a T2 1F as expected This means we cannot compare the raw coefficient n81 to our tables without some adjustment We could divide n81 by 1 7 82 1 7 84 which we just showed to be consistent then compare to our tables The error mean square MSE from our regression is a consistent estimate of 02 and the standard error printed by the computer is asymptotically equivalent to the square root of MSEZYt2 because X X is block diagonal From the results above we see that n2 MSEZYtz PH1 7 342391 g 0 Now PH1 7 12 appears in the denominator of n81 Dividing n lby n times the standard error produces the t test from the computer 1 and we see that this has the same limit distribution as URPH 1 7 042 j 02Tr21 gTeZ ZH 1 7 a Trzl Tef FR This means that we m use our 7 tables for the t test without any further adjustment Adding more lagged differences a mean or trend Coint 11 term to the model poses no additional theoretical problem The bank of tests coefficients and t tests for regressions possibly containing a mean or trend have come to be known collectively as the quotDickeyFullerquot test and when lagged differences are included as we are now discussing the tests are called quotAugmented DF testsquot Note the appearance in the denominators of several of our expressions the term 1 7 a Clearly our results will not hold in the presence ofa second unit root See Fuller Chapter 10 for further discussion Also our alternative estimators have been studied in the case of unit roots GonzalezFarias NCSU PhD thesis develops the exact MLE based test Dickey Hasza and Fuller JASA 1984 develop the symmetric test and Park and Fuller JTSA 1995 and ISU PhD Thesis develop the weighted symmetric estimator and test for unit root problems Pantula GonzalezFarias and Fuller JBES 1994 study power and size properties of these and other tests There is considerable power improvement in these versus the original OLS based tests with the best ones appearing to be the exact MLE followed closely by weighted symmetric which is much easier to compute The limit distributions are not the same as the DF tables Tables of critical values are given in Fuller39s book for all of these Cointegration general discussion De nitions A time series that requires d differences to get it stationary is said to be quotintegrated of order dquot If the 1 11 difference has p AutoRegressive and q Moving Average terms the differenced series is said to be ARMApq and the original Integrated series to be ARIMApdq Two series X1 and Y1 that are integrated of order d may through linear combination produce a series aXt bYt which is stationary or integrated of order smaller than d in which case we say that Xt and Yt are cointegrated and we refer to a b as the cointegrating vector Granger and Weis discuss this concept and terminology An example For example if Xt and Yt are wages in two similar industries we may find that both are unit root processes We may however reason that by virtue of the similar skills and easy transfer between the two industries the difference Xt Yt cannot vary too far from 0 and thus certainly should not be a unit root process The cointegrating vector is specified by our theory to be 1 7 1 or 7 11 or c 7 c all ofwhich are equivalent The test for cointegration here consists of simply testing the original series for unit roots not rejecting the unit root null then testing the Xt Yt series and rejecting the unit root null We just use the standard DF tables for all these tests The reason we can use these DF tables is that the cointegrating vector was specified by our theory not estimated from the data Numerical examples Y AYE1 Et Coint 12 1 Bivariate stationary Yit 7 12 i 3 Y1t71 elt 0 4 1 YR i 0 4 0 5iltY2t71 62t 7 EtNN lt07lt1 3 127 73 7 27 7 7 7 4 57A 7A 17A612A 9A 8 AiAI this is stationary note form of E distribution has no impact 2 Bivariate nonstationary Ylt 7 12 5 Y1til elt th 7 02 05 YVZt71 62t 7 12 i A i 5 7 27 7 7 7 02 57A 7A 17A61A 1 7 AiAI this is a unit root process Using the spectral decomposition eigenvalues vectors of A we have 12 75 um 1 um 1 1 0 02 05 AAm 1 AAm 1 0 07 A T T A T lYZ T lATT 1Y271 T lEt Zt A Ztil 7k components of the Zt vector Z Z171 nu commontrend unit root Zn 07 Z271 77 stationary root Y1t w1Z1t 102 Z2t 1V 116Z1t 1 Z2t lt share quotcommon trendquot th 10321 10422t ALA116 21 1 22t Zt T lYZ so that last row of T 1 is cointegrating vector Notice that A is not symmetric and T 1 3E T Coint l3 Engle Granger method This is one of the earliest and easiest to understand treatments of cointegration Y1t LUi Z1t 102 Z2t ZZZ ZZZ 1 Y wgzl t Jr w4ZZ t where 72 1s Op1 and 7 1s 0103 n39zwmzwou 1 so if we regress Y on Yu our regress1on coefficient is 7 W 1 w wz 7 w1w4w3 th a stationary series Thus a simple regression of Y on Y gives an estimate of the cointegrating vector and a test for cointegration is just a test that the residuals are stationary Let the residuals be rt Regress rt 7 131 on 131 and possibly some lagged differences Can we compare to our DF tables Engle and Granger argue that one cannot do so The null hypothesis is that there is no cointegration thus the bivariate series has 2 unit roots and no linear combination is stationary We have in a sense looked through all possible linear combinations of Y and Y nding the one that varies least least squares and hence the one that looks most stationary It is as though we had computed unit root tests for all possible linear combinations then selected the one most likely to reject We are thus in the area of order statistics If you report the minimum heights from samples of 10 men each the distribution of these minimae will not be the same as the distribution of heights of individual men nor will the distribution of unit root tests from these quotbestquot linear combinations be the same as the distribution you would get for a pre speci ed linear combination Engle and Granger provide adjusted critical values Here is a table comparing their EG tables to our DF tables for n100 EG used an augmented regression with 4 lagged differences and an intercept to calculate a t statistic 7 so keep in mind that part of the discrepancy is due to nite sample effects of the asymptotically negligible lagged differences 3 014 and our residual series is thus approximately iw3Yu LU1Y21l Prob of smaller 7 01 05 10 EG 377 317 284 DF 351 289 258 Example Pt cash price on delivery date Texas steers Ft Futures price source Ken Mathews NCSU Ag Econ Data are bimonthly Feb 76 through Dec 86 60 obs 1 Test individual series for integration Coint 14 5 V131 76 0117 PM 2 Bi VPH 7D F 2203 11 5 VFt 77 0120 FM 2 Bi VBi 7D F 2228 11 each series is integrated cannot reject at 10 2 Regress F1 on P1 ft 5861 9899 Pt residual Rt VRt 0110 9392 RH 7EG 7428 Thus with a bit of rounding Ft 100 P1 is stationary The EngleGranger method requires the speci cation of one series as the dependent variable in the bivariate regression Fountis and Dickey Annals of Stat study distributions for the multivariate system HY An1 Eithen Y2 7YH 7I7Am1 E We show that if the true series has one unit root then the root of the least squares estimated matrix I 7 Athat is closest to 0 has the same limit distribution after multiplication by n as the standard DF tables and we suggest the use of the eigenvectors of I 7 A to estimate the cointegrating vector The only test we can do with this is the null of one unit root versus the alternative of stationarity Johansen39s test discussed later extends this in a very nice way Our result also holds for higher dimension models but requires the extraction of roots of the estimated characteristic polynomial For the Texas steer futures data the regression gives V Pt 7 53 7 177 169 Pt1 VB 7 69 i 103 093 Ftil 3 lagged differences 054 082 i 177 169 7380 442 Where 69 072 7103 093 7363 284 indicating that 7 69 Pt 0 72 F1 is stationary This is about 7 times the difference so the two methods agree that Pt 7 Ft is stationary as is any multiple of it Johansen39s Method This method is similar to that just illustrated but has the advantage of being able to test for any number of unit roots The method can be described as the application of standard multivariate calculations in the context of a vector autoregression or VAR The Coint 15 test statistics are those found in any multivariate text Johansen s idea like in univariate unit root tests is to get the right distribution for these standard calculated statistics The statistics are standard their distributions are not We start with just a lag one model with mean 0 no intercept VY HYZA Et where Y is a pdimensional column vector as is Et assume EEtEt A l lap 1quot O 9 all linear combinations nonstationary 1quot p 9 all linear combinations stationary 0 lt1quot lt p 9 cointegration Note for any H there are in nitely many a 8 such that H 7 048 because 7 048 7 aTT l so we do not test hypotheses about a and 8 only about the rank r Now de ne sums of squares and cross products VYE 3 11 n WY 500 01 for example 511 ZYLlYQ A K71 510 511 t1 Now write down likelihood conditional on YB 0 W 1P ZVYZ HK71A71VY HYTH 7r t1 If H is assumed to be full rank r p then the likelihood is maximized at the usual estimate the least squares regression estimate A n n 91 H ZWYE 3 74 Z YEA K11 5015131 t1 t1 and Wt 7HK71Wt 71min 500 7 1151111 I7 M A t l l H0 r stationary linear combinations of Y linearly indep and thus pr unit root linear combinations Coint 16 H0 r quotcointegrating vectorsquot and pr quotcommon trendsquot H0 H 7 048 with 04pm and szr So far we have the unrestricted estimate H A and can evaluate likelihood there Principle of likelihood ratio requires that we maximize the likelihood for H 7 048 and compare to the unrestricted maximum That is we now want to maximize c 2 gm exp i ZVY 3453931 1r1v3ltt 043254 7r t1 Step 1 For any given 8 we can compute 8 YEA and nd the corresponding 1 by regression in the model VYZ 04339Yl71 Et and this is simply n 71 am filmt 311p p nlmm 1 a Step 2 Search over 8 for maximum To do this plug 84 8 into the likelihood function which now becomes a function of 8 and A Now recall from general regression that exp tmce XX X 1X exp gtracd X X 1X X exp where exp is the exponential function and prrankX In our case X has tth row WY 048 YEA and by our usual maximum likelihood arguments we will for any given 8 estimate A by A n lX X sothat nex 2 m mew 2 M 2 Our goal now is to maximize 5 which we would do by minimizing lA8 Step 2 a Coint 17 Minimize WM lSOO 7 5013 Su rl asioi Recall for a 2x2 matrix we have 7 71 a b adibc da bd c C d ad 7 c a lb and similarly for the determinant of a partitioned matrix Thus soo Sm ifxmii Su l 3510 3511 lSOOl l 8 5113 7 3 5105315013 I so our problem now becomes We W Mb lSool ijn l Sn l Recall Cholesky Root 11 psd and symmetric 9 11 U U U upper triangular SAS PROC IML UROOTSl l l5 UllI U 1 510 5amp1501U71U5 l U U l Note We have seen that H 7 048 allows a lot of exibility in choosing the columns of Corresponding adjustments in a will preserve H We choose 8 U U I LetZUB 8U 1Z Fact Ci i1th column on is eigenvector of symmetric matrix so can get it in SAS Z I 7 U 1 510 5amp1501U 1Z diagonal matrix 1 Cholesky on 11 17371510 507015011771 3 EIGENVECTORS Q EIGENVALUES A1 gt AZ gt A3 gt gt AP Coint 18 4 U42 I A A A 5 Get a by regressing Wt 011 YH a sll rlm39sm Note eigenvalues are called quotsquared canonical correlationsquot between VY and YEA PROC CANCOR will compute these for you Testin Maximized unconditionally Maximized 5 under H0 Now look at likelihood ratio test Summary 1 Choose 3 to minimize 2 U invertible so any 3 is expressible as 3 U lZ for some choice of Z 3 mth of 3 vector is arbitrary so we can specify Z Z I 4 Pick Z to minimize lZ U 1SOO 7 01 1510U 1Z and thus Z would be picked as the matrix whose columns are those associated with the smallest lAi that is the largest quotsquared canonical correlationsquot A H0 r 3 r0 H1 r gt re p nZ A n 110 p LRT mquo E Ah 2 11 liAi nZ mquoUH1 I Mop2 1910 Al 1115 Now in standard likelihood ratio testing we often take the log of the likelihood ratio The reason for this is that it often leads to a Chisquare limit distribution There is no hope of that happening in this nonstandard case but Johansen still follows that tradition 0 suggesting that we reject when Z ln1 7 M is small that is we will reject when ir01 p 7 n E ln1 7 Ai 1s large where AIOH gt Are gt AIOH gt gtp iro1 are the pro smallest squared canonical correlations This is Johansen s quottracequot test To keep things straight Coint 19 1 You use the smallest squared canonical correlations thus making lA large nearer l and hence making n E lnl small ie you select the A that best protect H0 In a later article Johansen notes that you may have better power if you opt to test H0 rr0 versus H1 rr0l and thus use the largest AIOH of the smallest Xs This is Johansen s quotmaximal eigenvalue testquot 2 Under H0 01quot H1 you have at least r0 cointegrating vectors and hence at most pro quotcommon trendsquot Therefore rejection of the null hypothesis means you have found yet another cointegrating vector 3 The interpretation of a cointegrating vector is that you have found a linear combination of your vector components that cannot vary too far from 0 ie you have a quotlawquot that cannot be too badly violated A departure from this relationship would be called an quoterrorquot and if we start at any point and forecast into the future with this model the forecasts will eventually satisfy the relationship Therefore this kind of model is referred to as an quoterror correctionquot model Example th Z1t1 elt We observe Y1t Zit 9Z2t Zzt 8 Z27t71 Jr ezt th th 6Z2t Notice that Ylt 7 th 15 th is stationary so we are saying that Ylt can39t wander too far from th and yet both Yts are nonstationary They are wandering around but wandering around together you might say Now in practice we would just observe the Y39s Notice that O 191019 1Y 1 76 08 1 76 quot1 me exagtly 88 12 i 08 92 1 0 Ztlt 8gtZt71Et gt 51 gtYL1 noise Now suppose Ylt 12 and th 2 These are not very close to each other and thus are in violation of the equilibrium condition Ylt th In the absence of future shocks does the model indicate that this quoterrorquot will quotcorrectquot itself Error correction Coint 20 A 7 88 12 12 7 108 The next perlod we forecast Yt1 7 08 92 lt 2 gt 7 lt 28 gt whose components are closer together 88 12 108 7 984 Our two step ahead forecast 1s 08 92 lt 28 gt 7 344 and continuing one fdtht 88 12 1 12 7 6644 d 88 12 5 12 7 60001 ms a 08 92 2 5711 an 08 92 2 59991 Let us take this example a step further Modeling the changes in the series as a 881 12 gtYH 7 noise function ofthe lagged levels we have VY lt 08 924 lt 702 1 7 1 YEA noise so we see that the discrepancy from equilibrium 1 7 1 YEA which in our case is 10 is computed then 12 times this is subtracted from Y1 and 08 times this is added to Y2 The quotspeed of adjustmentquot is thus faster in Y1 and we end up farther from the original Y1 than from the original Y2 Also although the model implies assuming 0 initial condition that EY1 EY2 0 there is nothing drawing the series back toward their theoretical means 0 Shocks to this series have a permanent effect on the levels of the series but only a temporary effect on the relationship equality in this model between the series components Remaining to do Q1 What are the critical values for the test Q2 What if we have more than 1 lag Q3 What if we include intercepts trends etc Q1 Usually the likelihood ratio test has a limit Chisquare distribution For example a regression F test is such that 4F 7 xi Also F t2 7 Z2 is a special case Now we have seen that the t statistic T has a nonstandard limit distribution expressible as a functional of Brownian Motion Bt on 01 We found T WMWW1 1 WMm KWMm and we might thus expect Johansen39s test statistic to converge to a multivariate analogue of this expression Indeed Johansen proved that his likelihood ratio trace test converges to a variable that can be expressed as a functional of a vector valued Brownian Motion with independent components channels as follows ie the error term has variance matrix 102 Coint 21 LRT gtrace folBtdBt folBtB tdt 1fBtdBt For a Brownian motion of dimension m l2345 Table 1 page 239 of Johansen he computes the distribution of the LRT by Monte Carlo Empirically he notes that these percentiles are quite close to the percentiles of cxi where f2m2 and c8558f We will not repeat Johansen39s development of the limit distribution of the LRT but will simply note that it is what would be expected by one familiar with the usual limit results for F statistics and with the nonstandard limit distributions that arise with unit root processes As one might expect the ml case can be derived from the 7 distribution Q2 What happens in higher order processes Y Ali271 A2K72 A3YE73 AkYLk Et VY i I A17 A2 AkY2717 A2A3quot39Akvx71 143 A439 39 39AkVY272 Akvyit7k1 Et which has the form VY I A17 A2quotquot AkY71Blvy71 B3VYZ72 Bkvx7k1 Et The characteristic equation is lImk 7 Almk 1 7 Azmk 2 7 7 Ak 0 Now if ml is aroot ofthis we have lI 7 A17 A2 7 7 AH 0 The number ofunit roots in the system is the number of 0 roots for the matrix I A1 A27 7 Ak and the rank of this matrix is the number of cointegrating vectors I am assuming a vector autoregression in which each component has at most one unit root ie differencing makes each component stationary here While this parameterization most closely resembles the usual unit root testing framework Johansen chooses an equivalent way to parameterize the model placing the lagged levels at lag k instead of lag 1 His model is written VY I A1VK71quot I7A17A2quotquot Ak1vx7k1 7I7A17A277AkxikEt and he is checkingthe rankof HI 7 A17 A27 7Ak Recall Regression can be done in 3 steps Suppose we are regressing Y on X1 X2 X164 Xkand we are interested in the coefficient matrix of X k We can get that coefficient matrix in 3 steps Step 1 Regress Y on X1 X2 X164 residuals Ry Step 2 Regress Xk on X1X2 Xk1 residuals Rk Step 3 Regress Ry on Rk gives same coefficient matrix as in full regression Coint 22 so Johansen does the following in higher order models Step 1 Regress VY on VYLl VYLZ VYLkH residuals RV Step 2 Regress YLk on VYLl VYLZ VYIHCH residuals Rk Step 3 Squared canonical correlations between Ry on Rk The idea is very nice Johansen maximizes the likelihood for any Hk with respect to the other parameters by performing steps 1 and 2 Having done this he39s back to a lag 1 type of problem By analogy if in ordinary unit root testing the null hypothesis were true you could estimate the autoregressive coefficients consistently by regressing the first difference on the lagged first differences Having these estimates 3 you could compute aquotf11teredquot version of Y namely it Yt 7 allFl 7 7 athiD and under the null hypothesis Zt Yt 7 041144 7 7 athw is a random walk so you could regress Z on 24 and in large samples compare the results to our DF unit root tables both for the coefficient and t statistic Note that all of Johansen39s statistics are multivariate analogues of T there is nothing like the quotnormalized bias testquot n 1 Example Y A1K71A2Y272 A3YL3 Et Y2 Y1t7Y2t7Y3t7Y4t 100 observations Step 1 Regress VY on VYLl VYLZ Step 2 Regress K4 on VYLl VYLZ Step 3 Squared canonical correlations 0010 0020 008 0460 TestHO r0vs H1 rgt0 r0 0 p4 so use 40 4 smallest canonical correlations LRT100 ln099 ln098 ln092 ln054 7298 Johansen Table 1 gives critical value for m4 H0 implies m4 common trends Critical value is 412 Reject H0 There is at least 1 cointegrating vector TestHozr1vs H1rgt 1 r0 1 p4 so use 41 3 smallest canonical correlations LRT 100 ln099 ln098 ln092 1136 look in Johansen s table under m3 common trends Critical value is 238 Do not reject H0 There are no more cointegrating vectors Q3 Johansen later wrote a paper addressing the intercept case The trend issue is a bit tricky as is true in the univariate case Even the intercept case has some interesting features in terms of how intercepts in the model for the observed series translate into the underlying canonical series common trends Coint 23 An additional note We are used to taking 2 715 in maximum likelihood estimation and likelihood ratio testing We do this because under certain regularity conditions it produces test statistics with standard limiting distributions Chisquare usually In this case we do not have the required regularity conditions for the test of r number of cointegrating vectors otherwise Johansen would not have needed to do any new tabulations Because this is the case Johansen could have opted to use other functions of the eigenvalues such as nZ in place of his n E lnl 7 Al statistic We show for the case of a univariate series just one A that both of these statistics converge to the same distribution and it is the distribution 72 First note that for the series Yt Yt1 at we have V Yt Yt 7 Yt1 at so our regression MSE is ZetZn SOO ourstandarderroris MSE2Y11 MSW71311 and 7lSmSuso Q7 1 301311 MSE242 7 WOO 310 7 7 30 xSoo 311 Now the quotCholesky rootquot of the univariate variable SM is of course just SH and Johansen looks at the matrix vSu l 75031 11 vSH SH 1 Tn Z Sucalling its eigenvalues lA We see immediately that n A is just 72 so we already know its distribution the multivariate cases are analogous This also shows that A Opln so ifwe expand Johansen39s statistic using the Taylor series for lnl x expanded around 1 0 we then have 7nlnl7 7nlnl Flt 07 Opln2 from which we see that 7 nlnl 7 Ai n Opln proving as claimed that these two statistics have the same limit distribution of course there are some extra details needed for the multivariate case Notice that for a single A these two statistics are monotone transforms of each other so even in nite samples provided we had the right distributions they would give exactly equivalent tests For the more interesting multivariate case they are the same only in the limit Coint 24 GENERALIZED MODELS WITH APPLICATIONS FALL 2004 R Adam Hoppes Department of Statistics North Carolina State University TABLE OF CONTENTS R A HOPPES Contents 1 Introduction and Motivation 1 11 Examples Securities and Commodities i i i i i i i i i i i i i i i i i i i i i i i i i i i i i l 2 Univariate ARCH Processes 5 2 1 ARCH1 Model i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i 5 2 2 ARCHp Model i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i 7 3 Univariate GARCH Processes 13 3 1 GARCHpq Model i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i 14 32 Extended GARCHpq Models i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i 14 4 References 15 ST 730 CONDITIONAL HETEROSKEDASTIC MODELS R A HOPPES 1 Introduction and Motivation For the most part an introductory course in time series focuses on analyzing the conditional mean behaviour ofa process also known as the rst central moment de ned as 1 Ey7 a y 7 Mfg dy During this semester a gallimaufry of techniques have enabled us to model the characteristics of these special7 types of processes Recent problems in nance have motivated the study of modelling the conditional variance of a process or the second central moment a Ey 7 M2 y 7 a2fyy dy This concept is applicable in many risk management applications such as options pricing and valueatrisk estimation However the most pervasive role in modelling volatility is illustrated in Example 11 11 Examples Securities and Commodities Example 11 Amazon Series The Amazon series Brocklebank amp Dickey 2003 represents daily stock prices from May 16 1997 to May 25 1999 The following scenario is indicative for analyzing security prices with respect to the ARMA modelling framework 0 Plot the 1 series It appears to have a nonconstant variance The natural log transformation is well known as a variance stabilizer hence we analyze the lnzt series Amazon Series Log Amazon Series mgog TWE TWE Figure ll Amazon and lnAmazon closing prices 0 Plot the autocorrelation function ACF and partial autocorrelation function PACF see Figure 12 As the lag length h increases the estimated autocorrelations for I and lnzt slowly decay PAGE 1 ST 730 CONDITIONAL HETEROSKEDASTIC MODELS R A HOPPES Hence rst differences are taken to correct for nonstationarity in the mean y lnzt 7 lnzt1i Series amz ACF an n2 m as as m PamelAcF nn n2 m as as m Figure 12 Amazon series It autocarrelations 0 After we address nonconstant variance and nonstationarity the yt series resembles a white noise process see Figure 13 This is rather unfortunate Any suggestions First Differenced Log Amazon Series series lamz39diquot m an n2 mo as as m u m 2 I 40 5 m in Series amzdiff an m Vamamcr Time m Figure 13 First di erenced log Amazon series gt and autocarrelations Hence our nal model for y is an ARIMA010 lnzt 71741171 6 This is well known as the random walk model or stock market model Random walk theory merely states that the future price PAGE 2 ST 730 CONDITIONAL HETEROSKEDASTIC MODELS R A HOPPES movements cannot be predicted from past price movements alone For example the change in the stock price from time t to tl is unpredictable With past information What is the statistician econometrician or nancier to do The log differences of the Amazon series resembled White noise in Example 11 By an analogous ARMA modelling building process the IBM series Example 12 follows a similar random walk process In light of this rather unsettling plight the y series has other types of structures present Example 12 IBM Series The IBM series represents daily stock returns from February 2 1984 to December 31 1991 Zivot and Wang 2003 o Firstly the distribution of y has heavier tails than a normal distribution Kurtosis the nor malized fourth cental moment of a distribution is de ned as H Mm and measures the degree of peakedness in a distribution The standard normal distribution has a kurtosis of HN01 MALMg 312 3 In the literature leptokurtic is often used to describe distribu tions that are peaked and have fat tailsi Sample Moments mean std skewness kurtosis 00001348 001443 2 004 38 27 Secondly the changes in y tend to be clustered This may be easier to visualize in a graph of the squared y series and even easier to see in Example 13 Hence dependence in the variability or volatility of the observed values is present Dally Stuck Returns Dr EM Dally Stuck Returns at lm 19m 1955 was 1957 1955 1 gas 199m 1991 1992 19m 1985 was 1987 1985 1959 Sen 1991 1992 Figure 14 IBM series y and PAGE 3 ST 730 CONDITIONAL HETEROSKEDASTIC MODELS R A HOPPES Series ibms Series ibmsquot2 MP on n2 m as as m MP on n2 m as as m u m 2 in on 5n n m 2U an on 5D Lag LEE Series ibms Series ibmsquot2 PamaiACF an n m PamaiACF n as n 15 5 Ann Figure 15 IBM series y and y correlations 0 And finally7 the y series is correlated and nonnegative Example 13 Copper Series In this example7 the concept of volatility clustering is Visually repre sented more clearly The copper series represents the cash settlement of Copper Prices in US Dollars in the spot market on the London Metal Exchange from January 37 1989 to October 317 2002 Log Returns of Copper Log Returns ofCopperAZ 2mm Baum H mm 2mm Baum H mm Time Time Figure 16 Copper series y and PAGE 4 ST 730 CONDITIONAL HETEROSKEDASTIC MODELS R A HOPPES 4 NonNormal AR1 model for y with 1 errors Likewise we could express the model as 2 2 2 2 y yt at 01 0602 a0 119371 02 10 aiyii 0363 1 00 01192271 V where 1 0396 7 1 Since at N iid N0 l e N iid x As a result a 7 l is a shifted to have mean zero x random variable 5 Examine the behaviour of yt unconditionally Using the law of iterated expectations Eyy EEEy ylz and the variance computing formula Vy EM 7 Eytyt2 the following ARCH1 properties are examined EMM EytillEytlyti yzlyzeiH Emilio 0 Vyt ya Eyt y Em Ml Eyt y Ey 71iEytly 71yt2iyt7ll Emil VytiyH ytiyt71 Emin ytiyt71l2l Emil VytiyH ytiyt71l Eyt71a0 0119371 0 0 alEyt71y1271 040 alEyg 00 011Vyt 9 Em 92 00 a1Vy yt Thus Vytyt aol 7 11 Because the variance of y must be positive 10 gt 0 whereas the support for al is restricted to the set 01 Typically this constraint is stated as 0 S 11 lt l on Higher order moments of y In some applications assumptions on higher moments of y are necessary This is critical in extreme value theory EVT settings such as stresstesting In particular we require the fourth moment to be nite lt 00 Since the forth moment is positive it can be shown that the variance of y presented below is also nite provided that So lt l Combining this result with the previous constraint 0 lt a lt 13 or alternatively oga1lt m 321 2 7 Vyz2Ey The kurtosis of y is 4 3 1 i 0i 7 H21 30 i PAGE 6 ST 730 CONDITIONAL HETEROSKEDASTIC MODELS R A HOPPES o The model treats positive and negative returns similarlyi Empirical evidence suggests the variabil ity of asset returns differ between gains and losses in nancial markets 0 Strong restrictions are placed on the ai coef cients 0 Oftentimes the model overestimates volatility because it responds slowly to large isolated shocks in the asset return seriesi Model Building 1 Build a model for the observed time series in order to remove any serial correlation in the data With respect to nancial assets this typically involves taking logarithms and rst differences De ne this result to be y 2 Examine the squared series to check for conditional heteroskedasticityi In other words plot the ACE and PACF of What would you expect to see if conditional heteroskedasticity is present For those theoreticians two tests for detecting ARCH effects are available 1 the Ljung Box test and 2 the Lagrange Multiplier test proposed by Engle 1982 such that under the null hypothesis no ARCH effects are present 11 ap 0i 3 Find the appropriate order of the ARCH model obtain parameter estimates via maximum likeli hood estimation perform diagnostic tests and plots and use the nal model to forecast Example 21 ARCH2 Simulation In the following example 500 observations from an ARCH2 process is simulated in SPlus three different ways In the rst simulation the starting value of the y series was randomly generated from its derived distributioni it 1 Generate an ARCH2 series of length 500 theory setseed7 n lt 502 e lt rnormn e lt rtse y lt doublen it initialize the y vector to be of size of n a lt c005 050 035 it alpha coefficient vector y12 lt rnorm1 sd sqrta110a2a3 fori in 3m yi lt ei sqrta1a2yi 1 2a3 yi2 2 it starting value outside of loop it generate the ARCH2 process y lt rtsy3502 it Drop the first two elements of y In the second simulation drop the rst 100 observations from the series Why is this method of simulation consistent Recall the ARp simulation at the beginning to the semester PAGE 8 ST 730 CONDITIONAL HETEROSKEDASTIC MODELS R A HOPPES it 2 Generate an ARCH2 series practical setseed816 ee lt rnormn100 yy lt doublen100 fori in 3n100 nyi lt eeisqrta1a2yyi1 2a3yyi2 2 yy lt rtsyy103n100 it Drop the first 100 observations And nally7 let s just use the simulategarch function available in the finmetrics module it 3 Simulated an ARCH2 series of length 100 modulefinmetrics ysim simulate garchmodellist a value0 05 archc O 50 O 35 n500 n start100 sigmaF names ysim Plot the error series7 e7 produced in simulation 1 as well as the ARCH2 processes in simulations 1 and 2 Can you Visualize the conditional heteroskedasticity Can you Visualize an autoregressive structure In particular7 What properties are you looking for in these graphs White Noise Figure 27 White noise ARCH2 simulation 1 and simulation 2 PAGE 9 MP on n2 m as 0810 ST 730 CONDITIONAL HETEROSKEDASTIC MODELS R A HOPPES Compare and contrast the ACF and PACF plots for the two pairs of series en y and eg Series e 1 u 2m an on an an LEE Series e no nos Femai ACF m The plot below contains the simulated values of y in addition to the simulated a values Series y Series equot2 Series yquot2 c u Hum il Hm c c i i c m 2U an on 5D an n m 2U an on 5D an n m 2 an on 5D Bu rag rag Lag Series y Series equot2 Series yquot2 Femai ACF n n n ans Figure 28 ACF and PACFfm en y and eg picture the serial correlation between plots sigmaJ Simulated APO 12 errors Simulated APO 12 volatility n2 m p amai MP on Can you Figure 29 ARCH2 series generated from sz rmtlategarch y and 0 Example 22 Example 21 continued Analysis for the simulated series in the previous example Plotting the series along With the ACF and PACF for y and y is always a good starting point In addition7 perform the Ljung Box and Lagrange Multiplier tests to detect for conditional heteroskedas PAGE 10 ST 730 CONDITIONAL HETEROSKEDASTIC MODELS R A HOPPES ticity effects There tests can be performed by calling the autocorTest y lagp and archTest y lagnp functions in SPlus respectively Null Hypothesis no ARCH effects Test Stat 238 8776 p value 00000 A y series SerIes y SerIes y 2 MP on n2 m as as n an on an an n m 2m 3m Lag Lag Series y Series yquot2 pmer on m PamelAcF n2 m an Lag Lag TlmE Figure 210 The y series In Figure 2107 the y series mostly resembles White noise except between time 125 and time 175 The ACF and PACF plots are dif cult to interpret Discuss plots The y series exhibits a chopping exponential decay in the ACF plot Whereas the PACF displays groups of signi cant spikes To start7 t an ARCH1 model in SPlus y splus garchy1 garch10 or in R yr lt garchy order c01 Discuss output and plots GARCH10 gt summary y mod Call garchformulamean y 39 1 formulavar 39 garch1 0 Mean Equation y 39 1 Conditional Variance Equation 39 garch1 0 Conditional Distribut ion gaussian Estimated Coeff ic ients Value Std Error t Value Pr gt It I A 01145 0006328 ARCH1 10082 0085263 1183 0 H 00 AIC2 8499204 PAGE 11 ST 730 CONDITIONAL HETEROSKEDASTIC MODELS R A HOPPES BIC2 8583496 Normality Test JarqueiBera Pivalue Shapiroiwilk Pivalue 21 09758 0003718 LjungiBox test for standardized residuals Statistic Pivalue Chi 2df 1858 009911 12 LjungiBox test for squared standardized residuals Statistic Pivalue Chi 2df 1046 111ei016 Lagrange mult iplier test TR 2 Pivalue Fistat Pivalue 87 08 1 812ei013 9 635 0 0001024 Series residualsymod Series residualsymod T ACF n2 m as as in u m 2m an an an an n m 2m an on an an Lag LEE Series residualsymod Series residualsymod T an n PamaiACF n n PameiACF nus ms Figure 211 Residuals and standardized residuals from the GARCH10 model GARCH20 Call garchformulamean y 39 1 formulavar 39 garch2 0 Mean Equation y 39 1 Conditional Variance Equation 39 garch2 0 Conditional Distribution gaussian Estimated Coeff ic ients Value Std Error t Value Pr gt I t I 004977 0009297 5354 65886 008 ARCH1 055543 0089053 6237 47716 010 ARCH2 043562 0094299 4620 2452er006 PAGE 12 ST 730 CONDITIONAL HETEROSKEDASTIC MODELS R A HOPPES AIC3 7205209 BIC3 7331647 Normality Test JarqueiBera Pivalue Shapiroiwilk Pivalue 05854 07463 09856 05714 LjungiBox test for standardized residuals Statistic Pivalue Chi 2d f 0 3761 12 LjungiBox test for squared standardized residuals Statistic Pivalue Chi 2d f 491 12 Lagrange mult iplier test TR 2 Pivalue Fistat Pivalue 6374 0 8961 0 5871 0 9304 Series residualsymod1 Series residualsymod1 T ACF nn n2 m as as in u m 2m an on an an n m 2n on an an an LEE Series residualsymod1 Series residualsymod1 T n2 nn Pamai ACF n n m Pamai ACF m Figure 212 Residuals and standardized residuals from the GARCH20 model 3 Univariate GARCH Processes A generalized version of the ARCH model or GARCH was developed by Bollerslev 1986 The GARCH model estimates the time varying volatility at based not only on lagged values from the y series but also on lagged values of 0 Thus the conditional variance can be modelled Within the class of ARMA models In practice a large number of parameters are often required to obtain a good model t for PAGE 13 ST 730 CONDITIONAL HETEROSKEDASTIC MODELS R A HOPPES estimating ARCHp models The GARCHpq model allows for a more parsimonious representation for the underlying process 31 GARCHQD7 q Model The GARCHp7 4 model is presented below y 06 39 a a0 alyil agyiQ apyip 6101271 6201272 611012711 310 17 q a0 2 93439 Z j Uij i1 j1 where e N iid01 and the parameters satisfy the following constraints 10 gt 07 ai 2 0 for all i1p j gt0for allilq7 and Egalxp qai ilt 1 32 Extended GARCHp7 q Models IGARCH The integratedGARCH models are unitroot GARCH models GARCH M The GARCHmean model incorporates a mean such that an addition layer is added to the original GARCHpq model For example7 y MJrcangwt 311 wt 0 6 312 02 00 01192271 0129372 A A A 11292717 510371 520372 A A A qat27q 313 CHARMA The conditional heteroskedastic ARMA model by Tsay 1987 applies a random effects structure to produce conditional heteroskedasticity EGARCH The exponentialGARCH proposed by Nelson 1991 allows for asymmetric effects between positive and negative asset returns TGARCH The thresholdGARCH model incorporates a binary effect on the conditional variance PGARCH The powerGARCH allows the powers of of where d is a positive integer7 to be modelled PAGE 14 ST 730 CONDITIONAL HETEROSKEDASTIC MODELS R A HOPPES 4 References Bilder CR 2002 Time Series Analysis Course notes for Statistics 5053 Oklahoma State University Stillwater Oklahoma 11 l 1 1 l BollersleV T 1986 n 139 d t 39 quot39 metrics 31 307327 39 77 Journal ofEcono Brocklebank 1C and Dickey DA 2003 SAS for Forecasting Time Series Second Edition Cary NC SAS Institute with estimates of the variance Engle7 R F 1982 AAA 1 11 1 1 l 1 1 u of United Kingdom in ations77 Econometrica 50 9871007 Nelson DB 1991 Conditional heteroskedasticity in asset returns A new approach77 Econo metrica 59 347370 Shumway RH and Stoffer DS 2000 Time Series Analysis and Its Applications New York SpringerVerlag Tebbs 1M 2004 Statistical Theory U Course notes for Statistics 771 Kansas State University Manhattan Kansas Tsay RS 1987 Conditional heteroscedastic time series models77 Journal of the American Sta tistical Association 82 590604 Tsay RS 2002 Analysis of Financial Time Series New York John Wiley and Sons Venables WN and Ripley BD 2002 Modern Applied Statistics with S Fourth Edition New York SpringerVerlag Zivot E and Wang J 2003 Modeling Financial Time Series with SPL US Seattle WA Insight ful PAGE 15 2004 R Adam Hoppes ARMAX Models 0 Vector multivariate regression output vector yt1 Yt ytj2 9th input vector Zt1 Zt 2t 3157 0 Regression equation ym 5 1Zt1 2Zt2 if zt war or in vector form Yt BZt wt 0 Here wt is multivariate white noise EWt 0 COVltWthwtgt E311 0 Given observations for t 12n the least squares es timator of B also the maximum likelihood estimator when wt is Gaussian white noise is B Y Z Z Zgt1 where o ML estimate of 2w replace n with n r for unbiased A 1 n A A 2w Z Yt BZt Yt BZt t1 o Information criteria Akaike A 2 k k 1 AIC inizwi ltkr n 2 Schwarz SIC In iiw In ltmkltkT1gt T Bias corrected AIC incorrect in Shumway amp Stoffer A 2 kk 1 Vector Autoregression o Eg VAR1I X75 2 C Xt1 W75 0 Here I is a k x k coefficient matrix and win is Gaussian multivariate white noise 0 This resembles the vector regression equation with ytzxta Boz o Z 1 t Xt l 39 Observe x0x1xn and condition on x0 Maximum conditional likelihood estimators of B and 2w are same as for ordinary vector regression VARp is similar but we must condition on the first p ob servations Full likelihood conditional likelihood gtlt likelihood derived from marginal distribution of first p observations and is dif ficult to use Example 1 year 5 year and 10 year weekly interest rates 0 Data from httpresearchstlouisfedorgfred2seriesWGSlYR etc a readcsvquotWGSlYRcsvquot WGSlYR tsa2 a readcsvquotWG85YRcsvquot WGSSYR tsa2 a readcsvquotWGSlOYRcsvquot WGSlOYR tsa2 a CbindWG81YR WGSSYR WGSlOYR plota plotdiffa 0 Use the dse package to fit VAR1 and VAR2 models to differences librarydse b TSdataoutput diffa bl estVARXlsb maXlag 1 catquotVAR1n print methodnquot printb1 catquotn summary methodnquot printsummaryb1 b2 estVARXlsb maXlag 2 catquotnVAR2n print methodnquot printb2 catquotn summary methodnquot printsummaryb2 VAR1 print method neg log likelihood 7188785 AL 11014698L1 0002482398L1 000144053L1 0005794167L1 109224325L1 0003872528L1 L OOHUJ OHOII O summary method neg log likelihood 7188785 WGSlYR yWG85YR inputs outputs WGSlOYR RMSE 02005654 01713752 01563661 ARMA model estimated by estVARXls WGSlYR yWG85YR WGSlOYR 0004292339L1 0005304638L1 11024605L1 sample length 2448 input dimension 0 output dimension 3 order A 1 order B 0 order C 9 actual parameters 6 nonzero constants trend not estimated VAR2 print method neg log likelihood 7414944 AL 11329215L1O3221239L2 001030711L1005850615L2 OO1539836L1O1172694L O007336772L1005027099L2 11117284L1O1974304L2 OO1148573L100577710 O00002002881L1001317073L2 O002287398L1006233586L2 11252808L1O226 L OOHUJ OHOII OO summary method neg log likelihood 7414944 sample length 2448 WGSlYR yWG85YR WGSlOYR RMSE 01910442 01666275 01534016 ARMA model estimated by estVARXls inputs outputs WGSlYR yWG85YR WGSlOYR input dimension 0 output dimension 3 order A 2 order B 0 order C 18 actual parameters 6 nonzero constants trend not estimated o AIC is smaller more negative for VAR2 but SIC is smaller for VAR1 o For VAR1 A 03288773 008581201 0136938 11 006575108 01534516 008875425 0004959931 004152504 02406055 o Largest off diagonal elements are 13 and 23 suggesting that changes in the 10 year rate are followed one week later by changes in the same direction in the 1 year and 5 year rates 10 The Periodogram 0 Recall the discrete Fourier transform n 71 12tgxte 27 wjt7 Z O1TL 1 o and the periodogram 1wj dwji2jo1n 1 o where wj is one of the Fourier frequencies 1 7 TL Sine and Cosine Transforms ForjO1n 1 n dwjgt n 12 Z xte 2mwjt i1 n 71 12 2 mt COS27rwjtgt iSin27rwjtgt i1 2 71 12 in wt COSlt27rwjtgt z39 X 71 12 in xtSin27rwjtgt 151 i1 u dcwj and d3 wj are the cosine transform and sine trans form respectively of x1x2xn Sampling Distributions o For convenience suppose that n is odd n 2m 1 e White noise orthogonality properties of sines and cosines dcw1 dso l dcw2 d3w2 dcwm have zero mean variance a w and are uncorrelated o Gaussian white noise dcw1 d5w1 dcw2 d5w2 EO39U C dcw1 dso l dcw2 d3w2 dcwm d5wm have zero mean and are approximately uncorrelated and varldcwjgtl varldswjgtl am where fxwjgt is the spectral density function 0 If mt is Gaussian 2 2 MW dcwigt dswigt 1 1 fwjgt fwjgt and 133001 Ixw2 Ixwm are approximately indepen dent N approximately X3 Spectral ANOVA o For odd n 2m 1 the inverse transform can be written 2 m 39COS new or sin 7rw39 tx jzidcw9gt 2 Jtgtd89gt 2 90 1 0 Square and sum over 25 orthogonality of sines and cosines implies that Ms i mt m2 2 new 0002 151 39 9 II J M3 N A S u H H H ANOVA table Source df 55 MS LL12 2 wm 2 21wm wm Total 2mn1 Xth iy Hypothesis Testing 0 Consider the model xt A COSlt27rwjt wt 0 Hypotheses HO A 0 gt xt 2 wt white noise H1 A gt 0 white noise plus a sine wave 0 Note no autocorrelation in either case 0 TWO cases wj known use Fj 0 m 11 which is F272m1 under HO wj unknown use maxF1F2Fm or equivalently maxIwjgt j 12n H 1 m ZjIwjgt and 1 Io P gt g m 1 exp mexp gm gm m 8 Example the Southern Oscillation Index 0 Using SAS proc spectra program and output a Using R parmfcol C2 1 Use fft to calculate the periodogram directly note that frequencies are expressed in cycles per year and the periodogram values are similarly scaled by 12 freq 12 Olengthsoi 1 lengthsoi plotit freq gt O amp freq lt 6 soifft fftsoi sqrtlengthsoi plotfreqplotit Modsoifftplotit 2 12 type quotlquot Use spectrum override some defaults to make it match spectrumsoi log quotnoquot fast FALSE taper O detrend FALSE

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

#### "I bought an awesome study guide, which helped me get an A in my Math 34B class this quarter!"

#### "Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

#### "It's a great way for students to improve their educational experience and it seemed like a product that everybody wants, so all the people participating are winning."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.