Time Series Econometrics
Time Series Econometrics EC 821
Popular in Course
Popular in Economcs
This 84 page Class Notes was uploaded by Jayda Beahan Jr. on Saturday October 3, 2015. The Class Notes belongs to EC 821 at Boston College taught by Staff in Fall. Since its upload, it has received 19 views. For similar materials see /class/218060/ec-821-boston-college in Economcs at Boston College.
Reviews for Time Series Econometrics
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 10/03/15
EC82I Time Series Econometrics Spring 2003 Notes Section 4 Nonstationary univariate time series Let us now consider the de nition of an integrated series A series which is stationary after being differenced once is said to be integrated of order one and is denoted I In general a series which is stationary after being differenced d times is said to be integrated of order d denoted I A series which is stationary without differencing is said to be I This de nition assumes that d is an integer it is possible to extended the de nition to the case of actional values of CI Note that this does not imply that all nonstationary series are I 1 or more generally I d with d 2 1 It may not be possible to difference a nonstationary series to stationarity see Leybourne et al JBES 1996 Also we will nd that a series with fractional d 05 S d lt 1 is nonstationary even though the value of d falls short of unity A series which is I 1 is said to have a unit root and a series which is I d d a positive integer is said to have d unit roots This may be seen for I 1 by writing the AR1 model in lag operator notation lt1 1LgtYtuet when gbl 1 the root of the polynomial is unity and the notion that Y is I 1 corresponds to the presence of the unit root in its AR representation In a higher order model there may be multiple unit roots for instance in the AR2 model 1 Q51L Q52L2Ytu t l the lag polynomial may be factored into 1 alL l agL 0 Where a1 1 a2 Q51 and alag gz52 and the roots of the quadratic polynomial Will be the inverses of the a coef cients If gbl 2 and gbg 1 there will be two unit roots in the AR representation and the 2 series must be differenced twice to yield a stationary process An AR2 model could also generate an 1 series eg if Q51 125 and gbg 025 the polynomial factors into 1 L l 025L The once differenced process Will be stationary For the general ARp representation if one unit root is present the AR polynomial can be factored into 1 L and a p lth order polynomial With stable roots It could be the case that there are multiple unit roots in the ARp This gives rise to the AR M Ap d g model a process Which has a standard ARM Ap q after differencing d times to achieve stationarity Difference stationary vs trend stationary series A trend stationary series is one in Which shocks have transitory effects The series Will not be covariance stationary but its second moments may satisfy the conditions for CS it only fails to be CS due to its time varying mean If the variation in the mean can be adequately explained by a linear polynomial or logarithmic trend then the detrended series is CS and shocks to the series Will be of a transitory nature In contrast a series possessing one or more unit roots Will only be stationary after differencing and shocks to its level Will have a permanent effect on the series It may also possess a trend that is it may be a random walk with drift The simplest trend stationary model is 325 M l39 t 139 6t where 615 is assumed to be white noise ie I y is clearly 0 without differencing since its stochastic properties are entirely determined by those of 615 Adding an AR1 component the model becomes yt u 1yt 1 t 5t This model includes several interesting special cases for instance when 3 0 LL 7 0 and Q51 1 the model is that of a random walk with drift and is I If on the other hand LL 0 then a pure random walk model results Alternatively if B 0 LL 7 0 and lgbll lt 1 the model is of a deterministic trend with a stationary AR1 component that is a trend stationary series In the latter case how may we distinguish this model om that of the random walk with drift The only difference between them lies in the parameter Q51 and our ability to distinguish that estimated parameter from unity The two series may mimic each other that is for a nite sample of data the characterization of the timeseries as trend stationary may be equally plausible to its identi cation as a unit root process But the distinction is crucial in that applying the appropriate transformation to the series will depend upon our ability to distinguish the two models If a series is DS for instance then detrending the series which in reality contains a unit root will not remove its random walk properties at best it will merely change a random walk with drift to a pure random walk Treating the resulting series as if it is now covariance stationary will be misleading On the other 3 hand if a series is TS then the appropriate transformation to CS is detrending and differencing the series is not warranted and since differencing may be considered as an approximation to applying the lter 1 gblL that approximation will be worse the farther is gbl om unity Therefore it is essential that we have a methodology to distinguish the TS from the DS series on empirical grounds We should also note that considering linear models of timeseries with either trend or unit root is not restrictive Many economic and nancial timeseries for instance that for gross domestic product GDP appear to be better characterised by an exponential trend constant percentage growth than by a linear trend constant growth in its levelFor this reason many models of economic and nancial timeseries are applied to the logarithms of the original series If we consider y log1 then a trendstationary model of y is actually a constant percentage growth model of the underlying 13 Likewise making use of the approximation that for small changes the rst difference of the logarithm of a variable is approximately the same as its percentage change we often use A log1 to construct a percentage change series If y log1 possesses a unit root so that we should be taking its rst difference Ag we are then arguing that the percentage changes of the underlying series 1 are stationary How do trendstationary and unit root representations of a timeseries differ in practice For a trendstationary series an optimal Tperiodahead forecast may be constructed by merely 4 adding 3739 to the series current value For the unit root process Ayt 6 1 16 L 615 the series Will be expected to change by 6 units per period so that the accumulated drift after 739 periods Will just be 6739 The difference in these forecasts if repeated period by period are in the intercept terms For the trendstationary process the intercept oz and slope 3 de ne a forecast trajectory to Which forecasts Will revert For the unit root process the intercept Will change With each period s shock to the series the forecast adds the drift 6 to Whatever value the y process has attained In terms of interval estimates the forecast errors for a trendstationary process Will converge to a xed value as 739 increases While the forecast errors om a unit root process increase Without bound given that the variance of the unit root process increases linearly With the forecast horizon To develop this methodology we now consider testing for a unit root EC82I Time Series Econometrics Spring 2003 Notes Section 2 Stationary and nonstationary random variables Many economic time series are not plausibly characterized as processes with a constant mean expenditure income and price series typically display a tendency to increase over time Such series are described as nonstationary To discuss the concept of stationarity we must consider not only the rst two moments of the series the mean and variance but also the autocovariance function or autocorrelation function of a single series and the cross covariance or cross correlation function of a pair of time series Recall that the covariance of Xt and Yt is de ned as CovltXt Yt EXt LixYl y This de nition is only valid for timeseries with constant means in that context we may also de ne Con X15 324 as the cross covariance of those series with a T period lag that is we are considering the covariance between X at time t and Y 739 periods prior or 739 periods hence since the covariance mction is symmetric Special cases of these de nitions occur when X and Y are the same series we then consider the autocovariance mction W of the series X whose zero order element is merely the variance of the series with other elements referring to the covariance between the series values at one point in time versus another point in time For stationary series those with a constant mean and variance both of these concepts may be transformed om covariances into correlations with appropriate scaling You 1 may recall that the simple correlation coef cient between X and Y is merely the covariance scaled by the product of the standard deviations of the two variables This implies that the autocorrelation function will have unity as its zero order term being the variance of X scaled by the square of the standard deviation of X that is the variance and the autocovariances will be the elements of the 7k sequence scaled by their variance the autocorrelations of the series usually denoted pk Like the autocovariance function the autocorrelation function is symmetric with the kth autocorrelation re ecting the relation between Xt and Xtk or Xtk With these building blocks in hand we may consider the concept of covariance stationarity or second order stationarity For a stochastic process to be covariance stationary three conditions must be satis ed the process must have a constant mean th a constant variance 0 and its autocovariance function must not be a function of time That is 01573 0tj7tj8 so that translating the calculation of the autocovariance function along the time axis does not affect its value the process measured at two different points in time e g t and 3 have an autocovariance depending only on their temporal displacement k t 3 Thus we may speak of the kth order autocovariance 7k without further reference to time We may note that covariance stationarity is itself a weak form of strict stationarity which would require that the entire distribution of the stochastic process is independent of the measure of time For a random variable distributed according 2 to the Normal distribution covariance stationarity implies strict stationarity since its distribution only depends on rst and second moments In general strict stationarity is a more restrictive condition and as it is di icult to test covariance stationarity will often suf ce in applied work We may easily determine that a process such as a random walk or a random walk with drift cannot be covariance stationary Such a process might be Xt Is 1 Xt1 1 615 where 615 is a zero mean random process with a constant variance 0 and independent increments and thus zero autocovariances We may rewrite the process as Xt 151 X0 221 63 where X0 is the xed initial condition The expectation of X j for j 1 739 will be X0 H X0 21 X0 7 thus the process has a continuously changing mean as given by the drift Likewise the variance of the process for l 2 739 observations will be 03 20 73903 that is the variance increases linearly This process is clearly nonstationary as it fails the rst two conditions de ning a covariance stationary process De nitions A stochastic process is a sequence of random variables X2 239 1 2 If the index is taken as representing time then the stochastic process is a time series The 1ndamental problem in time series analysis is that we observe the realization of the stochastic process only once There are annual data for example on the US in ation rate for 1946 1995 50 real values But this is only one possible outcome of the underlying stochastic process for the in ation rate over that period If we could have 3 observed history many times over we could assemble many samples each containing a possibly different string of 50 real numbers and take their average for each year This would be the ensemble mean of the series the average across the states of nature at any given calendar time In reality we can only observe one such history If the distribution for the in ation rate remains unchanged essentially the concept of stationarity then the particular sequence or time series that we observe can be considered as 50 different values om the same distribution And if the process is not too persistent if it possesses the property of ergodicity then each element of the sequence will bear some information and the time average over the elements of the single realization we have will be consistent for the infeasible ensemble mean A stationary process is ergodic if it is asymptotically independent that is if two random variables positioned far apart in the sequence are almost independently distributed This is true for instance for the AR1 process 215 C pzt1 615 e a white noise process and lpl lt 1 Most aggregate time series such as GDP are not stationary because they exhibit time trends In some cases nancial series are argued to be nonstationary on the grounds of nonconstant variances Many time series with a trend can be reduced to stationary processes for instance a series om which a linear trend has been removed rendering it stationary is said to be trend stationary TS Alternatively a series may be differenced if the series is nonstationary but its difference is stationary then the process is said to be difference stationary DS As we shall 4 see much of the concern over unit roots in economics and nance is related to this distinction between TS and DS processes A stochastic process is said to be white noise if it satis es three properties a E X15 0 for all t b the variance of Xt is constant and thus independent so that the process is said to be homoskedastic and c all autocorrelations 7k kl Z 1 equal zero This process is covariance stationary and ergodic but note that not all covariance stationary processes are white noise A closely related concept independently and identically distributed random variables often labelled did The elements of a stochastic process are said to be did if they possess three properties a E X15 th a constant not necessarily equal to zero for all t b the variance of Xt is constant and time independent 03 03 for all t and c Xt is distributed independently of X for all t 7 3 The latter is a stronger condition than the equivalent condition in the de nition of white noise since independence implies zero autocorrelation but not vice versa However if we add the assumption that X is distributed Normal an assumption of zero autocorrrelations is suf cient to imply independence An it39d sequence is stationary A random walk process Xt Xt1 615 combines a white noise error sequence with the level of X Since AXt 615 the rst difference of X is also a white noise process and stationary it should be clear that the level process Xt is not stationary Its mean is time varying its variance is in nite as T gt 00 and its autocorrelations are nonzero and die out very slowly A stochastic process is a martingale if it satis es 5 E XtH K215 Xt where Q is the information set containing at minimum the past history of the process Since that conditional expectation evaluates to Xt for the random walk process it is a martingale given that E 6H1 0 for the white noise innovation The difference 6H1 Xt1 Xt is a martingale difference sequence which generally only requires uncorrelated rather than the more strictly independent increments An example of a martingale difference process often used in analysing asset returns is Engle s autoregressive conditional heteroskedastic ARCH process A process is said to be an ARCH process of order 1 or ARCH 1 if it can be written as gt 2 v 0591524 615 where 615 is did with zero mean and unit variance Since the increments to the process are the it39d elements of 6 E gt conditioned on its own past history is zero since that conditional expectation involves conditioning 6 upon the past history of 9 Likewise E conditioned on the past history of the 9 process is merely C 1 059524 So the conditional second moment of the process which is the conditional variance since the conditional mean is zero is a function of the history of the process The 9 process is strictly stationary and ergodic if lozl lt 1 If gt is stationary then the unconditional second moment may be readily calculated as E As we shall see if oz gt 0 this model captures a characteristic of asset returns volatility clustering so that large values in absolute terms are followed by large values EC821 Time Series Econometrics Spring 2003 Notes Section 10 Part 3 Christopher F Baum Department of Economics Boston College baumbcedu January 8 2003 1 Annotated bibliography of papers studying long range dependence The following list includes only those papers which I have authored or coau thored on the subject Beyond the somewhat dated extensive bibliography in Baillie 1996 a search of the EconLit database for long memory fractional integration and ARFIMA will turn up many more recently published pa pers using these methods in economics and nance All BC Economics Working Papers are accessible in Acrobat PDF format from the department homepage or my personal homepage on the department s website 0 Boston College Economics Working Paper BC WP 492 John Barkoulas University of Tennessee and Christopher F Baum Dynamics of intra EMS interest rate linkages rev 022001 Abstract A number of previous studies have questioned the dominant role of Germany within the EMS These conclusions are often based on empirical ndings that interest rates of member countries of the EMS are not affected by German interest rates even in the long run In this study we establish evidence to the contrary by demonstrating that intra EMS interest rate differentials vis a vis Germany exhibit mean reverting behavior characterized by long memory dynamics Fractional error correction models estimates suggest the presence of short run intra EMS monetary policy interdependencies but they validate the German Dominance Hypothesis in the long run 0 BC WP 472 Basma Bekdache Wayne State University and Christopher F Baum 7 A re evaluation of empirical tests of the Fisher hypothesis 09 2000 Abstract This paper shows that the recent literature that tests for a long run Fisher relationship using cointegration analysis is seriously awed Cointegra tion analysis assumes that the variables in question are 11 or Id with the same d Using monthly post war US data from 1959 1997 we show that this is not the case for nominal interest rates and in ation While we cannot reject the hypothesis that nominal interest rates have a unit root we nd that in a tion is a long memory process A direct test for the equality of the fractional di erencing parameter for both series decisively rejects the hypothesis that the series share the same order of integration 0 BC WP 396 John T Barkoulas University of Tennessee Christopher F Baum and Atreya Chakraborty Charles River Associates 7 Waves and Persistence in Merger and Acquisition Activity rev 121999 published Economics Letters 2001 70 237 243 Abstract Does merger and acquisition MampA activity occur in waves that is are there oscillations between low and high levels of MampA activity The an swer to this question is important in developing univariate as well as structural models of explaining and forecasting the stochastic behavior of MampA activity There is evidence to suggest that aggregate US time series data on merger and acquisition MampA activity exhibit a 7 wave behavior which has been mod eled by tting either a two state Markov switching regime model or a sine wave model to the data This study provides an alternative characterization of the temporal patterns in MampA as a nonlinear process with strongly persistent or long memory dynamics The apparent level changes or partial cycles of differ ing magnitudes in aggregate MampA time series are consistent with an underlying data generating process exhibiting long memory Time and frequency domain estimation methods are applied to a long MampA time series constructed by Town 1992 covering approximately a century of merger activity in the Us econ omy We nd signi cant evidence of long term cyclical behavior nonperiodic in nature in the MampA time series even after accounting for potential shifts in the mean level of the series A shock to MampA activity exhibits signi cant persistence as it is damped at the very slow hyperbolic rate but it eventually dissipates We provide both theoretical and empirical rationales for the presence of fractional dynamics with long memory features in MampA activity Theoretically long term dependence may be due to persistent di erences in rm valuation between stock holders and nonstockholders following an economic disturbance as suggested by Gort 1969 Empirically long memory dynamics in MampA activity may re ect the statistical properties of fundamental factors underlying its behavior as several of the proposed determinants of MampA activity have been shown to exhibit strong persistence 0 BC WP 380 Christopher F Baum John T Barkoulas and Mustafa Caglayan University of Liverpool Long memory or structural breaks Can either explain nonstationary real exchange rates under the current oat rev 0199 published Journal of International Financial Mar kets Institutions and Money 1999 9 359 376 Abstract This paper considers two potential rationales for the apparent absence of mean reversion in real exchange rates in the post Bretton Woods era We allow for fractional integration and ii a double mean shift in the real exchange rate process These methods applied to CPI based rates for 17 countries and WPI based rates for 12 countries demonstrate that the unit root hypothesis is robust against both fractional alternatives and structural breaks This evidence suggests rejection of the doctrine of absolute long run purchasing power parity during the post Bretton Woods era 0 BC WP 377 John T Barkoulas Christopher F Baum Mustafa Caglayan and Atreya Chakraborty Persistent Dependence in Foreign Exchange Rates A Reexamination rev 042000 forthcoming 2002 in Global Financial Markets Issues and Policies Abstract We test for stochastic long memory behavior in the returns series of currency rates for eighteen industrial countries using a semiparametric frac tional estimation method A sensitivity analysis is also carried out to analyze the temporal stability of the long memory parameter Contrary to the ndings of some previous studies alluding to the presence of long memory in major cur rency rates our evidence provides wide support to the martingale model and therefore for foreign exchange market e iciency for our broader sample of for eign currency rates Any inference of long range dependence is fragile especially 3 for the major currency rates However long memory dynamics are found in a small number of secondary nonmajor currency rates 0 BC WP 361 John Barkoulas and Christopher F Baum Long Memory and Forecasting in Euroyen Deposit Rates 297 published in Financial Engineering and the Japanese Markets 1997 42189 201 Abstract We test for long memory in 3 and 6 month daily returns series on Eurocurrency deposits denominated in Japanese yen Euroyen The frac tional di erencing parameter is estimated using the spectral regression method The con icting evidence obtained from the application of tests against a unit root as well as tests against stationarity provides the motivation for testing for fractional roots Signi cant evidence of positive long range dependence is found in the Euroyen returns series The estimated fractional models result in dra matic out of sample forecasting improvements over longer horizons compared to benchmark linear models thus providing strong evidence against the martingale model 0 BC WP 356 John T Barkoulas Christopher F Baum and Nickolaos Travlos Long Memory in the Greek Stock Market 1296 published in Applied Financial Economics 2000 102 177 184 Abstract We test for stochastic long memory in the Greek stock market an emerging capital market The fractional di erencing parameter is estimated us ing the spectral regression method Contrary to ndings for major capital mar kets signi cant and robust evidence of positive long term persistence is found in the Greek stock market As compared to benchmark linear models the es timated fractional models provide improved out of sample forecasting accuracy for the Greek stock returns series over longer forecasting horizons 0 BC WP 349 John T Barkoulas Christopher F Baum and Gurkan S Oguz Tufts University Stochastic Long Memory in Traded Goods Prices 1096 published Applied Economics Letters 1998 52135 138 Abstract Using spectral regression and exact maximum likelihood meth ods we test for long memory dynamics in the traded goods prices for the G7 countries as measured in their import and export price indices Signi cant and robust evidence of fractional dynamics with long memory features is found in both import and export price in ation rates 4 0 BC WP 334 John Barkoulas and Christopher F Baum Fractional Dy namics in Japanese Financial Time Series rev 7 97 published Paci c Basin Finance Journal 621 2 115 124 Abstract Using the spectral regression and Gaussian semiparametric meth ods of estimating the long memory parameter we test for fractional dynamic behavior in a number of important Japanese nancial time series spot exchange rates forward exchange rates stock prices currency forward premia Euroyen deposit rates and the Euroyen term premium Stochastic long memory is es tablished as a feature of the currency forward premia Euroyen deposit rates and Euroyen term premium series The martingale model cannot be rejected for the spot forward and stock price series 0 BC WP 333 Christopher F Baum John Barkoulas and Mustafa Caglayan Persistence in International In ation Rates rev 0498 published Southern Economic Journal 654 1999 900 913 Abstract We test for fractional dynamics in CPI based in ation rates for twenty seven countries and WPI based in ation rates for twenty two countries The fractional di erencing parameter is estimated using semiparametric and approximate maximum likelihood methods Signi cant evidence of fractional dynamics with long memory features is found in both CPI and WPI based in ation rates for industrial as well as developing countries Implications of the ndings are considered and sources of long memory are hypothesized 0 BC WP 321 John Barkoulas Christopher F Baum and Mustafa Caglayan Fractional Monetary Dynamics revised 0198 published Applied Eco nomics 1999 31 1393 1400 Abstract We test for fractional dynamics in US monetary series their various formulations and components and velocity series Using the spectral regression method we nd evidence of a fractional exponent in the di erencing process of the monetary series both simple sum and Divisia indices in their components with the exception of demand deposits savings deposits overnight repurchase agreements and term repurchase agreements and the monetary base and money multipliers No evidence of fractional behavior is found in the velocity series Granger s 1980 aggregation hypothesis is evaluated and implications of the presence of fractional monetary dynamics are drawn 0 BC WP 317 John Barkoulas Christopher F Baum Fractional Di erenc ing Modeling and Forecasting of Eurocurrency Deposit Rates rev 10 96 published Journal of Financial Research Fall 1997 203 355 372 Abstract We investigate the low frequency properties of three and six month rates for Eurocurrency deposits denominated in eight major currencies with speci c emphasis on fractional dynamics Using the fractional integration testing procedure suggested by Geweke and Porter Hudak 1983 we nd that several of the Eurocurrency deposit rates are fractionally integrated processes with long memory These ndings have important implications for econometric modeling forecasting and cointegration testing of Eurocurrency rates 0 BC WP 315 John Barkoulas Christopher F Baum and Gurkan S Oguz Fractional Cointegration Analysis of Long Term International Interest Rates rev 1096 published International Journal of Finance 1997 92 586 606 Abstract DeGennaro Kunkel and Lee 1994 studied the long run dynam ics of a system of long term interest rates of ve industrialized countries by means of sophisticated cointegration methods They found little evidence in support of the cointegration hypothesis thus concluding that a separate set of fundamentals drives the dynamics of each of the individual long term interest rate series In this study we extend their analysis by exploring the possibility of very slow mean reverting dynamics fractional cointegration in the system of the ve long term interest rates We use the GPH test as our testing methodol ogy for fractional integration and cointegration Through rigorous investigation of the full system of the ve long term interest rate series and its various subsys tems we provide evidence that the error correction term follows a fractionally integrated process with long memory that is it is mean reverting though not covariance stationary Despite signi cant persistence in the short run a shock to the system of long term interest rates eventually dissipates so that an equi librium relationship prevails in the long run 0 BC WP 314 John Barkoulas and Christopher F Baum Long Term Dependence in Stock Returns 496 published Economics Letters 1996 533 253 259 Abstract This paper investigates the presence of fractal dynamics in stock returns We improve upon existing literature in two ways i instead of rescaled range analysis we use the more e icient semi nonparametric procedure sug gested by Geweke and Porter Hudak GPH 1983 and ii to ensure robustness we apply the GPH test to a variety of aggregate and sectoral stock indices and individual companies7 stock returns series at both daily and monthly frequen cies Our results indicate that fractal structure is not exhibited by stock indices but it may characterize the behavior of some individual stock returns series References 1 Baillie R 1996 Long Memory Processes and Fractional Integration in Econometrics Journal of Econometrics 73 5 59 EC82I Time Series Econometrics Spring 2003 Notes Section 3 ARMA models We have spoken of a univariate white noise process 615 a zero mean covariance stationary process with no serial correlation EEt 0Ee 0lto2 lt00 EEt6tj OYj 0 A very important class of covariance stationary processes called linear processes can be created by taking a moving average of a white noise process Let us consider that the white noise process is de ned for all integers t 0 i1 i2 so that we may assume that the process started in the distant past and its mean and autocovariance function have stabilized t0 time invariant constants The simplest linear process that exhibits serial correlation is a nite order moving average process gt is said to be a qthorder moving average process M Aq if it can be written as a weighted average of the current and most recent g values of a white noise process yt LL 1 1906151 916154 1 1 6q6tqt90 1 l A moving average process is covariance stationary with mean LL It is easy to show that the jthorder autocovariance is W 02 2 93ka forj 01 q and zero forj gt g with 02 denoting the variance of the 6 white noise process Thus if q 1 the zeroorder autocovariance is 1 91 02 the rstorder autocovariance is 9102 and the secondorder l and all higherorder autocovariances are zero The entire autocovariance function is described by q 1 1 parameters 61 c9q 02 The autocorrelation function depends only upon the q parameters in the 9 sequence We may also consider an M Altoo process in which the effects of past shocks do not abruptly dissipate after q periods Thus we might consider that the nite set of terms in 1 could be replaced by an in nite set of terms 20 jetj For this to make sense it must be the case that this sum of an in nite set of random variables is well de ned that is that the partial sum 220 jetj converges to a nite random variable as n gt 00 A condition under which this convergence will occur is that of absolute summability 220 le lt oo Intuitively this requires that the effects of the past shocks represented by t j eventually die away If the sequence t j is absolutely summable then an in nite order MA process M Aoo for yt converges in mean square and the process is covariance stationary with mean LL and autocovariances yj 02 220 jk k The autocovariances will themselves be absolutely summable If the 6 process is did then the y process is strictly stationary and ergodic We may now de ne a linear lter Let 3315 be a covariance stationary process and hj be a sequence of real numbers that is absolutely summable Then the in nite sum yt 20 hj1tj converges in mean square such that the y process is covariance stationary If the autocovariances of the 1 process are absolutely summable then so are the autocovariances of the y process The operation of taking a weighted average of possibly 2 in nitely many successive values of the 1 process is called ltering and we can write it using lag operator notation with a lter being represented by a lag polynomial 05L a0 alL agLZ The object yt ozL1t is a wellde ned random variable forming a covariance stationary process if the sequence 05 is absolutely summable and if the input process 3315 is covariance stationary Properties of linear lters For a given sequence of real numbers 10 a1 de ne a lter as the lag polynomial 05L a0 1 alL 1 a2L2 1 which may be applied to an input process 3315 to yield ozL1t 20 ajtj The lter could be nite such that ozj 0 for j gt p which then de nes a pth order lag polynomial When applied to an input process this nite lter creates a weighted average of the current and p most recent values of the process If the input process is covariance stationary and the sequence ozj is absolutely summable the ltered series will be a covariance stationary process Let 05 and j be two arbitrary sequences of real numbers and de ne the convolution of these sequences 6 as 06030 Z 607 2 04031 O i o 517 06032 1131 06230 Z 52 O o j O l j1 O g j2 O j1 l O j o Z Sj Product the convolution may be viewed as the product of two lters M altLgt ltLgt which may be computed with the same techniques as the product of two ordinary polynomials e g 1 alL 1 31L 1 1 a1 l 31 L 0513ng If each of these sequences are absolutely summable and the input series is covariance stationary then 05L L1t 6L1t is a well de ned random variable also covariance stationary and the sequence 6 will also be absolutely summable Inverse if 6L 1 so that aL L l we may say that ML is the inverse of 04L denoted as aL1 As long as 10 7 0 the inverse of 05L may be de ned for any arbitrary sequence 05 by successively solving the equations 2 For instance 1 L1 1 1 L L2 L3 which may not be absolutely summable To work with ARMA processes we must calculate the inverse of a nite order th degree lag polynomial L L 1 91L ng 9pr Since 0 7 0 the inverse polynomial IIL L1 may be de ned We may determine the coef cients of the inverse polynomial using the convolution formula in 2 for j gt p the coef cients of IIL may be solved recursively knowing the rst p coef cients The coef cients in the sequence llj will eventually decline if the stability condition on the L polynomial relating to its eigenvalues holds and those coef cients will then be absolutely summable As an example 4 consider the lag polynomial 1 gbL The root of the associated polynomial 1 Q52 0 is gb1 and the stability condition requires that lgb 11 gt 1 or alternatively that lt 1 The inverse of the lter is given stability 1 ngL 2L2 3L3 an in nite sequence but one which is bounded and nite The AR1 process A rstorder autoregressive process AR1 satis es the stochastic difference equation 225 I C yt l 6t 1 Lyt 0 6t where the 6 process is white noise For Q5 7 1 may rewrite this equation as 1 M M u 6t u 0 1 lt25 If yt is covariance stationary LL is the mean of the process and the moving average is formed om the successive values of yt We seek a covariance stationary solution to this stochastic difference equation If lt 1 we may use the inverse given above so that we may de ne 1 L11 M yt M 1 L16t yt M 1 L16t 225 I MZ jEt j j0 In this case yt has the moving average representation shown herean in nite order moving average process or M Aoo The 5 stability condition that allows us to form the inverse is known as the stationarity condition for the autoregressive lag polynomial which states that shocks to the system will be damped What about the case where that condition is violated and lgbl gt 1 Then there is a forward solution where the unique covariance stationary process yt is an in niteorder moving average of future values of e 00 325 M Z j6tj j0 and this in nite sum is well de ned since the sequence j is absolutely summable if gt 1 Last let us consider the case where there is no covariance stationary solution either in terms of past values of 6 nor future values of e the case of a unit root where Q5 1 In this case the j period change in y becomes yt yt j C 39j 6t 625 1 625 2 Et j17 which we call a random walk with drift The ARp process All that we have derived here for the AR1 process may equally well be expressed in terms of the pth order autoregressive process which satis es the pth order stochastic difference equation L lt C 615 with L 1 gblL QSZLZ gprp Ifthe process is CS then yt has a mean of ILL cltIgt11 and an MAOO representation yt 2 LL DLet with lIL tho 1L 2L2 where lIL DH4 The ARM Ap q process 6 We may combine nite order ARp and M Aq processes to yield an ARM Ap q process 225 I C 1yt 1 29254 Cbpyt p 90625 616t1 626t2 6q6tq or CIgtLyt C Let If 131 7 1 we may set Lt CltIgtl1 and write the model in the deviation from mean form ML 325 M ltLEt which is still a pth order stochastic difference equation but with a serially correlated forcing process Let rather than the white noise process 615 If the equation satis es the stationarity conditions then its unique covariance stationary solution has the M Aoo representation yt LL 1 IIL6t IIL L 1L The mean of the ARM Ap q process is again CltIgtl1 If there are common roots in the two polynomials then they could be factorized to represent the ARM Ap q process with a lower order process ARM Ap 1 q 1 We generally assume that there are no common roots in the AR and MA polynomials If the L polynomial satis es the stability condition then the ARM A process is said to be invertible and that stability condition is often termed the invertibility condition 7 If an ARM A process is invertible then we may also express it in an in nite order AR representation L1 Lyt 0911 6 Note that a process can be invertible yet not be stationary and vice versa both of its lag polynomials have to satisfy their respective stability condition if the process is to be termed stationary and invertible If the process is both stationary and invertible then the nite order ARM Ap q process also possesses both an AR00 and a M Altoo representation The autocovariance generating function The entire set of autocovariances of a covariance station ary process yt may be summarized by the autocovariance generating function OO 91 Z Z szj 00 Z 2 2 7 57 With z a complex scalar Note that for l the unit circle this implies that 00 Z 77 j oo 00 I 70 2 Z 77 jl which will represent a convergent sum given that the autocovari 8 gy lt1 ances are absolutely summable If we transform this function gy by dividing by 271 and setting 2 a complex number on the unit circle to 8 2 cosw isinw we have de ned the spectral density function or power spectrum of yt 3m army eW 271391gy cosw isinw where w represents the frequency the inverse of the period of the cyclical component in y The time series may be represented in the frequency domain as the sum of an in nite number of sines and cosines or in the time domain as a process with an in nite number of autocovariances The two representations are equivalent and we can represent y in the frequency domain as the Fourier transform of the timeseries data We will discuss the frequency domain representation at greater length when we discuss fractionally integrated processes See Hamilton Chapter 6 Extension to vector processes A vector white noise process 615 is a jointly covariance stationary process satisfying E 515 Z 07 E eta2 Q pd E etsj 0 forj 7 0 Since Q need not be diagonal there may be contemporaneous covariance among the elements of 615 Perfect correlation is ruled out since Q must be positive de nite A vector M Altoo process 9 may be expressed as 00 yt LL i Z j tj j0 This implies that if Fj is the jth order autocovariance matrix E yt LL ytj ml the autocovariance function may be written as U Emma k0 Multivariate lters may also be written as H L H 0 H1L 1 HgL2 1 with vector yt HL1t Products of lters may now be computed with linear algebra for instance the product of two lters AL and B L can be written in terms of sums and products of their coef cient matrices We may also consider the stability conditions in terms of lag matrix polynomials For instance if we have L 1 131L 2L2 pr where CIDj are r 7 matrices we may write the stability condition for this polynomial as l C1312 szpl 0 all of the roots of which must lie outside the unit circle For instance if we consider a two variable form of a rst order lag 511 512 521 522 7 the determinantal equation becomes 1 511 522 Z lt 11 22 12 21gtZZ Z 0 polynomial CIgtL IT C131L with coef cient matrix 10 The pth order vector autoregressive process VARp Will be the solution to the vector stochastic difference equation ML yt M 6t ILL MD 10 The VARp is itself a special case of a multivariate ARM A model the VARMAQ q Where each of the variables in the system are considered to follow an ARM A process of order q The autocovariance generating function for a vector covariance stationary process yt may be written as 00 Gy 2 szj j oo TO Z 1 ij Fzj 1 Which leads to the de nition of ihe spectrum of the vector process as I 8y w 271 1Gy 6 2 Special cases of this for vector processes are VWN Gy Q VMAoo Gy z p 2 or 21 VARp Gy z ow1 o z11 Which thus allows us to de ne the autocovariances of these processes at all lags ll Estimating autoregressions VARs and ARM A processes Pure AR models may be consistently estimated with OLS with the assumption that the error process is did How should the order of an AR process be chosen A sensible criterion is the general to speci c sequential t rule which starts by estimating a model with pmax lags where that parameter is selected to as over t the process If the high order lag term is signi cant at some prespeci ed level then pmax should be increased Then the autoregression is reestimated dropping the high order lag until the high order lag is signi cant Since the t test is consistent this rule will never under t the model it is biased toward over tting the model and thus may be criticized on its lack of parsimony It should be noted that all of the models described above should be t over a common sample to prevent the inclusion of additional data points om modifying the judgement A second approach is to apply the Akaike information criterion AIC or the Schwartz information criterion SIC also known as the Bayesian information criterion BIC In general terms these criteria seek the model which minimizes 10g SSRj j 1 1Cn n n where n is the sample size SSRj is the sum of squared residuals the least squares criterion for the jth order autoregression and Cn is set to 2 for the A10 and logn for the SIC BIC Either of the information criteria strike a balance between a better t in the rst term and the parsimony of the model in the 12 second term Ng and Perron suggest as above the use of a xed sample period to compare models With these criteria Just as With the general to speci c rule the A10 has a positive probability of over tting the model In contrast the BIO is a consistent estimator of j VARs may equally easily be estimated by single equation OLS Since the right hand side of each equation of the VAR contains the same set of regressors there is no gain in applying a system estimator conditional on it39d errors Likewise there is a multivariate generalization of the A10 and B I C that may be applied to search for the appropriate lag length of the VAR presuming that the same number of lags are to be used in each equation The estimation of ARM Ap q models is more challenging in that the model may be written as U15 2 615 616t1 6q6tq Zt 17 925 1 925 2 m yt p7 6 Ca 17 27 quot397 517 39 yt 216 at Which poses several problems The errors are serially correlated since they are M Aq and since the lagged dependent variables included in 25 are correlated by construction With lags of 6 included in the error term the regressors 25 are not orthogonal to the error term at The second problem could be solved by using suitably lagged values of the dependent variable at lags higher than q as instruments But the e icient estimation l3 of ARM Ap q processes requires a different modelling strategy relying on the maximum likelihood principle to estimate BoxJenkins models For instance consider the M Al process If we condition on initial values for the e s this becomes straightforward The model Yt LL 139 6t 9625 17 etN0 02 Let 9 LL 9 02 denote the population parameters of interest If 6t1 was known with certainty Ygl Et1NltLL 1 96154 02 and we could write 1 325 u Gels DZ 271 0392 202 If we knew for certain that 60 0 we would know the value of 61 with certainty as well and could in turn calculate the value of 62 conditional on 61 and so on The entire sequence of 6 values can then be recursively calculated om 6t 225 M 9625 1 given knowledge of LL and 9 Thus the conditional loglikelihood may be written as yilet DQ exp T T T 2 6 251 for a particular value of 9 Numerical optimization may then be used to vary elements of 9 and search for a maximum of 14 the loglikelihood function No analytical solution exists even for the M Al but the solution technique should be reliable This approach conditions the estimates on the speci cation that 60 0 other approaches which may be found in more sophisticated software may use other techniques for generating the initial conditions for optimization If 6 is substantially less than unity the effect of imposing 60 0 will quickly die out and the conditional likelihood will be a good approximation to the unconditional likelihood for a reasonably large sample size In contrast if 6 gt 1 the consequences of imposing 60 0 accumulate over time If numerical optimization yields a value in this region the results must be discarded An appropriate starting value for numerical optimization in that case would be 9 In particular a M Aq model will require the speci cation of q starting values for pre sample elements of the 6 vector Just as above we may generate conditional likelihood estimates on the assumption that 60 61 eq1 0 The conditional loglikelihood is only useful if all values of z for which 1 912 6222 6qu 0 lie outside the unit circle Numerical optimization to estimate the parameters of a general ARM Ap q process require two sets of assumed starting values the q presample values of the 6 sequence and p presample values of the y process Of course one may have the prior values of y that is you may start the estimation at least p periods after the start of the data Box and Jenkins in their classic text recommended setting the presample e s to zero 15 but setting the presample y s to their actual values The same caution applies With respect to stability of the resulting estimates if the MA polynomial does not satisfy the stability condition ie invertibility the estimates should not be trusted l6 EC82I Time Series Econometrics Spring 2003 Notes Section I I ARCH Modelling volatility Consider the ARp model 17 yt C Z CbiZt z39 W 21 with u iid Covariance stationarity requires that the AR polynomial has roots outside the unit circle The optimal linear forecast of the level of gt given covariance stationarity is E Qt yt la 925 27 2 C Z Cbz39yt z So that the conditional mean is changing as the process evolves However given covariance stationarity the unconditional mean is constant Eyt C1 Q51 as gbp1 What if we wanted to forecast the variance of the series rather than its mean We consider at as a process with a xed unconditional variance 02 But the conditional variance of yt could change over time If it followed a systematic pattern we might have something like m u2 77ZOiu2 w t Z t z t il where w is an iid error process This law of motion implies that m E ag 1 U212 77 139 Zaiug i 2391 l which is then the mth order model of Autoregressive Con ditional Heteroskedasticity ARCH as proposed by Engle 1982 This conditional expectation must be non negative for all realizations of the at process a necessary condition is that oz gt 0 for all 239 For u itself to be covariance stationary we fur ther require that all the roots of the oz polynomial lie outside the unit circle If the oz are all non negative this condition may be written as 2 oz lt 1 An alternative way to write such a model is in the form Ht 2 h tvt where m is distributed with mean zero and variance of unity If we then write m 2 ht 77 Z Clint 239 then the conditional expectation Zgives us the same expression in terms of 77 and the oz terms This ARCH model may then be used to augment a regres sion equation as a way of modelling the conditional variance of that equation s errors The presence of ARCH effects detected for instance by the Lagrange Multiplier test for ARCH archlm in Stata does not invalidate the use of OLS to estimate the equation but if there are systematic movements in the conditional variance we might want to be able to model them jointly with the level of the series The mean equation and the ARCH equation for the conditional variance may be jointly es timated in a maximum likelihood context Various solutions have been proposed to deal with the non negativity constraints which 2 are quite dif cult to impose in a ML estimation procedure Non Gaussian distributions may also be used to cope with the stylized facts of excess kurtosis in asset returns it may be desirable to allow for this in the model ARCH models have often been t using a t distribution where an addditional parameter the degrees of freedom is estimated in the process An even more general solution relies upon the Generalized Error Distribution GED which encompasses both the Normal and t distributions as special cases allowing for both excess and less than normal kurtosis The GARCH model of Bollerslev The primary extension of the ARCH m methodology of Engle is the Generalized ARCH or GARCH7 m model of Bollerslev 1986 The extension om ARCH to GARCH considers where HL is an in nite order lag polynomial Under appro priate conditions we can rewrite this as a rational lag in two nite order polynomials in the lag operator PM ML 1 MN 1 where the roots of 6L are outside the unit circle This gives rise to the GARCH7 m model 739 m Rt 2 H 139 Z 6ihti 139 Z ozjuij 39l jl where H l Z 677 This is an ARMAp 7 process for the squared errors where the jth AR coef cient is dj ozjand 3 the jth MA coef cient is 6j with p maxr m The non negativity requirement is that all parameters in this process are non negative with Is gt 0 The process is CS if 6 Z 05 lt 1 Just as a low order ARM Ap q process will often work as well as a high order ARp a low order GARCHO m will often su ice to capture the dynamics of the conditional variance as well as an ARCH m for large m The ability to specify a more parsimonious model especially given the non negativity constraints on the maximum likelihood problem is attractive An interesting special case is that of IGARCH or integrated GARCH where 6 1 Z 05 1 or cannot be distinguished from 1 This causes the unconditional variance of at to be in inite so that neither ut nor u is CS The issue is essentially that of a unit root in the ARM A process for ug and is often encountered in practice The GARCH in mean model A very useful variation on the GARCH model is GARCH in mean a speci cation where the condi tional variance itself enters the mean equation For assets we might expect higher return and higher risk to be positively cor related and thus a positive ARCH in mean term would be expected A similar rationale would apply if we con ont the stylized fact that countries with higher levels of in ation often are observed to have higher variances of the in ation process An example of this model is provided by Engle et al 1987 Alternative GARCH speci cations A huge literature on alternative GARC H speci cations 4 exists many of these models are preprogrammed in Stata s arch command and references for their analytical derivation are given in the Stata manual One of particular interest is Nelson s 1991 exponential GARCH or EGARCH He proposed 00 10 ht 77 2 th jl E th jl 9Vt j j1 which is then parameterized as a rational lag of two nite order polynomials just as in Bollerslev s GARCH Advantages of the E GARC H speci cation include the positive nature of ht irregardless of the estimated parameters and the asymmetric nature of the impact of innovations with 9 7 0 a positive shock will have a different effect on volatility than will a negative shock mirroring ndings in equity market research about the impact of bad news and good news on market volatility Nelson s model is only one of several extensions of GARCH that allow for asymmetry or consider nonlinearities in the process generating the conditional variance for instance the threshold ARCH model of Zakoian 1990 and the Glosten et al model 1993 The ARCH and GARC H models have also been extended in a multivariate context although considering more than two variables is quite difficult as the number of parameters to be estimated grows very rapidly Useful surveys of the literature although now somewhat dated are provided by Bollerslev et al 1992 1994 References Bollerslev T 1986 quot quot 39 conditional 39 39 Journal ofEconometrics 31 3077327 5 2 3 4 5 6 7 Bollerslev T Chou R and K Kroner 1992 ARCH modeling in nance Journal of Econometrics 52 5759 Bollerslev T Engle R and D Nelson 1994 ARCH models In Handbook of Econometrics Vol IV ed R Engle and D McFadden Elsevier Engle R 1982 conditional with estimates of the variance of UK in ation Econometrica 50 98771008 Engle R Lilien D and R Robins 1987 Estimating time varying risk premia in the term structure the ARCHiM model Econometrica 55 3917407 Glosten L Jagannathan R and D Runkle 1993 On the relation between the expected value and the volatility of the nominal excess return on stocks Journal of Finance 48 177971801 Zakoian 1 1990 Threshold heteroskedastic models Unpublished manuscript CREST INSEE EC821 Time Series Econometrics Spring 2003 Notes Section 10 Part 2 Christopher F Baum Department of Economics Boston College baumbcedu January 8 2003 1 Fractionally integrated timeseries and ARFIMA mod ellingL The model of an autoregressive fractionally integrated moving average process of a timeseries of order p 1 q denoted by ARFIMA p 1 q with mean it may be written using operator notation as ltIgtL1 Ldyt u 9Letetztztd0a 11 where L is the backward shift operator ltIgtL 1 1L qprp L 1 i91L quq and 1 Ld is the fractional differencing operator de ned by d 00 rag dLk L gm 12gt with denoting the gamma generalized factorial function The parameter 1 is allowed to assume any real value The arbitrary restriction of d to inte ger values gives rise to the standard autoregressive integrated moving average ARIMA model The stochastic process gt is both stationary and invertible if all roots of ltIgtL and L lie outside the unit circle and loll lt 05 The process 1This presentation of ARFIMA modelling draws heavily from Baum and Wiggins 2000 is nonstationary for d 2 05 as it possesses in nite variance ie see Granger and Joyeux 1980 Assuming that d E 0 05 Hosking 1981 showed that the autocorrelation function p of an ARFIMA process is proportional to k 1 as k a 00 Con sequently the autocorrelations of the ARFIMA process decay hyperbolically to zero as k a 00 in contrast to the faster geometric decay of a stationary ARMA process For 1 6 005 2 n diverges as n a 00 and the ARFIMA process is said to exhibit long memory or long range positive dependence The process is said to exhibit intermediate memory anti persistence or long range negative dependence for d E 050 The process exhibits short memory for d 0 corresponding to stationary and invertible ARMA modeling For 1 E 05 1 the process is mean reverting even though it is not covariance sta tionary as there is no long run impact of an innovation on future values of the process If a series exhibits long memory it is neither stationary I nor is it a unit root 11 process it is an 01 process with d a real number A series exhibit ing long memory or persistence has an autocorrelation function that damps hyperbolically more slowly than the geometric damping exhibited by short memory ARMA processes Thus it may be predictable at long horizons An excellent survey of long memory models which originated in hydrology and have been widely applied in economics and nance is given by Baillie 1996 11 Approaches to estimation of the ARFIMA model There are two approaches to the estimation of an ARFIMA Q 1 1 model exact maximum likelihood estimation as proposed by Sowell 1992 and semi parametric approaches Sowell s approach requires speci cation of the p and 1 values and estimation of the full ARFIMA model conditional on those choices This involves all the attendant di iculties of choosing an appropriate ARMA speci cation as well as a formidable computational task for each combination of p and q to be evaluated We rst describe semiparametric methods in which we assume that the short memory or ARMA components of the timeseries are relatively unimportant so that the long memory parameter 1 may be estimated without fully specifying the data generating process 12 Semiparametric estimators for Id series 121 The L0 Modi ed Rescaled Range estimator2 lomodrs performs Lo s 1991 modi ed rescaled range RS range over stan dard deviation test for long range dependence of a time series The classical RS statistic devised by Hurst 1951 and Mandelbrot 1972 is the range of the partial sums of deviations of a timeseries from its mean rescaled by its standard deviation For a sample of n values 1352 xn 1 k k Qn S M Wlsksnzwj xn MinlskanW xn 1391 1391 where sn is the maximum likelihood estimator of the standard deviation of ac The rst bracketed term is the maximum of the partial sums of the rst k deviations of acj from the full sample mean which is nonnegative The second bracketed term is the corresponding minimum which is nonpositive The difference of these two quantities is thus nonnegative so that Qn gt 0 Empirical studies have demonstrated that the R S statistic has the ability to detect long range dependence in the data Like many other estimators of long range dependence though the RS sta tistic has been shown to be excessively sensitive to short range dependence or short memory features of the data Lo 1991 shows that a sizable AR1 component in the data generating process will seriously bias the R S statistic He modi es the R S statistic to account for the effect of short range dependence by applying a Newey West correction using a Bartlett window to derive a consistent estimate of the long range variance of the timeseries For maxlaggt O the denominator of the statistic is computed as the Newey West estimate of the long run variance of the series If maxlag is set to zero the test performed is the classical Hurst Mandelbrot rescaled range statistic Critical values for the test are taken from L0 1991 Table II Inference from the modi ed R S test for long range dependence is comple mentary to that derived from that of other tests for long memory or fractional integration in a timeseries such as kpss gphudak modlpr and roblpr 2This discussion is drawn from Baum and Room 2000 122 The GewekeiPorterHudak log periodogram regression estima tor gphudak performs the Geweke and Porter Hudak GPH 1983 semiparametric log periodogram regression often described as the GPH test for long memory fractional integration in a timeseries The GPH method uses nonparametric methods a spectral regression estimator to evaluate 1 without explicit speci cation of the short memory ARMA parameters of the series The series is usually differenced so that the resulting 1 estimate will fall in the 05 05 interval Geweke and Porter Hudak 1983 proposed a semiparametric procedure to obtain an estimate of the memory parameter 1 of a fractionally integrated process Xt in a model of the form 1 L 1Xt 6 13 where t is stationary with zero mean and continuous spectral density f6 A gt 0 The estimate 1 is obtained from the application of ordinary least squares to log 1 0 6 flog 1 em 2 residual 14 computed over the fundamental frequencies As s 1 m m lt n We de ne mm As 2 1Xtems as the discrete Fourier transform dft of the timeseries Xt 10 w1sw1s as the periodogram and 3 log 1 eMs Ordinary least squares on 14 yields of 05W 15 n 1 352 Various authors have proposed methods for the choice of m the number of Fourier frequencies included in the regression The regression slope estimate is an estimate of the slope of the series p ower spectrum in the vicinity of the zero frequency if too few ordinates are included the slope is calculated from a small sample If too many are included medium and high frequency components of the spectrum will contaminate the estimate A choice of T or power 05 is often employed To evaluate the robustness of the GPH estimate a range of power values from 040 075 is commonly calculated as well Two estimates of the d coe icient s standard error are commonly employed the regression standard 4 error giving rise to a standard t test and an asymptotic standard error based upon the theoretical variance of the log periodogram of 7 The statistic based upon that standard error has a standard normal distribution under the null 123 The Phillips Modi ed GPH log periodogram regression estima tor modlpr computes a modi ed form of the GPH estimate of the long memory parameter 1 of a timeseries proposed by Phillips 1999a 1999b Phillips 1999a points out that the prior literature on this semiparametric approach does not address the case of d 1 or a unit root in 13 despite the broad interest in determining whether a series exhibits unit root behavior or long memory behavior and his work showing that the of estimate of 15 is inconsistent when d gt 1 with of exhibiting asymptotic bias toward unity This weakness of the GPH estimator is solved by Phillips7 Modi ed Log Periodogram Regression estimator in which the dependent variable is modi ed to re ect the distribution of 1 under the null hypothesis that d 1 The estimator gives rise to a test statistic for d 1 which is a standard normal variate under the null Phillips suggests that deterministic trends should be removed from the series before application of the estimator Accordingly the routine will automatically remove a linear trend from the series This may be suppressed with the notrend option The comments above regarding power apply equally to modlpr Phillips7 1999b modi cation of the GPH estimator is based on an exact representation of the dft in the unit root case The modi cation expresses mu As em Xn 1A3 A l 1 em 1 elw and the modi ed dft as ems Xn 1 ems x27rn with associated periodogram ordinates I As U1SU1s 1999b p9 He notes that both 1110 s and thus I A3 are observable functions of the data The log periodogram regression is now the regression of log I As on as logll ems De ning a m lzylas and 353 as a the modi ed estimate of the long memory parameter becomes 2quot 1 310g1V s 2 s 1 Ts Us As W1s a 05 16 Phillips proves that with appropriate assumptions on the distribution of t the distribution of 1 follows 712 M01 1 ad N 0 a 17 so that olhas the same limiting distribution at d 1 as does the GPH estimator in the stationary case so that d is consistent for values of 1 around unity A semiparametric test statistic for a unit root against a fractional alternative is then based upon the statistic 1999a p10 w 1 Tr with critical values from the standard normal distribution This test is consistent against both 1 lt 1 and d gt 1 fractional alternatives 2d 18 124 Robinson s Log Periodogram Regression estimator roblpr computes the Robinson 1995 multivariate semiparametric estimate of the long memory fractional integration parameters 19 of a set of G timeseries yg g 1G with G 2 1 When applied to a set of timeseries the 19 parameter for each series is estimated from a single log periodogram regression which allows the intercept and slope to differ for each series One of the innovations of Robinson s estimator is that it is not restricted to using a small fraction of the ordinates of the empirical periodogram of the series that is the reasonable values of power need not exclude a sizable fraction of the original sample size The estimator also allows for the removal of one or more initial ordinates and for the averaging of the periodogram over adjacent frequencies The rationales for using non default values of either of these options are presented in Robinson 1995 Robinson 1995 proposes an alternative log periodogram regression estima tor which he claims provides modestly superior asymptotic e iciency to 010 010 being the Geweke and Porter Hudak estimator 1995 p1052 Robin son s formulation of the log periodogram regression also allows for the formula tion of a multivariate model providing justi cation for tests that different time series share a common differencing parameter Normality of the underlying time series is assumed but Robinson claims that other conditions underlying his derivation are milder than those conjectured by GPH 6 We present here Robinson s multivariate formulation which applies to a sin gle time series as well Let Xt represent a G dimensional vector with 9 element th g 1 G Assume that Xt has a spectral density matrix fquot7T ei f A 1A with g h element denoted as fgh A The 9 diagonal element fgg A is the power spectral density of th For 0 lt Cg lt 00 and i lt olg lt i assume that fgg A w CgA 2 19 as A a 0 for g 1 G The periodogram of th is then denoted as 2 gm 27ml 9 1G 19 71 m 2 the t 1 Without averaging the periodogram over adjacent frequencies nor omission of 1 initial frequencies from the regression we may de ne ng log Ig The least squares estimates of c 61Cg and d all lg are given by vec Y ZZ Z 1 110 where Z 21Zm Zk 1 Zlog Ak Y Y1 Yg and Y9 Yin Ygm for m periodogram ordinates Standard errors for 019 and for a test of the re striction that two or more of the olg are equal may be derived from the estimated covariance matrix of the least squares coe icients The standard errors for the estimated parameters are derived from a pooled estimate of the variance in the multivariate case so that their interval estimates differ from those of their univariate counterparts Modi cations to this derivation when the frequency averaging j or omission of initial frequencies 1 options are selected may be found in Robinson 1995 13 Maximum likelihood estimators of ARFIMA models The theory and implementation of Sowell s exact maximum likelihood estimator of the ARFIMApdq model using Ox is described in Doornik and Ooms 1999 14 Applications Examples of the application of the lornodrs and classical rescaled range estirna tors Data from Terence Mills7 Econometric Analysis of Financial Time Series on returns from the annual SampP 500 index of stock prices 1871 1997 are analyzed use httpfmwwwbceduec pdataMills2dsp500adta lomodrs sp500ar Lo Modified RS test for sp500ar Critical values for H0 sp500ar is not long range dependent 90 0861 1747 9539 0809 1 862 99 0721 2098 Test statistic 780838 1 lags via Andrews criterion N 124 lomodrs sp500ar max0 Hurst Mandelbrot Classical RS test for sp500ar Critical values for H0 sp500ar is not long range dependent 90 0861 1747 95 0809 1862 99 0721 2098 Test statistic 799079 N 124 lomodrs sp500ar if tin1946 Lo Modified RS test for sp500ar Critical values for H0 sp500ar is not long range dependent 90 0861 1747 95 0809 1 862 99 0721 2 098 Test statistic 108705 0 lags via Andrews criterion N 50 For the full sample the null of stationarity may be rejected at 95 using either the Lo modi ed RS statistic or the classic Hurst Mandelbrot statistic For the postwar data the null may not be rejected at any level of signi cance Long range dependence if present in this series seems to be contributed by pre World War II behavior of the stock price series Examples of gphudak modlpr and roblpr estimators Data from Terence Mills7 Econometric Analysis of Financial Time Series on UK FTA All Share stock returns ftaret and dividends ftadiv are analyzed use http fmwww bc eduec pdataMills2dfta dta tsset time variable month 1965m1 to 1995m12 gphudak ftaretpower05 06 07 GPH estimate of fractional differencing parameter Asy Power Urds Est d StdErr tH0 d0 Pgtt StdErr zH0 d0 PgtIz 50 20 00204 160313 00127 0990 187454 00109 0991 60 35 228244 145891 15645 0128 130206 17529 0080 70 64 141861 089922 15776 0120 091267 15544 0120 modlpr ftaret power05 05508 Modified LPR estimate of fractional differencing parameter Power Urds Est d Std Err tH0 d0 Pgtt zH0 d1 PgtIz 50 19 0231191 139872 0 1653 0870 6 6401 0000 55 25 2519889 1629533 1 5464 0135 5 8322 0000 60 34 2450011 1359888 1 8016 0080 6 8650 0000 65 46 1024504 1071614 09560 0344 9 4928 0000 70 63 1601207 0854082 1 8748 0065 103954 0000 75 84 1749659 08113 2 1566 0034 117915 0000 80 113 0969439 0676039 1 4340 0154 149696 0000 roblpr ftaret Robinson estimates of fractional differencing parameter Power Ords Est d Std Err tH0 d0 Pgtt 90 205 1253645 0446745 28062 0005 roblpr ftap ftadiv Robinson estimates of fractional differencing parameters Power 90 Ords 205 Variable I Est d Std Err t Pgtt ftap 8698092 0163302 532640 0000 ftadiv 8717427 0163302 533824 0000 Test for equality of d coefficients F1406 constraint define 1 ftapftadiv roblpr ftap ftadiv ftaret c1 00701 Prob gt F 09333 Robinson estimates of fractional differencing parameters Power 90 Ords 205 Variable I Est d Std Err t PgtItI ftap I 8707759 0205143 424473 0000 ftadiv I 8707759 0205143 424473 0000 ftaret I 1253645 0290116 4 3212 0 000 Test for equality of d coefficients F1610 44011 Prob gt F 00000 The GPH test applied to the stock returns series generates estimates of the long memory parameter that cannot reject the null at the ten percent level using the t test Phillips7 modi ed LPR applied to this series nds that d 1 can be rejected for all powers tested while 1 0 stationarity may be rejected at the ten percent level for powers 06 07 and 075 Robinson s estimate for the returns series alone is quite precise Robinson s multivariate test applied to the price and dividends series nds that each series has 1 gt 0 The test that they share the same 1 cannot be rejected Accordingly the test is applied to all three series subject to the constraint that price and dividends series have a common 1 yielding a more precise estimate of the difference in 1 parameters between those series and the stock returns series References 1 Andrews D 1991 Heteroskedasticity and Autocorrelation Consistent Co variance Matrix Estimation Econometrica 59 817 858 2 Baillie R 1996 Long Memory Processes and Fractional Integration in Econometrics Journal of Econometrics 73 5 59 3 Doornik Jurgen A and Marius Ooms 1999 A package for estimating fore casting and simulating Ar ma Models Ar ma package 10 for Ox Available from the course homepage 41 E E 3 EE 101 111 121 131 141 Baum Christopher F and Tairi Room 2000 The modi ed rescaled range test for long memory Help le for Stata module lomodrs available from SSC IDEAS at httpideas repec org Baum Christopher F and Vince Wiggins 2000 Tests for long memory in a timeseries Stata Technical Bulletin 57 Available from the course home page Geweke J and Porter Hudak S 1983 The Estimation and Application of Long Memory Time Series Models Journal of Time Series Analysis 221 238 Granger C W J and R Joyeux 1980 An introduction to long memory time series models and fractional differencing Journal of Time Series Analy sis 1 15 39 Hosking J R M 1981 Fractional Differencing Biometrika 68 165 176 Hurst H 1951 Long Term Storage Capacity of Reservoirs Transactions of the American Society of Civil Engineers 116 770 799 Lo Andrew W 1991 Long Term Memory in Stock Market Prices Econo metrica 59 1991 1279 1313 Mandelbrot B 1972 Statistical Methodology for Non Periodic Cycles From the Covariance to RS Analysis Annals of Economic and Social Mea surement 1 259 290 Phillips Peter CB 1999a Discrete Fourier Transforms of Fractional Processes Unpublished working pap er No 1 243 Cowles Foundation for Research in Economics Yale University http cowles econ yale eduPcdd12ad1243 pdf Phillips Peter C B 1999b Unit Root Log Periodogram Regression Unpublished working pap er No 1244 Cowles Foundation for Research in Economics Yale University http cowles econ yale eduPcdd12ad1244 pdf Robinson PM 1995 Log Periodogram Regression of Time Series with Long Range Dependence Annals of Statistics 233 1048 1072 12 15 Sowell F 1992 Maximum likelihood estimation of stationary univariate fractionally integrated time series models Journal of Econometrics 53 165 188 EC82I Time Series Econometrics Spring 2003 Notes Section 5 Unit root tests Given the distinction between trendstationary and unit root processes it would seem to be very important to be able to determine whether a particular timeseries which for instance generally increases in value is being driven by some underlying trend or whether its evolution re ects a unit root in its data generating process Those who study macroeconomic phenomena will want to know whether economic recessions have permanent consequences for the level of future GDP as they would if GDP exhibits a unit root or whether they are merely deviations from a trend rate of growth temporary downturns that will be offset by the following recovery Those who are concerned with the stock market want to know whether stock prices really do follow a random walk ie exhibit unit root behavior rather than some complicated combination of trends and cycles If stock prices behavior re ect a unit root then technical analysis or charting is no more useful than astrology On the other hand if there are no unit roots in stock prices all of the effort applied by stock analysts to studying the behavior of these series should have a reward This concern has given rise to a battery of unit root tests statistical procedures that are designed to render a verdict as to whether a given sample of timeseries data appears to imply the presence of a unit root in that timeseries or whether the series may be considered stationary In terms of our prior terminology l we are trying to discern whether the series exhibits 1 unit root or 0 stationary behavior It turns out that this is a fairly dif cult problem om a statistical perspective It might appear suf cient to merely estimate an equation such as yt 1yt1 615 modi ed to the form Ayt mule 1 6t 1 using the available sample of data and test the null hypothesis that y gbl 1 1 For various reasons that turns out to be problematic in the sense that the distribution of the test statistic is nonstandard under that null The t test for y 0 does not have a t distribution under the null hypothesis even as T gt 00 the distribution of this If statistic will not be N 0 1 Under the alternative hypothesis the test statistic is well behaved but under the null the point of interest it follows the Dickey Fuller distribution rather than the Normal or t The critical points on the D F distribution as established by simulation are considerably larger than those of the equivalent 15 whereas a value of l645 would be on the borderline of rejection at the 95 level for a one tailed t test the DF critical value would be l96l for T 1000 Of course the model 1 may not be the appropriate special case of the autoregressive distributed lag model we may want to allow for an additional term which would become a constant term in a stable autoregressive process or a drift term in a random walk process Otherwise we are specifying a stable autoregressive process with a zero mean under the alternative hypothesis which may not be sensible unless the series has 2 already been demeaned With that modi cation we would test Ayt M wt i 6t 2 which would then allow both a test for a unit root 7 0 and a joint test for a white noise process an F test for y 0 and LL 0 Note that the critical values for the t test are not the same as those that would be used in l for instance the D F critical value for T 1000 in this test is 2864 One must also note that this model would not be appropriate if there was an obvious trend in the series since the model under the alternative has no mechanism to generate such a trend as the RW with drift model does under the null The most general form of the standard D F test allows for both a constant in the relationship and a deterministic trend A3125 M l39 7925 1 139 t 139 6t 331 Such a model will allow for both a nonzero mean for y with LL 7 0 and trending behavior with B 7 0 under the alternative hypothesis where y lt 0 The most likely null hypothesis is then that of a RW with drift so that under H0 7 0 and 0 no deterministic trend This null could be rejected for three reasons a there could be no unit root but a deterministic trend b there could be a unit root but with a deterministic trend or c there might be no unit root nor deterministic trend The most general alternative is a for which an F test is required since two restrictions on the parameter vector are implied under the null The F statistic is calculated in the normal way but the distribution is again nonstandard and tabulated values for the D F F distribution must be consulted More commonly we 3 consider a t test on 7 once again the critical values are speci c to model 3a For instance the D F critical value for T 1000 in this test is 3408 larger yet than the critical values in the constant only model which in turn exceed those for the original white noise model Any of the forms of this test presume the existence of white noise errors in the regression If that is implausible the test will lose signi cant power To cope with this issue any of the Dickey Fuller tests in practice are usually employed as the augmented Dickey Fuller test or ADF test in which a number of lags of the dependent variable are added to the regression to whiten the errors A3125 2 M YZt i lAyt l 192AZt 2 315 6t 4 In this formulation we consider an ARp model as the baseline model rather than the AR1 model of the simple Dickey Fuller framework The choice of appropriate lag length is likely to depend on the frequency of the data a general to speci c strategy analogous to the Ng Perron sequential t procedure or an information criterion may also be used We will discuss the use of a modi ed AIC below Phillips Perron tests The augmentation of the original D F regression with lags of the dependent variable is motivated by the need to generate it39d errors in that model since an OLS estimator of the covariance matrix is being employed An alternative strategy for allowing for errors that are not add is that of Phillips 1987 and Phillips and Perron 1988 known as the Phillips Perron PP unit 4 root test The PP test deals with potential serial correlation in the errors by employing a correction factor that estimates the long run variance of the error process with a variant of the Newey West formula Like the ADF test use of the PP test requires speci cation of a lag order in the latter case the lag order designates the number of lags to be included in the long run variance estimate The PP test allows for dependence among disturbances of either AR or MA form but have been shown to eXhibit serious size distortions in the presence of negative autocorrelations In principle the PP tests should be more powerful than the ADF alternative The same critical values are used for the ADF and PP tests The DF GLS test Conventional unit root tests are known to lose power dramatically against stationary alternatives with a low order MA process a characterization that ts well to a number of macroeconomic time series Consequently these original tests have been largely supplanted in many researchers toolkits by improved alternatives Along the lines of the ADF test a more powerful variant is the DFGLS test proposed by Elliott Rothenberg and Stock ERS 1996 described in Baum 2000 2001 and implemented in Stata1 as command dfgls dfgls performs the ERS e icient test for an autoregressive unit root This test is similar to an augmented DickeyFuller I test as performed by dful ler but has the best overall performance in 1 This command is not built in to Stata version 7 but can be readily installed by using the command ssc install dfgls The command is already installed in the version of Stata running on fmriscbcedu It is available as part of o icial Stata in version 8 5 terms of smallsample size and power dominating the ordinary DickeyFuller test The dfgls test has substantially improved power when an unknown mean or trend is present ERS p813 dfgl 5 applies a generalized least squares GLS detrending demeaning step to the varname 22l 2 9t Zt A A For detrending 2t 1 t and o l are calcu lated by regressing 311 1 65L yg 1 65L yT onto 21 1 65L 22 1 65L 2T where 55 1 1 ET with 5 135 and L is the lag operator For demeaning 2t 1 and the same regression is run with 5 70 The values of 5 are chosen so that the test achieves the power envelope against stationary alternatives is asymptotically MP1 most powerful invariant at 50 percent power Stock 1994 p2769 empha sis added The augmented DickeyFuller regression is then computed using the 3 series A9 oz 715 pyii 2 5 Ay 5 6t where mmaxlagThe notrend option suppresses the time trend in this regression Approximate critical values for the GLS detrended test are taken from ERS Table 1 p825 Approximate critical values for the GLS demeaned test are identical to those applicable to the no constant no trend Dickey Fuller test and are computed using the dfuller code The dfgls routine includes a very powerful lag selection criterion the modi ed AIC MAIC criterion proposed by Ng and Perron 2001 They have established that use of this MAIC 6 criterion may provide huge size improvements 2001 abstract in the dfgls test The criterion indicating the appropriate lag order is printed on dfgl s output and may be used to select the test statistic om Which inference is to be drawn It should be noted that all of the lag length criteria employed by dfgls the sequential I test of Ng and Perron 1995 the SC and the MAIC are calculated for various lags by holding the sample size xed at that de ned for the longest lag These criteria cannot be meaningfully compared over lag lengths if the underlying sample is altered to use all available observations That said if the optimal lag length by Whatever criterion is found to be much less than that picked by the Schwert criterion it would be advisable to rerun the test With the maxlag option specifying that optimal lag length especially When using samples of modest size The KPSS test An alternative test is that proposed by KWiatkowski et al 1992 the socalled KPSS test Which has a null hypothesis of stationarity that is H0 y N I 0 It is also described in Baum 2000 and implemented in Stata2 as command kpss kpss performs the Kwiatkowski Phillips Schmidt Shin KPSS 1992 test for stationarity of a time series The test may be conducted under the null of either trend stationarity the default or level stationarity Inference from this test is complementary to that derived om those based on the Dickey Fuller distribution such This command is not built in to Stata but can be readily installed in any version of Stata with access to the Web by using the ssc install kpss command The command is already installed in the version of Stata running on fmriscbcedu 7 2 as dfgls dfuller and pperron The KPSS test is often used in conjunction with those tests to investigate the possibility that a series is fractionally integrated that is neither 1 nor 10 see Lee and Schmidt 1996 The series is detrended demeaned by regressing y on 215 1 t 215 1 yielding residuals 615 Let the partial sum series of at be st Then the zeroorder KPSS statistic k0 T421519 T 1ZtT1e For maxlaggt 0 the denominator is computed as the NeweyWest estimate of the long run variance of the series see R newey Approximate critical values for the KPSS test are taken from KPSS 1992 The kpss routine includes two options recommended by the work of Hobijn et al 1998 An automatic bandwidth selection routine has been added rendering it unnecessary to evaluate a range of test statistics for various lags An option to weight the empirical autocovariance function by the Quadratic Spectral kernel rather than the Bartlett kernel employed by KPSS has also been introduced These options may be used separately or in conjunction It is in conjunction that Hobijn et al found the greatest improvement in the test Our Monte Carlo simulations show that the best small sample results of the test in case the process exhibits a high degree of persistence are obtained using both the automatic bandwidth selection procedure and the Quadratic Spectral kernel 1998 p 14 The qs option speci es that the autocovariance function is to be weighted by the Quadratic Spectral kernel rather than the Bartlett kernel Andrews 1991 and Newey and West 1994 8 indicate that it yields more accurate estimates of 03 than other kernels in nite samples Hobijn et al 1998 p6 The auto option speci es that the automatic bandwidth selection procedure proposed by Newey and West 1994 as described by Hobijn et al 1998 p7 is used to determine maxlag in two stages First the a priori nonstochastic bandwidth parameter nT is chosen as a function of the sample size and the speci ed kernel The autocovariance function of the estimated residuals is calculated and used to generate y as a function of sums of autocorrelations The maxlag to be used in computing the longrun variance MT is then calculated as min T int yTQH where 913 for the Bartlett kernel and 15 for the Quadratic Spectral kernel The Leyboume McCabe test Like the KPSS test the test proposed by Leyboume and McCabe LMc 1994 1999 has a null of stationarity and a unit root alternative hypothesis The difference lies in the speci cation The LMc test like the ADF test is parametric it posits a null of AR M Ap 0 0 that is ARp with deterministic trend with an alternative of AR M Ap 1 1 The LMc test is more cumbersome as it requires estimation of this nonlinear model under the null hypothesis but it has been shown to be more power ll than the KPSS test Just as the PP test is a semiparametric alternative to the ADF that is the PP test uses a Newey West long run estimate to deal with dependence in the error process rather than an explicit ARp speci cation the KPSS test may be considered as a semiparametric alternative to the LMc test The tests differ in how they take account 9 of autocorrelation in yt under H0 1994 p160 the LMc test accounts for autocorrelation in a parametric manner by including lagged terms in the rst difference in the series as does the augmented DickeyFuller test In contrast the KPSS test modi es the test nonparametrically in a manner similar to that in which the PhillipsPerron test is a nonparametric adjustment of the simple DF test opcit The authors nd that once yt does not resemble white noise the size of the KPSS test is likely to be quite badly approximated by its asymptotic distribution even when the lag length l is relatively high 1994 p 161 The critical values for the LMc test are identical to those used for KPSS Combining inference from 11 and 10 tests The two families of unit root tests may be used in conjunction to establish the nature of the data generating process for a given timeseries and in particular to signal the presence of fractional integration in the series If inference om the DFGLS test rejects its null hypothesis of unit root behavior or nonstationarity while the KPSS test also rejects its null then we might conclude that both 1 and 0 are rejected by the data That sets the stage for an alternative explanation of the timeseries behavior that of fractional integration or longrange dependence in which the series may be characterized as d 0 lt d lt l neither 0 nor I Seasonal unit root tests The implicit assumption in applying unit root tests to data which have been deseasonalized is that the adjustment method 10 does not affect inference on the stationarity of the time series However several authors have called that assumption into question in the case where the SA data have been generated by a moving average lter The most popular SA technique is the US Census Bureau s Xll seasonal adjustment program which passes the data through a sequence of moving average lters Monte Carlo simulations show that the power of standard unit root tests applied to SA data generated in this manner is reduced so that the null of nonstationarity is not rejected frequently enough Although one could deal with this issue by testing NSA data they are not always available One can modify the standard AR model that gives rise to the unit root test to 225 I 1syt s 139 6t lt1 lsLsyt 6t where s 4 for quarterly data 3 12 for monthly data etc If gbls 1 then there is a unit root at the 3th seasonal frequency and the 3th difference of yt will remove it Although this model could be directly tested with the D F methodology it is likely to be too simple for most applications since it restricts the dynamics of y to depend only on seasonal differences A natural extension of this model would be L SLsyt 6t 5 where is a standard autoregressive polynomial and SL3 l gblsLs Q5TSLTS is the seasonal polynomial One could have unit roots in either both or neither of the polynomials and if they exist in both of the polynomials both an ordinary difference and a seasonal difference would have to 11 be applied to render the resulting series stationary The test for a unit root in this context is that of Hylleberg et al HEGY 1990 For quarterly data the test may be implemented Within Stata via the routine hegy4 of Baum and Sperling f indit hegy4 In this representation the unit root at the quarterly frequency can be written as 1 L4 1 L1LL2L3 1 L1 L1 239L1239L and the composite polynomial in 5 can be expanded about its roots With a remainder term gtltLgt gt4ltL4gt mLlt1 L L2 L3 72 Llt1 L L2 L3 lt73L 74lt Llt1 L2 ltLlt1 L4 LA4yt 719125 1 ngu i l 7323t 2 Myat i 6t y 1 L L2 L3yt 9225 lt1 L L2 L3yt 3325 lt1 L2yt and this regression may be run as the equivalent of the D F regression A test of 71 0 corresponds to a test of the standard unit root hypothesis against the alternative of stationarity A test of 72 0 allows for the semi annual root of 1 versus the stationary alternative Seasonal unit roots at the quarterly frequency correspond to 73 74 0 Thus there will ne no seasonal unit roots in the series if 72 7 0 and either 73 7 0 or 74 7 0 corresponding to a rejection of the null that 72 7 0 and thejoint null that 73 74 0 12 The Stata routine hegy4 performs the test for seasonal unit roots by estimating the four roots of the timeseries representation and presents estimates of these roots as 7r17r4 A joint test for 7r3 7r4 0 is also presented Critical values are those appropriate for T100 taken from HEGY Table 1 Joint tests for 7T2 7r3 7r4 0 and 7r1 7T2 7r3 7r4 0 With critical values are those presented by Ghysels et al 1994 Critical values for the case of muliplicative seasonality see below are from tables 1ac in Smith and Taylor 1998 Critical values are linearly interpolated for sample sizes in the ranges 48100 and 100200 Like the standard D F test it may be necessary to augment the HEGY test With additional lags of the dependent variable The option Lags numlist speci es the lag orders to be used in augmenting the model With lags of the fourth difference of the timeseries Its default is zero If sequential lags are speci ed starting With 1 HEGY4 automatically conducts a sequential ttest to determine the optimal lag length and optimal lags to be included in the auxiliary regression It may also be desirable to include deterministic terms such as a constant trend and in this case seasonal dummy variables in the model In the hegy4 routine the option det may take on values none const seas trend strend or mul t specifying the process to be tested The default as suggested by HEGY and Ghysels et al 1994 is seas indicating that a set of 3 seasonal dummies plus constant are to be included in the regression none speci es that no deterministic variables are to be included const speci es 13 only a constant trend speci es that a trend is to be included along with a constant term strend speci es that a trend is to be included along with seasonal dummies and a constant term mult speci es that seasonal intercepts the case of multiplicative seasonality recommended by Smith and Taylor 1998 are to be included along with seasonal dummies and a constant term The HEGY model may also be de ned for monthly data although the algebra in that case is rather menacing a Stata routine for that purpose is under development Testing for unit roots with structural breaks A well known problem in the unit root literature is the potential for a series which exhibits structural shifts to fail to reject the unit root null In the simplest case a series which undergoes a mean shift is not covariance stationary but could be made so if regressed on a dummy that identi ed the shift period zero before one after Early work along these lines was that of Perron 1989 and Perron and Vogelsang 1992 If there is a known break in a sample of T observations at point T b we may consider three extensions of the random walk with drift model where y is considered to be in logs and no 1rther dynamics are present 225 M 51DVTBt yt i 625 225 I M 52DVUt yt i 6t yt 2 LL 62DVU15 l yt l 139 615 where DVTBt 1 in the period Tb 1 1 and DVUt 1 for t gt T b The rst model considers a level shift jump at time T b 1 the second model considers a change in the growth 14 rate of the series effective at time T b 1 1 and the third model considers the possibility that both occur The DVTBt dummy is an impulse dummy picking out the single period of the shift while DVUt is a shift dummy which changes the underlying slope of the stochastic trend The alternative hypotheses to each of these models are respectively yt J l l 62DVU15 139 6t yt u t 63DVTX 615 Mt LL 139 t 139 62DVU15 139 615 where DVT and DVTt both equal zero if t S T b DVTtquot t Tb ift gt Tb and DVTt t ift gt Tb Since there is drift in the rst and third models under the null hypothesis the alternative includes a deterministic trend In the rst model the null is a unit root with level change under the alternative we have a trend stationary series with a change in the intercept to LL 1 52 In the second model the null hypothesis is a unit root with a change in the drift and the alternative is a trend stationary series with a change in the slope to B 1 53 In the third model the null is a unit root series with change in both level and drift and the alternative is a trend stationary series with changes in the intercept and slope In practice further dynamics may be necessary to whiten the residuals Perron suggests the use of two procedures for modeling this process depending on whether adjustment following the break is assumed to be instantaneous or gradual The former is known as the A0 additive outlier case where there is a single effect at the breakpoint Alternatively the O innovational outlier 15 model may be applied which allows for a gradual adjustment of the series following the break The original derivation of these tests was performed conditional on a known breakpoint More realistically we may not know when or even if such a breakpoint exists If the tests are performed conditional on an a priori breakpoint they may not have maximal power More recent derivations of unit root tests in the presence of structural change have focused on unknown breakpoints and in some cases on multiple structural breaks and the methods needed to consistently detect them For instance Perron 1997 presents a procedure for locating the single breakpoint with highest likelihood by considering all possible breakpoints in the interior of the sample selecting that point which maximizes the absolute value of the t statistic for the structural change term That article presents asymptotic critical values for the unit root test statistic in the presence of a single breakpoint A paper using this methodology to examine the impact of structural breaks on unit root testing is Baum et al 1999 which considers the possibility that common ndings of nonstationarity in real exchange rates may be an artifact of structural breaks in the series In the end they conclude that unit roots are present even when structural breaks and the potential for fractional integration are accounted for The unitroot test statistics forthcoming om the A0 and 0 models will account for onetime level shifts which might otherwise be identi ed as departures from stationarity However the behavior of 16 real exchange rate series over our sample period may not be adequately characterized by a single shift as Lothian 1998 has noted US dollarbased real exchange rates appear to have exhibited two shifts in mean over the 19801987 period approximately reverting to their pre198O level after 1987 In these circumstances allowing for a single level shift will not su ice The PerronVogelsang methodology has been extended to double mean shifts by Clemente et al 1988 Who demonstrate that a twodimensional grid search for breakpoints T 1 and T b2 may be used for either the A0 or 0 models and provide critical values for the tests In this context the A0 model involves the estimation of 22 M61DU1t62DU2tyt 6 and subsequently searching for the minimal t ratio for the hypothesis oz 1 in the model k k g Z titBTW Z wiDTbW Oak 1 20 20 k 2 QtAth z39 8t 1 fort k i 2 T For the 0 model the modi ed equation to be estimated becomes 325 M l 61DU1t 52DU2t 191DTb1 192DTb2 7 l7 k O yt 1 Z eiAyt i 8t 23971 8 for t k 2 T with a search for the minimal t ratio for the hypothesis oz 13 Code to estimate unit root tests allowing for one or two structural breaks in either an A0 or O context is available for Stata as routines clemaol clemaoZ clemiol and clemioZ As it has not yet been documented it is not in the SSC archive but it is available from the instructor on request 1 2 3 4 5 6 7 3 References Andrews DWK 1991 Heteroskedasticity and Autocorrelation Consistent Covariance Matrix Estimation Econometrica 59 817858 Baum Christopher F 2000 Tests for stationarity of a timeseries Stata Technical Bulletin 57 Available from the course homepage Baum Christopher F John T Barkoulas and Mustafa Caglayan 1999 Long memory or structural breaks Can either explain nonstationary real exchange rates under the current oat Journal of International Financial Markets Institutions and Money 9 359376 Available as BC EC WP 380 Baum Christopher F and Vince Wiggins 2001 Tests for stationarity of a timeseries Update Stata Technical Bulletin 58 Available from the course homepage Clemente J Monta es A and M Reyes 1998 Testing for a unit root in variables with a double change in the mean Economics Letters 59 175182 Elliott Graham Rothenberg Thomas J and James H Stock 1996 E lcient Tests for an Autoregressive Unit Root Econometrica 644 813836 Ghysels E Lee HS and J Noh 1994 Testing for Unit Roots in Seasonal Time Series Some Theoretical Extensions and a Monte Carlo Investigation Journal of Econometrics 62 415442 These tests customarily are applied to a trimmed sample we trimmed 5 of the sample from each end when searching for the breakpoints 18 8 9 10 11 12 13 14 15 16 17 18 19 20 21 Hylleberg S Engle RF Granger CWJ and BS Yoo 1990 Seasonal Integration and Cointegration Journal of Econometrics 44 215238 Hobijn Bart Franses Philip Hans and Marius Ooms 1998 Generalizations of the KPSStest for Stationarity Econo metric Institute Report 9802A Econometric Institute Erasmus University Rotterdam httpwwweurnlfeweipapers Lee D and P Schmidt 1996 On the power of the KPSS test of stationarity against fractionallyintegrated alternatives Journal of Econometrics 73 285302 Leybourne SJ and BPM McCabe A consistent test for a unit root Journal of Business and Economic Statistics 12 1994 157166 Leybourne SJ and BPM McCabe Modi ed stationarity tests with datadependent modelselection rules Journal of Business and Economic Statistics 17 1999 264270 Lothian J 1998 Some new stylized facts of oating exchange rates Journal of International Money and Finance 17 2939 Newey WK and KD West 1994 Automatic Lag Selection in Covariance Matrix Estimation Review of Economic Studies 61 631653 Ng Serena and Pierre Perron 1995 Unit root tests in ARMA models with datadependent methods for the selection of the truncation lag Journal of the American Statistical Association 90 268281 Ng Serena and Pierre Perron 2001 Lag Length Selection and the Construction of Unit Root Tests with Good Size and Power Econometrica 69 151971554 Perron Pierre 1989 The Great Crash The Oil Price Shock and the Unit Root Hypothesis Econometrica 57 136171401 Perron P 1990 Testing for a unit root in a time series with a changing mean Journal of Business and Economic Statistics 82 153162 Perron Pierre 1997 Further Evidence on Breaking Trend Functions in Macroeconomic Variables Journal of Econometrics 80 3557385 Perron P and T Vogelsang 1992 Nonstationarity and level shifts with an application to purchasing power parity Journal of Business and Economic Statistics 103 301320 Smith R J and A M R Taylor 1998 Additional Critical Values and Asymptotic Representations for Seasonal Unit 19 Root Tests Journal of Econometrics 85 26988 Stock James H 1994 Unit Roots Structural Breaks and Trends Chapter 46 in Handbook of Econometrics Volume IV RF Engle and BL McFadden eds Amsterdam Elsevier 20 EC82I Time Series Econometrics Spring 2003 Notes Section 6 Cointegration Consider the system Xt Z Xt l 139 6t Yi 06 l Xt 139 wt with 615 and wt each it39d standard Normal variates There are no dynamics in the DGP and we assume there is no contemporaneous correlation between the two error processes In this case OLS will nd unbiased and consistent estimates of oz and despite the fact that the variance of X is not bounded X and Y are said to be cointegrated or to possess a cointegrating relationship Y will be 1 since it is composed of an 1 variable plus a stationary error On the other hand consider two independent random walks X and Y and the regression of Y upon X In that regression the true slope coef cient is zero since there is no relationship whatsoever between these two integrated processes But the regression run either way will yield a nonzero estimate of the slope coef cient and the signi cance of that coef cient will not diminish with sample size Indeed the probability of rejecting the true null will increase with sample size In a simple Monte Carlo experiment the rejection rate for the t statistic is 72 in samples of size 50 but 80 for T 100 and 91 for T 500 whereas the rejection rate should be 10 Furthermore R2 of such a regression becomes a random variable in a regression of l unrelated nonstationary variables and the likelihood of nding a sizable R2 in that context is quite sizable Demonstrably OLS does not yield consistent estimates of the true slope parameter in this instance the case of a spurious regression as de ned by Granger and Newbold 1974 A theoretical explanation of this phenomenon was presented by Phillips 1986 The problem of spurious regressions appears with I 1 variables so the determination of unit root processes is essential Furthermore the problem will not arise if the series are cointegrated so that determining whether cointegration exists is important as well What is cointegration The notion that two or more nonstationary processes may be following the same stochastic trend or may share an underlying common factor Although X and Y are both I l a linear combination of those nonstationary variables may exist which is itself I The coef cients in that linear combination form the cointegrating vector which will have one element normalized to unity since the CI vector is de ned up to a factor of proportionality Additionally the vector may include a constant to allow for unequal means of the two variables so Yr 901902Xt t and the notion is that 5 will be a stationary process in the presence of cointegration The concept may be extended to higher orders of integration and more than two variables If X and Y are each I 2 then a linear combination of them might be I l or even I We 2 can speak of series as being cointegrated Cd b where d is the common order of integration of the variables and b is the reduction in the order of integration of the cointegrating combination Thus the case above would be 011 1 with two 1 variables forming a combination one order lower of I A linear combination of 1 variables is not a spurious regression if it is stationary The Engle Granger approach The original approach to testing for cointegration is that of Engle and Granger 1987 In this classic paper they demonstrate that if one regresses an 1 variable upon another 1 variable in what is termed a balanced regression in which all variables share the same order of integration the residuals from that regression may then be subjected to a unit root test The null hypothesis in this case is that of noncointegration that is failing to reject a unit root in the errors in a DickeyFuller style test will yield the conclusion of nonCI whereas rejection in favor of stationarity in the error process will be evidence of a cointegrating relationship among the variables Stock 1987 has shown that the OLS estimates in this regression have the desirable property of superconsistency that is they are not only consistent estimates of the underlying parameters of the DGP but they converge on the population values more quickly than OLS estimates in the context of stationary regressors The ADFtype test applied in this instance will not contain a constant term since the OLS residuals will be mean zero with a constant included in the CI regression The critical values in this case 3 are not the same as those of the standard D F distribution since the timeseries being tested is a generated series They are larger negative values than those provided for the D F distribution and like D F critical values are obtained by simulation Note also that the regression could be run with either variable on the left hand side since it is not a structural relationship if a CI vector exists it may be renorrnalized on the other variable If the R2 in the CI regression is low however less than 08 then the inference may differ depending upon normalization How would one operate with more than two variables One may still form a cointegrating vector among three or more 1 series and estimate the CI regression As in the two variable case the estimates of the CI coef cients will be superconsistent However with more than two variables the weakness of the EG approach emerges the test can determine whether a CI vector exists that yields stationary errors but it will generate only one of the possible multiple CI vectors that could exist in this setting When there are three variables for instance they could all be driven by the same common factor stochastic trend or their behavior could re ect two common factors coinciding with the existence of two CI vectors The E G approach is not capable of nding more than one CI vector other approaches such as Johansen and Juselius ML approach can do so Of course it could be that there are zero common factors underlying these three variables dynamics in which case the E G approach will correctly re ect the absence of CI among them Cointegration and the error correction model 4 A simple error correction model ECM may be written as Y K1 91AXtl 62315 1 139 615 In this formulation also known as a partial adjustment scheme there is adjustment of Y toward a target Yquot which depends on the lagged disequilibrium Imagine that there is a constant ratio in equilibrium between consumption and income in logs C and 3 so that Ct k 1 yt Then a measure of disequilibrium may be written as t Ct k 1 gt An error correction scheme might be written as Act QlAyt 92t1 615 where consumers react to last period s disequilibrium by revising their consumption Substituting in we have Act 90 QlAyt 92 Ct yt1 615 where the parenthesized quantity is the error correction term Consumption will change if income changes or if there was a disequilibrium in the relationship last period Note that the proportionality factor k has been subsumed in the constant term of this relationship we could instead leave it in the error correction term as a coef cient on y The coef cient 92 has limits 1 S 92 lt 0 since for stability one should not overadjust to the disequilibrium which would correspond to closing the entire gap this period nor to fail to adjust at all a coef cient of zero let alone a positive coef cient which would drive C away om its equilibrium relationship The ECM contains both the short run mechanism by which consumption will adjust to current changes in income as well as the long run adjustment to equilibrium The relative importance of shortrun and longrun 5 uctuations in consumption will be governed by the relative magnitudes of the 91 and 92 coef cients Incorporating the equilibrium coef cient in the relationship we have Act I 90 61AM 62015 gpgyt1 6t 1 corresponding to the levels or static relationship Ct 901 902325 25 2 If these variables are both 1 this relationship may be superconsistently estimated by OLS and the residuals tested for noncointegration If the null of nonCI is rejected then the relationship 1 may be estimated replacing the unknown F1 with the lagged residuals om 2 by OLS If Ct and yt are both 1 and cointegrated 011 1 and neither has a trend in the mean then by the Granger Representation Theorem Granger and Weiss 1983 there will always exist an error correction representation of the form Act 2 laggedltAct Ayt 920554 wt 3 Ayt laggedltAct Ayt 9295154 11 where 5 2 Ct 901 902 mt wLz 1t and Ugt wLz 2t with the 6 sequences white noise It must be so that legal ldgyl 7 0 that is the lagged disequilibrium term must appear in at least one of the equations In general it will appear in both Note that this model is essentially a VAR augmented by the error correction term The transformation of the dynamic model into this form illustrates that a regular VAR in the differences of these variables would be misspeci ed in that it would omit the error correction term That term is required to fully specify the dynamics of the model In its absence the reversion of these variables to a long run equilibrium is not modeled These equations are balanced in that if the levels variables are 1 their differences are I If the variables are 011 1 then the error correction term will be 0 and all terms in 3 are 0 If these two variables are 1 and CH1 1 then knowledge of the one variable helps to forecast the other at least in one direction Having established cointegration as a longrun property of the data it is natural to think of an ECM as an appropriate way of capturing the dynamic adjustments of these variables to the long run Multiple cointegrating relationships Consider a set of time series variables If they are each trend stationary a VAR may be employed to estimate their joint evolution and interdependence If one or more of the variables in the VAR are nonstationary it would be inappropriate to estimate the VAR in levels However it might also be inappropriate to estimate a VAR in differences even in the case where all variables in the VAR possess unit roots If there are cointegrating relationships among the level variables the proper representation will be the error correction model ECM and the VAR in differences may be seen to be a misspeci ed version of that model excluding as it does the error correction term The VAR in differences is also uninformative about the long run behavior of the series since it only expresses their short run paths of adjustment without any link to the long run equilibrium relationships among the variables The error correction model 7 explicitly provides that link capturing the short run adjustment toward the long run equilibria When there are more than two variables in the set there may be multiple cointegrating relationships among them In a two variable system the variables either form a cointegrating combination or they do not In a three variable system there may be zero one or two cointegrating vectors If zero then these are three independent random walks If there are two CI vectors the Engle Granger procedure will locate one of them but it is incapable of identifying the multiplicity or of estimating a second relationship Likewise for higher order systems of order k there may be up to k 1 CI vectors de ning long run equilbria among the variables The most common methodology employed to evaluate multiple cointegrating relationships is that of J ohansen 1988 and Johansen and Juselius 1990 which is based on the estimation of a pth order VAR in the k variables The VAR in the k3 vector y is 325 I Hiyt i H2yt 2 139 prt p 139 1 Dt 6t 4 where Dt is a d vector of deterministic terms such as a constant trend and seasonal dummies The VAR may be reparameterized into an ECM Aw I Hyt 1F1Ayt 1F2Ayt 2 Fp 1Ayt p 1 DDt Et 5 No assumption is made about the rank of H In the 8 decomposition H a oz and B are k X k matrices We seek to determine whether any columns of B that is rows of 3 are statistically indistinguishable om zero vectors The existence of r cointegrating vectors reduces the rank of H by k 7 that is if there are r cointegrating relationships among the variables then there will be r nonzero eigenvalues in the dynamic system and k 7 zero eigenvalues The decomposition will then relate H 053 where oz and B are both k X 7 matrices If the CI rank is full that is r k then the VAR is stationary in the levels If the rank is zero then there is no implied long run and the VAR may be safely reformulated in rst differences The J ohansen methodology provides inference on the number of nonzero eigenvalues or Cl relationships by setting up an eigenvalue problem derived om the levels and differences of the k variables The eigenvalues are ordered om largest to smallest The space spanned by the 7 largest eigenvalues is the r dimensional cointegrating space If r 1 is k X l and is the eigenvector corresponding to the largest eigenvalue If r 2 is k X 2 the rst column is as before and the second column is the eigenvector corresponding to the second largest eigenvalue Two statistics are de ned in J ohansen s work to determine the CI rank rst the trace statistic k trace T Z lnltl AZ 6 irl which allows for the test of H r the rank of H is r against the alternative that the rank of H is k A large value of the trace 9 statistic is evidence against H 7 that is with r l a value of the trace statistic greater than the appropriate critical value allows us to reject r 1 in favor of r gt 1 The test may then be repeated for r 2 and so on Alternatively the Amy statistic may be used Am Tlnl Ar 7 This test allows for the comparison of a CI rank of 7 against the alternative of a CI rank of r 1 1 This test also may then be repeated for larger values of 7 until one fails to reject the null hypothesis The distribution of both statistics is nonstandard and depends on nuisance parameters in Dt Critical values have been tabulated by J ohansen and Osterwald Lenum and are reproduced in the textbook Research by Reimers and Cheung and Lai 1993 have identi ed small sample biases in the tabulated values of these test statistics and they recommend applying a small sample adjustment Extensions of the J ohansen methodology include tests of various restrictions on the CI vectors either zero exclusion restrictions indicating that certain variables should not appear in certain of the equilibrium relationships or restrictions on parameters values such as those forthcoming om theory e g purchasing power parity not only speci es a long run relationship but indicates that the coef cients in the CI combination should be 1 l l Ref erenc es lO 1 2 3 4 5 6 7 8 Cheung YW and K S Lai 1993 Finiteisample sizes of Johansen s likelihoood ratio tests for cointegration Oxford Bulletin of Economics and Statistics 55 3137328 Engle R and CWJ Granger 1987 Cointegration and error correction Representation estimation and testing Econometrica 55 2517276 Granger CWJ and P Newbold 1974 Spurious regressions in econometrics Journal of Econometrics 2 1117120 Granger CWJ and A Weiss 1983 Timeiseries analysis of error correction models in S Karlin T Amemiya and L Goodman eds Studies in Econometrics Time Series and Multivariate Statistics Academic Press Johansen S 1988 Statistical analysis of cointegration vectors Journal of Economic Dynamics and Control 12 2317254 Johansen S and K Juselius 1990 Maximum likelihood estimation and inference on cointegration with applications to the demand for money Oxford Bulletin of Economics and Statistics 522 1697210 Phillips PCB 1986 Understanding spurious regressions in econometrics Jorunal of Econometrics 33 3117340 Stock James 1987 Asymptotic properties of least squares estimators of ccrintegrating vectors Econometrica 55 103571056 11
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'