### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# Rhetoric I 010 001

UI

GPA 3.77

### View Full Document

## 21

## 0

## Popular in Course

## Popular in Language Education

This 35 page Class Notes was uploaded by Alison Lockman on Friday October 23, 2015. The Class Notes belongs to 010 001 at University of Iowa taught by Staff in Fall. Since its upload, it has received 21 views. For similar materials see /class/228112/010-001-university-of-iowa in Language Education at University of Iowa.

## Reviews for Rhetoric I

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 10/23/15

Economics 8205 Applied Econometrics II Lecture Notes Winter term 1998 Part 111 Models and their applications 1998 by John Geweke This document may be freely reproduced for educational and research purposes provided that i this copyright notice is included with each copy ii no changes are made in the document and iii copies are not sold but retained for individual use or distributed free of charge 15 Censored linear regression model Motivation Suppose that household 139 has preferences over a specific good x and all other goods 2 given by Uxz logal x b logz Only nonnegative amounts of x and 2 may be consumed Clearly z is always consumed and if a positive amount of x is consumed then the currentperiod first order condition U xU z pxpz is equivalent to p a xpzz 1b ltgt x pZzbp a If IID a N Nu 72 across households then 1 x1 maX u b 1pztzlpx a 0 a If N0 72 The model The observables are X and y y1y7 The relation of interest is IID y maXB xt 2 0 t1T 2X N N0 h l Complete the model with the two independent prior distributions B N0 3 1 Eh 95 Data augmentation approach De ne 7 B xt 2 and denote 57 071 If 7 gt 0 then 7 y is observed if y 0 then 7 S 0 is unobserved Note that 136 WK lt py ahX py hXp ph py hXp hXp ph 151 py p hXp ph Consider a Gibbs sampler with the blocks 8 h and From 151 the conditional distribution of neither 3 nor h involves y Furthermore pBhyX in this posterior distribution is exactly the same as pBhyX in the posterior distribution of the normal linear regression discussed in Section 2 if we simply change yto Hence the only new development required for a Gibbs sampler with the blocks 8 h and y is the implementation of drawings from phil hyX lt py p WAX 152 H ytmax720272771Z hl Z eXp 5kg B xtY where 6 r is the point indicator function 46 5xay 1 ifx 2y xiy From 152 the 7 are conditionally independent with p07 cc 6ytmax720exp 5h72 B xt2 If y gt 0 then p612 0 unless 7 yt so the conditional distribution is degenerate at the point 7 yt If y 0 then p072 cc eXp 5h72 x22iwvo so the conditional distribution is 7 N NB xt h l subject to the restriction 7 S 0 Thus a Gibbs sampling algorithm in the censored linear regression model proceeds as follows Given 3 and h successively draw 7 N NB xt h l truncated to 7 lt 0 for all observations tfor which y 0 and set 7 yt if y gt 0 Set the vector 57 J7177 Then draw 3 N NB l with H de ned in 28 and 3 de ned in 29 except that y replaces y in de ning b X39X71X y Finally draw h as indicated in 211 but with 57 replacing y 47 16 Probit model Motivation Suppose that a set of individuals 139 1n each allocate income y between two goods x and z The good X is continuously divisible but 2 can only be 0 or 1 For example 2 might represent a decision to enlist in the military or not Individual i s utility function is U1xz 2a bx 1 zcl dx and heshe consumes out of income y Of course a b C and d are identi ed only up to a common scale factor If 2 0 then x yp and U1xz Cl dpr If 2 1 then x a y pZpX and U1xz a bpxy pz Individual 139 chooses 2 1 if aqWJUE nwygt0 Now suppose we observe pr pz and y 211 We do not know a cl1 but are willing to take a c1 yl yn pi p2 If N0 72 as representative of the distribution of these parameters across households Then we can write 161 y bpx pzdpxy 2 where 81 a CI Nu 72 The variable y is unobserved but is related to z by z 95 The right side of161 is de ned only up to an arbitrary positive scaling factor Consequently we may take 72 1 Inference in the probit model The observables are X x1xT and d d1d7 at 0 or 1 The relationships of interest are N N HD 7 M g MWULWHQRNhU Complete the model with the prior distribution 3 N Nl3 l Data augmentation approach Let S 071 Note that pd il X p l xpdli 162 27072 exp si X13 1 X 139dx0 1 atxao z 16393 Hzlexp5J2 BXzy dzxowJNz 1 dzxamo72 From 163 we have 164 Pd ax LPG 9XPdI d Hilide DWz 1 dull 130359 w Thejoint posterior density for B and y is pBydX cc pdy pdyBX in the form 162 we see that llix N ng H51 is 33 XX 3 ENE3E Xg Taking pdyBX in the form 163 we have yja x 72dt xt Nam st y 2 o ifdt 1 and y lt o ifdt o BX Taking Expression 164 provides a computationally ef cient basis for evaluating the data density corresponding to any 3 since it has analytically marginalized This evaluation can be used with the generic approach described in Section 14 for computing marginal likelihoods 49 17 Constrained linear regression model In this section we will look at three different kinds of constraints that can arise within the normal linear regression model first discussed in Section 2 In each case we begin with the complete model set forth there 17 1 7X1Txk a sXN0h IIT 172 Nl3 1 51mm Linear inequality constraints Motivating example Suppose we observe T rms indexed by t each producing output y from inputs xiv x1 using a CobbDouglas production technology logy log m Z1Bj logxtj 3 gt 0 j 1n Here log Bm varies across rms due to different xed factors for each rm Denote Xx10gx j y y1y7logyfrwlogyi and assume IID 2 log le N0 o Thus 171 is satis ed but the prior distribution 172 must be modi ed to re ect the inequality constraint 3 2 0 The normal linear model with linear inequality constraints is 17 1 and 172 subject to 17 3 a lt DB lt w where Dkgtltk is a nonsingular matrix and oo S a lt w S 00 j1k Thus the model may have from one to k linear combinations of coef cients that are constrained and each of these linear combinations may be bounded above below or both above and below If we de ne y D X XD I z Di y DHED I then 171 through 173 may be expressed 17 4 yxys sxquot N0h 117 175 7NN7Hl swam 176 a lt y lt w Since 174 and 175 are formally identical to 171 and 172 which in turn are just a restatement of 21 through 23 conditional posterior distributions for and are given by 210 and 211 subject to 176 17 7 yhyx N7 1 178 5 y Xy y Xyh iyXxzTz 71 where H7 hX X g X X j X y 77 Hl 77hX X g Clearly l7 6 has no impact on 178 but it does affect 177 From 176 and 177 it is apparent that the conditional posterior distribution of a single coefficient 7j is univariate normal subject to the inequality constraint aj lt 7j lt w To describe this univariate normal distribution let H From some standard results in multivariate normal distribution theory see short appendix at the end of this section the precision of this univariate normal k distribution is E and the mean is 77j l7 211 7510 JJ 1 k Villl 19hay5XlNNV nighM 74 This provides a k lblock Gibbs sampler for the posterior distribution From the sufficient condition given in Section 12 it is immediate that this Gibbs sampler is ergodic Mixed equality anal linear inequality constraints Motivating example In the CobbDouglas production function example just given suppose we think that log 302 may be related to some other observable characteristics of the firm If the firm is a farm for instance these might include a measure of land quality the educational attainment of the farm manager and an indicator of whether or not the manager is also the owner Denoting these observable characteristics wnwtp this leads to the extended model i n np logy 201391310gx3 Z u Wt 82 where 82 re ects other unmeasured in uences on log m We are not sure whether each observable characteristic really in uences log Bm given all the rest but we are certain that the in uence is positive it is nonzero Furthermore our priors on n1 np are independent of each other and of the prior distribution for the other coefficients Thus for j n l n p we have the p independent prior distributions Pl i 0Pp p j j 0 cc exp 5h j l32xow j Prior information is thus represented by independent censored normal distributions The normal linear model with mixed equality anal linear inequality constraints is 171 with independent prior distributions for each of the coef cients and the precision parameter h The prior distribution for h remains sin N 2 3 Each coef cient 3 is zero with probability 3 Conditional on 3 0 the prior distribution of 3 is Nl3j El possibly truncated to the interval aj w This properly normalized prior distribution is gin3981 Edeil 1 E Z yl l dial20 3991 qDl lZaf g l71 eXp39 5h 398 El2lxla139wfl j where Hjr denotes the prior cdf of 3 Hx0 if xlt0 and Hxl if x20 ltD is the cdf ofthe standard normal distribution 0 lt h lt co 00 lt a S w lt co and w lt if lt oo Notice that this model is not strictly a generalization of the previous model because it requires the coef cients to be independent in the prior distribution Some weakening of this restriction is not dif cult but the notation becomes much more cumbersome This model embodies a problem that arises often in applied work known as selection of regressors the investigator is uncertain which of a possibly quite long list of regressors really belongs in a model While this problem could be formally set up as a model combination problem the number of combinations of regressors can be quite great and the procedures described one week ago are then impractical The posterior distribution in this model can again be attacked using a Gibbs sampler completely blocked on all coef cients in the model To do this it is necessary to work out the conditional distribution of each coefficient in the posterior distribution This distribution follows from the simpli ed model 2 43ij 22 2 TN0 W t 1T where z y Zz zxw The likelihood function kernel in 3 is T Z eXp 5h221zt llxv Conditional on 3 0 the value of the kemel is 17 9 eXp 5h 2123 Conditional on 3 0 the corresponding kernel density for 3 is 52 exP shzltz jxvgtzl 2n l 72lt1gtl i lwx jl 1 h Will Pl 5 Elzl wl d eXp 5hzlzt bjxtj 2exp 5KJIBJ bf2 5 j 31 l3j2 39Z mhlz wj 41 lt1gt i l Ejl 1am x J T T 2 T 2 where b Z xyzt x and K h thj 2 5a j Ej2 exp 5hzlzt bij 2exp 5Kjbj2 jgj ME 2704 by2lt1gthj wj 411 i 2aj Ej71ajwj 1 exp 1710 where h K j and 8 hKjbj jgj Ifthe normal prior distribution for 3 is not truncated ie a cgto w 00 then conditional on BAZij h andlBj 0 3 N N3jh1 the standard result for a conjugate normal prior If the normal prior distribution is truncated the conditional distribution is 3 N NEh1 truncated to the interval aj W 1711 BjNEjhf1 st eajwj To remove the conditioning on 3 0 or 3 0 it is necessary to integrate 1710 over 3 and compare this expression to 179 The integration yields eXp 5hzlzt bjxy exp 5Kjbj2 jgj h jm jhyz awm 4 PW EgtJ ht wj 41 th all Thus the conditional Bayes factor in favor of 3 0 versus 3 0 is BF exp5hllzt2 2122 bjxtj 2exp 5Kjbj2 jgj hgjmhjhyz mm m 44w El i wx all We all 71 exp 5thhi ilhihm l wjEl Dlh 4W 39elht iw h zaJ l To draw 3 from its conditional distribution the conditional posterior 1712 probability that 3 0 is computed from the conditional Bayes factor 1712 P p j p 1 p BF Based on a comparison of this probability with a drawing from the uniform distribution on 01 the choice 3 0 or 3 0 is made If 3 0 then 3 is drawn from 1711 Nonlinear inequality constraints Motivating example Consider the production function y fxt t a xt xt th 82 t lT 8123 N0 h l where B is symmetric and the constraints 3f8X20Vx0 x c 92 fBXBX n d are to be imposed For given 06 and B this can easily be checked by i computing the eigenvalues of B and ii noting that 3fz9x 06 Bx V x0 S x S c if and only if 1713 a gt Zjb0lt0bycj i ln Complete the model with the prior 152 subject to these inequality constraints There are several ways one could proceed Begin with a three block Gibbs sampler in 06 B and h The parameters in 06 can be drawn either using acceptance sampling to enforce 1713 drawing the 061 one at a time making use of the algorithm for the normal linear model with linear inequality constraints described earlier in this section or using a Metropolis within Gibbs step drawinga from the appropriate unconstrained conditional normal distribution and using the candidate only if 1713 is satisfied The alternatives for drawing B conditional on 06 and h are the same acceptance sampling drawing the by one at a time or setting up a Metropolis within Gibbs step A short appendix on conditional normal distributions 2 a z quot 1 N Nlu u u39 2 xx xy I n1gtlt1 y ILL Tyx 7 Using standard results for inversion of a partitioned matrix Let 1 71 hyx hyy ayy 0yx21aaxy 0 2 0 0yX20W From standard results for population regression Eyx A5 0yx2ix m A5 hy ijJx MK varyx 7 ayngoxy hy yl 18 Markov chain models Motivation Often economic agents or entities can be characterized as being in one of a small number of possible states For example an individual might be employed unemployed or out of the labor force or an individual might be married or not married If the probability of being in a particular state in a period depends only on the state occupied in the previous period the model is a rst order nite state Markov chain model While we are interested in these models in part for their own sake we are also interested in them because they frequently arise as important constituents of more complicated models Furthermore the probability of transition can easily be made to depend on more than just the state occupied in the previous period The model There are in possible states of the world that are occupied by agents For any agent let st indicate the state occupied at time t st is an integer between 1 and m inclusive The rst order nite state Markov chain model speci es Pstlstilstizstij PstsH V j 23 and denotes Pst jlsH i E p1 Let 77 j lm denote the probabilities ofbeing in the various states at time t corresponding to some initial distribution of probabilities 7750 j lm in period 0 Let 7712 7712 727m ml pl1 In this notation the speci cation of the model is 77 211ij j ltgt 7r nfilP From this formulation it is clear that the transition from period t j to period t is given by 727 EffPf The eigenvalues and eigenvectors of the transition matrix P are important for the properties of the model Denote the eigenvalues by 11J ordered so that llll 22 llml Let the diagonalization of P be P CAC I the columns of C are right eigenvectors of P and the rows of C 1 are left eigenvectors of P The eigenvalues of P cannot exceed 1 in modulus because 0 StrP Sn and trP 2111 But since 2171 1Vilm Pem e 39 em is a right eigenvector corresponding to an eigenvalue 1 and it is convenient to take 11 l A probability distribution over the in states 727 is an invariant distribution if to an 39 39 l l A I39 77 MP The vector 727 must be a left 39 cur r If llll gt llzl then this invariant distribution is unique Suppose llzl 1 If 12 1 then the Markov chain is reducible with invariant states depending on the initial distribution The simplest example is P I2 and the simplest nontrivial example is 1 p12 p12 0 P p21 1 p21 0 39 0 0 1 If 12 l or Mil l and 12 is complex then the chain is periodic Examples include 0 l 0 0 l P P 0 0 l l 0 l 0 0 If llzl lt 1 then this eigenvalue provides an upper bound on the rate of convergence to the invariant distribution as indicated in the following result Result If Mil lt 1 then for any r Mil lt r lt1 lime r 7rt 775 0 Proof Recall 11 l and let cl1 n 1 i ln Since 7752 7721P gP and 727 77P 77P n 727 no 77Pt no 7rCA C 1 7270 77CNYC 1 ngCINX C l where NX diag0lz in The third equality obtains because the rst column of C is proportional to em the last equality because 72 is proportional to the rst row of C l Then rquot7rt rquot7r 7rC C 1 7r 77CK C 1 where K diag0rlzrln Since lime R 0 lime r 7rt 775 0 Notice that if all ljlj llzl are real and positive then limelzquotn m ngc1imeIZ 1A C1 ngCDol v wherre D is a diagonal matrix with dn 0 e22 12 and djj 1 or 0 for all j 3m For any h 12 limHmllzlitM39W1 775 llzlhv If h log2loglzl then Milk 12 This value of h is known as the half life of the Markov chain The definition still applies for secondlargest roots that are negative or complex but in that case the limit does not exist as it has been taken here and the result is in terms of amplitudes of oscillations about the invariant distribution Other functions of interest While the entries pl of P uniquely characterize the model they are not as directly related to the implied dynamics as some functions of these parameters The invariant vector 72 and the convergence bound llzl are examples There are also many measures of mobility between states including the expected length of stay in state i l p 71 and the overall measure of mobility n trPn l for further discussion and properties of these measures see Geweke Marshall and Zarkin 1986 Inference in Markov chain models Suppose we observe individuals 1 ln in each of t lT successive time periods and let sit denote the state occupied by individual i at time t The probability pslll11l 7271 pl I Hz 51H22ps1s 39 Now let n 216s11j the number of individuals in state j at time t l and let function is njk 212 6s17271j5311k the number of individuals in state j one period and state kthe next period These are sufficient statistics and 181 ps11 lial1 lt mill 11m Observe that the likelihood function 181 factors into ml components one for the 72 m 11 and the other for the P1p1m with all parameters strictly positive and 272 1 2171 l i l The generic form ofthese components is 182 1119quot 2 gt0 2161 1 As a function of the n 182 is a multinomial probability function As a function of the 61 it is a kernel of the multivariate beta distribution which in standard notation has density 183 px1 xma1 am 1quot241 a Hllquotal x 71 the parameters a1am must all be positive and the density is defined on the m dimensional unit simplex x1 x1 gt0 1 lm 21x1 1 The multivariate beta distribution is also known as the Dirichlet distribution In developing prior and posterior distributions for the parameters of the model it is useful to distinguish between two cases In the first case priors for the 72 and the p11p1m are independent in particular there is no assumption that the period tl distribution is the invariant distribution The conjugate prior distribution then consists of m 1 independent multivariate beta distributions with respective densities s FZaxoHFaml39l72quot n gt o 272 1 and for 1 1m 184 pp11plm 11241611 Hllquotay Iplga pU gt 0 2111 1 The interpretation of the conjugate prior distribution in terms of notional data is straightforward The notional sample corresponding to this distribution consists of 6110 1 P observed in state 1 in period 1 and a1 1 transitions from state 1 to state j observed in the notional sample Then the posterior distribution is the product of m 1 independent components The posterior density kernel in standard form is p mm 11 L1 quot1 L j a0 H11 a10 Elana 11 m 110ay71 FIR 39Hilllrlil jVHLFM de ned on the same support as the prior The posterior distribution of 771727 is multivariate beta with parameters m 185 am 111 and for all 1 1 m the posterior distribution of p11 pm is multivariate beta with parameters av ny Drawing from a multivariate distribution 183 is 11 straightforward as described by Devroye 1986 593596 construct the independent random variables dz xz2a1 1 1m and thentake x1 ldj i1m Since the posterior density kernel 185 is the product of m1 independent multivariate beta distributions the marginal likelihood is easy to evaluate analytically Comparing 183 and 185 we see that it is dz a101391r21 aj1 a101391rny av I 1121 n a0 1quot6110 lquotzl 111 a1 gt111 may If only the pl are of interest as will be the case if only the asymptotic properties of the Markov chain are of interest then the components of the posterior distribution in 771 775m can be ignored completely Now consider a second case in which it is assumed that the invariant distribution pertains for all states ie 7712 727m 77 for all t1T This assumption is equivalent to the restriction that 7712 775ml is a left eigenvector of P corresponding to the eigenvalue 1 1 If we retain 184 as the prior distribution for P then the density kernel in standard form is lial12 Minx131 cc mquot 1l39r2HquotF when where 7712 775 is the left eigenvector of P normalized to 2172 1 corresponding 186 to the eigenvalue i1 The kernel 186 is close to a product of the kernels of independent multivariate beta distributions This suggests that we take the product of multivariate beta distributions with pdf 187 H11F F1611 n11L1Fav nv as an importance sampling distribution Dividing 186 by 187 the corresponding m n0 my 71 weight function using the prior and importance sampling densities not merely kernels and the data distribution lnction is H1FZ1 av gt111 way my 771 39 m 1 I HHrrzl my As shown in Section 14 the average of this expression over all iterations is a simulation consistent approximation of the marginal likelihood in the case when stationarity is imposed 60 19A Normal seemingly unrelated regressions model Motivation The seemingly unrelated regressions SUR model developed originally in Zellner 1962 is perhaps the most widely used econometric model after linear regression The reason is that it provides a simple and useful representation of systems of demand equations that arise from the neoclassical static theories of producer and consumer behavior Two widely applied examples from producer theory illustrate this point If the m X 1 vector w denotes factor prices for a producer producing output y with cost function cw y then from Shephard s Lemma the corresponding m gtlt1 vector of factor demands is xwy 3cwyz9w Given a functional form for cwy factor demands can be derived explicitly The generalized Leontieff or Diewert cost function is cw y yz 1 by wlmwjl2 2 W181 and defining 8 81 Sm slwy N N0 2 Then X1Wyyzl1bUWjWl12 1 139 1m Notice that there are m equations each of which individually satis es the data density where by b specification of the normal linear model but that most parameters appear in two equations and the disturbance terms in the different equations are allowed to be correlated The translog cost function Varian 1984 Section 44 is logcwy a0 2161 logwl 2 1 21 by log w log w 1 logy Z logwl 8 where 2161 1 by b V1 ZFlbU 0 and slwy N N0 2 Since 310gcwyz910g w 3cwyz9wl r WICl 139 1m the cost share ofthe i th factor is w X w y m a blo w z1m Cwyy 211 V g j 1 Once again there are m equations each of which individually satis es the data density specification of the normal linear model Most parameters appear in two equations and the disturbance terms in the dilTerent equations are allowed to be correlated Model and notation The relations of interest are given by y Zj 3 S j1m X1 Let y yl yfn and 8 81 n Further define 61 A special case of the seemingly unrelated regressions model is 191 y Xj 3 8 j1m T TXl X1 TXkJ k X1 In this case X1 z 01 x 0 z 2 21 ELM Wk 0 To take another special case for the Leontieff cost function with two inputs 12 1 wllwll 0 Xm b Z 1 WZTWIT12 0 B bu 27x3 12 12 0 wllWZI 1 b 22 0 IVIT IVZTy2 1 In general the model may be written 192 yZB The distributional assumption is 19 3 sZN0H 1 IT The m X m matrix H is the precision matrix of the disturbance vector 8 Data density From 192 and 193 19 4 pyz 27z 7quotquot2HT 2 exp7y z H 9 ITy z Since y Z18 H ITy Z18 21271371 Zita37 ZJ39BhU tr SH where man 31 1 SU y Z Byj ZJB an alternative expression for the pdf is 19 5 pyx 27z 7quotquot2HT 2 exp7tr SH De ne i Z H ITZ71Z H ITy and observe Z H ITy z Z H ITy Z H ITZ 0 Hence 194 may also be expressed 62 133 IZ27TTmZIHIT2 expry Z3H 17y Z3 196 plaza i Z H 17Z 3 Priors The prior distribution of 3 is 3 N N3Hgl 12 197 PM 2 ml il PM 5 EM To assign a conditionally conjugate prior distribution to H it is necessary to introduce the Wishart distribution The motivation of this distribution is as follows Suppose IID y N N0 2 what is the distribution of A 21ytyt The answer to this question is mgtlt1 that A has pdf m 71 N Nm pA 24 n mlm 142 TZHllFT 1 z392 AT 1 quot2 eXp7tr2 1A For a full derivation See Anderson 1984 Section 72 The corresponding distribution is known as the Wishart distribution with matrix parameter 2 and T degrees of freedom It is denoted W2 T In this distribution EA T r 2 The Wishart distribution is the multivariate generalization of the chi square distribution If in our application we take the prior distribution of the precision matrix H to be independent of 3 and H N W 1y then m 71 VN Nm 198 pH 2WZ n mltm lgt4 WH1lrg 1 z392 H 1 V2 exp7tr H Comparing 198 with 195 note that the Wishart distribution for H is the conditionally conjugate prior In the notional sample there are 2 observations is the sumsofsquares and cross products for the notional sample The prior mean of H is 251 and sfTlh NZZQ j lm In this sense this prior distribution is the multivariate generalization of the chi square prior distribution 23 for h in the linear model Conditional distributions From 196 and 197 the posterior density kernel for 3 expi 3Z H ITZ 37 Hi gg ii g Hence y Z N N3 HE where H g z H IZ E g gz H 1y In the special case 193 this expression becomes 63 Xi 0 111117 hlmIT X1 0 hlleXl thiXm 0 x hmllT h I 0 X hmlxgx1 hllxgxm mm 7 m The conditional posterior mean is E H Z H 17 y note that in the special case 191 X 0 huIT hmIT yl 3me z H ITy E E E E E E 0 x hmllT hmmIT ym lenyth From 195 and 198 the posterior kernel for H is HXH mZ eXp7tr Hence H yz w s 1g T 64 19B Normal linear simultaneous equation model Motivation Consider a very simple market equilibrium model 19B 1a pt qut IBI xt1 Sn Inverse demand 19B1b pt yzqt 3th Sn Supply Price and quantity at observation t are denoted pt and at respectively The vector x21 includes variables affecting demand but not supply eg income and prices of substitutes and complements The vector x22 includes variables affecting supply but not demand eg prices of inputs The vectors x21 and x22 may also have variables in common e g an intercept term The disturbances 12 and 822 may be correlated A data set ppqt1 will show a scatter of prices and quantities Notice that if covariates were not included in the model so that 19B 2a pt 3981 qut S 19B2b at 3932 yzpt 822 then there would be no basis for calling the first equation demand and the second supply Not only do the two equations look alike but so does any linear combination of the two equations The model 19B2 is said to be underidentified This is not a problem with 19B 1 the only linear combination of 19B 1a and 19B 1b that excludes x22 and is normalized with a coefficient of 1 on pt is 19B 1a Similar remarks apply to 19B 1b The model 19B1 is identified Model and notation The canonical model IID 19B3 y r ng 9 tx1xT N0H 1 IXLLXL 1X16ka IXL consists of L equations with L endogenous variables y and k exogenous variables x2 Equations 19B 3 are called structural equations because the parameters in them correspond to the tastes and technology from which these behavioral relations are derived The system is normalized by yjj 1 j 1L and with this normalization one may write 19B4 y1 1 y B x s 1 1quot B y39 s LAMz s t 1T X X in which MKL zy Since it is the case that AIF B a 0 j1L and for all ilL ayy jiijlL and axL jl 65 In addition there are typically linear restrictions on the coefficients in F and B Certain variables my not enter certain equations eg 71 0 for some i j pairs and there may be more general linear restrictions that can extend across equations1 To handle these restrictions compactly express l9B4 in the alternative notation N 71 19B5 T XI Tamar TLst 2x1x N0 H 91 The arrangement of Z will re ect linear equality restrictions just as it did for the seemingly unrelated regressions model Indeed that model is a special case of this model Data density From l9B 3 l9B6 y lquot 1B xt Vt t lT IID where conditional on x1xT Vt lquot 1 t N N0 FHH T I Equations l9B 6 are the reduced form of the structural equations l9B3 Thus we have the T independent distributions ytx1x7 N N F 1Bx2 F TIH IF I t lT The joint data density is therefore py1yTx1xT 27am r 1H1r1 T exp ZL y 1quot 1B x1quotHlquot yt r lB x lt 54m m Hm Ham T 221 y Azl Hy Azt39 Reorganizing the notation using l9B 5 the last expression becomes 27z TL 21 THT 2 exp 5y ZooH 1y m0 27r FITHIT exp 27r FITHIT exp This kernel is exactly the same as 194 with the important exception of the term FT Prior distribution We retain the same prior distribution as in the seemingly unrelated regressions model 06 N Ng and independently H N W 1 X Full information with Gibbs sampling In the full model presented analytical expressions for FT involve higher powers and combinations of the 71 These are 1There is a long and substantial literature on the kinds of restrictions that are necessary and sufficient to uniquely identify all of the equations in the system ie to prevent any one equation from being confused with linear combinations of the other equations On the interaction between identification and prior distributions see Dreze 1972 66 typically quite unwieldy If we attempt to apply the twoblock Gibbs sampler that was convenient for the normal seemingly unrelated regressions model sampling 06 is quite dif cult since elements of 06 enter FT However it is easy to utilize a Metropolis within Gibbs step at this point At iteration m 1 first draw Hm from the appropriate Wishart distribution Then draw a candidate of N N gl where H g Z H 1z a ggz H 1y Then set 060quot 06 with probability y5zgtpa 1 pHm1 aw yaZpamlHm1ayZ I I Thus the actual modi cation of the seemingly unrelated regressions model Gibbs sampling Fl min l min algorithm is minor Limited information with Gibbs sampling In an approach known as limited information the investigator fully specifies the righthand side of one equation in the system l9B3 without loss of generality the first but for the remaining L 1 indicates only the reduced form relation between the predetermined variables X and the endogenous variables y22yu Thus r 1 0 B N is ii 3 l where 77 and T may and generally are be subject to further linear restrictions but f3 is not Since FT l in this situation the likelihood function for the limited information normal linear simultaneous equation model is exactly the same as the normal seemingly unrelated regressions model likelihood function The Gibbs sampling algorithm for the latter can therefore be applied directly with no further modification in this case 67 20 Serial correlation Motivation Suppose that in the linear regression model discussed in Section 2 the regressors and dependent variable are time series each measured at a point in time If in continuous time these variables move smoothly without jumps then as the sampling interval becomes shorter and shorter the assumption that the disturbances are mutually independent becomes untenable since each is a linear function of the dependent variable and regressors Here we take up a simple modi cation of this model that weakens the assumption of independence replacing it with the assumption that the disturbance follows a first order autoregressive process Model The model and procedures are those of Chib and Greenberg 1996 y B xt 2 20 1 s 21 gem up us1s2 HND N0h 1 where 82 is stationary ie the joint distribution of any set of 2 tisl tism depends on 31sm but not I We shall continue to denote y yly7 and X x1x7 Motivated by 201 for tplT define y yt El1 syH and F HD 7 x2 x2 ZSI SxHJ lk Then y B xt N N0 h 1 tplT Letting yp denote the firstp elements of y and Xp we have yp XFB N0 h lVp where VFW is a po to be derived The transformation from y to 1ypy1y is onetoone and the Jacobian of transformation is one Hence the data density conditional on covariates is pyTXp a ah 27042 W ZIVAWITUZ rexp 5hztply 39BX2 yp XF39B VF TlyF XF39B where y yt Z1 S H and x x2 21 599757 t p lT The function VFW is a po Toeplitz matrix ie entry ij is covv v 1 20 2 depends only on lz39 j These elements are and consequently may be written VHI deterministic functions of which we may derive as follows From 201 F F 7 203 v cov t tij 251 5 cov H SH covu2 SW 251 SVH 607 1 Evaluating 203 for j l p leads to the p Y ule Walker equations 68 v v v v 204 1 0 p72 i2 2 VF1 VF2 v0 p vp and evaluating 203 for j 0 yields 205 v0 21qu h l for v0vp If the pgtltp matrix in 204 is positive definite then 204 provides a solution for 1 p given v0vp Equation 205 then provides h l but it is not obvious that h 1 will be strictly positive In fact it can be shown Anderson 1971 that if the p X p matrix in 204 is positive de nite then h 1 v0 21 SVS is positive and all of the roots of the characteristic polynomial 206 gtz 1 211 have modulus strictly greater than one Conversely given 1 F and h the pl equations 204 and 205 provide unique solutions for v0 vp and the p X p matrix in 204 is positive de nite if h is positive and all of the roots of z have modulus strictly greater than one It is clear that in the latter mapping from 1 p and h 20vp is homogeneous of degree one in WI Taking h 1 provides the function VFW in 202 Additional values of v j gt p may be computed through the recursion from 203 v 21 SVH var t 251 827 formed in this way will be positive de nite Serial correlation Inference Independent prior distributions for 3 and h are 3 NU H pm W ll lm Pl W l d lquot eh 2 y 2 pa 2 rn2 1g2 hwexpwhz N NQ subject to 6 SF where Sp l 21 st 0 V zz S l s p gt D9ui l2n zluilm exp 5 glam glam Dltiaigt Mrrlailmirwl i igt39ailt igt2ld Gibbs sampling approach First consider the conditional distribution of B and h Let 1 1 Vp Ap Ap where Ap 1s a lower triangular matrix w1th pos1t1ve 69 diagonal elements This decomposition is unique Then the posterior kernel in 3 and h is 4H T a 2 207 h 2m exp h 21yt 3 x2 2exp 3 3 3 32exp h2 In this expression y1 M X7 X1 Ap 5 and Ap y yp X X As a function of 3 207 is in standard form for the posterior kernel of the coefficients of a normal linear regression model with a normal prior The corresponding conditional distribution of 3 is normal with precision H3 hX X and mean 3 Egg3 my The kemel of 207 in h implies the conditional distribution T if 2 2 1y X2 th TJrD The posterior kernel in is 208 exphzp s z sZ2exp 2 gm 22 209 39A exp hy Xp Vp 1yp Xp 2xs gt where 82 yt 3 xt t 1T Expression 208 is in standard form for the posterior kernel of the coefficients of a normal linear regression model with a normal prior The distribution corresponding to the kernel of 208 in is normal with precision e y hE E and mean a gw yf hE squot where the T p Xp matrix E has typical entry 6 8 and s p1 T In a Metropolis within Gibbs step we draw a candidate W from this distribution and use 209 to construct the acceptance probability IAM exp hltmgty 4w wimp x ltmgt2xslt gt min Alt ltm gtegtltp hltmgtlty 4m vlt ltm gt 1lty x ltmgt at step m This follows from expression 131 1 70 21 Generalization of Gaussian models Motivation In the linear regression model y lB xt 82 t 1T and IID 2X N N0 h l In Section 2 we completed the model with the prior distribution 3 N NB y where B e Rk and y is a k Xk positive de nite precision matrix 211 pm 2 W lair exp 5 l3 EN 41 and the independent prior distribution szh N 9521 In virtually all cases the assumption of normality is analytically convenient but has no special grounding in either economic theory or observed behavior Moreover in any model in which expected utility is an important part the distributional assumption for the shock may be of first order signi cance This is especially true when both utility and shocks in a model are unbounded for then expected utility may depend on the interaction of the tail behavior of the utility function and the distribution of shocks In this section we take up a generalization of normality that is rich yet builds on the tools developed in this course It is illustrated for the case of the normal linear model but it will be clear that the generalization can be used in much the same way with more elaborate models The mixture of normals linear regression model For the assumption IID 2X N N0 h l substitute 8 Ef e05 hfl znt 05 a1ocm e R h hlhm 6 RT IID ntX N N0 1 The random vectors et enem are iid each with a multinomial distribution with parameters pJ Pe2j 1 j1m IID e N MNpmpm p pmpmeSm where Sm is the unit simplex in R39quot 71 In the full mixture ofnormals linear model rankX k and a X ll for any a p gt 0 Vj and either a 06H lt 06 j2m or b hH lthj j2m In the scale mixture of normals linear model 06 0 j lm X may and generally does include an intercept and a obtains In the mean mixture of normals linear model hl hm h and b obtains The orderings in a and b are labeling restrictions that prevent interchanging the components of the mixture obviously other labeling restrictions are possible For Bayesian inference it is convenient to complete the model with independent prior distributions for 3 06 h and p respectively The prior distribution for 3 is 211 Except in the scale mixture of normals model 06 N Ng g3 where g e R and u is an m X m positive de nite matrix 212 pm mmm exp 501 2 gm a subject to a Except in the mean mixture of normals model sth 12 1 j lm where sj gt 0 y gt 0 and the distributions are independent m V 1 x 2 X 213 phgtH2r lFzj2l e w rm eh possibly subject to b Finally pp N Betar 1 6 RT 214 pp 1quot241 rjl1 lquotrj gt111 pf 71 The data density is 215 py3 06hpX27ITTZHIZ1pjhjZ exp 5hj y a 3 x22 As a function of 3 06 h and p the data density is unbounded and hence the conventional theory of maximum likelihood estimators breaks down in this model See Result 1 in the appendix of this section However the product of 213 and 215 is bounded by a function of h that is finitely integrable See Result 2 in the appendix It follows that the posterior distribution for all parameters 3 06 h and p exists Since 3 06 and p have prior moments of all orders they also have posterior moments of all orders It is also the case that h has posterior moments of all orders 72 A posterior simulator In the mixture of normals linear model pyeB 06hpX pyeB 06hXpep De ning L2 11621 1 and T 2182 216 pep 11 pi 1 1 Pf J a 217 pyeB 06hX 27777211hj712 eXp 5211Z2 y a et B xtY The product of 216 and 217 and the prior density kernels 21l214 is a kernel of the posterior distribution of the latent variables e equivalently Ltl and y and the parameter vectors 3 06 h and p Posterior distributions for individual groups of latent variables and parameters conditional on all the other latent variables and parameters and the data are easily derived from these expressions as follows The kernel in Lt1 is the product of 216 and 217 which shows that the L are conditionally independent with PLt 1 cc p eXp 5hj1Z72 06 B xt2 The kernel in 06 is the product of 212 and 217 yielding T 7 T N a Na Hg H H Heep a H1uaw Z exy x subject to 061 ltlt 06m if this labeling restriction has been invoked The algorithm of Geweke 1991 provides for efficient imposition of the inequality constraints The kemel in h is the product of 212 and 217 indicating T Z sjzthxzvj sjz j Metyt ocj x subject to hl lt lt hm if the labeling restriction on h has been invoked Whether or not this V 1 Tj jlm restriction applies it is straightforward to draw the h sequentially Finally the posterior kernel in p is the product of 214 and 216 p N Betar1 T1rm Tm It is especially interesting to compare this model to the normal linear model and to compare models within this class using different prior distributions and or different values of m It is straightforward to compute the marginal likelihood as described in the Section 14 using 211 through 215 73 Appendix Some important properties of the likelihood function and posterior kernel Result 1 The likelihood function is unbounded The likelihood function is T L ahpyX H1L ahpyX LB 06hpytxt 1 pjhjlZ eXp 51104 a 393x22 Set Bb a2 am 0 hz 1h1y Xby Xb T ksh 1 p1 sgt0 p2pm1 m lEp Now set 061y1 b x1 Then as hl gtoo L1B 06hpy1x1 gt 00 whereas with probability one LtB06hpytxt gt 1 hl eXp 5hyt b x22 t 2T Result 2 The product of the likelihood function and the prior for h is bounded by the kernel of a proper pdf A kernel of the prior density for h is H hjlezv2 eXp sjhj Since 2 m 12 T y 06 B x 2 0 and p S l the likelihood function 1s bounded above by Zth The logarithm of the product of this bound and the prior kernel for h is fh f1hf2h m Tlogzhj 52quotgi2n f2 h 521Lj 2 loghj 521sj2hj It is straightforward to verify that f1h is globally concave with a regular internal maximum and that and f 2 h is the log kemel of a proper gamma densities Corollary 1 The posterior distribution for the data density 215 and independent prior densities 2ll2l4 exists Result 2 implies that the product of the likelihood function and prior density kernel is finitely integrable Corollary 2 The parameter vector h has nite moments of all orders This follows because the gamma densities mentioned in the proof of Result 2 have finite moments of all orders 74 References Anderson TW 1971 The Statistical Analysis of Time Series New York Wiley Anderson TW 1984 An Introduction to Multivariate Statistical Analysis New York Wiley Second edition Bartlett MS 1957 A Comment on DV Lindley s Statistical Paradox Biometrika 44 533534 Chib S and E Greenberg1994 II J J39 the Ha ting t r quot Algorithm The American Statistician 49 327335 Chib S and E Greenberg 1996 Hierarchical Analysis of SUR Models with Extensions to Correlated Serial Errors and Time Varying Parameter Models Journal of Econometrics Devroye L 1986 Non uniform Random Variate Generation New York Springer Verlag Dreze JH Econometrics and Decision Theory Econometrica 40 118 Gelfand AE and DK Dey 1994 Bayesian Model Choice Asymptotics and Exact Calculations Journal ofthe Royal Statistical Society Series B 56 501514 Geweke J 1989a Bayesian Inference in Econometric Models Using Monte Carlo Integration Econometrica 57 13171340 Geweke J 1989b Exact predictive densities in linear models with ARCH disturbances Journal ofEconometrics 40 6386 Geweke J 1991 Efficient Simulation from the Multivariate Normal and Studentt Distributions Subject to Linear Constraints in E M Keramidas ed Computing Science and Statistics Proceedings of the Twenty Third Symposium on the Interface 571578 Fairfax Interface Foundation of North America Inc Geweke J R Marshall and G Zarkin 1986 Mobility Indices in Continuous Time Markov Chains Econometrica 54 1407 1424 Hammersly JM and DC Handscomb 1964 Monte Carlo Methods London Methuen and Company Hastings WK 1970 Monte Carlo Sampling Methods Using Markov Chains and Their Applications Biometrika 57 97109 Jeffreys H 1961 Theory ofProbability Oxford Clarendon Kloek T and HK van Dijk 1978 Bayesian Estimates of Equation System Parameters An Application of Integration by Monte Carlo Econometrica 46 120 Leamer EE 1973 Multicollinearity A Bayesian Interpretation Review of Economics and Statistics 55 371380 75 Lindley DV 1957 A Statistical Paradox Biometrika 44 187192 Metropolis N AW Rosenbluth MN Rosenbluth AH Teller and E Teller 1953 Equation of State Calculations by Fast Computing Machines The Journal of Chemical Physics 21 10871092 Poirier D 1995 Intermediate Statistics and Econometrics A Comparative Approach Cambridge MIT Press Raftery AE 1995 Hypothesis testing and model selection via posterior simulation University of Washington working paper Tierney L 1991 Exploring Posterior Distributions Using Markov Chains in EM Keramaidas ed Computing Science anal Statistics Proceedings of the 23rd Symposium on the Interface 563570 Fairfax Interface Foundation of North America Inc Tierney L 1994 Markov Chains for Exploring Posterior Distributions with discussion and rejoinder Annals ofStatistics 22 17011762 Varian H R 1984 Microeconomic Analysis New York WW Norton Second edition Zellner A 1962 An Efficient Method of Estimating Seemingly Unrelated Regressions and Test of Aggregation Bias Journal of the American Statistical Association 57 500509 76 Economics 8205 February 27 1997 Winter 1997 J Geweke Applied Econometrics II Lecture Notes Nonlinear inequality constraints Motivating example Consider the production function y fxt t a xt xt th s t 1T where B is symmetric and the constraints 3f8x20Vx0 xSc 92 fBXBX n d are to be imposed For given 06 and B this can easily be checked by i computing the eigenvalues of B and ii noting that Bfax 06BX V x0 S x S c if and only if a gt My byej i 1n Complete the model with the prior 152 subject to these inequality constraints There are two ways to proceed The first procedure is to modify one of the previously developed algorithms using rejection to enforce the inequality constraints There are many possibilities The simplest is to use the twoblock Gibbs sampling algorithm for the normal linear regression model described on February 11 rejecting any draw for which and eigenvalues in i are positive or any of the inequalities in ii are violated A more elaborate but potentially more efficient procedure is to use the linear inequality constraint algorithm set out above to enforce the restrictions in ii and then use rejection to enforce the concavity condition in i The second procedure is to employ a random walk Metropolis algorithm An initial point satisfying all constraints is chosen this is easy The variance matrix of the normal proposal distribution is held constant As described in lecture February 13 it is important that this variance matrix be scaled appropriately if the algorithm is to be reasonably ef cient Choosing a good scaling could be more difficult Markov chain models Motivation O en economic agents or entities can be characterized as being in one of a small number of possible states For example an individual might be employed unemployed or out of the labor force or an individual might be married or not married If the probability of being in a particular state in a period depends only on the state occupied in the previous period the model is a rst order nite state Markov chain model While we are interested in these models in part for their own sake we are also interested in them because they frequently arise as important constituents of more complicated models Furthermore the probability of transition can easily be made to depend on more than just the state occupied in the previous period The model There are m possible states of the world that are occupied by agents For any agent let S indicate the state occupied at time t S is an integer between 1 and m inclusive The rst order nite state Markov chain model speci es Pst stilstizstij PstlsH V j 23 and denotes Pst j sH i E p1 Let 77 j lm denote the probabilities ofbeing in the various states at time t corresponding to some initial distribution of probabilities 7750 j lm in period 0 Let 7IE7ru7rm P plj me In this notation the speci cation of the model is 77 2111am j ltgt 777 nfilP From this formulation it is clear that the transition from period t j to period t is given by 727 nilPf The eigenvalues and eigenvectors of the transition matrix P are important for the properties of the model Denote the eigenvalues by 11Jn ordered so that MI 22 lm Let the diagonalization of P be P CAC I the columns of C are right eigenvectors of P and the rows of C 1 are left eigenvectors of P The eigenvalues of P cannot exceed 1 in modulus because 0 StrP S n and trP 2111 But since 2171 lVilm Pem em em is a right eigenvector corresponding to an eigenvalue 1 and it is convenient to take 11 l A probability distribution over the m states 72 is an invariant distribution if 77 MP The vector 72 must be a left eigenvector corresponding to an eigenvalue 1 1 If MI gt Mil then this invariant distribution is unique Suppose Mil 1 If 12 1 then the Markov chain is reducible with invariant states depending on the initial distribution The simplest example is P I2 and the simplest nontrivial example is 1 p12 p12 0 P p21 1 p21 0 0 0 l If 12 l or 12 l and 12 is complex then the chain is periodic Examples include 010 01 P P001 10 100 If Ml lt 1 then this eigenvalue provides an upper bound on the rate of convergence to the invariant distribution as indicated in the following result Result If Ml ltl then for any r Ml lt r lt 1 lime rquot7rt 77 0 Proof Recall 11 l and let cl1 If1 139 ln Since 775 7721P 7276P and 727 77P 77P 7r 727 70 77P 70 7rCA C 1 7270 77C1 C 1 ngCINX C l where NX diag0lz in The third equality obtains because the first column of C is proportional to em the last equality because 72 is proportional to the first row of C l Then rquot7rt 77 rquot7rt 7rC C 1 7270 7739C1 C 1 where K diag0rlzrln Since lime R 0 lime rquot7rt 775 0 Notice that if all i j 1 j Ml are real and positive then it 1 N t 71 71 limellzl 7272 775 7750Cl11mele A C 770CDC V wherre D is a diagonal matrix with dn 0 6122 12 and djj 1 or 0 for all j 3m For any h 12 limellzlithM 775 lehv If h log2loglzl then Ml 1 12 This value of h is known as the half life of the Markov chain The definition still applies for secondlargest roots that are negative or complex but in that case the limit does not exist as it has been taken here and the result is in terms of amplitudes of oscillations about the invariant distribution Other functions of interest While the entries pl of P uniquely characterize the model they are not as directly related to the implied dynamics as some functions of these parameters The invariant vector 727 and the convergence bound 12 are examples There are also many measures of mobility between states including the expected length of stay in state 139 l p 71 and the overall measure of mobility n trPn l for further discussion and properties of these measures see Geweke Marshall and Zarkin 1986 163

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

#### "I bought an awesome study guide, which helped me get an A in my Math 34B class this quarter!"

#### "I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

#### "Their 'Elite Notetakers' are making over $1,200/month in sales by creating high quality content that helps their classmates in a time of need."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.