### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# Quant Anly Pol Sci Ii POL 213

UCD

GPA 3.56

### View Full Document

## 96

## 0

## Popular in Course

## Popular in Political Science

This 220 page Class Notes was uploaded by Pierre Huel on Tuesday September 8, 2015. The Class Notes belongs to POL 213 at University of California - Davis taught by Bradford Jones in Fall. Since its upload, it has received 96 views. For similar materials see /class/187556/pol-213-university-of-california-davis in Political Science at University of California - Davis.

## Similar to POL 213 at UCD

## Popular in Political Science

## Reviews for Quant Anly Pol Sci Ii

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 09/08/15

Categorical Models Brad Jones1 1Department of Political Science University of California Davis April 28 2009 Jones POL 213 Research Methods Jone Research Methods Bradford 5 Jones Dept of Political Science UCrDavis Ordinal Response Variables gt Definition Confine ourselves to y well understood to be limited 3 7 points scales for example V Basic problems emerge but are often ignored V Regression is commonly applied Because the dependent variables are categorical OLS regression is technically inappropriate We found substantially the same results however using ordinal logit models We report the OLS results because their interpretation is more straightforward Zuckerman and Jost 2001 SPQ Jones POL 213 Research Methods lll Bradford 5 Jones Dept of Political Science UCrDavis Illustration gt Suppose we proceed with OLS gt Response Variable 4 point Likert scale on attitudes toward immigration gt Use same covariates from last week gt Model returns OLS Estimates for Immigration Effect Variable Estimate Std Error t value Prgtiti Intercept 24332 01490 1633 00000 pid 703022 00604 7501 00000 education 00136 00215 063 05280 income 01049 00379 277 00059 atino 708992 01088 7826 00000 Jones POL 213 Research Methods Bradford 5 Jones Dept of Political Science UCrDavis Problems gt Equal interval scoring assumed gt a unit change in X results in a 3 change in EY gt False sense of precision gt Alternative models need to be considered or should be Bradford 5 Jones Dept of Political Science UCrDavis Utility Motivation gt Assume Y is a discretized measure of Y gt Postulate Yiax e 1 gt Cut Point Rule Y 1 Y3 g 041 Y 2 E 041 lt Y 042 Y 3 E 042 lt Y 043 Y 4 Y gt 043 Jones POL 213 Research Methods Bradford 5 Jones Dept of Political Science UCrDavis Utility Motivation gt Easy to visualize on a line Y The idea is to map the latent variable Y onto Y Y As in previous slides we identify thresholds or cutpoints that partition the space gt Since these are unknown we estimate them given the observed data gt With likelihood this means we need to identify some function gt That is a distribution function suitable for ordinal categorical data gt Equivalently specify a distribution function for e in equation Jones POL 213 Research Methods Bradford 5 Jones Dept of Political Science UCrDavis Probability gt Getting to some data gt Y is a j category variable gt In terms of cumulative probabilities 0 c l v 1 A lt A K V H T1 c v 1 A lt H X V A M V 1 The cumulative probabilities must sum to 1 Because of this we have to impose a constraint on the model Only J 71 cumulative probabilities are uniquely identified VVVV Thus only J71 thresholds need to be estimated Jones POL 213 Research Methods lll Bradford 5 Jones Dept of Political Science UCrDavis Probab hy gt Define a model Ci Haj X 3 gt Conditional Probabilities Hmfm jL Fa2 x Fa1 x06 1ltj 2 Prly l X Fa3 x06 7 Fa2 x 2 lt1 S 3 17Fa3x jJ gt Looks similar to the cutpoint rule from before gt In fact it is identical gt Note something critically important in the relationships shown above gt The effects of the covariates are invariant to the cutpoints gt Means the cutpoints have the effect of shifting the probability curve to the left or right Jones POL 213 Research Methods Bradford 5 Jones Dept of Political Science UCrDavis Ordinal Models b VV V Probability curves will be identical just shifted along the X aXis Generically this is called the parallel slopes assumptions In the logit setting it is called the proportional odds assumption To test this assumption we first need to know how to estimate a model A couple of good candidates for are logistic distribution and standard normal FXOex Under probit we get PFCi j l X Fgt1 S X S a1 ajel S X S a1 5 Bradford 5 Jones Dept of Political Science UCrDavis Ordinal Models Ordinal Probit gt I39ve written the probit function using the latent utility motivation V It need not be written in this way see point 2 from two slides back Sometimes 04 will be referenced as 739 VV Cumulative probabilities in probit gt Let Z be the probit linear prediction Zf x Pry1lx al 7 Pryi 2 l X ew Zj ew 21 WM 3 l X ea Zj ew Zj Pryl394lx 17 a372j gt These probabilities fully summarize the probability space for a 4 category response variable Jones POL 213 Research Methods lll Bradford 5 Jones Dept of Political Science UCrDavis Ordinal Models Ordinal Logit gt Retrieving the ordinal logit means specifying in terms of the logistic eXPaj X B lt Prlyiyj lx 1expozj7x gt Linear Model for log odds PFW S M l X 7 7 logW fajix 17127171 gt Proportional Odds Property expm MW Jones POL 213 Research Methods Bradford 5 Jones Dept of Political Science UCrDavis Ordinal Models Ordinal Logit gt Conditional Probabilities letting Z be linear predictor l Pry1lx 1expzjia1 Pr2lx 1 7 1 y 7 1expZJ7a2 1expZJ7a1 l l Proii l X 1expZJ7a3 7 1expZJ7a2 Pry 4 l x 1 171expzja3 6 gt As with binary logitprobit the ordinal versions will be very similar gt Virtually indistinguishable from one another in many cases Jones POL 213 Research Methods lll Bradford 5 Jones Dept of Political Science UCrDavis Estimation gt Estimation through standard likelihood based methods gt Function is generally well behaved Newton Raphson is commonly hardcodedso is Fisher scoring gt General log likelihood form n J log L Z 2 do IogFaj x Ham 7 6 7 i1 j1 gt Estimation in R using the MASS package and polr option Stata39s ologit or oprobit works gt Turn to example using immigration data First logit Jones POL 213 esearch Methods gt ologrtuodltapolrlty pld educatlon lncome latlno ethod C loglstlc gt summaryologltmod Reafrttrug to get Hesslan Call polrformula y pld educatlon lncome latlno method C lo stlc Coefflclents Value Std Error t Value pld 468236666 011984850 418692621 educatlon 002912987 004249420 08855021 lncome 019449400 007509927 25898254 latlno 7187699413 023360995 7803041547 Intercepts Value Std Error t value 112 713725 03082 744532 213 02447 02984 08203 314 15395 03026 50878 Resldual Devlance 1104 433 AIC 1118433 6 observatlons deleted due to mlsslngness gt Jone POL 21 Reeearch Melhode Bradford 5 Jones Dept of Political Science UCrDavis Ordinal Models Ordinal Logit b Coefficients are log odds ratios V Signs inform you about how the covariate is related to moving up or down the scale in logit form V We would want to compute odds ratios or probabilities V Either is simple to do and identical to what we39ve previously seen V Main complication with probabilities is that there are four of them V Consider Latino coefficient of 7188 V Log odds of responding in higher versus lower categories is 88 Jones POL 213 Research Methods lll Bradford 5 Jones Dept of Political Science UCrDavis Ordinal Models Ordinal Logit b Odds ratio 9 13988 15 Odds less than 1 means that odds of responding in higher vs lower categories is 15 that of non Latino respondents V gt We can flip the interpretation If 67188 is odds of responding up the scale then 8T13988T1 19 13988 gives the odds of responding in lower vs higher categories V These odds are 652 This is easier to interpret a Latino is 652 times more likely to respond in categoryj 71 vs category j Probabilities Simple Let the covariate profile be Democratic Latino respondent with mean income and mean education V V gt This is covariate profile PID 1 Latino 1 Income 3 Education Jones POL 213 Research Methods l I gt zltcoef11 coef 21 coef33 coef41 gt gt p1lt7 11expz13725486 gt p2lt7 11expzlt2447428 11expz13725485 gt p3lt7 11expz15394910 11expzlt2447428 gt p4lt7 111expz16394910 gt gt pl 1 05882365 p2 1 02888005 gt p3 1 008530286 gt 4 P 1 0 03666003 Jone POL 21 Reward Method8 Bradford 5 Jones Dept of Political Science UCrDavis Ordinal Models Ordinal Logit These are the conditional probabilities I chose the scenario Should look a lot like binary logit mechanically These probabilities must sum to 1 something has to happen VVVVV Plots of probabilities are helpful Odds ratio interpretation is also a useful thing to do here gt What about probit Jones POL 213 Research Methods lll gt opromtuodltepolry pld educatlon 1 come latlno ethod C problt gt gt summaryoprobltm0d Reafrttrng to get Hesslan Call polrformula y pld educatlon lncome latlno method C problt Coefflclents Value std Error t value pld 7034138676 007114005 74798784 educatlon 001932492 002525182 0765288 lncome 012393886 004420907 2803472 latlno 7103617869 013033518 77942435 Intercepts Value Std Error t value 112 O7489 01791 11826 213 01839 01768 10401 314 09499 01784 3 Resldual Devlance 1108417 AIC 1122417 6 observatlons deleted due to uueemgnees gt Jone POL 21 Reaearch Melhoda Bradford 5 Jones Dept of Political Science UCrDavis Ordinal Models Ordinal Probit gt Coefficients are of same sign and significance as logit gt Interpretation is different gt There is no odds ratio interpretation same as binary probit gt Consider probabilities for same scenario from before Jones POL 213 Research Methods gt zltcoef11 coef 215 coef33 coef 41H gt gt p1ltprob1tltlt 7748872 JnverseZTRUE 1 FALSE gt p2ltprob1t1838660z 1nverseTRUE prob1t7489 2 1nverseTRUE gt p3ltprob1t9499398z 1nverseTRUEproblt1838660 2 1nvers TRUE gt p4ltiiiprob1t8499398 2 1nvers TRUE gt gt gt pl 1 05882385 gt p2 1 02883278 gt p3 1 01058388 gt p 1 003158003 gt gt Jone POL 21 Reward Method8 Bradford 5 Jones Dept of Political Science UCrDavis Ordinal Models Ordinal Probit gt Probabilities are essentially the same as in logit gt The choice of models will generally be a matter of preference gt I prefer logit because of odds ratio interpretation gt And speaking of odds ratios Jones POL 213 Research Methods lll Bradford 5 Jones Dept of Political Science UCrDavis Standard Practice with Ordinal Response Variables gt Assume Y is equal interval scored and use normal theory methods OLS gt Apply ordinal logitsprobits but rarely evaluate prop odds assumption gt Or do both conduct eyeball test and report OLS gt We can do better than this gt Consider the following application from Jones and Westerland unpublished gt DV is attitudes on affirmative action Likert type scale Jones POL 213 Research Methods lll Bradford 5 Jones Dept of Political Science UCrDavis Application Table 1 Support for Affirmative Action 0L5 and Ordinal Logit Estimata 015 Proportionai Odds Symbohc Racism 1375 2 562 0 111 0 207 Raciai Prejudice 0 383 0 728 0 193 0 356 Education 0 259 0 510 0 102 0 182 ideoiog 0 066 0117 0 027 0 047 Femaie 0 025 0 075 0 049 0 088 intercept 1479 0 127 971 0 652 0 233 at 1 888 0 236 973 268 0 245 Qgehkeimood 17 72 54518 Jone Research Methods Bradford 5 Jones Dept of Political Science UCrDavis Non proportionality gt Both ordinal logitprobit make the parallel slopes assumption V In logit setting this is equivalent to the proportional odds property gt It39s a good property insofar as if it holds all we need to know about movement over the scale is 3 But what if it doesn39t hold This may mean that one or more covariates has a differential effect over the range of the scale V V V Perhaps stronger effects above vs below a midpoint V Proportionality tests are available Jones POL 213 Research Methods Bradford 5 Jones Dept of Political Science UCrDavis Wald Test Wald test is a goodness of fit test Useful for testing parameter restrictions Consider the Wald X2 Basic form Q r 8 gt Q is a matrix of constants and r is a vector of constants gt 6 is the vector of parameters Jones POL 213 Research Methods Bradford 5 Jones Dept of Political Science UCrDavis Wald Test gt Basic form W 06 7 rlQV mQl llQ 7 r 9 gt Suppose we have three parameters BO 31 and 32 gt Interested in 31 32 0 l 0 1 0 i go l 0 l 0 0 1 1 0 32 gt In matrix form this symbolizes what we39re doing are the coefficientsjointly O gt Any other constraint could be imposed single coefficient multiple coefficients etc Jones POL 213 Research Methods Bradford 5 Jones Dept of Political Science UCrDavis Wald Test gt For this test without proof see Long p 91 gt z is the 2 ratio gt The Wald test for proportionality is essentially evaluating acrossequation parameters for J 71 logits Jones POL 213 Research Methods Bradford 5 Jones Dept of Political Science UCrDavis Proportionality Tests gt Proportionality tests have been proposed Brant Biometrika 1990 Peterson and Harrell Applied Statistics 1990 gt Basic concept much simplified gt Estimate J 71 binary logits 1 ify gt m 0 ify m gt Extract estimated parameter vectors and covariance matrices from each model gt Evaluate hypothesis that Bid 3kg Big1 gt Alex Mayer has coded this up in R can do in Stata thanks to Scott Long gt Essentially a series of Wald tests over parameter restrictions Jones POL 213 Research Methods Bradford 5 Jones Dept of Political Science UCrDavis Proportionality Tests Table 2 Testing me Proportional Odds Assumption aria e 0e Icle y 1 gt 2 y gt 3 Symbohc Racism 1822 2 45s 3 037 Raciai Prejudice 1384 0 219 0 664 Educatio 0 209 0 496 0 651 ideoiogy 0116 0105 0139 Femaie 70118 0144 0 052 Constant 70 279 71651 V3 645 Variable x2 p gt 8 df AH 29 33 0 001 10 Symbohc Racism 12 37 0 002 2 Raciai Prejudice 815 0 017 2 Educa on 229 0319 2 ideoiogy 0 33 0 849 2 F i 5 79 0 055 2 Jone POL 21 Reaearch Methoda Bradford 5 Jones Dept of Political Science UCrDavis Implications gt Basic property of ordinal logit does not hold gt Therefore the model is inappropriate statistically speaking gt Substantively you may be missing out on a lot of stuff gt Why Parameters may vary over the scale scores gt This is potentially very interesting gt Yet it almost always goes untested gt There are easy to implement models Jones POL 213 Research Methods lll Bradford 5 Jones Dept of Political Science UCrDavis Models Nonproportional Odds gt Models for Nonproportional odds nonparallel regression gt Proposed most fully by Peterson and Harrell 1990 see also Williams 2006 gt Though McCullagh 1980 and others proposed such models gt Basic idea let regression parameters be unconstrained over scale scores Jones POL 213 Research Methods lll Bradford 5 Jones Dept of Political Science UCrDavis Some Models for Nonproportional Odds gt Genelgaliggd Model log ip ygglf aji x j j 1213971 gt Partial Proportional Odds P Ylt ogW7097x it yj7 11727171 gt Restricted Generalized Logit P Ylt log 041 xl lzij J 1727quot39J 1 Jones POL 213 Research Methods lll Bradford 5 Jones Dept of Political Science UCrDavis Unconstrained Partial Proportional Odds gt Originators Peterson and Harrell 1990 gt More Details Probabilities PrYZj l x j 12121 1 exp7ozj 7 x 7 tquotyj gt t are the q covariates exhibiting nonproportionality with associated parameter vector 39y x are p covariates having proportionality with associated parameter vector 3 gt Under original model 39y gives the increment associated with the jth cumulative logit PampH 208 gt If 39y 0 proportionality holds and proportional odds model obtained Jones POL 213 Research Methods lll Bradford 5 Jones Dept of Political Science UCrDavis Estimation and Implementation gt Log Likelihood n k n k IogL ZZdUIogPrYj l xi ZZdUPU i1 j0 i1 10 gt Has the same form as ordinal logit with difference that cumulative probabilities in UPP have nonproportional factors gt Implementation 1 Evaluate proportional odds assumption global or covariate specific tests 2 Constrain covariates wproportional odds to be equal over logits 3 Maximize log likelihood Jones POL 213 Research Methods lll Bradford 5 Jones Dept of Political Science UCrDavis Application U PP Table 3 Support for Affirmative Action Partial Proportional Odds UPP Model arIa e oe Iclent tan ar rror Symbolic Racism o 285 2 o 257 w o 358 Racial Prejudice o 495 w o 440 w o 586 Female 0 125 w o 111 73 o 145 Education 0182 ideology o 047 G1 2 a2 20 a 236 3 n logrlikelmood Jone Research Methods lll Bradford 5 Jones Dept of Political Science UCeDavis Application Restricted Generalized Ordinal Table 3a Support for Affirmative Action Partial Proportional Odds Generalixed Ordinal Logit W C C1 3 Symbolic Racism 1788 2 588 3133 285 0 245 0 280 Racial Preludlce 1563 0127 0 752 0 495 0 421 0 462 Educatlon 0 478 0 478 0 478 0182 0182 0182 ldeology 0122 0122 122 0 047 0 047 0 047 Female 70 098 0180 0 049 0125 0101 0113 Constant 70 509 71698 V3 639 0 283 0 261 0 299 n 1744 logrllkellllood 72274 293 Jone Research Methods lll Bradford 5 Jones Dept of Political Science UCrDavis Summing Up Ordinality gt The OLS strategy is really bad in this context gt Even the putatively correct model ordinal logitprobit may not hold gt It must surely be true that if the logit model doesn39t fit OLS is even more awkward gt Further potentially interesting substantively theoretically etc information is overlooked gt Given prevalence of ordinal scales in research these models would seem useful Jones POL 213 Research Methods Bradford 5 Jones Dept of Political Science UCrDavis What about unordered response variables gt Nominal categories without any natural ordering to them gt Makes no sense to think about latency spanning the range of y gt The baseline category or referent category is utterly arbitrary gt Career decisions gt Models for unordered outcomes are common gt The most common is multinomial logit Jones POL 213 Research Methods lll Bradford 5 Jones Dept of Political Science UCrDavis M ultinomial Logit Logit gt The model is given by J 71 nonredundant logits PFYleX 7 7 logW iajx h 17127171 gt As probabilities P expxm u i J 1 Zj2 POE gt Likelihood n J logLZZdjlogPj i1 11 Jones POL 213 Research Methods Bradford 5 Jones Dept of Political Science UCrDavis MNL Straightforward generalization of binary logit Baseline category is utterly arbitrary Many parameters J 71x K i 1 Easy to impose constraints if necessary Jones POL 21 Research Methods Bradford 5 Jones Dept of Political Science UCrDavis MNL gt Let39s see an application using an ordinal variable gt This is ok to do gt Item is on the effect immigrants will have on taking awayjobs 1eXtremely likely 2very likely 3somewhat likely 4not very likely gt 5 independent variables gt Many parameters J 71x K i 1 gt Easy to impose constraints if necessary gt In R I39m using VGAM Jones POL 213 Research Methods gt moumLovglmfomula ImmJobs TransJhspamcs IDSelfPlace gdlff Morals PersonalRetro famlycmulmomal summarymmod m Call Vglmformula ImmJobs TratsHlspan1cs IDSelfPlace gdlff 315 PersonalRetro falluly multmomal Pearson Resmuals Vb e 3 logmu1mu4 723372 7036022 7026508 7016844 34832 logmu2mu4 723859 7043538 7032315 7015246 25376 logmu3mu4 724399 7058041 7050494 106772 14258 Coefhments ror t alu Intercept1 7414929980 0811718 758509217 58840085 0745811 748114057 35328556 0639192 721171814 43740323 1062604 3234885 02004483 0991489 30459685 92820198 0875752 22017673 54782027 0360729 715186478 05799430 0338796 704711776 02699044 0304273 700887048 03888022 0106332 03656499 0 0 848386 0099745 02855673 000057201 0087389 00065456 408198880 0888800 45926961 326978148 0831379 39329605 1 75242820 0742368 23605932 PersonalRetro1 1 21720657 0475106 25619698 PersonalRetro2 0 89297254 0450565 19818959 PersonalRetro3 0 71437247 0409490 1744544 Jone POL 21 Reward Method8 Number of lrmear predrecors 3 Names of 11near predrecors Nrmr m741w r 1 FA yW 7 r1 NAM Drspersrom Parameter for mulcmomral fanuly 1 Nesrdual Devramoe 1963678 on 2304 degrees of freedom Logahkehhood 7981839 on 2304 degrees of freedom Number of Iteratrons 4 mlogrt ImmJobs Irartsjrspamcs IDSelfPlace gdrff Morals PersonalRetro Iteratron 0 log lrkehhood 710101141 Iteracron 1 log hkeh oo 54035 Iteracron 2 log hkehhood 88183989 Iterat on 3 03 1 kehhood 88183909 Iteracron 4 log hkehhood 88183909 Mulcrnouual logrscre regressron Number of obs 774 R eh1215 5655 Prob gt eh12 00000 Log hkehhood 88183909 Pseudo R2 00280 ImmJobs N Coef Std Err z Pgtz 957 Conf Interval r 1 s 1509201 8620393 175 0080 lt1803648 3198767 IDSelfPlace lt5208298 2879748 gdrff 0383082 0845181 Morals 2 329561 7070628 l I Personame o1 5028841 8888781 187 0170 72168378 1221508 cons 78888014 878074 7502 0000 74721096 72070884 2 1 112115541158 1 1081848 7788878 140 0181 74848448 2818881 IDSelfPlace 1 70810088 2814588 7012 0808 4814851 gdlff 1 0278118 0782702 087 0714 1778887 Moral 1517858 8880485 287 0018 2788881 PersonalReNO 1 1788001 8888751 058 0585 7480888 8878881 cons 1 72285115 8028882 7871 0000 78418288 71058884 4 1 112115541158 1 71828202 8758145 7220 0028 78844787 72118878 IDSelfPlace 1 0288805 8042855 008 0828 75884177 8288887 gdlff 1 7 7 0878825 70 01 0885 7 1718582 1707141 Morals 1 71 752428 7424888 72 88 0018 78207572 72872848 PersonalReNO 1 7 7148728 4085818 7174 0081 7151704 0882848 cons 1 1858288 8882888 212 0084 1008888 2808172 ImJoby78 1 the base outcome Jone POL 21 Reward Method8 Bradford 5 Jones Dept of Political Science UCrDavis MNL gt No difference gt except baseline category is chosen differently here gt BC is category 3 gt Implications None really gt You can force Stata to change BC base4 at end of model statement would do this gt To see equivalence note Jones POL 213 Research Methods display 1 b Traits 4 b Traits 34374034 display 2 b Traits 4 b Traits 3020045 display 4bTraits 19282022 Jones POL 213 esearch Methods Bradford 5 Jones Dept of Political Science UCrDavis MNL gt These are differences in log odds In R the contrast is with category 4 in Stata it39s category 3 gt Ifyou want to know log odds of 1 vs 4 simply subtract log odds of 4 vs 3 from 1 vs 3 Jones POL 213 Research Methods Bradford 5 Jones Dept of Political Science UCrDavis Issues gt Some issues with MNL Ordinality not preserved Contrasts of more natural interest not directly modeled Though in principle nothing is particularly wrong with MNL gt Suppose however we could reparameterize the MNL gt Under MNL the probabilities are given by expose 1 l 22 eXpx gt lmplies probability is assessed to baseline category Fb Jones POL 213 Research Methods Bradford 5 Jones Dept of Political Science UCrDavis An Alternative Para meterization gt Imagine a 4 point scale gt BCL using category 1 as baseline Probability Contrasts 4 vs 1 3 vs 1 2 vs 1 gt As an alternative consider this Probability Contrasts 4 vs 3 3 vs 2 2 vs 1 gt Here we39re contrasting probabilities or odds with adjacent categones gt This is the basic idea behind the adjacent category logit model Jones POL 213 Research Methods Bradford 5 Jones Dept of Political Science UCrDavis Adjacent Category Logit Model gt The ideas of the previous slide can be summarized as P39 1 log lt7 gt Bj 5in P gt Coefficients are indexed byj implying a multinomial model gt Probabilities are derived in terms of adjacent categories P expm 1 1 expzc where ZC corresponds to the linear predictor from the adjacent logit model gt Possible to estimate single parameter vector Agresti 1996 Jones POL 213 Research Methods lll l I gt Produces Adjacent Cateorgy Logrc Eennacee gt gt 1nodACLltavglnfornu1a ImmJobs Trancexxrepanree IDSelfPlace gdrff Morale PersonalRetro fanrlycaeat summaryhmodjkCL Call Vglmformula ImmJobs IrartsJhspanrcs IDSelfPlace gdrff Morals PersonalRetro falluly acat Pearson Resrduals IquotL1n 1Q Medran 3Q Max logPY2P 36584 010607 019399 0445039 15449 logPY3P a32526 7094751 039500 085317617959 logPY UPY3 12808 7059704 7022232 70084046 59901 Coeffrcrents Std Error t value Intercept 1 Hrsp 0 95325 0 11872 TratsHlspan1c 0 79908 70 2223 T 1 Hrspa 1c 0 89626 7213666 IDSelfPlace1 0 31507 144208 IDSelfPlace2 0 26251 010378 IDSelfPlace3 0 30751 011701 31111101 0 56082 7227577 gdff2 0 51677 7265789 gdff3 0 61926 7092946 Morals1 0 78043 7096271 Morals2 064219 7224668 71681674 074526 7225636 Personal etro1 70320741 040503 7079190 PersonalRetro 70 POL 21 Reeearch Melhode l I PersonalRetro3 70685212 040960 167288 Number of lrmear predrecore 3 Names of lrmear predrecore logPY2PY1 logPY3JPY2J logPY4PY3 Drepereron Parameter for heat fanuly 1 Reerdual Devramee 193195 on 2304 degrees of freedom Log llkellhood 7966976 on 2304 degrees of freedom Number of Iteratrons 4 Jone POL 21 Reaearch Melhoda Bradford 5 Jones Dept of Political Science UCrDavis Adjacent Category Logit Model gt Because mode rs reparametemzed BCL 2H m statnstneswm be ndentnear gt Tnere are no statwsuca grounds upon wnncn to adjudrcate one over tne otner gt wustratnon Logeodds for rnorantv sca e C2 vs C1 7 751 C3 vs C2 71443 C4 vs C3 71682 Odds are C2 Vs C1 Xp7 751 471 C3 Vs C2 Xp71443 236 C4 Vs C3 exp71682 186 gt The contrasts are between adjacent catego es gt nterpretatnon quotWe see tnat rnorattradntnonansts are ess hke y to respond m tne nngner categornes as versus tne ower categornes Further tn rs varna e seems to not drstnngvnsn weH categornes 2 vs nterestmg y both categomes represent the vrew that mrmgrants er rltey extremew or very take away Jobs Jone POL 21 Research Methods POL 213 Research Methods Brad Jones1 1Department of Political Science University of California Davis April 15 2008 Jones POL 213 Research Methods Matrices Meet R Jon POL 213 Re arch Melholt Bradford 5 Jones UCrDaviS Dept of Political Science Matrices Meet F Some Matrix Basics gt What is a matrix gt A rectangular array of elements arranged in rows and columns 55 900 0 67 1112 1 73 525 0 gt All data are really matrices gt Suppose we have votes money and PID gt Dimension of a matrix is r x c 3 x 3 Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Matrices Meet F Some Matrix Basics gt Dimensionality gt lfwe had 435 MCs and 3 variables 55 900 0 67 1112 1 73 525 0 67 874 1 gt Dimensionality 435 x 3 gt Applied issue always a good idea to know the dimensions of your research problem Jones POL 213 Research Methods Matrices Meet F UCrDa is Dept of Political Science Bradford 5 Jones Some Matrix Basics gt Symbolism gt We can use symbols to denote re 311 312 313 321 322 323 331 332 333 gt Often will denote matrices with bold faced symbols 311 312 313 321 322 323 331 332 333 Research M ellio ls Jones POL 21 Bradford 5 Jones UCrDaviS Dept of Political Science Matrices Meet F Vectors gt A matrix with a single column is a column vector 311 321 331 gt A matrix with a single row is a row vector 311 312 313 gt Dimensions of the column vector gt Dimensions of the row vector gt Answers 3 x11x 3 gt Linear models y is usually an n x 1 matrix Jon POL 213 Re arch Metholt Bradford 5 Jones UCrDaviS Dept of Political Science Transposi ng Matrices b b b b Matrices Meet F The transpose of matrix A is another matrix A It is obtained by interchanging corresponding columns and rows So if A is 311 321 331 Then A is 311 312 313 312 322 332 321 322 323 313 323 333 331 332 333 That is the first column of A is now the first row of A Jones POL 21 Research M etho is Bradford 5 Jones UCrDaviS Dept of Political Science Matrices Meet F Transposi ng Matrices gt The transpose of a column vector is a row vector and vise versa gt Note that the dimensions of the input matrix A is r x c gt For the transpose matrix A the dimension is c x r gt Transposing matrices is really important in regression like settings Jon POL 213 Re arch Metholt Bradford 5 Jones UCrDaviS Dept of Political Science Matrices Meet F Fun with R Generating data in Ras vectors XOltC11111 X1ltC246810 X2ltC7386923212 yltC32147181101 Jones POL 213 esearch M etho ls Creating a matrix X Xmatltcbindx0x1x2 xmat x0 x1 2 4 6 8 M 101010010 7 8 9 3 1 u A HHHHH 10 xmat13 Jones POL 213 Research Methods Basic Manipulations Multiplication by a scalar 2xmat x0 x1 x2 1 4 146 8 172 Jones POL 213 Research Methods Basic Manipulations Transposing X to create X xprimelt txmat xprime 1 2 3 4 5 X0 10 10 10 10 10 X1 20 40 60 80 100 X2 73 86 92 32 12 Transpose Y to create Y Y 1 32 14 71 81 101 yprimeltt y YPrime 1 2 3 4 5 1 32 14 71 81 101 Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Matrices Meet F M ultiplying Matrices gt Multiplying matrices will be important for us in order to derive variance and covariances gt Multiplication of matrices having the same dimensions is not problematic 2 5 4 1 4 6 5 8 gt The product is a matrix AB Its elements are obtained by finding the crossproducts of rows of A with columns of B gt Suppose A is gt Suppose B is Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Matrices Meet F M ultiplying Matrices gt For these the product matrix is obtained by 2455 33 2658 52 4415 21 4618 32 gt Product matrix AB 33 52 lt 21 32 gt gt Above we say we postmultiplied A by B or premultiplied B by A gt Why the language Multiplication is not always defined on matrices Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Matrices Meet F M ultiplying Matrices gt Scalar multiplication has the commutative property xy yx gt However AB may not equal BA gt The product matrix is defined when the number of columns in A equals the number of rows in B gt When this condition holds we say the matrices are conformable for multiplication gt If A is 2 x 2 and B is 2 x 2 AB is conformable and would have dimension 2 x 2 gt If A is 2 x 3 and B is 3 x 1 AB is conformable and would have dimension 2 x 1 gt Note BA is not conformable why7 Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Matrices Meet F Reg ressio n gt The OLS model can be written as y X 6 1 gt What are the dimensions of each matrix gt They are y n x 1 X n x k 8 k x 1 e n x 1 2 gt Note that X is conformable for multiplication ie we have one parameter for each column vector in X Jones POL 213 Research Methods Matrices Meet 5 Bradford 5 Jones UCrDaviS Dept of Political Science Basic Manipulations Multiplying Matrix X by X to create X39X xprimexltxprime xmat xprimex x0 x1 x2 x0 50 300 2950 x1 300 2200 14180 x2 295 1418 22357 X39 had dimension 3 x 5 X had dimension 5 x 3 Thus X39X is 3 x 3 and is obviously conformable for multiplication ie n columns of X39 equal n rows of X ie 55 Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Matrices Meet F Inversion gt One of the most important manipulations we need to perform is matrix inversion gt In matrix algebra inversion is kind of like division gt For a scalar k 1 1k gt Multiplication by an inverse has the property kk l 1 gt The question for us is this can we find a matrix such that BA AB l gt What is l Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Matrices Meet F Identity Matrix gt Suppose A is 1 1 3 4 4 71 73 1 gt Multiplying A x B yields 1 0 0 1 gt Call this I an identity matrix gt Suppose B is gt Here B is the inverse matrix and A is said to be invertible gt BTW verify you could obtain l Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Matrices Meet F Identity Matrix gt A diagonal matrix is a matrix with all Us on the off diagonal gt An identity matrix is a diagonal matrix with ls down the main diagonal gt In matrix algebra the identity matrix serves the role of a 1 Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Matrices Meet F Determinants gt Determinants are a scalar a number associated with a matrix gt They are useful in many mathematical endeavors gt wrt matrix algebra the determinant is useful in informing us as to whether or not a matrix is invertible gt A matrix with a determinant of O is a singular matrix gt Important a singular matrix is NOT invertible This will cause us trouble gt A matrix with a non zero determinant is a nonsingular matrix Jon POL 213 Re arch Metholt Bradford 5 Jones UCrDaviS Dept of Political Science Matrices Meet F Determinants gt If all the elements of any row of a matrix are 0 its determinant is O and the matrix cannot be inverted gt If matrix is not of full rank its determinant is 0 Finding a determinant is a bit complicated beyond the 2 x 2 setting not mathematically complicated just computationally tedious gt Consider this 2 x 2 matrix 311 321 312 322 In this setting the determinant is found by cross multiplication of elements on the main diagonal and subtracting them from crossproduct elements on the off diagonal ll A ll 311322 f 321312 Jones POL 213 Research Methods V V Bradford 5 Jones UCrDaviS Dept of Political Science Matrices Meet F Determinants b Beyond the 2 x 2 setting finding the determinant is timeconsuming why LOTS of cross product manipulations V Easy to do computationally The process is sometimes called matrix expansion V The important idea if a determinant is 0 stop V The matrix is not invertible V As noted before if a matrix is not of full rank it will not be invertible gt the rank of a matrix is the dimension or order of the largest square submatrix whose determinant is not 0 gt For a matrix to be invertible an n x n matrix must be of rank n Jones POL 213 Research Methods Basic Manipulations Finding the determinant gt Define matrix q gt gt q1lt c36 gt q2lt c52 gt qmatltcbindq1q2qmat q1 q2 5 2 6 2 gt The Determinant gt gt Finding determinant in 2x2 case gt gt detltqmat11qmat22qmat21qmat12 det 1 24 The determinant is 724 ie it39s not 0 therefore the matrix must be invertible Jones POL 213 Research Methods Basic Manipulations Finding the inverse gt gt Solvlng by hand gt gt qmat11det 1 70125 gt qmat 221det 1 408333333 1 02083333 gt qmat 211det 1 025 n l is would produce the inverse matrix I could tell R to do this for gt Uslng R to do 1 for us gt gt qnverseltsolveqmat qmverse 1 2 q1008333333 02083333 q2 025000000 701250000 These cells are identical to what we computed above POL 213 Re arch Melho Bradford 5 Jones UCrDaviS ept of Political Science Matrices Meet F Recall from before if BA AB I then from above is we multiply matrix Q by Oil we should obtain an identity matrix Let39s check qmat o o qinverse 1 2 1 1 o 2 o 1 Formally this is QQ 1 and it39s an identity matrix Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Matrices Meet F At long last regression gt The model y X 6 3 gt The dimensions of each matrix are as before gt Recall the residual sums of squares in scalar form 2 gig 00 i 30 31X1i ZXZi 7 kaif 4 ln matrix form 39 A bit simpler But why What are the dimensions of this matrix Multiply a column vector 6quot by a row vector E and you will get a scalar gt ie dimension of the first matrix is 1 x n the second is n x 1 the dimension of the product Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Matrices Meet F Reg ressio n gt Thus SSError gt Note in matrix terms what the residual is e y X3 5 gt The least squares solution is the one that minimizes the sum of the squared errors hence least and squares gt To solve for 3 you partially differentiate the SSError Jones POL 213 Research Methods Bradford 5 Jones UC7DaviS Dept of Political Science Reg ressio n gt ForumovaHaMeregmswon derenuam Matrices Meet F M 7 v 7 L30 7 M 7 zerJZ partraHy wrtn respect to tne tn ree unknown parameter estnnates Thwgwa ae2 2 v 7 47 X 7 ago t 50 11 ae2 7 2 v 7 7 X 8amp1 r 50 311 532 2 v 7 7 X M t 50 11 Jone ZEgtQJ1 zerlFanw 7ampampm7ampm Research M etho ls Bradford 5 Jones UCrDaviS Dept of Political Science Matrices Meet F Reg ressio n gt When set to o and rearrangmg terms produces the norma squatens V 0 31 23 Emir 089 QUE zzxmr 2mg 089 AU zzxmr gt These equatrons can be rewrrtten yet agam m terms of the parameter estrmates 3X1 7mm VXZOQ lel 3X2 Y21W VXZOQ YUOQ Y2 1 2 Jozzw J2 7 mt Jim J21 32 3X2 YZJW VXZOQ lel 3X1 Y11W VXZOQ YUOQ Y2 09 7112209 7212 i 09 71109 72 50 V7317173272 Jon POL 213 Re earch Metholt Bradford 5 Jones UCrDaviS Dept of Political Science Matrices Meet F Reg ressio n gt In matrix terms the normal equations are summarized as X903 X39y 6 gt What is X39X gt It is a matrix of the raw sums of squares and crossproducts for the X variables include a constant term which is a vector of 1s Jon POL 213 Re arch Metholt Bradford 5 Jones UCrDaviS Dept of Political Science Matrices Meet 5 Remember from before we created X39X from our simulated data xprimexltxprime xmat xprimex x0 x1 x2 x0 50 300 2950 x1 300 2200 14180 x2 295 1418 22357 Note several things 11 is simply n 21 ZXU 31 ZXgi 22 2X12 37 3 2X12 37 2 ZXzXl That is these are simply functions of the data nothing more Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Matrices Meet F Reg ressio n gt From the normal equations it is clear we have known quantities in X39X and X39y gt We of course do not know what 3 are gt We solve for them gt Let39s take advantage of the inversion result recall QQ 1 if the inverse exists it may not gt So consider this Let us premultiply equation 6 by X39X X X 1X XB X39xgt1x39y 395 X39X1X39y 7 gt The existence of an inverse is important Without it we cannot obtain the OLS estimator 3 X39X1X39y 8 gt This is the OLS estimator It shows how the regression parameters are a function of the observed data H Continuing with our example let39s create X39X 1 mquotverseosolveepme xxlnverse x0 x1 x2 x0 78403149 70680541367 706028904 x1 706805436 0066760141 00474547 x2 706028904 0047454170 00639268 Ok Now let39s verify that X39X 1X39X IS an Identity matrix gt solvexpnmexgtzzxpnmex x0 x2 x0 1000000eOO 370137 e iS 4009293e14 x1 1783513e 16 OOOOOOeOO 5793976e16 x2 2123952e 16 2896988e 16 1000000eOO Looks like an identity matrix I We can solve for B by first creating X39y gt xprlmeyZXIorlme w y xpnmey 1 POL 213 Re arch Melho The least squares estimates are obtained by 3 X39X 1X39y gt bZXXJnVerse 3939 xprrmey b Are we correct gt modeliltlmyx1 x2 summarymodeli Call lmformula y x1 x2 Resrduals 2 3 4 5 1140513 72460416 1399390 7009684 7079903 Coeffrerents Estrmate Std Error t Value Prgtiti o 09 Intercept 7 7654 602828 0129 9 XI 5050 65627 1709 0230 X 708466 49995 70169 0881 Resrdual standard error 2153 on 2 degrees of freedom Multrple RsSquared 08197 Adjusted Rssquared 06395 Fastatrstre 41548 on 2 and 2 DP psvalue o1803 Which by the way is how you run regression in R Jone POL 21 Research M elho ls Bradford 5 Jones UCrDaviS Dept of Political Science Matrices Meet F Reg ressio n gt The hat matrix is a useful matrix you used in a lot in POL 213 gt Predicted values 9 xltx39xgt lx39y 9 gt This isn39t mysterious it is X times gt Let H XX39X1X39 we call H the hat matrix gt It puts the hat on y gt Thus 9 Hy Jones POL 213 Research Methods Let39s compute the hat matrix hatcxmac 3939 xxmverse 3939 xpnme And now let39s get the fitted values 9 Hy gt yhat lt7th 3939 y yhat 11 8109584 10179903 Now the residuals y y resdualltyyhat resldual 11 1140513374 1 72460416307 31 1399389560 1 7009683693 1 7019902934 miEr S Jon POL 213 Re earch Melholt D ror we can compute this quantity using matrix en Since Eel2 SSEr operations gt resltcb1ndresdual gt eprmelttres epnme 1 2 3 4 5 1 11 40613 2450416 1389390 0 09683693 0 7990293 gtgt gt epnmeeoepnme w res epnmee 1 1 9270678 or the old fashioned way SSErrorsumresdual 2 SSError 78 gt 1 92700 Other quantities How about the R2 gt meanyecmmepememe 5 gt explanedltyhatmeany explmned 1 1 739205134 e plalned Q gt IQZSSRegresseprmeeSSRegress r2 1 1 08197466 Research Methods Jone This is what we want So now that we have the variance components we can rule the world Mean Square Error gt MSEltepr1mee2 SE 1 1 4635039 Root Mean Square Error ee of the eetnnate MSEquotS 1 1 2152814 Note that the variance covariance matrix of 6 MEX39X1 gt varcovlta4685088 xxrnverse varcov x1 x2 x0 86840165 731543460 727944206 x1 73154346 3094358 2199544 1 72794421 2199544 2488482 gt gt low we can obtarn the standard errors gt gt eerntltaeqrtvareov11 gt ee rtv o 2 c r s e381 eeb2 ee b2 se1nt seb1 se 1 602828 5662696 4888482 Jon POL 213 Re earch Melholt l I Recall gt modeliltlmyx1 x2 summarymodeli 0511 lmformula y x1 x2 Reerduale 2 3 4 5 1140513 2460416 1399390 7009584 7079903 Coeffrerence stlmate Std Error c Value Prgtltl Intercept T7654 602828 0129 0909 xi 99050 59627 1709 0230 X 709466 49995 O169 0981 Reerdual standard error 2193 on 2 degrees of fre ed Mulcrple R Squared 09197 Adjusted Risquared 06395 F statlstlc 4948 on 2 and 2 DP p value 01803 We39ve basicallyjust replicated everything here or could replicate everything here Jone POL 21 Reeearch M elho de POL 213 notes on conditional logit and multinomial logit Using data on automobile choice taken from Stata website and used in Stata manual we have the following data structure use choice list id car choice dealer sex income in 112 1 id car choice dealer sex income 1 1 1 American 0 18 male 46 7 2 1 Japan 0 8 male 46 7 3 1 Europe 1 5 male 46 7 4 2 American 1 17 male 26 1 5 2 Japan 0 6 male 26 1 1 6 2 Europe 0 2 male 261 7 3 American 1 12 male 327 8 3 Japan 0 6 male 327 9 3 Europe 0 2 male 327 10 4 American 0 18 female 492 1 11 4 Japan 1 7 female 492 12 4 Europe 4 female 492 In this case we have three choices a choose must decide among There are two first set is choice specific dealer noted in lecture sets of covariates the sex and income noted in lecture as x and the second set is chooser specific as w What makes w unique Implications Given CL model the effects They are constants within choosers t be estimable To obtain chooser of sex and income would seem to no specific effects then we must resort to interactions gen japan car gen europe car gen sexJapansexjapan gen sexEuropesexeurope gen incJapanincomejapan gen incEuropeincomeeurope Note there is an excluded category Why Now let s proceed with conditional logit clogit choice japan europe sexJapan sexEurope incJapan incEurope dealer groupid Iteration 0 log likelihood 728451561 Iteration 1 log likelihood 725147313 Iteration 2 log likelihood 725078678 Iteration 3 log likelihood 72507794 Iteration 4 log likelihood 72507794 logistic regression Number of obs 885 14662 Conditional fixedieffects LR chi27 7 Prob gt chi2 00000 Log likelihood 72507794 Pseudo R2 02262 choice 1 Coef Std Err z Pgt1Z1 95 Conf Interval japan 1 al352189 6911829 196 0050 2706882 0025049 europe 1 2355249 8526681 276 0006 4026448 6840502 sexJapan 1 5346039 3141564 170 0089 al150339 0811314 sexEurope 1 5704111 4540247 126 0209 319461 1460283 incJapan 1 0325318 012824 254 0011 0073973 0576663 incEurope 1 032042 0138676 231 0021 004862 0592219 dealer 1 0680938 0344465 198 0048 00058 1356076 I use the group option to group the data by some identifying varia e Remember we re assuming t at t e chooser can choose among any of the three data Thus for every individual there are 3 records of choices To interpret the model we could use odds ratios clogit or Conditional fixedieffects logistic regression Number of obs 885 LR chi27 14662 Prob gt chi2 00000 Log likelihood 72507794 Pseudo R2 02262 choice 1 Odds Ratio Std Err z Pgt1Z1 95 Conf Interval japan 1 2586735 1787907 196 0050 0667446 1002508 europe 1 0948699 0808925 276 0006 0178376 5045692 sexJapan 1 5859013 1840647 170 0089 3165294 1084513 sexEurope 1 1768994 803167 126 0209 7265405 4307179 incJapan 1 1033067 013248 254 0011 1007425 1059361 incEurope 1 1032561 0143191 231 0021 1004874 1061011 dealer 1 1070466 0368737 198 0048 100058 1145232 To prove a point from above suppose we try and estimate the conditional logit model with w but without the interaction clogit choice japan europe sex income dealer groupid income omitted due to no withinigroup variance note note sex omitted due to no withinigroup variance Iteration 0 log likelihood 728619981 Iteration 1 log likelihood 725895846 Iteration 2 log likelihood 725862878 Iteration 3 log likelihood 725862825 Conditional fixedieffects logistic regression Number of obs 885 LR chi23 13092 Prob gt chi2 00000 Pseudo R2 02020 Log likelihood 725862825 Coef Std Err z Pgt1Z1 95 Conf Interval choice l japan 77130572 3938727 7181 0070 71485034 0589l92 europe 71070849 5271377 7203 0042 7210402 70376779 dealer 0339226 0324258 105 0295 70296307 097476 Now let me rerun the conditional logit model to illustrate its equivalency to MNL note that I m dropping the choice attribute dealer clogit choice japan europe sexJapan sexEurope incJapan incEurope groupid Iteration 0 log likelihood 728466485 Iteration I log likelihood 725331044 Iteration 2 log likelihood 72527268 Iteration 3 log likelihood 725272012 Iteration 4 log likelihood 725272012 Conditional fixedieffects logistic regression Number of obs 885 LR chi26 l4274 Prob gt chi2 00000 Log likelihood 725272012 Pseudo R2 02202 choice Coef Std Err z Pgtlzl 95 Conf Interval japan 71962652 6216804 7316 0002 73181123 77441806 europe 73180029 7546837 7421 0000 74659182 71700876 sexJapan 74694799 3114939 7151 0132 71079997 141037 sexEurope 5388442 4525278 119 0234 7348094 1425782 incJapan 0276854 0123666 224 0025 0034472 05l9236 incEurope 0273669 013787 198 0047 000345 0543889 Because conditional logit has as many records for each observation as there are choices we need to limit our focus to just the choice to estimate MNL Below I m dropping all cases except for the one representing the chooser s choice kee if choicel 590 observations deleted table car The above looks like an MNL dependent variable Let s estimate MNL mlogit car sex income Iteration 0 log likelihood 72591712 Iteration I log likelihood 725281165 Iteration 2 log likelihood 725272014 Iteration 3 log likelihood 725272012 Multinomial logistic regression Number of obs 295 LR chi24 l290 Brad Jones1 1Department of Political Science University of California Davis April 7 2009 Jones POL 21 Research M elho is Today MLE Jon POL 213 Re arch Melholt Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Preliminaries gt The concept of maximum likelihood gt The theory of MLE gt MLE in practice Jon POL 213 Re arch Melholt Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Coin Flips and MLE Theory of ML has close connection to probability Suppose we want to know 9 Let 9 PrH a priori we know p 5 for fair coins We could specify a model PrH l p 5 And formalize it in terms of the binomial distribution PDF f iy ml 7 pw 1 Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Binomial Distribution gt The binomial PDF n y W 7 ypy1 Pquot y gt The distribution has two parameters n is the number of Bernoulli trials p is the probability of a success gt Suitable for Bernoulli processes gt or binary sequences like a coin flip for example gt Let39s take a look at the pf Jones POL 213 Research Methods n lt 100 k lt seq0 n by 1 plot k dbinomk n 75 logFALSE type 1 ylabquotdensityquot main quotBinomial Density Prob Scalequot 1inesk dbinomk n 75 col red 1wd2 n lt 100 k lt seq0 n by 1 plot k dbinomk n 50 logFALSE type 1 ylabquotdensityquot main quotBinomial Density Prob Scalequot 1inesk dbinomk n 50 col red 1wd2 Jones POL 213 Research Methods l I Binomial Density Prob Scale p75 density Jones POL 213 Research Methods l I Binomial Density Prob Scale p50 density 0 04 i Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Binomial Distribution gt The peaks represent maximum probability gt The location of the peak depends on the parameter p gt In passing for now note that the location is unaffected by rescaling the y aXis gt Suppose we log it Jones POL 213 Research Methods l I Binomial Density Log Scale p75 Binomial Density Log Scale p50 c O c g N l c 3 er 87 8 a 9v 8 g 8 8 37 27 l l l l l l l l l l l l 0 20 40 60 80 100 0 20 40 60 80 100 k R Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Binomial Distribution The peaks still lie above the same location on the X aXis However y is no longer a true probability Coin flipping suppose n flips9 and n heads call it y is 4 Thejoint probability is given by PXPXPXPX1PX1PX1PX1PX1P gt Or more succinctly p41 P5 gt Note this is a really small number why Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Binomial Distribution gt To solve the probability utilized the binomial distribution 4 5 7 m5 5 7 246 gt We39ve answered the following Wquot l P gt That is the probability of 4 heads from 9 flips is 25 conditional on the parameter p 5 gt This of course is a bit trivial gt But let39s pretend we know nothing Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Binomial Distribution and Likelihood gt Suppose we didn39t know p Above this is an assumption and a pretty good one gt Instead we might want to posit E P l H gt Which in words asks what is the value of p that maximizes the likelihood of a given sequence of coin flips gt Here we do not know the parameter but we know what the data look like gt More generic E t9 l X gt 9 is some parameter could be a regression coefficient and X are data gt What is the value of 9 that maximizes the likelihood of the observed data Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Binomial Distribution and Likelihood gt What is the appeal of this statement gt It explicitly acknowledges the fact that our data are a fixed sample gt Given this the parameters are not fixed and are certainly not known in advance gt The question is what value of 9 is most likely given the observed data The best that we can do is maximize the likelihood of 9 given the observed data V gt This is among the fundamental principles of the theory of maximum likelihood V The historically relevant figure here is RA Fisher V Among the most important statisticians in history Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Binomial Distribution and Likelihood gt Sometimes analytical solutions for 9 do not exist or are very hard to solve gt we may need to search for the best answer V This entails iterating over possible values of 0 V Since some values of 9 may be more likely than other values we apply some mathematical criteria to evaluate the possible solutions gt We39ll discuss this in a bit but let39s do a pedagogical walk thru Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Let s flip a coin gt sequenceltrbinom100 1 46summarysequence gt What did I just do gt This returns a binary sequence of 100 trials with a mean of 42 gt The mean of a binary variable is equivalent to the proportion of 1s or successes or heads gt What we want to do is solve this E p l H gt What is the value of p that maximizes the likelihood of this coin flip chain gt Pretending we know nothing about binomials we could just pick a value of p and evaluate Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Wandering around gt Let y be yltmeansequence100 gt We can derive the probability of n Bernoulli trials through the binomial distribution le3 ylylpy17 Pniy gt OK so where to start gt Remember any permissible value of p is possible though some are not very likely Jones POL 213 Research Methods l I Let s start with p 10 p1lt choose100 4210 421 10 58 This returns 6 269305e17 basically 0 Can we do better Let s try p 50 p2lt choose100 4250 421 50 58 This returns 0 02229227 How about p 40 p2ltchoose100 4240 42140 58 This returns 00742072 Jones POL 213 Research Methods l I We could keep golng 11 AZ AS 95 96 97 Or tell R to do 1 for us Probablhty sequence coarse mesh pltseq0 1 by1 Gnd Search coarse mesh siltchooseny YPAY1P nY gt cbndp s1 s1 000000eoo o o 1 6 2 2 3 3 4 7420719e 02 lt Eggest 6 2229227e 02 6 1 7 4 8 6 9 3 o o 000000eoo Plot this wrt p plotps1 typeeulu col red xlaw H ylabeuprob t1tle E1nonual Probabmnes 42 Heads 58 T5115 Returns Jone POL 21 Reeearch Melhode l I Binomial Prohahl les 42 Heads 58 Tails Prob Jones POL 213 Research Mediods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Searches gt What do we see gt When p 40 the likelihood of the data is highest about 074 gt We iterated to this solution by substituting in 11 possible values for p gt Any issues Points of concern gt What we39vejust done is a very simple and crude grid search gt The search was for the maximum gt Given the mesh of the grid we found the maximum at 40 gt Thus our MLE of p is 40 given this grid gt Can we do better Jones POL 213 Research Methods l I Probablhty sequence fme mesh pltseq32 62 by01 ltMoce be nn1ng and endmg poms on gr1d7 Why do 117 Gnd Search fme mesh 2ltichooseny YPAY1P nY gt cbndp s2 s2 1 036 0028247816 2 036 0 037622886 3 037 0 047674827 4 038 0067647822 6 038 0066834060 6 040 0074207194 7 041 0078876623 8 042 0080620806 lt E1ggest 8 043 0 0788887 10 044 0074314204 11 046 0067160732 12 046 0068320448 13 047 0048670334 14 048 0038037288 16 049 0 030082680 16 060 0 022282270 17 061 0 016866236 18 062 0 010846411 18 063 0 007118863 20 064 0004483788 21 066 0002708388 Plot 1 plotpe2 typeul col red xlawp ylabeuprob t1tle E1nonual Probabmnee 42 Heads 68 15115 Jone POL 21 Reward Method l I les 42 Heads 58 Tails Binomial Prohahl Prob 035 040 045 050 Jones POL 213 Research Mediods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Searches gt What do we see gt A finer grid mesh returns a different maximum V This one is clearly preferable to the course mesh search from before V In either case however we had to iterate though in reality no need to The probability of a binary sequence is maximal for p proportion of 1s gt But if we didn39t know that we could still get to the right V answer gt In fact what we39vejust done is maximum likelihood estimation gt The value of p that maximizes the likelihood of the observed data is 42 gt lts likelihood is 081 Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE MLE gt Back to the theory gt Likelihood is not synonymous with probability it39s proportional to p gt Proportionality 09m kpYl0 N cx Hpma 2 i1 gt That is the likelihood is proportional to thejoint probability of the data gt More generic N 0lYHfyl0 i1 gt What is f Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE MLE b It39s a density function gt What about the product operator V Recall property of independence if observations are independent thejoint probability is the product of the individual probabilities iid independently and identically distributed Wejust talked about independence and we39ll assume it ldentically distributed has something to do with the PDF VVVV Coin flips events are independent with a DGP governed by The binomial distribution V Jones POL 213 Research Methods l I Lets generate a binary sequence using the binomial distribution 1iklt functionpyn choosenyp y1p ny plt seq0199 by 001 matplotp cbind1ikp 10 25 1ikp2050 typequot1quot x1abquotpquot ylabquotLike1ihoodquot tit1equotLike1ihood Functions for Two Binomial Sequencesquot Jones POL 213 Research Methods l I Likelihood Functions for Two Binomial Sequences Likehhood Jones POL 213 Research Mediods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE MLE gt The MLE is found at the mode gt Technically it39s the point at which the rate of change in the likelihood wrt a change in 9 is 0 gt That is 8 0 80 gt A tangent wrt the plot would have a slope of zero 0 3 gt Consider our coin flip example in pictures Jones POL 213 Research Methods l I Likelihood Cain Flips Prob 035 040 045 050 Jones POL 213 Research Mediods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE MLE gt In principle the likelihood is obtained by specifying a probability function and deriving thejoint likelihood HL1 zlngxsz lmplicit here is some predefined like a binomial for example In reality we always work with log of the likelihood functions Why Thejoint likelihood is way too small a number to handle 51000 9332636e 7 302 On the other hand log51000 76931472 Easy to work with and no trickery the log transformation is a monotonic transformation of E The only effect it has is to change the scale But since the scale has no intrinsic meaning except in a relative sense this is not a problem Jones POL 213 Research Methods V VV VVV VV Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE MLE b Note that since E is a joint probability and a very very small number log must be negative On your computer output you will be given log and by itself it39s a meaningless number V gt MLE revealed gt Step 1 find a suitable probability function gt Step 2 Evaluate the score function 7 8og 09 U09 7 80 4 gt If multiple parameters are estimated evaluate the elements of U09 8og 09 7 5 k 80k l gt What is the score function gt It is the derivative of the log likelihood wrt the parameters Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE MLE gt Evaluate really means setting those derivatives to O and solving for 0 gt Why 0 That is the point at which a tangent wrt log likelihood function is flat gt Flat is good It means you hit the maximum of course it may not be THE maximum gt Coin flips redux gt The first derivative for the binomial setting is UW 8020 0 y quot y T thip 6 gt Find the value of p that gives a score of 0 gt This is your maximum Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Coin flips MLE gt The log likelihood function log log y logp n y0g1 P 7 gt The score function up 8390g Ej M 8 gt Easy to do in R Jones POL 213 Research Methods loghklt lchooseny ylogp n ylog1p scoreltyP 7 n y1p Let s compare cbndp score 52 p score 32 585588e01 0008873666 1 0 4 2 033 4070556e010013838145 3 034 3565062e010020268268 4 035 3076823e010028247815 5 036 2604167e010037522885 6 037 2145002e010047574827 7 038 1687783e010057647822 8 038 1261034e010066834050 8 040 8333333e00 0074207184 10 041 4133840e00 0 07887552 11 042 1421085e14 0080620906lt Maxlmum score 5 essennany 0 12 043 4079967e00 0078888727 13 044 8116883e00 0074314204 14 045 1212121e01 0067160732 15 046 1610306e01 0058320448 16 047 72 007226e01 0048670334 17 048 72 403846e01 0038037288 18 048 72 801120e01 0030082580 18 050 73 200000e01 0022282270 20 051 73 601441e01 0015866235 21 052 4006410e01 0010846411 Pl c plotploglk typeul col red xlaww ylaba Log leethood ablnehmlell ablneF42 Jone POL 21 Reward Method l I LogLikelihnad Function Logbkehhood 035 040 045 050 Jones POL 213 Research Medlods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Are we done yet gt Did the first order condition really identify a maximum gt Why might it be wrong coarse mesh in a grid search gt To establish if first order condition truly yields a maximum we need to compute the second derivatives gt We39ll call this the Hessian named for Ludwig Otto Hesse gt It looks like this 82 log 09 H0 9 8080 l l gt It39s a bit ugly looking but it39s reallyjust a matrix of second derivatives gt This thing needs to be negative definite Why Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Inverse of the Hessian gt The inverse of the expected value of the Hessian is called the Fisher Information Matrix gt It looks like 390 EH0 10 gt The information in this matrix is really really important gt The inverse if it exists gives us the variance of 0 lt9T1 var0 eEinnrl Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Inverse of the Hessian gt If we can39t invert the Hessian we39re stuck gt The inverse of the information matrix gives us the variances down the main diagonal gt Take the square root you39ve got your standard errors Jon POL 213 Re arch Melholt l I 11klogtltfunct10nparamyx cons lt7 rep1 lengthx1 v A v Hy g c o m EV 0M Hv 4 sum ylogtfep xb i ylog1expxb olsresult lt7 1my x olsresult stval ltolsresultcoeff logltresultltopt1mstval 111L103 method EFGS hesslanZTRUE yy xx pam scolwcresu1tpar pangs Varcovltsolvelogtresulthesslan Varicov magnesthalagwaycov swan logglkeo logltresultvalue logglke Jone POL 21 Reward Method l I olsresult lt7 1my x olsresult Call lmformula 7 y x Coefflclents Intercept x 05 67 07750 gt gt stval ltolsresultcoeff gt logltresultltopt1mstval 11161090 methoduEFGS hesslanZTRUE yy Fx gt gt pam scolognresu1tpar pangs Intercept x 003303452 909369576 gt Varcovltsolvelogtresulthesslar0 Varicov Interce t x Intercept 0043813933 70001781283 6001781283 0569164199 gt magnesthalagwagcov Stdierr Intercept x 02093178 07544297 gt logglkeo logltresultvalue logglke 1 4143286 gt Jone POL 21 Reward Method l I gt logt1ltglmdx1 famrly51nomral11nkemlogm summarylo t1 Call glmformula d x1 falluly blnonual lnk mlogrtm Devranee Reerduale 1n 1Q Medram 3Q Ma 720330 O6081 01545 06256 20836 Coeffrerence Escrmace Std Error 2 Value Prom Intercept 003303 020932 0158 0875 xi 509370 075442 0752 146e 11 var Sgn1fcodes oww0oo1ww0o1 w 005 U 01 d d 1 Drspersron parameter for blnonual fanuly taken to be 1 Hull devranee 23414 on 168 degrees of freedom Reerdual devranee 14227 on 167 degrees of freedom AIC 14027 Number of Frsher Scorrng rteratrons 5 Jone POL 21 Reeearch Melhode Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE MLE gt Last time we covered basic gt MLE revealed gt Step 1 find a suitable probability function gt Step 2 Evaluate the score function alog 0 12 80 gt If multiple parameters are estimated evaluate the elements of U09 alog 09 7 13 k 80k gt What is the score function gt It is the derivative of the log likelihood wrt the parameters Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE MLE gt Evaluate really means setting those derivatives to O and solving for 0 Why 0 That is the point at which a tangent wrt log likelihood function is flat V Often however an analytical solution may not be possible This is particularly true for models with nonlinearities As such we need to iteratively find the maXimum lteratively could mean a grid search VVVVV Or searches using different kinds of optimizers Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Optimizers b Algorithms to find the maximum or minimum depending on how you parameterize these things V Technically these are methods to solve unconstrained nonlinear optimization problems aka hill climbers There are a variety of them They do not all behave the same way The choice is usually not yours to make VVVVV although you can control the optimization choice Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Optimizers V The basic issue is how do you know you39ve reached a maximum V Commonly used optimizers Newton Raphson or quasi Newton methods BFGSBroyden Fletcher Goldfarb and Shanno gt Both make use of Hessian or approximate Hessian and score vector to update estimates over 0 V The hill climb stops when you reach the top V The top is usually determined to be reached when the change in parameter values from one iteration to the next is effectively nil Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Hessian importance of gt The Hessian matrix 7 82 log 0 7 8080 gt Useful to us in order to compute estimates of uncertainty ie variances and covariances gt The negative of the expected value of the Hessian is called the Fisher Information Matrix gt It looks like H6 14 399 EH9 15 gt The inverse if it exists gives us the variance of 0 l0 1 var0 Ell Wi H 1 82 log 0 71 E l 16gt Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Can we do this on our own gt Let39s play around with an optimizer gt Consider the Newton Raphson optimizer gt Let39s start with the Hessian assume two parameters 31 and 52 Hessian 32 og 9 32 log 9 2 Og 0 31913191 3 13 2 9 17 32 og 9 32 log 9 81923191 31923192 gt Newton Raphson algorithm 82log 0 ilalog w 01 0 7 lt 8080 gt 80 18 Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Newton Raphson gt We39ve seen these parts before 01 007H01U0 19 gt The first part is the inverse of the Hessian ie the var cov matrix and the second part is the score vector gt lntuitively it makes sense to consider both functions gt Why The score function gives you the gradient or the direction of the change in parameter values while the Hessian gives you the rate of change of the function is it increasing quickly or slowly gt At a maximum the first derivative should be 0 If the second derivative is 0 the function is concave down V This means the tangent lines wrt the function are downwardly slope Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Newton Raphson gt You achieve convergence when the incremental change in 0 is effectively 0 gt Alternatives to Newton Raphson gt Method of Steepest ascent only uses score function often not a good choice gt Scoring methods replace inverted Hessian with inverse of information matrix gt BHHH BFGS etc gt They do not all behave exactly the same gt However with well behaved data and pdfs they will all tend to the same answer Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Logit and Newton Raphson gt Let39s work through an example to illustrate all the moving parts gt We39ll estimate a logit model gt The distribution function for the logistic distribution is given by eXPZ 1 eXpZ 1 1 eXp7Z A 20 PrY1 Z Z0 kXik Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE gt The logit model is obtained by taking the logistic transformation which yields Pi 7 loglt1ipigt T Z 7 30 51Xi1 32X BkXik K Z kXik 21 k0 gt In the language of GLMs Z is a link function gt The model is linear in the log odds ratios gt The distribution is nice because a It is unbounded in the log odds b PrY 1 is bound in the interval 01 c The probabilities are a nonlinear function of covariates gt We have a distribution function All we need are data Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Logit and Newton Raphson gt I use some simulated data and estimate a logit model with one covariate X and binary response variable y gt Rcode logit1lt glmd x1 familybinomiallinkquotlogitquot gt glm uses the Fisher scoring algorithm gt The logit function is globally concave and so Fisher method should be generally equivalent to other optimizers gt Known results 30 0033 0209 31 5094 0754 gt log likelihood 77113 Jones POL 213 esearch M elho ds Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Optimizing a Logit Model b V V V V V Let39s start from scratch We know the loglikelihood for a generic BRM is given by log mi 2 log y log Pom n 7 ya Iogu e Fawn 39 22 Look familiar Think binomial family Logit loglikelihood og 6 yi logi 1 7 yog17 A 23 What did I just do We39ve got a function we39ve got data but we don39t have any estimates Time to iterate using equation 8 POL 213 Jo nes esearch M elho ds Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Optimizing a Logit Model gt We39ve got to start somewhere gt How about OLS ie y 5051X1 gt This returns 50775 gt We39ll start with these values Jones POL 213 Research Methods l I The functlon 11klogtltfunct10nparamyx cons lt7 rep1 lengthx1 4 A A E c o 39b a x39 squ ylog1exp xb i ylog1expxb Jone Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Iteration 1 Score Vector 1 7 7181263 U lt 181095 gt 24 Information Matrix 1 17 1026 1003 I 7 1003 1146 25 Directional Vector A31 71413 11 lul 21589 26 Updating A1 A1 7 1507 71413 3 A 1775 21589 A 1094 3962 31364 27 Loglikelihood 71051110 Deviance 2101221 Jones POL 213 M elho ds Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Iteration 2 Score Vector 27 71603 U lt 4262 28 Information Matrix i034 001 2 71 7 II 1 7 2001 2299 29 Directional Vector A30 70049 21 luz lt 1272 gt 30 Updating A A 7 413 70049 2 2 7 3 A lt2589gtlt1t272gt A 045 3963 4636 31 Loglikelihood 774 467 Deviance 148 934 Jones POL 213 Methods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Iteration 3 Score Vector 37 7249 U lt 875 gt 311 7 041 7 001 7 7 001 i481 Directional Vector A30 PW 35 A A 045 70011 3 3 7 3 A 4636 lt 421 gt A4 7 034 3 5057 Loglikelihood 771329 Deviance 142 658 Jones POL 213 Information Matrix Updating M elho ds 32 33 34 35 Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Iteration 4 Score Vector 47 7 012 U lt 065 gt 4117 044 7 002 7 7 002 i562 Directional Vector A30 41 47 70001 I U 7 036 A4 A4 7 034 70001 3 A 5057 036 A5 7 033 3 5093 Loglikelihood 771 134 Deviance 142268 Jones POL 213 Information Matrix Updating M elho ds 36 37 38 39 Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Iteration 5 Score Vector 57 5000 U lt 000 gt 40 Information Matrix 5 1 7 044 5002 II 1 7 402 5569 41 Directional Vector A35 5 1 5 7 7278537 06 i39 i U lt 24453704 42 Updating A5 A5 033 70000 3 A 5093 000 A 5033 3 5094 43 Loglikelihood 7715133 Deviance 1425266 Jones POL 213 M elho ds Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Optimizing a Logit Model gt At iteration 5 we stop gt Why The incremental change in B is effectively 0 gt Ifwe kept going we39d get nowhere gt Our MLE estimates are thus 60 0033 0209 61 5094 0754 log likelihood 771133 Deviance 142266 gt Return to canned routine Jones POL 213 Research Methods quot x1 family binomial trace TRUE glmformula d Deviance Residuals Min 1Q Median 3Q Max 20330 06081 01545 06256 20836 Coefficients Estimate Std Error z value Prgtz Intercept 003303 020932 0158 0875 X1 509370 075442 6752 146e11 Signif codes 0 0001 001 005 01 Dispersion parameter for binomial family taken to be 1 Null deviance 23414 on 168 degrees of freedom Residual deviance 14227 on 167 degrees of freedom 146 27 AIC Number of Fisher Scoring iterations 5 esearch M elho ls Jones POL 213 Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Optimizing a Logit Model Note again that GLM uses Fisher scoring used quasi Newton methods D D gt Both return same results gt Again this need not always be the case D Good data gone bad Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Red Flags V VVVVVV V V If a function is globally concave the maximum is usually THE maximum If not local maxima may exist What to do Try different starting values attempt a grid search over different ranges of H Nonconcave functions The Hessian isn39t invertible Can happen frequently A BIG problem ifyour model stops iterating with this kind of error Unproductive steps Not a big deal unless all the steps are unproductive Flat likelihoods lots of iterations Data problems scaling limited variance POL 2 Joi earch Methods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE MLE Estimators Properties of gt MLE typically only has largesample properties gt That is they hold asymptotically gt There is a premium for large n gt and a price to pay for small gt Consider the optimizers they are data intensive gt Consistency is a probabilistic statement nangoPrl676llt61 6gtO 44 gt Or puma 9 gt CramerRao Theorem a lo L a 1 vaIH 2 lt75 aggags D 45 gt Under certain regularity conditions the variance of the MLE will be lower bound This is asymptotic ef ciency Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Binary Response Models Logit and Probit b b b b We now have an idea how these estimators work Finally let39s consider some regression models Sometimes the OLS estimator simply will not work well for us The assumption of normally distributed errors may not hold for some DGPs V Further for some kinds of response variables classic Gausian assumptions will not hold V To begin moving toward ML methods let39s consider what happens to the OLS estimator in a very specific setting Y10 V V In other words let39s consider a regression model again and hope that it gets us thinking about some other way of proceeding Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE When your response variable is binary A regression sidetrip V Typically assume y is unbounded V Since the distribution ofy is a function of e if we assume 6 N then y must also be unbounded Hence y 30 l BX l e y must be unbounded V V V With a dichotomous response variable this clearly isn39t the case V Sometimes we think of dichotomous y as representing latent utilities or probabilities V As such y really measures y which reflects underlying and continuous probability in the interval 01 Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE When your response variable is binary gt Problem Linear change in Ey lx not natural gt Why Pl 1 Pi Py 0 17 Pi Q ElY Pi1 Qi0 ElY Pi1 gt In OLS the coefficients give us the expected change in y 1 for a unit increase in X gt This won39t always make sense Jones POL 213 Research Methods Illustration Regression on dichotomous y 1pdacltsreaddcalt dclassespo16811o9cprobrcdcad smaryupdat attachCLpdat p1ltsp1otx1 d t1tle D1chotomous y Concrnuous x regltlmdx1 sumlliaryreg Call lmformula d x1 Resrduals 1n 1Q Medran 3Q Max 7079552 7025913 003246 026522 081012 Coeffrerencs Escrmace Std Error c Value Prom Intercept 050673 002904 1745 lt2es1e We x1 006814 1137 lt2es1e var Sgn1fcodes oww0oo1ww0o1 w 005 U 01 d d 1 Resrdual standard error 03774 on 167 degrees of freedom Mulcrple RsSquared 04365 Adjusted Rssquared 04331 Fsscansm 12s 3 on 1 and 167 DF psvalue lt 2 2es1e Jone POL 21 Research Methods Bradford 5 Jon rch M 39thods Bradford S Jones UCDavis Dept of Political Science Today MLE Interpretation gt For a unit change in X Ey changes by 77 units gt yhatltfittedva1uesreg plotd X1 ablinereg gt gt Oops gt 9 a25123 Cannzberwobahh ed Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Interpretation We could impose constraints fix 7 to lie in unit interval 01 This is fairly ad hoc But wait There are other problems g 0 23AM VVVVV 1 172334 02 072314 gt V x 6 assumes two values y y gt Plot them and the squared residuals Jones POL 213 Research Methods Bradford 5 Jon res dua s rch M 39thods adford 5 Jo res d Squared Q7 Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Heteroskedasticity gt This is a problem X is systematically linked to 6 gt Since the variance is a function of the squared residuals and since the squared residuals are a function of X the variance in the model isAa function of X ThisAis heteroskedasticity EO Z kxi Pi and 1 Z kxi Q gt Noting without proof that Vark 1 Z kxilzpi 23 gt vare Qi2Pi l Pi2Qi QiPiQiPi QiPi1 Pi Pi QiPi 1 i Z kXiXZ kxi gt The variance of is a direct function of X it39s non constant Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE OLS gt Heteroskedasticity means the estimator is no longer minimum variance gt Thus even though 3 are unbiased the estimator is no longer efficient gt That is it39s not best linear anymore gt There are fix ups weighted least squares gt What about an alternative estimator Jon POL 213 Re arch M9ll0 Initial consideration of a logit estimator gt lothJuod lta glmdx1 blnonual sumarylo tdat Call glmformula d x1 falluly blnonual Devranoe Resrduals 1n 1Q Medram 3Q Ma 720330 O6081 01545 08258 20838 Coeffrorencs Escrmace sod Error 2 Value Prgtizi Intercept 003303 020832 0158 0875 xi 5 075442 0752 148es11 var Sgn1fcodes oww0oo1ww0o1 w 005 U 01 d d 1 Drspersron parameter for blnonual fanuly taken to be 1 on 168 degrees of freedom 1 devrance 234114 am on 167 degrees of freedom Mul Resrdual devr e 14227 6 Number of Frsher Scorrng rteratrons 5 Jone POL 21 Research M elho k Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Initial Logit gt Doesn39t look like regression of the OLS variety gt It is a linear model however gt But only in one part of it gt Suppose we like the probability interpretation gt In regression the response function for probabilities was awkward ie a straight line gt The Pry 1 or Pry 0 is ever increasing decreasing wrt AX gt Probabilities probably exhibit marginality gt What do the logit probabilities look like Jones POL 213 Research Methods Bradford 5 Jon fmedgmd m Woman rch M 39thods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Logit b Logit resolves the functional form problem in terms of the response function in the probabilities Note that in the probabilities logit is a non linear model Suppose Ey Py 1 l X kaik and y is binary VV 1 Prly 1lX1 eXP ZBkXik gt Let Z Z kxik then 1 eXpZ 1 exPZ 71 eXpZ This is the cdf for the logistic distribution Problems Solved Z is unbounded P must stay in unit interval P is nonlinearly related to parameters though logit is linear Prediction of 1 or O impossible Pry1lx VV Jones POL 213 Search M elho ds Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Logit gt Why no perfect prediction Pi 7 Ioglt1ipigt 72 gt If p 1 division by 0 gt Consider log odds gt If p 0 og0 undefined operation gt So what gt This also holds for probit Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Interpreting Logit gt Often stated logit coefficients are not naturally interpretable gt W R O N G gt They39re about as easily interpretable as any linear regression model gt The metric of the logit coefficients are as log odds ratios gt Thus changes in X are associated with 3 change in the log odds gt Since the model is linear in the log odds ratios the printed logit coefficients are simply not impossible to interpret gt Signs make sense gt Hypothesis tests make sense Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Interpreting Logit Odds Ratios The big issue is the scale Conversion from log odds to odds is simple Exponentiate B and you get an odds ratio VVVV These are easy to interpret though some argue against using them Example consider the model we estimated from before logp17 pi 03 l 509X1 Odds ratio for X1 exp509 16299 V V V Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Today MLE Interpreting Logit Odds Ratios b b b b There are a couple of ways to interpret this expx1 x 1 expx1 x O 16299 Note that the odds ratios will be proportional Take the ratio of odds for adjacent scores and they will all be the same expcoef 2 1expcoef 2 99 gives 1052257 expcoef 2 99expcoef 2 98 gives 1052257 Odds ratios are really useful perhaps most useful for dummy variables V V V Jones POL 213 Research Methods x x x x 710 705 00 05 10 Jones POL 213 Research Medlods M ulticategory Choice Models Brad Jones1 1Department of Political Science University of California Davis April 30 2008 Jones POL 213 Research Methods Jone Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Adjacent Category Logit Model gt The ideas of the previous slide can be summarized as log it on P gt Coefficients are indexed byj implying a multinomial model gt Probabilities are derived in terms of adjacent categories expo P39 vs 39 H1 39J 1expZC where ZC corresponds to the linear predictor from the adjacent logit model gt See last slide set for application Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Reconsiderations of Ordinality b Ordinality with respect to covariates may not always hold V Distinguishability between categories may not exist given some set of covariates gt Respondents may not view category labels as meaningfully different gt This in turn may call into question assumptions about ordinality of the response variable Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Anderson s Stereotype Logit Model gt Category labels may be loose stereotypes gt Demarcations assumed by a researcher may not exist in the mind of the assessor gt Anderson derived the stereoptype logit model 1984 Royal Star B gt It is really a canonical baseline category logit model Jones POL 213 Research Methods Jone Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Stereotype Logit Model gt Derived from the MNL PFYleX 7 7 IogW iajx h 17127171 gt As we39ve noted ordinality is not built into this model gt Anderson postulated PrYyJix 7 ng a xm 17172171 gt Main differences Model is one dimensional single parameter vector Parameters are weighted BJ jg Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Stereotype Logit Model j are scale parameters They are weights on the parameters Identification requires restrictions 11 1 gtJ O Ordinality exists subject to constraint 1 gt1gt gt2gt gtJ0 gt Distinguishability is now testable given 1 gt Model may be built up to incorporate multiple dimensions gt Saturated model is equivalent to MNL Jones POL 213 Research Methods l I In tt M gt Produces Anderson Stereotype Model gt gt unodstereo ltrrvglmImmJobs TrmtsHlspanlcsIDSelfPlace Morals t gdlff t PersonalRetro t multlnomlal Raani sunnaryltrno stereogt al rrvglmformula ImmJobs IrantsHuspanucs IDSelfPlace Morals gdlff PersonalRetro fauuly multlnomlal Rank 1 Pearson Reslduals d 3 logmu1mu4 744894 7038081 7025154 7013939 39102 logmu2mu4 742797 7044495 7031539 7014679 35604 logmu3mu4 740210 7058232 7047162 103204 16050 Coefflclents Value Std Error t value Ilvmat1 074862 0094984 78816 Ilvmat2 039212 0084823 46228 Intercept 1 74 28529 0 811086 75 2834 Intercept 2 72 78453 0 692810 74 0192 Intercept 0 60683 0 529454 71 1461 Tralts Hlspanlcs 2 37890 1035743 22968 IDSelfPlace 7045178 0337441 713388 Morals 3 94622 0 8 04 6 g 1 346905 0682961 50794 PersonalRetro 096762 0443165 21834 Number of llnear predlctors 3 Names of llnear predlctors 41 W I F 1 Irmr mr Research M elho ls l I Drepereron Parameter for mulcmomral fanuly 1 Reerdual Devramee 1939057 on 2312 degrees of freedom Log hkehhood 8696284 on 2312 degrees of freedom Number of Iteratrons 4 Jone POL 21 Reeearch Melhode Bradford 5 Jones UCrDaviS Dept of Political Science Anderson Model gt Note resmcuons on P gt Ordmahty constramts seem to ho d 1 gt M gt 3 gt 4 gt Gwen statwstwca swgm cance of 4 mstmgmshabmw common seems to ho d gt conveys mformatwon about the re atwe dwfferehce m We ghts assooated thh the covarwates gt Covarwate effects are argest for notvery hke y Vs somewhat hke y categorwes 4 Vs 3 gt The dwfference 5 about 39 on the 0g odds sca e Dwfference between 2 VS its about 25 gt Conswd r the m ra s sca e Logyodds for morahty sca e C1 vs C4 3 95 C2 vs C4 395x 749296 C3 vs C4 395x 392155 Odds are C1 vs C4 exp3 95 m 51 C2 vs C4 exp2 96 m 19 C3 vs C4 exp1 55 m 5 gt C ear y a dose connectwon to the BCL gt BCL does not t better than thws mode We mwght prefer to report thws mode Jone POL 21 Research Methods l I elog1t ImmJobs TraltsJhsparucs IDSelfPlace gdlff Morals PersonalRetro 61111 Stereotype 1og1et1o regreeemn Number of obs 774 148161 oh125 5870 Log hkehhood 86962836 Prob gt oh12 00000 1 ph111oons 1 ImmJobs 1 Coef Std Err z Pgt1z1 967 Conf Interval IrmtsJhs 72378897 1036031 7230 02 74409481 lt3483125 IDSelfPlace 451782 3373466 134 180 72094051 1112969 gdlff ls 73 946216 4 64 000 5612084 72 280348 1 e m lt0 8 o 0 6829146 7608 0000 74807637 72130662 8499482 7 o o Ora PersonalReNO lt9676152 4431091 7218 029 1836093 70991372 ph11 1 1 ph112 1 7486209 0949184 789 0000 5625843 9346575 ph113 1 3921207 0849045 462 0000 2257109 5585306 ph1 4 1 0 base outcome thetai 1 41286288 8105076 7529 0000 5873853 2696722 thet52 1 184526 6948011 7401 0000 1146311 71422741 theta3 1 lt6068263 5302372 7114 0252 1646072 4324196 theta4 1 0 base outcome ImmJoby 1 the base outcome Jone POL 21 Reeearch Melhode elogrt 1111105 Irartejrepamee IDSelfPlace gdrff Morals PersonalRetro d1m3 Iteratron 0 log lrkelrhood 796597497 Iteratron 1 log lrkelrhood e 796597497 Stereotype logretre regreeeron Number of obs 774 148161 ch1215 7737 Log hkehhood 86597497 Prob gt ch12 00000 1 ph111eo phr12 phr13 131124 phr22 con mummeww phr23 19111371 19111372 9 ph133 e ImmJobs 1 Coef Std Err z Pgt1z1 957 Conf Interval d 1 TraltsJhs 1 2467876 1083606 7227 0023 1581706 lt3340463 IDSelfPlace 1 517587 3660112 141 0157 7 1997817 1234956 gdrff 1 3226396 7167043 7450 0000 4163011 1820681 Morals 1 3876693 9036463 7429 0000 5646807 72104579 PersonalReNO 1 71150033 4829151 7238 0017 2096529 2035369 611112 1 TraltsJhs 1 72571045 101397 7254 0011 4166839 lt5837004 IDSelfPlace 1 0632245 3422315 018 0853 7 6075369 7339859 gdrff 1 1849108 6910179 7282 0005 73303479 lt5947384 s 1 73124366 8375387 7373 0 0 0 1766912 71 482821 PersonalReNO 1 lt8292917 4531768 183 0 067 71717502 0589185 l I a 118155411581 71914011 8964045 7214 0033 67093 71570909 IDSelfPlace 6359823 3075682 012 0907 6668403 6388049 3111111 lt5766786 6194035 093 0353 1189587 63843 14018151 71681575 745406 7226 0024 73142544 lt2206061 PersonalReO lt6862126 4096908 167 0094 71488192 1177666 ph111 1 ph112 o ph113 o ph114 0 base outcome ph121 o ph122 1 ph123 o ph12 4 1 0 base outcome ph131 o ph132 o ph133 1 ph134 0 base outcome checa1 4133276 82072 7528 0000 75941341 2724178 checa2 3341766 7521791 7444 0000 1816009 71867522 checa3 71304421 6487651 7201 0044 72575977 ltO32866 theta4 0 base outcome ImmJoby 1 the base outcome Jone POL 21 Reward Method8 l I mloglt ImmJobs Irarchxrspamos IDSelfPlace gdlff Morals PersonalRetro base4 Iteratlon 0 log llkehhood 710101141 Iteratlon 1 log llkehhood 796733 Iteratlon 2 log llkehhood 7966 87708 Iteratlon 3 log llkehhood 7966 87487 Iteratlon 4 log llkehhood 796597497 Multlnonual loglstlc regresslon Number of obs 774 LR oh1215 88 28 Prob gt ch12 0 0000 Log llkellhood 86597497 Pseudo R2 00437 ImmJobs 1 Coef Std Err z Pgt1z1 857 Conf Interval 1 15s s 1 2457878 1083808 227 0023 3340483 4581705 IDSelfPlace 1 17587 3880112 7141 0157 1234956 1887817 dlff 1 3225388 7187043 450 0000 1820881 483011 Morals 1 3875883 8038483 428 0000 2104578 5848807 PersonalReNO 1 1150033 4828151 238 0017 2035388 2088528 oons 1 4133276 82072 7528 0000 5941341 2724178 1 rchx s1 2571045 101387 254 0011 5837004 4 5 8 IDSelfPlace 1 lt0632246 3422315 O18 0 853 7 7338858 8075388 gdlff 1 1848108 8810178 282 0005 5847384 3303478 Morals 1 3124388 8375387 3 73 0000 1482821 4785812 PersonalReNO 1 8282817 4531788 83 0087 lt0589186 1717502 cons 1 3341766 7521781 7444 0000 1816009 1867522 1 11s H15 1 1814011 8884045 1570808 3870832 IDSelfPlace 1 0358823 3075882 7 8388048 Reward M elho d8 l I dxff 6766786 6194035 093 0353 1789587 Morals 1681575 745406 226 0024 2206061 3142544 PersonalReO 6852126 4096908 1 67 0094 7 1177666 1488192 cons 71304421 6487661 7201 0044 72575977 7032866 ImJoby4 1 the base outcome Jone POL 21 Reward Method Bradford 5 Jones UCrDaviS Dept of Political Science Anderson Model gt Stereotype model gives a test for ordinality and distinguishability gt It also allows for implicit comparisons with higher dimensional models gt If it holds it has the desirable feature of being reduced rank Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Continuation Ratio gt Assumption of explicit ordering will be made gt Under the CR model it is assumed that subjects cannot progress to scorej until theji 1 score is passed through gt Let Y record scores j 1k gt As probabilities PrY j l X X 6jPrYjl YZJ7XXW 1 gt Q are the continuation ratios Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Continuation Ratio gt Model given by G6jaj x7 1177k 2 gt Specifying a logit link the model becomes 53900 7 3 filmml 3X7 which is sometimes referred to as the continuation ratio logit gt Specification of the model in terms of the complementary log log link gives 0g0g15jxl aj x 4 gt Major assumption in 3 is proportional odds in proportional hazards gt Scale differs but both often yield similar results Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Continuation Ratio gt Previous models are forward continuation ratios possible to consider backward continuation ratios V In general models will be different because invariance condition does not hold V Intuition easy to see V Suppose there are four outcomes j 07 127 3 Outcome two cannot occur until after event one three until after event 2 and so on Define a series cutpoints 0 vs 0 1 vs 1 2 vs 2 and 3 vs 3 and define a dummy variable at the cutpoints V V Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Generalized Continuation Ratio gt The explicit continuation ratios are for the first cutpoint PrY 0 I02 04 5X7 5 for the second PrY 1 I02 04 5X7 6 for the third IO PrY 2 g PrY 2 2 and for the fourth PrY 3 7 log 7 04 x 8 Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Continuation Ratio Model gt In 5 8 the number of observations successively dwindles as observations exit the risk set gt Last logit may have a small n problem gt Models 5 8 can be estimated as a single binary logit model gt Identiny cohort parameters 0401 ltgt Y00 otherwise 0411 ltgt Y10ltgtY2 0421 ltgt Y20ltgtY3 0431 ltgt Y30ltgtY4 gt Treat these as the 09 in gt and estimate a binary logit or specify some other Bradford 5 Jones UCrDaviS Dept of Political Science Close Cousins b b VV VVV Close connection to proportional odds model discussed more in paper PO model with cloglog link is equivalent to CR model with cloglog link if parameterized in a particular way Proportional oddshazards must be checked Non proportionality can be handled by interacting covariate with cut points G6j aj xaj1x j 1k 10 Gives rise to a variable cut points model with nice interpretation that covariates can have separate effects for different events Therefore close connection to partial proportional odds Also very close connection to mover stayer models Estimable in R using Design library Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Choosers and Choices gt There are many logit type models for multicategory choice data gt We typically fixate on ordinal or multinomial logits gt even though these are not always going to hold or be of most natural interest gt Now we want to consider some different ways of thinking about modeling categorical data gt Let39s first consider a conditional logit model Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Conditional Logit gt 3 individuals who choose among J alternatives J 2 3 gt eg votersi partiesJ j 12J gt RUM forumalation Uij BX 6039 gt Uij Bjo leldeology Gender ngEducation gt Under multinomial logit X can have different effects for each J gt Hence MNL estimator expxm Pquot J 1 Z26XPX B Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Conditional Logit VVVVV b As in binary logit we can derive a linear model for the log odds lo Ll E x g Pril j I For J 71 alternatives 3 J 71 non redundant logits Under this model we39re explicitly concerned with X These are attributes of the the ith individual That is attributes of the chooser But suppose choice is conditioned on chooser and choice attributes Leads to consideration of conditional logit model Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Conditional Logit gt XU are covariates that vary across choosers and choices party proximity scores gt W are covariates that vary across choosers but not across choices gender education ideological self placement V Hence choice is conditional on 2 Xi W V Consideration of these factors might lead to a utility model like UU Bl Proximityg39yj039yj1Ideologyrluvg Genderiyj3Education gt We could extend the MNL model in principle Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Conditional Logit gt Extension would lead to the conditional logit estimator EXPWJXU WWI 211 eXPWJXij WWI Prij JeXPBjXUeXPVjW 11 Zk1 eXPWJXU eXPWjWi Prij gt Simplifying exijXij J 12 k1 exijXij This is the conditional logit estimator What happened to y Any factor that is constant is dropped out of the model Since the W are fixed within choosers they cannot be estimated gt Conditional logit is sometimes called a fixed effects I ll I Jones POL 213 Research Methods Prij b b b b Bradford 5 Jones UCrDaviS Dept of Political Science Conditional Logit gt The conditional logit estimator is algebraically equivalent to a multinomial logit model V It seems a limiting feature of the model is inability to estimate 39yj directly gt However we can retrieve these relationships through interactions V This model is commonly applied in models of comparative voting behavior V Let me illustrate with some madeup data supplied by Stata V For this application I will use Stata Jones POL 213 Research Methods Bradford 5 Jones UCrDaviS Dept of Political Science Conditional Logit gt Data structure is important gt Since we39re modeling attributes of choices we need choice level data gt Hypothetical example choice among 3 automobile types American European and Japanese Jon POL 213 Re arch Melliolt use http www stata press comdataIScholce llst 1d car cholce dealer sex lncome 1n 112 1d car cholce dealer sex lncome 1 1 Amerlcan o 18 male 46 7 2 1 Japan 0 8 male 46 7 3 1 Europe 1 5 male 467 4 2 Amerlcan 1 17 male 261 5 2 J 0 male 26 1 6 2 Europe 0 male 261 7 3 Amech 1 12 male 32 7 8 3 Japan 0 6 male 32 7 9 3 Europe 0 2 male 327 10 4 Amech o 18 female 43 2 11 12 Jone POL 21 Reaearch M elho la Bradford 5 Jones UCrDaviS Dept of Political Science Conditional Logit gt Note which are the choice and chooser factors gt In multinomial logit the choice factors are usually not considered gt Note difference in data structure between this setup and traditional MNL model Jones POL 213 Research Methods l I 11 1d car chome dealer sex meome 1n 112 1f chem We 1 1d car chome dealer sex meome 1 a m 1 3 1 1 7 1 41 1 2 1 1 1 7 1 3 1 1 7 1 11 1 4 Jap 1 7 female 492 1 7 Jone POL 21 Reaearch Melhoda Bradford 5 Jones UCrDaviS Dept of Political Science Conditional Logit b Usually in MNL setup only factors fixed within choosers are are considered gt Though since there is an equivalence between conditional logit and MNL we should be able to retrieve common information We will demonstrate this V V Let39s estimate a conditional logit using Stata39s clogit option gt Suppose we try to include all factors ie zJ XU7 W Jones POL 213 Research Methods l I sexo o 1 Iteratlon 2 3 4 Iteratlon Condltlonal flxed effects loglstlc holce lncome omltted due to dealer lncome sex groupld no wrthrnagroup varrance mltted due to no ulthlrrgroup Varlance l regresslon Number of obs 88 LR ch121 12888 Prob gt ch12 0oooo Log llkellhood 6014988 Pseudo R2 01854 cholce 1 Coef Std Err z Pgt1z1 857 Conf Interval dealer 1 0882278 0080857 1058 0000 0784004 1140548 Jonee POL 21 Reaearch Methoda

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

#### "I made $350 in just two days after posting my first study guide."

#### "I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

#### "Their 'Elite Notetakers' are making over $1,200/month in sales by creating high quality content that helps their classmates in a time of need."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.