### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# STAT METHODOLOGY I STAT 613

Texas A&M

GPA 3.5

### View Full Document

## 24

## 0

## Popular in Course

## Popular in Statistics

This 7 page Class Notes was uploaded by Celestino Bergnaum on Wednesday October 21, 2015. The Class Notes belongs to STAT 613 at Texas A&M University taught by Jianhua Huang in Fall. Since its upload, it has received 24 views. For similar materials see /class/225744/stat-613-texas-a-m-university in Statistics at Texas A&M University.

## Popular in Statistics

## Reviews for STAT METHODOLOGY I

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 10/21/15

Reading report on Gideon Schwarz 1978 Estimating The Dimension Of A Model The Annals of Statistics7 Vol6 No27 4617464 by Souparno Ghosh for Stat 6137 Professor Jianhua Huang Abstract This report discusses the Bayesian lnoformation Criterion for model selection which was proposed to remove some inherent fallacies present in various classical model selection criteria 7 viz 7 A10 and Mallow7s Op In the classical set up we study how several model t a dataset and settle on any one of them This 7 nal model7 and its MLE of the parameters are then chosen work with However this approach is awed in the sense beacuse it ignores the consequence of Model uncertainties JESS A 7 2005 7 1687 part 37 4697472 This gives rise to the need of dealing this problem in a Bayesian paradigm BlC is important beacause it was one of the rst attempts to deal the problem of model selection in a Bayesian Frameworrk 1 The problem Choice of appropriate model has always been an extremely important and much de bated topic in Statistics When we are faced with the problem of selecting a model of appropriate dimension from various competing models7 Maximum Likelihood cri terion always leads to choosing the model of highest possible dimension The problem discussed here is to choose a model of appropriate dimension which im poses suitable penalty for incorporating excess number of parameters The advantage of a parsimonious model is that the variance of the parameter estimates and that of the predicted values are much lower than the corresponding high dimesional models Thus a parsimonious model yields tighter con dence bound and predictive bounds than a higher dimensional model 2 Literature review The earliest and the most heuristic idea that served as criterion for model selection was Adjusted R2 which is basically a correlation measure between the observed and the predicted values using a given model multiplied with a penalty term Akaike A New Look at the Statistical Model ldenti cation77 IEEE Transactions On Automatic Contr0l7 1978 7 Vol 19 7 7167723 proposed a criterion of model selection wherein the Likelihood function LM for each model M is maximised and the model for which 72logMJ 2k 7is minimum is chosen as the working model7 kj being the dimension of the model Mallows propsed another the Op criterion given by RSSp p 82 2p7n where p is the no of parameters 7 n is the number of observations7 RSSp is Residual sum of squares using p variables and 32 is an independent estiamte of error More recently Mcquairrie amp Tsai 1998 propsed an empirical correction in A10 for small sample size The criterion named A100 is given by Awflog r n 7171072 All these criteria suffer from the drawback outlined in the previous section This necessitated the concept of BIG which no longer remains just a variant of AlC7 infact Kass amp Raftery developed Bayes Factor JASA 1995 7 Vol90 7 No430 7 7737795 The Bayes Factor for model M is given by PYlePMj BFltMjgt 2725 PYlePMj In this paper they showed asymptotically BlC is an approximation to the logarithm of Bayes Factor pp 778 3 The approach A full Bayesian approach7 ie 7 assuming the parameters and the model order to be random variable by themselves7 was undertaken to determine the optimal model Let the observations X17 X27 7X77 comes from an exponential family7 ie 7 WW MM96 59 1 where 6 E 9 7 9 is a convex subset of a K dimensional Euclidean space 7 Y is a K dimensional suf cient statistic The competing models are de ned in the set 771 97 771 being a k dimensional linear sub manifold of a K dimensional space 31 Prior Speci cation The prior probability that Model j is true is given by 7T0 a1 2 2 The conditional prior distribution of the model parameter 6 given the model is new 1 lt3 Thus the joint prior distribution of the model order and parameter is given by MW 7T9lj7r Z CYij 4 KW We assume 7T0lj is bounded and locally bounded away from zero throughout mj G This assumption implies mutual orthogonality of W 7 since intersection of two distinct linear manifolds is generally the one with lower dimension Now using the Bayes formulation we choose the model having the highest posterior probability 7 ie 7 choose j that maximizes sltmgt log 7 eajexpltnltm7blt6gtgtdmlt6gt 1 7L Yo 21m 4 Main results Proposition For xed Y and j 7 as n 7 00 so nJ nSupme 7 Me 7 gloom R Where the remainder R RY7nj is bounded in for n for xed Y and j The Proposition is proved with help of the following Lemmas and Result Lemma 1 The proposition holds when YOQ 7 b0 A 7 AW 7 ow where A gt 07 00 is a xed vector in mi and M is Lebesgue measure on 771 Lemma 2 If two bounded positive random variables U and V agree on the set where either exceeds p for some 0 lt p lt sup U 7 then as n 7 007 10gEU 10gEV 0 Result 2 As 71 7 007 1 EUn 7 sup V 3 Lemma 3 For some 0 lt p lt 6 4 7 where A supY06 7 b6 7 a vector 00 7 and some positive A1 7 and A2 7 the following holds wherever expY06 7 b6 gt p A7 m6 79on lt Yea 426 lt A 7 MW 60W 5 Technical Details Proof Lemma 1 5Y7n7j10g ajexpWM 7 l l9 7 90H2dw9 5 39mi Without loss of generality 7 we assume 1 0 Note that 7Tt9lj is independent of 6 7 ie 7 the conditional prior distribution of 6 is uniform So 5 yields 00 671939 SY 39 1 A2 dk k02 7n7j DE 0 76 0 77 J n 7r 1 lt mi 7 W2 OE 0 75 71A 1 71A 7 akaloan R where 1 A R logaj 7 Elsilog Again we have SupA 7 W9 7 00W 7 A So SY7n7j 71A 7 gigloan B So the proposition is established for this case ProofResult 2 EW 7 E WV Sup V sup V 6 Again7 Ewmm WWWWgtampmV7mEWWWampmV7QWM Sup V 7 6 PV 3 Sup V 7 81n Sup V 7 8 PV 3 Sup V 7 81n Sup V76 as PV 3 Sup V 7871 Sup V 7 8 for all 8 7 i V m E WWn Z Combining 6 and 7 we get7 EV DV 778qu Proof Lemma 2 Four Cases arise in this situation 7 Viz 7 USMVSM U lt P7 V gt P 9 U gt AV E P 10 Ugtp7Vgtp 11 For the last three cases 7 ie 7 9710 and 11 U V by construction of U and V So the Lemma is triaVially true is these cases So we only have to show that the Lemma holds for V that vanishes where U S p ie7 Case 1 Thus we have U 3 AV 3 p m 7p 3 U 7 v g p 12 mm EV E V IU p E V IU gt m 0EwwwgtmuvgtmEwwwgtmuvgm Now gt p gt p i U V IUgtpIV p UZV EV EU IU gt p EU 13 5 Combining 12 and 13 we get7 EV S EU S EV 1 7 n p 7 EV 1 Now we have to show n p WJ70 as 71700 log 1 Using Result 21 we have EV 1n 7 sup V and since both U and V have the same supportt we have sup U sup V gt p So W tends to a limit strictly less than 1 Hence p71 0 EV p71 l 1 0 i 0g 1mm 1 logEU 7 logEV 7 0 as n 7 00 Thus we have ProofLemma 3 Since X715 belong to exponential family 7 the dispersion matrix is given by 19 19 which is positive de nite Therefore 1019 7 190 is strictly concave Suppose 190 be the point where the maximum A is attained Now expanding 1019 7 190 around 190 using Taylor expansion 7 we get 1 509 59 5090 5090 gb 6ollt9 90112 Now by construction 7 we have A Sup 509 59 5090 5090 1 Sup 50190 7 19090 A i i1 190mo i 901 2 Now7 de ning 2A1 and 2A2 to be larger and smaller than all the eigenvalues of 19 19 at 190 respectively 7 we arrive at the given inequalities 7 i e 7 A 7 A1116 7 90H lt 1019 7 196 lt A i A2116 7 90H for some neighborhood of 90 By construction p lt eA this bounds ezpY019 7 196 outside the neighborhood of 1 Lemma 1 proves rhe Proposition while Lemma 2 and Lemma 3 proves the bound edness condition required for the remainder R RY7 n7j 6 References Mallows7 CL 1973 Some comments on Op Technornetrics 15 7 661 675 Akaike7 H 1978 A New Look at the Statistical Model Identi cation IEEE Transactions On Automatic Control 7 19 7 7167723 Shibata7 R 1981 An optimal selection of regression variables Biometrika 68 7 45754 Kass7 R Kc Raftery7 A 1995 Bayes Factor JASA 9030 7 773 795 McQuarrie7 A D R Kc and Tsai7 CL 1998 Regression and Time Series Model Selection World Scienti c Editorial 2005 JESS A 7 168 7 4697472

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

#### "I made $350 in just two days after posting my first study guide."

#### "There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

#### "It's a great way for students to improve their educational experience and it seemed like a product that everybody wants, so all the people participating are winning."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.