### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# Mathematical Statistics II STAT 6720

Utah State University

GPA 3.72

### View Full Document

## 55

## 0

## Popular in Course

## Popular in Statistics

This 192 page Class Notes was uploaded by Geovanny Lakin on Wednesday October 28, 2015. The Class Notes belongs to STAT 6720 at Utah State University taught by Juergen Symanzik in Fall. Since its upload, it has received 55 views. For similar materials see /class/230497/stat-6720-utah-state-university in Statistics at Utah State University.

## Similar to STAT 6720 at Utah State University

## Reviews for Mathematical Statistics II

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 10/28/15

STAT 6720 Mathematical Statistics II Spring Semester 2008 Dr Jiirgen Symanzik Utah State University Department of Mathematics and Statistics 3900 Old Main Hill Logan UT 8432273900 Tel 435 79770696 FAX 435 79771822 e mail syinanzik nathusuedu Contents Acknowledgements 1 6 Limit Theorems 1 61 Modes of Convergence 2 62 Weak Laws of Large Numbers 15 63 Strong Laws of Large Numbers 19 64 Central Limit Theorems 29 7 Sample Moments 36 71 Random Sampling 36 72 Sample Moments and the Normal Distribution 39 8 The Theory of Point Estimation 44 81 The Problem of Point Estimation 44 82 Properties of Estimates 45 83 Suf cient Statistics 48 84 Unbiased Estimation 58 85 Lower Bounds for the Variance of an Estimate 67 86 The Method of Moments 75 87 Maximum Likelihood Estimation 77 88 Decision Theory 7 Bayes and Minimax Estimation 83 9 Hypothesis Testing 91 91 Fundamental Notions 91 92 The NeymaniPearson Lemma 96 93 Monotone Likelihood Ratios 102 94 Unbiased and Invariant Tests 106 10 More on Hypothesis Testing 116 101 Likelihood Ratio Tests 116 102 Parametric ChiiSquared Tests 121 103 tiTests and FiTests 125 104 Bayes and MinimaX Tests 129 11 Con dence Estimation 111 Fundamental Notions 112 ShortestiLength Con dence Intervals 113 Con dence Intervals and Hypothesis Tests 114 Bayes Con dence Intervals 12 Nonparametric Inference 121 Nonparametric Estimation 122 SingleSample Hypothesis Tests 123 More on Order Statistics 13 Some Results from Sampling 131 Simple Random Samples 132 Strati ed Random Samples 14 Some Results from Sequential Statistical Inference 141 Fundamentals of Sequential Sampling 142 Sequential Probability Ratio Tests Index 134 134 138 143 149 152 152 158 165 169 169 172 176 176 180 184 Acknowledgements I would like to thank all my students who helped from the Fall 1999 through the Spring 2006 semesters with the creation and improvement of these lecture notes and for their suggestions how to improve some of the material presented in class In addition I particularly would like to thank Mike Minnotte and Dan Coster who previously taught this course at Utah State University for providing me with their lecture notes and other materials related to this course Their lecture notes combined with additional material from a variety of textbooks listed below form the basis of the script presented here The textbook required for this class is o Casella G and Berger R L 2002 Statistical Inference Second Edition Duxbury Thomson Learning Paci c Grove CA A Web page dedicated to this class is accessible at http www math usu eduquotsymanzikteaching2006stat6720stat6720 html This course follows Casella and Berger 2002 as described in the syllabus Additional material originates from the lectures from Professors Hering Trenkler Gather and Kreienbrock l have attended while studying at the Universitat Dortmund Germany the collection of Masters and PhD Preliminary Exam questions from Iowa State University Ames Iowa and the following textbooks Bandelow C 1981 Einfu39hrung in die Wahrscheinlichkeitstheorie Bibliographisches lnstitut Mannheim Germany c Buning H and Trenkler G 1978 Nichtparametrische statistische Methoden Walter de Gruyter Berlin Germany Casella G and Berger R L 1990 Statistical Inference Wadsworth amp BrooksCole Paci c Grove CA Fisz M 1989 Wat L 39 quot 39 y und t39 Statistik VEB Deut scher Verlag der Wissenschaften Berlin German Democratic Republic Gibbons J D and Chakraborti S 1992 Nonparametric Statistical Inference Third Edition Revised and Expanded Dekker New York NY Johnson N L and Kotz S and Balakrishnan N 1994 Continuous Uniuariate Distributions Volume 1 Second Edition Wiley New York NY 0 Johnson N L and Kotz S and Balakrishnan N 1995 Continuous Uniuariate Distributions Volume 2 Second Edition Wiley New York NY 0 Kelly D G 1994 Introduction to Probability Macmillan New York NY 0 Lehmann E L 1983 Theory of Point Estimation 1991 Reprint Wadsworth amp BrooksCole Paci c Grove CA 0 Lehmann E L 1986 Testing Statistical Hypotheses Second Edition 7 1994 Reprint Chapman amp Hall New York NY 0 Mood A M and Graybill F A and Boes D C 1974 Introduction to the Theory of Statistics Third Edition McGraw Hill Singapore 0 Parzen E 1960 Modern Probability Theory and Its Applications Wiley New York NY 0 Rohatgi V K 1976 An Introduction to Probability Theory and Mathematical Statis tics John Wiley and Sons New York NY 0 Rohatgi V K and Saleh A K E 2001 An Introduction to Probability and Statistics Second Edition John Wiley and Sons New York NY 0 Searle S R 1971 Linear Models Wiley New York NY 0 Tamhane A C and Dunlop D D 2000 Statistics and Data Analysis 7 From Ele mentary to Intermediate Prentice Hall Upper Saddle River NJ Additional de nitions integrals sums etc originate from the following formula collections 0 Bronstein l N and Semendjajew K A 1985 Taschenbuch der Mathematik 22 Au age Verlag Harri Deutsch Thun German Democratic Republic o Bronstein l N and Semendjajew K A 1986 Erga39nzende Kapitel zu Taschenbuch der Mathematik 4 Au age Verlag Harri Deutsch Thun German Democratic Republic o Sieber H 1980 Mathematische Formeln 7 Erweiterte Ausgabe E Ernst Klett Stuttgart Germany Jurgen Symanzik January 7 2006 6 Limit Theorems ase on o atgl7 apter 7 o atgl a e 7 apter ase a erger7 B d Rh Ch 6Rh Sthh 6ampC llB Section 55 Motivation I found this slide from my Stat 2507 Section 0037 Introductory Statistics77 class an under graduate class I taught at George Mason University in Spring 1999 I 3ng Wylmsru 4mm Wick WM M M L 4 Wm H m 1 x J Wilym 11m V i W m x WWWW W MMAW WM 4394 whiny k VF W W39 1T39Ewia w tmhmg t MJW W ma MWM39 mu Mmmm 1444 WW I WW 39 WWmexmpqmmm MMWFMWW X 39M mm W4m4m Ww mtm El 4 km 9 Mssz 4mm 4m ML WMIAM MillAM MMWEMM WMy y I 4174 m4mw anlwrdw mm4mwmm MMWMMWWdQWWJJa mmm Mam my wt rfMWn W iufmabvy W Xv 5 11 W What does this mean at a more theoretical level Lecture 0 We 010704 61 Modes of Convergence De nition 611 Let X1 Xn be iid rv s with common cdf Let I IX be any statistic ie a Borelimeasurable function of X that does not involve the population parameters 19 de ned on the support X of X The induced probability distribution of IX is called the sampling distribution of l Note i Commonly used statistics are 7L Sample Mean Xn 7L Sample Variance 52 i 7 Xn2 1 Sample Median Order Statistics Min Max etc 7 ii Recall that if X1 Xn are iid and if EX and VarX exist then u EX ES2 a2 wax and VarXn 5 iii Recall that if X1 Xn are iid and if X has mgf MXt or characteristic function ltIgtXt then MEX MX 0r QEW 1Xn Note Let Xn il be a sequence of rv s on some probability space QLP ls there any meaning behind the expression lim Xn X 7 Not immediately under the usual de nitions Hoe of limits We rst need to de ne modes of convergence for rv s and probabilities l De nition 612 Let Xn il be a sequence of rv s with cdf s E3311 and let X be a rv with cdf F If a at all continuity points of F we say that Xn converges in distribution to X Xn L X or Xn converges in law to X Xn L X or Fn converges weakly to F Fn i F I Example 613 Let X N N0 i Then Mm f expltint2gt dt 700 ZJ n m exp732 d8 700 27139 Moo 1 if m gt 0 anm a 0 g ifz0 ltIgt7oo 0 ifxlt0 1 x20 HFXW 0 lt0 x Fnz a FXz where ltIgtz PZ z with Z N N0 1 the only point of discontinuity is at z O Everywhere else So Xn L X Where PX 0 1 or Xn L 0 since the limiting rv here is degenerate 1 e it has a Dirac0 distribution l Example 614 In this example the sequence E3311 converges pointwise to something that is not a cdf Let Xn N Diracn ie PXn n 1 Then Fnm0 zltn m 2 n It is a 0 Vz which is not a cdf Thus there is no rv X such that Xn L X l Example 615 Let X73311 be a sequence of rv s such that PXn 0 1 7 71 and PXn n i and let X N Dirac0 ie PX 0 1 ltis 0 xlt0 17 0 xltn 1 271 0 xlt0 F m 7 X 1 20 It holds that Fn i FX but 1 EltXgto wl wn 1 71 n H 74gt EXk 0 Thus convergence in distribution does not imply convergence of momentsmeans l Note Convergence in distribution does not say that the Xi s are close to each other or to X It only means that their cdf s are eventually close to some cdf F The Xi s do not even have to be de ned on the same probability space I Example 616 Let X and Xn il be iid N0 1 Obviously Xn L X but lim Xn 7 X l Hoe Theorem 617 Let X and Xn il be discrete rv s with support X and 26 311 respectively De ne 00 the countable set A XU U Xn ak k 123 Let pk PX ak and nil W PX ak Then it holds that W L pk Vk 13 X L X I Theorem 618 Let X and X73311 be continuous rv s with pdf s f and fnf1 respectively If fnz L fz for almost all x as n L 00 then Xn L X l Theorem 619 Let X and Xn il be rv s such that Xn L X Let c E R be a constant Then it holds 1 XncigtXo ii an L cX iii If an L a and 1 L b then 1an 1 L aX b Proof Part iii Suppose that a gt 00 gt 0 If a lt 0 an lt O the result follows via ii and c L1 Let Yn aan 1 and Y aX b It is yib yib FyyPYltyPaXbltyPXlt FX a a 39 Likewise L b Fyltygt FXltyL an If y is a continuity point of Fy 37717 is a continuity point of FX Since an L 16 L b and FXnz L FXz it follows that Fy y L Fy for every continuity point y of Fy Thus aananaXh I De nition 6110 Let Xn il be a sequence of rV s de ned on a probability space 9 L P We say that Xn converges in probability to a rV X Xn L X P lin1 Xn X if Hoe hm Pan7Xlgte0 vegt0 Hm Note The following are equivalent hm Pan7Xlgte0 mace ltgt hm Pan7Xl e1 mace ltgt Pw anQu 7 Xw lgt 6 0 If X is degenerate ie PX c Xn such that PXn 0 17 1 we say that Xn is consistent for c For example let and PX 1 Then 1 7 0 lt 6 lt1 P X gt E 7 lt1 m gt O E 21 Therefore lin1 Pl Xn lgt e 0 V6 gt 0 So Xn L 0 ie Xn is consistent for 0 l Hoe Theorem 6111 1 XnLXltgtXn7XigtO ii XLXXLgtY PXY1 iii XnLXXmLXaniXmLOaannaoo iv XnLXYnigtYgtXniYnigtXil V Xn LXJs ER aconstant gt an L kX Vi XnLkJsERaconstant XQLW WEN Vii XnLaJnib ab BgtXnYnigtab viii Xniinglii ix XnLaYnigtb aERbEB70gtigt39 Lecture 38 We 112900 X Xn L XY an arbitrary rV gt XnY i XY xi X L XYn L Y XnYn L XY Proof See Rohatgi7 page 24472457 and RohatgiSaleh7 page 26072617 for partial proofs l Theorem 6112 Let Xn L X and let 9 be a continuous function on R Then 9Xn L gX Proof Preconditions 1 XrV gtVegt0 3kkePle gt k lt5 2 g is continuous on R gt g is also uniformly continuous on 7k7 k see De nition of uniformly continuous in Theorem 333 iii 35567k 3 W S klen Xl lt5 l9Xn9Xl lt 6 Let A le kwlXwlk B aniXl lt 6 w anw 7Xwl lt 6 C l9Xn9Xlltewl9ltXnwgXwl lt6 lfw eA B 239gtw E C gtA B g C gtCCQAOBCACUBC 1300 PAC u BC PAC PBC Now PlgXngXlZE S Ple gtk PanXlZ6 g by 1 g for n2n056k since XnivX Corollary 6113 i Let Xn L 37 c E R and let 9 be a continuous function on R Then 9Xn L 90 ii Let Xn L X and let 9 be a continuous function on R Then 9Xn L gX iii Let Xn i c c E R and let 9 be a continuous function on R Then 9Xn L 90 Theorem 6114 X L X s X L X Proof XnLXwP XniXl gt6 HOaanoo vegt0 It holds PXgmie PX E7 XWX SEPX E7 X7L7X gtE A g PX g m Pan 7 X gt e A holds since X g z 7 e and Xn within 6 of X7 thus Xn z Similarly7 it holds PXngm PXngx XniX 6PXn maniXlgte PX zePaniXl gt e Combining the 2 inequalities from above gives Pngie7PaniXl gte PXn z PX mePaniXl gt6 40 as mace an 40 as Hoe Therefore7 PXgmie Fnx PX E asnaoo Since the cdf s are not necessarily left continuous7 we get the following result for e l O PX lt as g g PX g z Let x be a continuity point of F Then it holds PX lt m g g XnLX I Theorem 6115 Let c E R be a constant Then it holds 1 XngtcltgtXnigtc Example 6116 In this example we will see that X L X 72gt X L X for some rV X Let Xn be identically distributed rV s and let Xn X have the following joint distribution X 01 X 0 0 10 11 551 Obviously Xn L X since all have exactly the same cdf but for any 6 E 0 1 it is Pan7XlgtePan7Xl11Vn so lim Pl X 7 X lgt 6 7g 0 Therefore X 71L X I Hoe Theorem 6117 Let Xnf1 and K3311 be sequences of rV s and X be a rV de ned on a probability space QLP Then it holds YWLXan7YnligtOgtXnigtX Proof Similar to the proof of Theorem 6114 See also Rohatgi page 253 Theorem 14 and Ro hatgiSaleh page 269 Theorem 14 l Theorem 6118 Slutsky7s Theorem Lemme 41 i i We 120600 Let Xnf1 and K3311 be sequences of rV s and X be a rV de ned on a probability space QLP Let c E R be a constant Then it holds 1 XnLXYnigtcgtXnYnigtXc ii X L X Yn L c XnYn L cX If c 0 then also XnYn L 0 iii XniXYnLcL ifc 0 Proof 1 A V 111 ynLcTh39LWYVCLo LYnLcYnXnLXnLcXnYnLXncL0 A 6130 X c L X c B X L X Combining A and B7 it follows from Theorem 6117 X Yn L X 1 Case 0 O VEgtO Vkgt071tls E Pl ann lgt e Pl ann lgt 614 P1XnYn lgt 614 gt k W Xn lgt e PltYn gt g Pl X lgt k Pl la lgt Since X L X and Yn L 0 it follows XnYn lgt e Pl Xn lgt k gt 0 as k gt oo Therefore7 XnYn L 0 Case 0 31 0 Since X L X and Yn L c it follows from 11 case 1 0 that ma 7 an X414 7 c L 0 s ann L an Th6114 mg L an Since an L 0X by Theorem 619 ll7 it follows from Theorem 6117 ann L cX Let Zn L 1 and let Yn oz 00 1 7 1 1 gt yLn 7 Z 2 Th6111vviii 1 p 1 gt V 4 2 With part 11 above7 it follows X L X and Yin L d X a c X gtYL De nition 6119 Let Xn 301 be a sequence of rv s such that Xn V lt 00 for some r gt 0 We say that Xn converges in the rth mean to a rv X Xn L X if X V lt 00 and hm El X 7 XV 0 Hm I Example 6120 Let X73311 be a sequence of rv s de ned by PXn 0 17 i and PXn 1 It is Xn V V gt O Therefore7 Xn L 0 V gt O l Note The special cases r 1 and r 2 are called convergence in absolute mean for r 1 Xn L X and convergence in mean square for r 2 Xn E X or Xn i X l Theorem 6121 Assume that Xn L X for some r gt 0 Then Xn L X Proof Using Markov s lnequality Corollary 3527 it holds for any 6 gt O 7 7 EH X X l gt gt P ET aniXlZEZPan7XlgtE XWLX hm Ean7XV0 MK 7 7 hm PQXVXlN hm MA naoo naoo ET XnLX I Example 6122 Let X73311 be a sequence of rv s de ned by PXn 0 1 7 i and PXn n i for some r gt O Foranyegt0PanlgteH0aanoosznigt0 ForOltsltrEanVnTlsHOaanoosznigtO ButEanV17Lgt0as naooszn7T gt0 l U omiu omiu omiu X T X souls 0 A X X DH mu gt A X DH Inn 4A X DH mu A X DH A X 4 X DH S A X X X DH KIIBIHHIS omiu 0 A X DH mu gt A X DH omiu omiu X T X emits 0 A X4 X DH mu S A X DH mu 4 A X DH A XAX DH S A X DH 4 A X DH A X DH A X 4 X DH S A X X 4 X DH meloeql Aquenbeul SJXSAAOXUIW IIIOJJ SAAOHOJ 11 L lt J JQH omiu A X 3 A X 3 mm 89MB 8 PH V BHIHIqUIOQ omiu E A X 3 mu gt A X 3 MPH OOHU OOHU 0 A X X 3 mm S A X 3 Inn 4A X 3 mu A X 4 X 3 S A X 3 4 A X 3 lt A X A X X HS A uXuX4X H A X H KIIBIHHIS mopueH 998 9g 939d 9851 mefefpueulegugeqsumg 01 amp sp10q 4 u 00PM V A X3gtA XW mu omiu 4 omiu omiu 0 A X X 3 mm gt A X 3 Inn 4A X 3 mu A X 4 X 3 S A X 3 4 A X 3 lt A XAX4 XH A XX4uXHA XDH 38pm 11 S x gt 0 log I CJOOJd gtSgt0103Xlt7 X I omiu pm 5A X 3 A X 3 um 1 38pm 11 Hem X T X 31 32019 meloetu gt 3130mm Xn W Em X W D Combining C and D gives ggngoiEo X W Em X W gt lim Xn V X V 7H Lecture 421 ii For 1 g s lt r it follows from Lyapunov s Inequality Theorem 354 Fr 120800 Em Xn 7X W E0 XVX W E0 Xn 7X19 Em 7X W Xni X ls Xni X 17 0 since Xn L X anLX An additional proof is required for 0 lt s lt r lt 1 De nition 6124 Let Xnf1 be a sequence of rv s on Q L P We say that Xn converges almost surely to a rv X Xn g X or Xn converges with probability 1 to X Xn 2 X or Xn converges strongly to X iff Pw Xnw a Xw as n a oo 1 Note An interesting characterization of convergence with probability 1 and convergence in proba bility can be found in Parzen 1960 Modern Probability Theory and Its Applications on I page 416 see Handout Example 6125 Let Q 01 and P a uniform distribution on 9 Let Xnw w w and Xw w For an 6 01 w a 0 as n a 00 So Xnw a Xw Vw 6 01 However for w 1 Xn1 2 7 1 X1 Vn ie convergence fails at w 1 Anyway since Pw Xnw a Xw as n a oo Pw E 0 1 it is Xn g X I Theorem 6126 X X s X L X Proof Choose 6 gt 0 and 6 gt 0 Find no 71067 6 such that 0 Plt Xn7X egt21i6 nno 0 Since XniXigeganiX e vn2n0itis nno 0 PXn7X eZPlt XniXigeQZlizS VnZnO nno Therefore P XniX 6H1asnaoo Thus XnLX I Example 6127 XnLXann X Let Q 07 1 and P a uniform distribution on 9 De ne An by A1 0617142 7 1 amp ampamp6 n A7 07 7148 i Let Xnw Anw It is P X 70 i e a 0 vs gt 0 since X is 0 except on An and PAn10 Thus X L 0 But Pw Xnw a 0 0 and not 1 because any to keeps being in some An beyond any no ie thi looks like 0 010 010 010 so X 0 I Example 6128 X L X 72gt X E X Let Xn be independent rV s such that PXn 0 17 i and PXn 1 ItisEXn70VEXn VEXn gt0asngtooszngt0 VTgt0and due to Theorem 61217 also Xn L 0 But m1 m2 PltXn0vmnnogt aibaL lx m m m1 71072 71071 77 no i 1 no Asnoaoo7itisPXn0 Vm n n0gt0 Vm7s0Xn7739 s39gt0 Example 6129 Xn E X 72gt Xn L X Let Q 01 and P a uniform distribution on 9 Let A 0 Let Xnw nIAnQu and Xw O It holds that m gt 0 Eng lnlno X E 0 But E X 7 0V 1 nn H00 Vrgt07s0Xn7T gtX mil 710 lt w gt Xnw 0 Vn gt no and Pw 0 O Thus7 62 Weak Laws of Large Numbers Theorem 621 WLLN Version I Let XiBil be a sequence of iid rv s with mean u and variance VarX 02 lt 00 TL Let K ZXi Then it holds i1 TlgngoPaniplZe0 Vegt07 ie Y L 11 Proof By Markov s Inequality Corollary 3527 it holds for all e gt 0 i E Y 7 2 V Y 2 POXTLLM ZEWLZ0ampS HOO E E 716 I Note For iid rv s with nite variance7 Y is consistent for u A more general way to derive a WLLN follows in the next De nition l De nition 622 7L Let XiBil be a sequence of rv s Let Tn ZXi We say that obeys the WLLN i1 with respect to a sequence of norming constants B32217 Bl gt 07 B1 T 007 if there exists a sequence of centering constants Ai il such that Bglm 7 An A 0 Theorem 623 Let XiBil be a sequence of pairwise uncorrelated rv s with M and VarXl 02 TI TL 17 V L ie N leo lZ a 00 as n a 007 we can choose A EM and Bn 0 and get i1 i1 i1 n 20939 i M Lecture 39 Fr 120100 Proof By Markov s Inequality Corollary 352 it holds for all e gt O n EZXi MW n n n L 1 PHZXiizmlgteZa f gt0asngtoo l i1 i1 i1 622 722 62 Z 02 11 11 Note To obtain Theorem 621 we choose An 7111 and B 7102 l Theorem 624 V L Let X3221 be a sequence of rV s Let Y 7 1 ZXi A necessary and suf cient condition in i1 for to obey the WLLN with respect to B n is that 72 E X12 gt 0 1 X Proof Rohatgi page 258 Theorem 2 and RohatgiSaleh page 275 Theorem 2 l asnaoo Example 625 Let X1 Xn bejointly Normal with O 1 for all i and CovXi Xj p ifliij l 1 and CouXX Oiflz ij lgt 1 71 Let Tn EX Then Tn N N0n 2n 71p N0 a2 It is i1 2 E X 7 E 7T3 72 7 2 2 1Xn n Tn 2 00 m2 Li m dm WO me 190 l 24 121 2 co UZyZ yz 7 Wd TWO n202y25 2 y 2 n2n1py2 7 7 5 27r 0 n2n2n71py2 y 7 00 2 S n2nz up 2 yze dey n 0 xE 1 since Var of N01 distribution H0 asnaoo Note We would like to have a WLLN that just depends on means but does not depend on the existence of nite variances To approach this7 we consider the following 71 Let XiBil be a sequence of rV s Let T ZXi We truncate each l X l at c gt 0 and get 13971 07 otherw1se TL 71 Let T EX and mn ZEX I i1 i1 Lemma 626 For Tn7 T5 and mm as de ned in the Note aloove7 it holds V L Pl Tnimn lgt e Pl Tyfimn lgt EHZPQ X1 lgt c vegt0 i1 Proof It holds for all e gt O PlTn7mnlgte PlTn7mnlgteand lXilchi 1n Pl Tnimn lgt eand le lgtcfor at least onei E 17771 ME Pl T imn lgt e Pl X lgt c for at least onei E 1n 71 Pl T imn lgt e 2P0 X1 lgt c i1 5 holds since T5 Tn when l Xi lg 0 Vi E 1 771 l Note If the Xi s are identically distrilouted7 then Pl Tn 7m lgt e Pl T57 mn lgt e 7113le lgt 6 V6 gt 0 If the Xi s are iid7 then E X0 2 Pl TV mn lgt e nPlX1lgt c vs gt 0 9 Note that Pl X lgt c Pl X1 lgt 0 Vi E W if the Xi s are identically distributed and that EXf2 EXf2 Vi 6 iv if the ng are iid I Theorem 627 Khintchine7s WLLN Let XiBil be a sequence of iid rV s with nite mean 0 Then it holds XnlTni J39 71 Proof If we take 0 n and replace 6 by me in in the Note above7 we get Plt Since X1 lt 007 it is nPl X1 lgt n a 0 as n a 00 by Theorem 319 From Corollary 00 3112 we know that X l 04 za 1Plegt zdm Therefore7 0 T7 E an 7m gt6PlTnimnlgtne nPlX1lgtn 7716 EXl 2 7L 2 130 X 1mm 0 A n 2 waxy zdm2 130 X 1mm 0 A m K6 dm A S K 716 In l 7 A is chosen suf ciently large such that zPl X lgt z lt g Vz 2 A for an arbitrary constant 6 gt 0 and K gt 0 a constant Therefore7 n 2 mm gt lt 5 3 n62 7 n62 62 Since 6 is arbitrary7 we can make the right hand side of this last inequality arbitrarily small for suf ciently large n V L EEK SinceEXlu Vi7itisi1 gtIuasngtoo I 71 71 Note Theorem 627 meets the previously stated goal of not haVing a nite variance requirement l Lecture 422 Fr 12 0800 63 Strong Laws of Large Numbers De nition 631 n Let X3221 be a sequence of rV s Let Tn ZXi We say that obeys the SLLN i1 with respect to a sequence of norming constants Bib1 Bi gt 07 Bi T 007 if there exists a sequence of centering constants AlBil such that B1Tn 7 An 0 Note Unless otherwise specified7 we will only use the case that B n in this section I Theorem 632 Xn X ltgt lim Psup leiX lgt60 vegt0 naoo m2 Proof see also Rohatgi7 page 2497 Theorem 11 WLOG7 we can assume that X 0 since Xn E X implies Xn 7 X E O Thus7 we have to prove Xn o ltgt lim Psup lelgtE0 vegt0 Hoe m2 Choose 6 gt 0 and de ne AME sup le lgt 6 mZn C 1310 Xn 0 77 Since Xn g 07 we know that PC 1 and therefore PCc O 00 Let Bne C O Ane Note that Bn1e Q Bne and for the limit set Bne Q It 1 follows that n 00 gig PltBnltegtgt M Q Bnltegtgt 0 n We also have PBne PAn O C 17 PCc U Ail 17 1300 713W PCc m AS RH EH 0 0 P An iilt77 Assume that lim PAne 0 V6 gt 0 and de ne De lim l Xn lgt e naoo 4 00 Since De Q AME V71 E IN it follows that PDe 0 V6 gt O Also7 naoo 0 i if 1 Ce 11m X 7g 0 g Hugo l X lgt g 17 130 g i PD 0 k1 gtXn gt0 l Note 1 X 0 implies that vegt 0 V6 gt 0 am e N P sup l X lgt e lt 6 nZno ii Recall that for a given sequence of events 33117 kmAwigng Aw M kn n1 kn is the event that in nitely many of the An occur We write PA PAn 20 where 220 stands for in nitely often iii Using the terminology de ned in ii aloove7 we can rewrite Theorem 632 as Xn o ltgt Panlgteio0 vegt0 Theorem 633 Borelicantelli Lemma i 191 BciLemma 00 Let 1473311 be a sequence of events such that Z PULL lt 00 Then PA 0 n1 ii 2 BciLemma 00 Let An il be a sequence of independent events such that Z PAn 00 Then PA 1 n1 Proof PmsUmgt k n gsHUAw kn nszmegt kn 5 holds since 2 PAn lt oo n1 U Ag n1 kn ii We have A0 7 Therefore7 PltAcgt Pogo A2 ggng A2 kn kin If we choose no gt n it holds that 00 no Ag g Ag kn kn Therefore7 00 no P Ag hm P Ag kn OHM kn no ingep 7 7 113100 g0 PAk 21 Lecture 02 We 011001 no lt 7 3100 exp mo 0 PA 1 l holds since no no 0 17eXpltZajgt 17H17aj 2aj forn0gtnand0 aj 1 jn 39 7n jn Example 634 Independence is necessary for 2nd BciLemma Let Q 0 1 and P a uniform distribution on 9 Let An 0 lw Therefore iPUln oo n1 n1 7739 l But for any w E Q A occurs only for 1 2 a where ml denotes the largest integer oor that is g Therefore PA PAn 20 O Lemma 635 K01m0g0r0v7s Inequality Let X3221 be a sequence of independent rV s with common mean 0 and variances 0 Let V L Tn EX Then it holds 11 n a gt lt 51 P1r 1agxank L e 7 62 V6 gt 0 Proof See Rohatgi page 268 Lemma 2 and RohatgiSaleh page 284 Lemma 1 l Lemma 636 Kronecke s Lemma 00 For any real numbers zn if E zn converges to s lt co and B T 00 then it holds n1 V L ZBkzkHO asnaoo L B k1 Proof See Rohatgi7 page 2697 Lemma 37 and RohatgiSaleh7 page 2857 Lemma 2 l Theorem 637 Cauchy Criterion XnEX ltgtnliangoPs17inanmi nl e1Vegt0 M See Rohatgi7 page 2707 Theorem 5 l Theorem 638 00 00 lf 2 VarXn lt 007 then XXX 7 converges almost surely 1 n1 Proof See Rohatgi7 page 2727 Theorem 67 and RohatgiSaleh7 page 2867 Theorem 4 l Corollary 639 Let XiBil be a sequence of independent rv s Let B32217 Bl gt 07Bi T 007 be a sequence 7L of norming constants Let T 2X12 i1 V X If E lt 00 then it holds 13971 1 Tn 7 ETn E 0 B71 Proof This Corollary follows directly from Theorem 638 and Lemma 636 I Lemma 6310 Equivalence Lemma TL 71 Let X3221 and be sequences of rv s Let T ZXi and T7 171 391 00 L If the series ZPXi 7 lt 007 then the series and are tailiequivalent and i1 Tn and T7 are convergenceiequivallent7 ie7 for B T 00 the sequences BinTn and BinTL converge on the same event and to the same limit7 except for a null set Proof See Rohatgi7 page 2667 Lemma 1 l Lemma 6311 Let X be a rV with X lt 00 Then it holds iPmX12ngtEltXgt1 PltXngt 1 n1 Proof Continuous case only Let X have a pdf 1 Then it holds EleOollfmdmZk wmdz 00 k0 Slwl k1 co ZkPklek1EXlZk1PkXk1 k0 k0 ltis 00 00 ZkPk le k1 ZZPklek1 k0 k0n1 00 00 ZZPk le k1 n1kn 00 ZPquzm n1 Similarly7 00 00 00 2ltk1gtPltkXk1gt ZPlt1X12ngtZPltle k1gt k0 n1 k0 ZPleZn1 n1 Theorem 6312 Let XiBil be a sequence of independent rV s Then it holds 00 Xn o ltgt ZPanlgteltoo vegt0 n1 Proof See Rohatgi7 page 2657 Theorem 3 Lecture 03 Fr 011201 Theorem 6313 K01m0g0r0v7s SLLN 11 Let XiBil be a sequence of iid rV s Let T ZXi Then it holds 11 EYn gtpltoo ltgt EleltooandthenuEX 71 Suppose that Y E u lt 00 It is n n71 11 11 T 7171 T ampJ 7120 n n 7171 V F RH as 41 as 4111 4111 By Theorem 63127 we have 00 X 00 2P0 i l 1 lt 0013 2P0 X l n lt oo n1 n n1 Lemrg ll X lt 00 Th 62WLLN Y L EltXgt Since in g In it follows by Theorem 6126 that Y L In Therefore7 it must hold that u EX by Theorem 6111 ii 77 Let X lt oo De ne truncated rV s X12 Xkilekl k O7 otherwise T X k1 i T X i 71 Then it holds 00 00 ZPXk7 Xz 213le lgtk k1 k1 00 Pm XI 12 k k1 8 PltleZk k 1 Lemma 6311 S EU X l lt 00 By Lemma 63107 it follows that Tn and T7 are convergencequivalent Thus7 it is suf cient to prove that Y E We now establish the conditions needed in Corollary 639 It is VarX7L S n71 szXzdz k0 kglmlltk1 n71 S 12kgzlltk1 fXltmgtd 0 w niltk1gt2PUv 1 X llt k1 w o ltk1gt2Pltk X llt 1m A M k0 gt iVarX 7Pk Xltk1 2712M ggn2ltwgt 2 k 21 P 71 H M8 M3 00 1 knglltk1 E17n2P0 lelt 1 n 3 H H w H H E l M8 k12Pk X lt k 1 1gt2P0 X lt 1 A nk n2 1 H 00 1 5 holds since E 7 7 z 165 lt 2 and the rst two sums can be rearranged as follows 71 n1 It is From Bronstein7 page 307 77 we know that nk1 M8 w 1 nn 71 N H 1 A 1 A 1 1 1 k2 k12 k22 1 1 1 7 k2 kk1 k1k2 0 1 1 Z 7 2 k nk1nn71 12 23 34 n1 111111 1 1 1 12 23 34 k71k nk1nn71 1 1 1 1 1 12 23 34 k71k 1LL 1 2 23 34 kink 1 1 1 3 34 k71k 1 1 4 k71k l k 0 1 Z 2 k nk1nn71 11 k2 k 3 k Using this result in A7 we get 1 M1 Z VaMXn 2ZTPUs leltk12P0lelt1 n1 k1 2ZkPk lXlltk14ZPk leltk1 k0 k1 X 1 2ZEPkleltk12POlXllt 1 k1 2Ele422 lt 00 To establish B7 we use an inequality from the Proof of Lemma 63117 ie7 00 Proof 00 Lemma 6311 ZkPk leltk1 ZPlen Ele k0 n1 Thus7 the conditions needed in Corollary 639 are met With Bn n it follows that in 7 3mm 2 0 a 71 n 71 n Since a EX as n A 007 it follows by Kronecker s Lemma 636 that a Thus7 when we replace by EX in C7 we get 1 Lemma 6310 1 7T E EX gt 7T E EX 71 71 since Tn and T7 are convergencekequivalent as de ned in Lemma 6310 l Lecture 04 We 011701 64 Central Limit Theorems Let X73311 be a sequence of rv s with cdf s Fn ir Suppose that the mgf Mnt of Xn exists Questions Does Mnt converge Does it converge to a mgf MOE If it does converge7 does it hold that Xn L X for some rv X Example 641 Let X73311 be a sequence of rv s such that PXn in 1 Then the mgf is Mnt EetX 57m So 07 25gt 0 lim Mnt 17 t 0 Hoe 007 tlt 0 So Mnt does not converge to a mgf and a 1 Vm But is not a cdf l Note Due to Example 6417 the existence of mgf s Mnt that converge to something is not enough to conclude convergence in distribution Conversely7 suppose that Xn has mgf Mnt7 X has mgf Mt7 and Xn L X Does it hold that Mnt a M if Not necessarily See Rohatgi7 page 2777 Example 27 and RohatgiSaleh7 page 2897 Example 27 as a counter example Thus7 convergence in distribution of rv s that all have mgf s does not imply the convergence of mgf s However7 we can make the following statement in the next Theorem I Theorem 642 Continuity Theorem Let X73311 be a sequence of rv s with cdf s Fnff1 and mgf s Mntff1 Suppose that Mnt exists for l t g to V71 If there exists a rv X with cdf F and mgf Mt which exists for 1 t 1 t1 lt to such that 711310 Mnt Mt Vt 6 721 t1 then Fn L F ie X L X I Example 643 Let Xn N BMW Recall eg7 from Theorem 3312 and related Theorems that for X N Binnp the mgf is MXt 17 p pet Thus7 t Mnt1iggen1w amp e VBt l as n a 00 774 In we use the fact that lim 1 E em Recall that e VBt l is the mgf of a rV X where Hoe n X N P0isson Thus7 we have established the welliknown result that the Binomial distribu tion approaches the Poisson distriloution7 given that n a 00 in such a way that np gt O l Note Recall Theorem 3311 Suppose that X73311 is a sequence of rV s with characteristic fuctions Dntf1 Suppose that lim Dnt ltIgtt Vt 6 77171 for some h gt 07 Hoe and ltIgtt is the characteristic function of a rV X Then Xn L X l Theorem 644 LindebergiL vy Central Limit Theorem Let X73311 be a sequence of iid rV s with u and 0 lt VarX 02 lt 00 Then it 71 holds for Y iZXi that i1 VMXW I i Z 039 where Z N N01 Proof Let Z N N01 According to Theorem 3312 V7 the characteristic function of Z is ltIgtzt expe tZ Let ltIgtt be the characteristic function of Xi We now determine the characteristic function ENE of W xmiin M Dnt E exp itL co 00 n 71 I exp it dFXg LOO LOO a i exp7i expjjdFX1m1L expfdFXnzn t it n 7 m Wa expe 0 Recall from Theorem 335 that if the kth moment exists7 then 1 0 ln partic ular it holds for the given distribution that ltIgtlt1gt0 iEX m and lt1gtlt2gt0 2 2EX2 i2u2 02 7012 02 Also recall the de nition of a Taylor series in MacLaurin s form m n x f0 if 20m2 if 3E0z3 if 2 3 fmemlmi n aroundt07 we get if 1 1 7 lt1gt0 260 725 0 7t3lt1gt 0 W ltgt ltgt2 ltgt6 ltgt Thus7 if we develop a Taylor series for ltIgt 1gt m 12p202 t 2 1 257772 7 g 2 7102 0 WU Here we make use of the Landau symbol 0 ln general7 if we write ovz for z 7 L7 this implies limL u m O7 ie7 goes to 0 faster than vz or vz goes to 00 ma U 13 faster than We say that is of smaller order than vz as x 7 L Examples are 513 0amp1 and 2 0z3 for z 7 00 See Rohatgi7 page 67 for more details on the Landau symbols 0 and o Similarly7 if we develop a Taylor series for exp7 around if 07 we get ex W 7 W 12L2 0 L2 plt agt71tf t 2 0 m7 2 m7 Combining these results7 we get 71 7 W ilz zi gz 5 2 7 W ilzi 5 2 Enos 7 ltlt1t a 2t 7102 0 WU 1 t a 2t n020 WU TL in 1212 m 2M2 12M202 lt t 2 172 7727 25 257772 7 lt 710 2 7102 a n02 2 7102 0 a 1252 t n 177 72 lt 2n0lt a 71752 n 1 2 oltlgtgt n 71 t2 exp7 aanOo Thus7 Dnt ltIgtzt Vt For a proof of 9 see Rohatgi7 page 2787 Lemma 1 According to the Note above7 it holds that De nition 645 Let X17X2 be iid nonidegenerate rV s with common cdf F Let ahag gt 0 We say that F is stable if there exist constants A and B depending on an and a2 such that B 1a1X1 1ng 7 A also has cdf F l Note When generalizing the previous de nition to sequences of rV s7 we have the following examples for stable distributions 7L 0 X iid Cauchy Then N Cauchy here B 7114 0 i1 7L X1 iid N0 1 Then 2X1 N0 1 here B WA 0 i1 De nition 646 11 Let XiBil be a sequence of iid rV s with common cdf F Let T ZXi F belongs to i1 the domain of attraction of a distribution V if there exist norming and centering constants 1373311 B gt 07 and A fj such that PB1Tn An S FB1TTLAWm H Vm as n a 00 at all continuity points z of V Note A very general Theorem from Loeve states that only stable distributions can have domains of attraction From the practical point of View7 a wide class of distributions F belong to the domain of attraction of the Normal distribution l I Lecture 05 Fr 011901 Theorem 647 Lindeberg Central Limit Theorem Let X3221 be a sequence of independent nonidegenerate rV s with cdf s Assume 71 that EXk uk and VarXk 0 lt 00 Let 3 20 k1 If the Fk are absolutely continuous with pdf s fk Fig assume that it holds for all e gt 0 that 1 n 2 A lim i 35 7 M F1zdz o H 8 k2 lm Mkbeml If the Xk are discrete rV s with support zkl and probabilities pkl l 1 2 assume that it holds for all e gt 0 that 1 71 B 7 Z Z W szpkz 0 8 k1 lwkrmlgt69n The conditions A and B are called Lindeberg Condition LC If either LC holds then M Xk Mk 2 w H H 5n where Z N N01 Proof Similar to the proof of Theorem 644 we can use characteristic functions again An alterna tive proof is given in Rohatgi pages 2827288 I Note Feller shows that the LC is a necessary condition if g a 0 and 3 a 00 as n a 00 l Corollary 648 Let X3221 be a sequence of iid rV s such that iXi has the same distribution for all 71 If 0 and VarX 1 then X N N0 1 111 M Let F be the common cdf of iXi for all 71 including 71 1 By the CLT 11 1 n 11310 MW g z ltIgtz n where Mm denotes PZ z for Z N NO1 Also P EX z for each n i1 Therefore we must have l Note In general if X1X2 are independent rV s such that there exists a constant A with Pl Xn lg A 1 Vn then the LC is satis ed if 32 a 00 as n a 00 Why Suppose that 3 a 00 as n a 00 Since the l Xk s are uniformly bounded by A so are the rV s Xk 7 Thus for every 6 gt 0 there exists an N5 such that if n 2 N5 then Pl Xk7EXk llt 63 k 1n 1 This implies that the LC holds since we would integrate or sum over the empty set ie the set z 7 uk lgt 68 Q The converse also holds For a sequence of uniformly bounded independent rV s a necessary and suf cient condition for the CLT to hold is that 32 a 00 as n a 00 I Let X3221 be a sequence of independent rV s such that EXk 07 04k Xk PM lt 00 71 for some 6 gt O and Z 04k 0SEL639 k l Does the LC hold It is 1 n A 1 n m 25 2 2fkd S Zst 7ng J WEWE 2 n k1 lwlgt59nl 8 k1 l A 0 as n a 00 6 71 A holds since for l x lgt 63 it is 2 gt 1 B holds since Zak os 6 k1 Thus the LC is satis ed and the CLT holds l Note i In general7 if there exists a 6 gt 0 such that 19le Mk l2 M k 1 836 EOaanoo then the LC holds A V Both the CLT and the WLLN hold for a large class of sequences of rv s XiHa If the Xi s are independent uniformly bounded rv s7 ie7 if Pl Xn lg M 1 Vn the WLLN as formulated in Theorem 623 holds The CLT holds provided that 3 a 00 as n a 00 If the rv s are iid7 then the CLT is a stronger result than the WLLN since the CLT n provides an estimate of the probability P l 2X1 7 71p 2 e z 17 Pl Z lg EVE i1 7 where Z N NO17 and the WLLN follows However7 note that the CLT requires the existence of a 2nd moment while the WLLN does not iii If the are independent but not identically distributed rv s7 the CLT may apply While the WLLN does not iv See Rohatgi7 pages 28972937 and RohatgiSaleh7 pages 29973037 for additional details and examples 7 Sample Moments 71 Random Sampling Based on CasellaBerger Section 51 amp 52 De nition 711 Let X17 7Xn be iid rv s with common cdf F We say that X17 7Xn is a random sample of size n from the population distribution F The vector of values 1 an is called a realization of the sample A rv gX17 7Xn which is a Borelimeasurable function of X17 Xn and does not depend on any unknown parameter is called a sample statistic I De nition 712 Let X17 7Xn be a sample of size n from a population with distribution F Then 7 1 n X 2X1 is called the sample mean and 1 7171 l 52 n X if 1 in 7 7K 1 39 is called the sample variance l De nition 713 Let X17 7Xn be a sample of size n from a population with distribution F The function A 1 F7490 Zaioo Xi i1 is called empirical cumulative distribution function empirical cdf l Note For any xed z E R is a rv l Theorem 714 The rv has pmf 71 gtW WO7FMW77NJQMWL J Ham 36 with Ewing Fe and VMFWW Proof It is 100mXl N Bin1Fx Then n m N BinnFm The results follow immediately I Corollary 715 By the WLLN7 it follows that I Corollary 716 By the CLT7 it follows that A WltFnm F i Z Fz1F 7 where Z N N01 l Theorem 717 Glivenkoicantelli Theorem converges uniformly to Fz7 ie7 it holds for all e gt 0 that lim P sup anx7Fxlgte0 icoltmltoo I Lecture 06 w Mo 012201 Let 17 7Xn be a sample of size n from a population with distribution F We call 1 n k ak 7 2X1 i1 the sample moment of order k and 1 n k 1 n i k bk 72 701 72 7X n i1 n i1 the sample central moment of order k l Note It is b1 0 and b2 L152 I n Theorem 719 Let X17 7Xn be a sample of size n from a population with distribution F Assume that EX In VarX 02 and EX 7 Wk pk exist Then it holds 1 Mn E3 M n Vara1 VMY 072 iii Eb2 74a iv wrung 7 Law V E092 02 vi was e 7531 3 Proof lt1 n M 1 ZEltX1gt 3p 7 M i1 11 i 2 n U2 VarX ZVaMXZ I 11 in n E092 EZX139Y2gt i1 lt1 2 1 n 2 E ngl 7EltZX1gt 11 i1 my E lt2X Z X1Xjgt l l 7 2 EltX2gt e 01M W 71W quot7 1ltEltX2gt 7 p2 n 7 102 5 holds since X and Xj are independent and then7 due to Theorem 4537 it holds that See CasellaBerger7 page 2147 and Rohatgi7 page 30373067 for the proof of parts iv through Vi and results regarding the 37d and 4th moments and covariances l 72 Sample Moments and the Normal Distribution Based on CasellaBerger Section 53 Theorem 721 Let X1Xn be iid Nu02 rV s Then X and X1 7XXn 7X are independent 111 m By computing the joint mgf of X X1 7 X X2 7 X Xn 7 X we can use Theorem 463 iv to show independence We will use the following two facts 13 aztz exp 25114 A holds by Theorem 464 B follows from Theorem 3312 Vi since the Xi s are iid 2 D 461 n 7 MX17YX27YWXW7Y751 t2 tn 5f E exp tiXi 7 i1 7L i 7L E exp tiX 7 XZtgt i1 11 7L 1 7L E exp Xitl 7 where f 7 Zti i1 n i1 n 7 E expXit7t i1 n 22517 2 Hexpltptrt U 2 gt i1 n n 2 7 039 7 exp HEW t 7 205139 i 792 11 39 i1 0 02 n exp 7 32gt i1 C follows from Theorem 453 since the Xi s are independent D holds since we evaluate MXh expuh Uzzhz for h t 7 t From 1 and 27 it follows M7 Def461 XX17YMXW tt1tn E eXptY t1X1 7 Y tnXn 7 X l E exptYt1X1 7 t1Y 25an 7 273 E exp Xitl39 7 7 i1 13971 V L n t1tn7tZXi E exp ZXlti 7 11 13971 n E exp X475 7 i1 n 397 i n E H exp Xintl nnttgt7 Where 2751 i1 i1 g E exp Xilt ti WM i1 H MXi 13971 n n n 7 02 7 i Hexp infiltt n i 75 13971 n 02 TL 71 7 exp 5 712547712037 2n22tnti7t2 i1 i1 0 2 n 039 7 7 exput exp W ntZ 271252051 7 t 712 2051702 11 11 0 0392 0392 n 7 exp 1115 152 exp T ti 7 t2gt 391 l DMZ M7tMX17YmXn7Yt17 win Thus7 Y and X1 7 Y Xn 7 Y are independent by Theorem 463 iv follows from Theorem 453 since the Xi s are independent holds since we evaluate MXh expuh 022 for h l Corollary 722 Y and 52 are independent M This can be seen since 52 is a function of the vector X1 7 Y 7Xn 7 Y and X1 7 Y 7Xn 7 Y is independent of X as previously shown in Theorem 721 We can use Theorem 427 to formally complete this proof I Corollary 723 71 7 1 5392 Proof Recall the following facts If Z N N0 1 then 22 N X TL IfY1Yn iid Xi then 212 X3 i1 For Xi the mgf is Mt 17 min2 If X1 Mew then X37 N0 1 and Xi W N X 2 Xi M2 2 Kw Kw 2 Therefore7 ET N X and W 717 N X1 i 41 Now consider n 09 NZ 39 1 fax 7 Y M 7 WY 7 p Y7 m n 71SZ 0 71Y7 u2 Therefore 0392 7 039 H EH U V i X 7 m2 747 m2 n 7 us i1 W We have an expression of the form W U V Since U and V are functions of and 52 we know by Corollary 722 that they are independent and also that their mgf s factor by Theorem 463 iv Now we can write Mwt MUtMvt gtMvt i 17 min2 17 22342 1 2tn12 Note that this is the mgf of kl by the uniqueness of nigf s Thus V ag N x il l Corollary 724 Proof Recall the following facts If Z N NO1 Y N X3 and Z Y independent then LY tn V o Z1 w N N0 1 Yn1 7071252 N xiil and Z1Yn1 are independent Therefore 7 7w 7w WWW a N5 7 Z1 N 5 Six5 52n71 Ynil n71 f 02n71 n71 Corollary 725 Let X17 7Xm N iid Np1af and Y17 7Yn N iid Nug7 0 Let Xi be independent vm Y777W17W mn72 t N mn72 m 7 mm lt71 1S al V 0 03 Then it holds ln particular7 if 01 02 then Y Y QLlilug mnmn72 Mt 72 m712 717152 mt m 1 2 Proof Homework l Corollary 726 Let X17 7Xm N iid Np1af and Y17 7Yn N iid Nug7 0 Let Xi be independent vm Then it holds 5 N 1 1 5303 m ln particular7 if 01 02 then 5 y N milmil 2 Proof Recall that7 if Y1 N xi and Y2 N x then Ylm F N F YZn mm L 2 7 2 Now7 01 W N xfnil and Cg N xiil Therefore7 mews 5120 amen llmi 1 Sig0 7 703 020171 N m m l39 n a n7 1 If 0391 027111811 52 1 7 Wilflil39 2 8 The Theory of Point Estimation Based on CasellaBerger Chapters 6 amp 7 81 The Problem of Point Estimation Let X be a rV de ned on a probability space 9 L7 P Suppose that the cdf F of X depends on some set of parameters and that the functional form of F is known except for a nite number of these parameters De nition 811 The set of admissible values of 9 is called the parameter space 9 If F9 is the cdf of X when 9 is the parameter7 the set F9 9 E 9 is the family of cdf7s Likewise7 we speak of the family of pdf7s if X is continuous7 and the family of pmf7s if X is discrete l Example 812 X N Binnp7 p unknown Then 9 p and 9 p 0 lt p lt1 X N Nuz727 p02 unknown Then0 p02 and 0102 foo lt u lt 0002 gt 0 I De nition 813 Let X be a sample from F97 9 E 9 Q B Let a statistic TX map E to 9 We call TX an estimator of 9 and Tg for a realization g of X an point estimate of 0 ln practice7 the term estimate is used for both I Example 814 Let X17 7Xn be iid Bin1p7 p unknown Estimates ofp include i 1 X X TM X TM X1 TM 514k Obviously7 not all estimates are equally good I Lecture 07 We 012401 82 Properties of Estimates De nition 821 Let X3221 be a sequence of iid rV s with cdf F9 0 E 9 A sequence of point estimates TnX17 7 X Tn is called 0 weakly consistent for 0 if Tn L 0 as n a 00 V0 6 9 0 strongly consistent for 0 if Tn E 0 as n a 00 V0 6 9 0 consistent in the rth mean for 0 if Tn L 0 as n a 00 V0 6 9 Example 822 V L Let XiBil be a sequence of iid Bin1p rV s Let Y Since 197 it i1 follows by the WLLN that Y L 197 ie7 consistency7 and by the SLLN that Y 3 p7 ie7 strong consistency However7 a consistent estimate may not be unique We may even have in nite many consistent estimates7 eg7 TL 2 a LP V nite ab ER Theorem 823 If Tn is a sequence of estimates such that ETn a 0 and VarTn a 0 as n A 007 then Tn is consistent for 0 Proof PlTw0lgte 7 7 02l E2 VMTn 2ElTn ETnETn 0l ETn 0V 2 vmm Em 7 0 62 0 asnaoo A holds due to Corollary 352 Markov s Inequality B holds since VarTn a 0 as naooandETnH0aanoo l De nition 824 Let Q be a group of Borelimeasurable functions of R onto itself which is closed under com position and inverse A family of distributions P9 9 E 9 is invariant under G if for each 9 E Q and for all 9 E 97 there exists a unique 0 0 such that the distribution of 9X is sz Whenever the distribution ofX is P9 We call y the induced function on 9 since P9ltglt gt E A Fae E I Example 825 Let X17 7 X be iid Nuz72 with pdf 1 1 95 z 7ex 77 1 n aw plt 202 The group of linear transformations Q has elements 7 u2gt i1 gx1xnam1baznb agt07 fooltbltoo The pdf of 9X is 1 1 n 4 fmi7xWexpltingxiapib2gt7 zazib 21771 1 So 1 foo lt u lt 007 02 gt 0 is invariant under this group G with gm 02 1u l l7 1202 where foo lt at b lt 00 and 0202 gt O l De nition 826 Let Q be a group of transformations that leaves F9 9 E 9 invariant An estimate T is invariant under G if T9X17 7909 TX17 7Xn V9 6 Q De nition 827 An estimate T is 0 location invariant if TX1 a7 7Xn a TX17 7 Xn7 a E R 0 scale invariant if TcX17 7an TX17 7Xn7 c E B 7 0 o permutation invariant ifTX 7X777 TX17 7 X77 Vpermutations i17 7i l17 of177n Example 828 Let F9 N N7a702 2 is location invariant Y and 52 are both permutation invariant Neither Y nor 52 is scale invariant l Note Different sources make different use of the term invariant Mood7 Graybill amp Boes 1974 for example de ne location invariant as TX1 a7 7X77 a TX17 7 X77 a page 332 and scale invariant as TcX17 70X cTX17 7X page 336 According to their de nition7 Y is location invariant and scale invariant l 83 Su icient Statistics Based on CasellaBerger Section 62 De nition 831 Let X X17Xn be a sample from F9 9 E 9 Q Bk A statistic T TX is su icient for 9 or for the family of distributions F9 9 E 9 iff the conditional dis tribution of X given T 25 does not depend on 9 except possibly on a null set A where P9T EA 0 W I Note i The sample X is always suf cient but this is not particularly interesting and usually is excluded from further considerations ii ldea Once we have reduced from X to TX7 we have captured all the information in X about 0 iii Usually7 there are several suf cient statistics for a given family of distributions Example 832 Let X X17 7 X be iid Bin1p rv s To estimate 197 can we ignore the order and simply count the number of successes 71 Let x EX It is i1 n PX11 XnnTt PX1z1XnzanXZt i1 PT PCP 0 otherwise PX1177Xnn n 7 m39t it E l 07 otherwise 1 n 77 Zn 2 i1 t 07 otherwise V L This does not depend on p Thus7 T ZXl is suf cient for p i1 Example 833 71 Let X X17 7X be iid Poisson Is T ZXi suf cient for X It is 11 PX X Tt pX1m17nt7Xnmn Tt w PT t 111 eiAkmi x39 n 7 13 l t 7 90 t i 5 71A 11 25 O7 otherwise ginAE mi VL H m 7 f M t 7 e nt 7 t 07 otherwise V 7 90 t nt H xi i1 i1 O7 otherwise V L This does not depend on Thus7 T ZXl is suf cient for i1 Example 834 Let X17X2 be iid Poisson Is T X1 2X2 suf cient for X It is PX1 07X2 17X1 2X2 2 PX1 2X2 2 PX1 07X2 1 PX1 07X2 1 PX1 07X2 27X2 e Ae M 64W 5 2A 5A 1 1 PX10X21X12X22 7 Ni ie7 this is a counteriexaniple This expression still depends on Thus7 T X1 2X2 is not suf cient for l Note De nition 831 can be dif cult to check ln addition7 it requires a candidate statistic We need something constructive that helps in nding suf cient statistics without having to check De nition 831 The next Theoren1 helps in nding such statistics l Lecture 08 Theorem 835 Factorization Criterion Fr 012601 Let X1Xn be rV s with pdf or pn1f fz1mn l i97 i9 6 9 Then TX1Xn is suf cient for 0 iff we can write fz1zn l 0 hz1zn gTm1mn l i97 where h does not depend on 0 and 9 does not depend on 1 Wm except as a function of T Proof Discrete case only 77 Suppose TX is suf cient for 0 Let WW P9TX Mi PX l T Then it holds FAX Q Hi Tg t P9TX t PK g l Ni t W l WM holds since 1 g implies that x Tg is 77 Suppose the factorization holds For xed to it is Pemx to Z P9X g Q i T to 50 E h 9T l t9 g i T to 9750 l t9 2 7M A Q i T to lf P9TX to gt 0 it holds FAX 7 Ni to P X T X t 9i gl 0 to FAX g i 7 P9TX t07 1f Tlt gt t0 0 otherwise We l a if Tg to g 9 l 0 Z m g i T to 0 otherwise A if T 35 t h m 7 0 otherwise This last expression does not depend on 0 Thus TX is suf cient for 0 l Note i In the Theorem above 9 and T may be vectors ii If T is suf cient for 0 then also any litoil mapping of T is suf cient for 0 However this does not hold for arbitrary functions of T I Example 836 Let X1 Xn be iid Bin1p It is PX1 1 Xn m l p pEmi ipykgmi Thus hm1 zn 1 and 9Zmi lp pEmi 7p Emi V L Hence T EX is suf cient for p l i1 Example 837 Let X17 7Xn be iid Poisson It is eiAkzi ginAE mi PX mX z 39 Thus hz1 H357 and 92351 l einAwai V L Hence7 T ZXi is suf cient for l i1 Example 838 Let X17 7Xn be iid NW 02 where u E R and 02 gt 0 are both unknown It is 1 2m 7 m2 1 Ex 2 w 2 7 7 7 7 l 7 7 7 flt1w7n l M70 7 27117 explt 202 n exp 202 M 0392 20392 39 27m TL 71 Hence7 T Xi is suf cient for p 02 l i1 i1 Example 839 Let X17 7Xn be iid U00 1 where foo lt 9 lt 00 It is 3 me 1 0ltxi 01vze1n O7 otherw1se 111900i170091xi 19oomini 17m91maxi Hence7 T X1Xn is suf cient for 0 l De nition 8310 Let f9z 9 E 9 be a family of pdf s or pmf s We say the family is complete if E9gX 0 v0 6 9 implies that P9gX 0 1 V0 6 9 We say a statistic TX is complete if the family of distributions of T is complete I Example 8311 71 Let X1 Xn be iid Bin1p We have seen in Example 836 that T EX is suf cient i1 for p ls it also complete We know that T N Binnp Thus implies that lt17pgtniglttgt gtlt p gt20 vpelt071gt Vt t 17p 71 However 2925 is a polynomial in which is only equal to 0 for all p E 0 1 t if all of its coef cients are 0 Therefore gt 0 for t O 1 n Hence T is complete I E l 8 3 12 Lecture 09 M MO 012901 7L 7L Let X1X be iid MHZ We know from Example 838 that T ZXZX is 391 391 suf cient for 0 ls it also complete 1 l V L We know that EX Nm9 7102 Therefore i1 V L EZX2 7102 71202 not 102 i1 V L MEX 7402 02 27102 i1 It follows that V L V L E lt2ZX2 771 1 2X3 0 v0 i1 i1 V L V L But 9z1 zn 2Zmi2 771 1 is not identically to O i1 11 Therefore T is not complete I Note Recall from Section 52 what it means if we say the family of distributions f9 t9 6 9 is a onparameter or kiparameter exponential family I Theorem 8313 Let f9 9 E 9 be a kiparameter exponential family Let T1 Tk be statistics Then the family of distributions of T1 X Tk is also a kiparameter exponential family given by k 992 exp Ema0 W 92 i1 for suitable 5 25 Proof The proof follows from our Theorems regarding the transformation of rV s l Theorem 8314 Let f9 9 E 9 be a kiparameter exponential family with k g n and let T1 Tk be statistics as in Theorem 8313 Suppose that the range on Q1 Qk contains an open set in Bk Then I T1X Tk is a complete suf cient statistic Proof Discrete case and k 1 only Write Q0 t9 and let ab Q 9 It follows from the Factorization Criterion Theorem 835 that T is suf cient for 0 Thus we only have to show that T is complete ie that Ee9TX 29tPeT t t i Zgtexp0t 130 SW 0 W B implies gt 0 Vt Note that in A we make use of a result established in Theorem 8313 We now de ne functions 9 and g as 9105 905 if9t Z 0 0 otherwise 0 otherwise wt 90 ifgtlt0 It is gt gt ig t where both functions 9 and g are noninegative functions Using 9 and 9 it turns out that B is equivalent to 29105 with 5375 29705 with 5375 V0 0 t t where the term expD0 in A drops out as a constant on both sides 54 If we x 00 6 11 and de ne W glttgtexplt0ot 5 7 W glttgtexplt0ot 5 7 Zg t exp00t St Zg exp00t 5 t t it is obvious that pt 2 0 Vt and p t 2 0 Vt and by construction Zpt 1 and t Zp 1 Hence7 pJr and p are both pmf s t From C7 it follows for the mgf s MJr and M of pJr and 19 that MW Ze tp t Z 56tgt exp00t St Zgl exPWOt 5 Zgt exp00 6t St W 2917 exp00 6t St W Z 65 9 t expwot 5 Zg exp00t St V6 a7007b700 W W4 lt0 gt0 By the uniqueness of mgf s it follows that pt p t Vt 5 9t 9 05 W gt gt 0 Vt gt T is complete I De nition 8315 Let X X17 7X be a sample from F9 9 E 9 Q Bk and let T TX be a suf cient statistic for 0 T TX is called a minimal su icient statistic for 9 if7 for any other suf cient statistic T TKX Tg is a function of T g l Note i A minimal suf cient statistic achieves the greatest possible data reduction for a suf cient statistic ii If T is minimal suf cient for 0 then also any 17toi1 mapping of T is minimal suf cient for 0 However7 this does not hold for arbitrary functions of T De nition 8316 Let X X17 7X be a sample from F9 9 E 9 Q Bk A statistic T TX is called ancillary if its distribution does not depend on the parameter 0 l Example 8317 Let X17Xn be iid U00 1 where foo lt 9 lt 00 As shown in Example 8397 T X1Xn is suf cient for 0 De ne Rn Xltngt X0 Use the result from Stat 67107 Homework Assignment 57 Question viii a to obtain fRW 10 flan 7171 1T 21 TIo1 This means that Rn N Betan 7 17 2 Moreover7 Rn does not depend on 9 and7 therefore7 Rn is ancillary l Theorem 8318 Basu7s Theorem Let X X17 7Xn be a sample from F9 9 E 9 Q Bk lfT TX is a complete and minimal suf cient statistic7 then T is independent of any ancillary statistic l Theorem 8319 Let X X17 7X be a sample from F9 9 E 9 Q Bk If any minimal suf cient statis tic T TX exists for 0 then any complete statistic is also a minimal suf cient statistic I Note 1 iii Due to the last Theorem Basu s Theorem often only is stated in terms of a complete suf cient statistic which automatically is also a minimal suf cient statistic As already shown in Corollary 722 Y and 52 are independent when sampling from a Nu 02 population As outlined in CasellaBerger page 289 we could also use Basu s Theorem to obtain the same result The converse of Basu s Theorem is false ie if T is independent of any ancillary statistic it does not necessarily follow that T is a complete minimal suf cient statis tic V L V L As seen in Examples 838 and 8312 T Xi is suf cient for 9 but it is not i1 i1 complete when X1 Xn are iid N09 02 However it can be shown that T is minimal suf cient So there may be distributions where a minimal suf cient statistic exists but a complete statistic does not exist As with invariance there exist several different de nitions of ancillarity within the lit erature 7 the one de ned in this chapter being the most commonly used 84 Unbiased Estimation Based on CasellaBerger Section 73 De nition 841 Let F9 9 E 97 9 Q R be a nonempty set of cdf s A Borel7measurable function T from R to 9 is called unbiased for 9 or an unbiased estimate for 0 if E9T 0 V0 6 9 Any function 10 for which an unbiased estimate T exists is called an estimable function If T is biased7 b0T E9T 7 0 is called the bias of T l Example 842 If the kth population moment exists7 the kth sample moment is an unbiased estimate gt lt V h VarX 02 the sample variance S2 is an unbiased estimate of 02 However7 note that for X17 Xn iid Nu7 02 S is not an unbiased estimate of a n 7 1 S2 n 7 1 XEH 7 Gammalt 2 7 2 co Lilil ii gtE n 21S ab e 2 dz 0 0 27W re ltgtltgt n7 W71 0 23F g P071 2 NE E S 2 j l 7 71 711W 21 5 5 holds since zine 2 is the pdf of a Gamma 2 distribution and thus the integral is 1 2 7 2 So S is biased for a and Note If T is unbiased for 0 gT is not necessarily unbiased for 90 unless g is a linear function Lecture 10 Example 843 We 013101 Unbiased estimates may not exist see Rohatgi7 page 3517 Example 2 or they me be absurd as in the following case Let X N Poisson and let d 5 Consider TX 71 as an estimate It is EAltTltXgtgt er fem miO 39 HV 6 A ml e Ae A Hence T is unbiased for d but since T alternates between 1 and 1 while d gt 07 T is not a good estimate l Note If there exist 2 unbiased estimates T1 and T2 of 0 then any estimate of the form 3le 1704T2 for 0 lt 04 lt 1 will also be an unbiased estimate of 0 Which one should we choose I De nition 844 The mean square error of an estimate T of 9 is de ned as MSE0 T E9T 7 0 vamT W T Let TlB1 be a sequence of estimates of 0 If MSE0T 0 V0 6 97 then is called a meanisquaredierror consistent MSEiconsistent sequence of es timates of 0 l Note i If we allow all estimates and compare their MSE7 generally it will depend on 9 which estimate is better For example 17 is perfect if 9 177 but it is lousy otherwise ii If we restrict ourselves to the class of unbiased estimates7 then MSE97 T Var9T iii MSEiconsistency means that both the bias and the variance of Ti approach 0 as i a 00 De nition 845 Let 00 E 9 and let U090 be the class of all unbiased estimates T of 00 such that E90T2 lt 00 Then T0 6 U090 is called a locally minimum variance unbiased estimate LMVUE at 00 if E90T0 7 00 E90T 7 00 VT 6 U090 De nition 846 Let U be the class of all unbiased estimates T of 9 E 9 such that E9T2 lt 00 V0 6 9 Then T0 6 U is called a uniformly minimum variance unbiased estimate UMVUE of 9 if E9T0 7 0 E9T 7 0 v0 6 e VT 6 U An Excursion into Logic II In our rst Excursion into Logic77 in Stat 6710 Mathematical Statistics I we have established the following results A a B is equivalent to B A is equivalent to A V B AlBHAjBl Al Bl Bj l AVB 1 1 1 0 0 1 1 1 0 0 0 1 0 0 0 1 1 1 0 1 1 0 0 1 1 1 1 1 When dealing with formal proofs there exists one more technique to show A a B Equiva lently we can show AA B O a technique called Proof by Contradiction This means assuming that A and B hold we show that this implies O ie something that is always false ie a contradiction And here is the corresponding truth table B l AB l AA B l AA BO HHOHU HOHO 0 1 1 0 0 1 0 1 Note We make use of this proof technique in the Proof of the next Theorem I Example Let A z 5 and B 2 25 Obviously A a B But we can also prove this in the following way A m5and B 25 gtx225Az2725 This is impossible ie a contradiction Thus A a B l Theorem 847 Let U be the class of all unbiased estimates T of 9 E 9 with E9T2 lt 00 V0 and suppose that U is noniempty Let U0 be the set of all unbiased estimates of O ie U0 1 Em 0E912 lt 00 v0 6 9 Then T0 6 U is UMVUE iff E9zT0 0 V0 6 9 VII 6 U0 Proof Note that E9zT0 always exists This follows from the Cauchyischwarzilnequality Theorem 457 11 Ema Eeltu2gtEeltT02gt lt oo because E912 lt 00 and E9T02 lt 00 Therefore also E9zT0 lt oo iigt77 We suppose that To 6 U is UMVUE and that E90z0T0 7 0 for some 00 E 9 and some 10 6 U0 It holds E9T0 AVG E9T0 0 VA 6 JR v0 6 9 Therefore TO 10 E U VA 6 R Also E90Vg gt 0 since otherwise P9z0 0 1 and then E90z0T0 0 Now let 7 E90 To V0 E90 Then E90T0 MOP E90T02 QATOVO A214 E90T02 2 V 2 E90Tozl lt E90T027 and therefore VGTQO To A110 lt VGTQO To This means T0 is not UMVUE ie a contradiction iilt77 Let E9zT0 0 for some T0 6 U for all 9 E 9 and all 1 6 U0 We choose T 6 U7 then also T0 7 T 6 U0 and E9T0T0 7 T 0 v0 6 o E9T02 E9T0T v0 6 o It follows from the Cauchyischwarzilnequality Theorem 457 ii that 1 1 E9T02 E9T0T S E9T 5E9T25 This implies E9ltT gtgt E9ltT2gtgt and VareT0 Vo n 9T7 where T is an arbitrary unbiased estimate of 0 Thus7 T0 is UMVUE l W Lecture 11 i i i Mo 020501 Let U be the noniempty class of unbiased estimates of 9 E 9 as de ned in Theorem 847 Then there exists at most one UMVUE T E U for 0 Proof Suppose T07 T1 6 U are both UMVUE Then T1 7 To 6 U0 Var9T0 Var9T1 and E9T0T1 7 T0 0 v0 6 o E9T02 E9T0T1 Cov9T0 T1 E9T0T1 7 E9T0E9T1 E9T02 7 0993 2 VareT0 Var9T1 v0 6 o gt0TOT11V0 gt P9aT0 bT1 0 1 for some 071 V0 6 9 0 E9T0 E97gT1 E9T1 v0 6 e j 7 1 gtP9T0T11V0 I Theorem 849 i If an UMVUE T exists for a real function 10 then T is the UMVUE for N109 E E ii If UMVUE s T1 and T2 exist for real functions 110 and 120 respectively then T1 T2 is the UMVUE for 110 120 Proof Homework l Theorem 8410 If a sample consists of n independent observations X1 Xn from the same distribution the UMVUE if it exists is permutation invariant Proof Homework l Theorem 8411 RaoiBlackwell Let F9 9 E 9 be a family of cdf s and let h be any statistic in U where U is the non7 empty class of all unbiased estimates of 9 with E9012 lt 00 Let T be a suf cient statistic for F9 9 E 9 Then the conditional expectation E901 l T is independent of 9 and it is an unbiased estimate of 0 Additionally mm m e 0gt2gt Em e 0 v0 6 9 with equality iff h Eh l T Proof By Theorem 473 E9Eh l T E901 0 Since X l T does not depend on 9 due to suf ciency neither does Eh l T depend on 0 Thus we only have to show that E9Eh l T2 S E9012 19909012 l T Thus we only have to show that EM 1 T2 S EM2 1T But the CauchyPSchwarzilnequality Theorem 457 ii gives us ltElth 1T E012 1TE1 1T E012 1T Equality holds iff EeEh l T2 E9012 E9Eh2 l T EU l T 7 EUI l T2 0 Varh l T 0 ltgt E9Eh 7 Eh l T2 l T 0 ltgt his a function of T and h Eh l T For the proof of the last step7 see Rohatgi7 page 17071717 Theorem 27 Corollary7 and Proof of the Corollary l Theorem 8412 Lehmannischeff e If T is a complete suf cient statistic and if there exists an unbiased estimate h of 0 then Eh l T is the unique UMVUE Proof Suppose that 711712 6 U Then E9Eh1 l T E9Eh2 l T 9 by Theorem 8411 Therefore7 E9Eh1 l T 7 Eh2 l T 0 v0 6 9 Since T is complete7 Eh1 l T E012 l T Therefore7 Eh l T must be the same for all h E U and Eh l T improves all h E U There fore7 Eh l T is UMVUE by Theorem 8411 l Note We can use Theorem 8412 to nd the UMVUE in two ways if we have a complete suf cient statistic T i If we can nd an unbiased estimate hT7 it will be the UMVUE since EhT l T hT ii If we have any unbiased estimate h and if we can calculate Eh l T7 then Eh l T will be the UMVUE The process of determining the UMVUE this way often is called Rao Blackwellz39zatz39on iii Even if a complete suf cient statistic does not exist7 the UMVUE may still exist see Rohatgi7 page 35773587 Example 10 Example 8413 V L Let X17 Xn be iid Bin1p Then T ZXi is a complete suf cient statistic as seen in Examples 836 and 8311 i1 Since EX1 p7 X1 is an unbiased estimate of p However7 due to part of the Note aloove7 since X1 is not a function of T7 X1 is not the UMVUE We can use part ii of the Note above to construct the UMVUE It is HampmWF y n E D gt Y is the UMVUE for p If we are interested in the UMVUE for dp p1 7 p p 7 p2 VowX7 we can nd it in the following way nT Eltnltn71gtgt T2 E 771017 1 nTiTZ gtE 71017 1 7 Thus7 due to part of the Note aloove7 2 2T4 is the UMVUE for dp 1917 p 7749 7L 7L 7L E ZX Z Z Xin i1 i1 j1j7 i npnn71p2 71p 7171 1L 2 7171 p 71 iii 2 7171 7171 p n 1P 2 7171 p 297192 dp nil 85 Lower Bounds for the Variance of an Estimate Based on CasellaBerger Section 73 Theorem 851 Cram riRao Lower Bound CRLB Let 9 be an open interval of B Let f9 that the set g f9g 0 is independent of 0 9 E 9 be a family of pdf s or pmf s Assume Let 1M0 be de ned on 9 and let it be differentiable for all 9 E 9 Let T be an unbiased estimate of 10 such that E9T2 lt 00 V0 6 9 Suppose that 1 W3 is de ned for all 0 e 9 ii for a pdf f9 lt or for a pmf f9 gem 0 v0 6 9 iii for a pdf f9 8 a E Tgf9gdggt Tg fg gldg v0 6 e or for a pmf f9 8 a ltZTzf9zgt Eng 1089 v0 6 e Let X 9 a R be any measurable function Then it holds 81 X 2 W002 Emu 7 WW E9 v0 6 e A Further7 for any 00 E 97 either W090 0 and equality holds in A for 9 00 or we have l00gt B 31 X 39 E90 ogafesL 2 Finally7 if equality holds in B7 then there exists a real number 0 7 0 such that 810 MK 80 EeoTX X 902 2 Hi X090 K090 0 990 with probability 17 provided that T is not a constant 67 Lecture 12 We 020701 Note i Conditions i7 ii7 and iii are called regularity conditions Conditions under which they hold can be found in Rohatgi7 page 117137 Parts 12 and 13 ii The right hand side of inequality B is called C mme riRao Lower Bound of 00 or7 in symbols CRLB00 Proof From ii7 we get E9 log ME E9ltX010gf9ilgt From iii7 we get E9 ltnxgt 10gfelt gtgt tlt lXixW bghKO ltT log hwy hygdg ltT f gt Mg g ltT f ggt dg T f9gdggt 8 mnampgt w WW E9TXX0logfe 2 E E9 mg 7 WW E9 logfei ie A holds follows from the Cauchyischwarzilnequality Theorem 457 lf W090 7 0 then the leftihand side of A is gt 0 Therefore the rightihand side is gt 0 Thus 2 E90 ltlogfe gt 0 and B follows directly from lf W090 0 but equality does not hold in A then E90 logfeltxgt2 gt 0 and B follows directly from A again Finally if equality holds in B then W090 7 0 because T is not constant Thus MSEX00TX gt O The Cauchyischwarzilnequality Theorem 457 iii gives equality iff there exist constants 04 6 B2 7 0 0 such that 0 1 990 This implies K090 7g and C holds Since T is not a constant it also holds that M00 7g 0 I P MM 7 x090 B 10mm Example 852 If we take x0 1M0 we get from B WWW VamTX Z W If we have 1M0 0 the inequality above reduces to 2 Var9TX 2 E9 Finally if X1 Xn iid with identical f9z the inequality reduces to WWW VareTX 2 m 71 Example 853 Let 17 in p n 7L Since X 2X1 with fpx1 pw117p1w1 x1 6 01 i1 gt10g fplt1 8 gt a log fpz1 En ltlt10gfpXigt2gt So7 if 1p xp p and if T is unbiased for 197 then VarpTX gt Since Vast W Y attains the CRLB Therefore Y is the UMVUE Example 854 1 logp 17 m1log17p Xn be iid Bin1p Let X N Binnp7 p E 9 01 C B Let 3W4 1p is differentiable with respect to p under the summation sign since it is a nite polynomial i 11 p p1ili p 17p 191 19 VarX1 1 2921 7p 291 19 1 7290729 fmein Let XU00 0E o0eoc1R Thus the CRLB is 9 We know that THXW is the UMVUE since it is a function of a complete suf cient statistic XW see Homework and EXn Var lt gt log f9z2 gt E9 log f9Xgt 2 n1 71 gt log 70 Mm Mm 10 it is 2 X W nltn 2 1 51am 90 7l0g0 02 Z 7 961 191 19 Lecture 13 Fr 020901 How is this possible Quite simple7 since one of the required conditions for Theorem 851 does not hold The support of X depends on 0 l Theorem 855 Chapman Robbins Kiefer Inequality CRK Inequality Let 9 Q B Let f9 9 E 9 be a family of pdf s or pmf s Let 1M0 be de ned on 9 Let T be an unbiased estimate of 1M0 such that E9T2 lt 00 V0 6 9 lf 9 7 19 t9 and 19 E 97 assume that f9z and 129z are different Also assume that there exists such a 19 E 9 such that 9 7 19 and 50 g i Mg gt 0 3 519 3f19 gt 0 Then it holds that 7 2 MARK Wgt W W e e Z sup w 50CS9gt199 Van Proof Since T is unloiased7 it follows Ema M w e e For 19 7 t9 and 519 C 507 it follows Aw Ngch gdg E19TX EeT 1W 7 1M0 f9 7 f19 f9 m m f0 7 0 39 fem femd E9ltfe 1 Therefore 00129 Hz 71gt 1W i 1M It follows by the Cauchyischwarzilnequality Theorem 457 ii that ltwltvgtewlt0gtgt2 lt00U9ltTg1gtgt2 ail gt lt V T X V 7 1 ail V T X V W fl 9 mg Thus7 1p 19 7 1p 0 2 vame 2 9 ltfslt gtgt Finally7 we take the supremum of the rightihand side with respect to 19 519 C 507 19 7 0 which completes the proof I Note i The CRK inequality holds without the previous regularity conditions ii An alternative form of the CRK inequality is Let 0 0 67 6 7 07 be distinct with 50 6 C 50 Let 10 0 De ne 2 J J06 6712 71gt Then the CRK inequality reads as Var9TX Z W with the in mum taken over 6 7 0 such that 50 6 C 50 iii The CRK inequality works for discrete 97 the CRLB does not work in such cases Example 856 Let X N U0t97 0 gt O The required conditions for the CRLB are not met Recall from Example 854 that THXW is UMVUE with VMWTHXW Wig lt CRLB Let the 0 If 19 lt 0 then 519 c 50 It is Eelltfci gt2gt Aglt gtz dw 12900 19 t9 1 E 77d 1 9 ltfeX 0 190 a 19 7 0 e 02 gt Var9TX sup 97 sup 19 7 t9 0lt19lt9 3 7 1 t9 0lt19lt9 4 See Homework for a proof of Since X is complete and suf cient and 2X is unbiased for 0 so TX 2X is the UMVUE It is 02 02 02 Vare2X 4VaT9ltXgt 4E E gt Since the CRK lower bound is not achieved by the UMVUE7 it is not achieved by any unbiased estimate of 0 l De nition 857 Let T17 T2 be unbiased estimates of 9 with E9T12 lt 00 and E9T22 lt 00 V0 6 9 We de ne the e iciency of T1 relative to T2 by VareT1 T T 7 WWW 17 2 Vaqu and say that T1 is more e icient than T2 if eff9T1T2 lt 1 l De nition 858 Assume the regularity conditions of Theorem 851 are satis ed by a family of cdf s F9 9 E 97 9 Q B An unbiased estimate T for 9 is most e icient for the family F9 if 2 VW9T E9 ltltm8 19Xgt De nition 859 Let T be the most ef cient estimate for the family of cdf s F9 9 E 97 9 Q R Then the e iciency of any unbiased T1 of 9 is de ned as VareT1 T T T 7 Efo 1 Efo 17 Vamq De nition 8510 T1 is asymptotically most e icient if T1 is asymptotically unbiased7 ie7 lim E9T1 0 Hoe and lim eff9T1 17 where n is the sample size I naoo Th 8 5 11 Lecture 14 80er 39 39 39 i i i i i i 1 M0 021201 A necessary and suf cient condition for an estimate T of 9 to be most ef cient is that T is suf cient and 1 KW where K09 is de ned as in Theorem 851 and the regularity conditions for Theorem 851 hold 7 Tom v0 6 e 5 Theorem 851 says that if T is most ef cient7 then holds Assume that 9 R We de ne C00 L 0107 wwo L da and Mg 93mm log fag 7 73 Integrating with respect to 9 gives 90 1 90 0 7 90 meme 7mmTgd0iioomd0 7 00 80 d0 Tltzgt0lt0ogt e we 10gf9lt gtl9300 cm Tltggt0lt0ogt 7 We log me 7 eggloolog M ca T C0o 7 W00 10g feds Mg Therefore 1690g BXPTC00 7 W00 My which belongs to an exponential family Thus T is suf cient iilt77 From 5 we get 2 E9 gt ltKlt10gtgt2V T 39 Additionally it holds 810 f9 gt 1 E9 ms 7 0 a as shown in the Proof of Theorem 851 Using in the line above we get K0E9 810g8 igt2gt 1 ie 71 810gf9igt2 K 0 E lt gt lt 9 8 Therefore 71 81 X 2 ma a E9 Og WU 80 ie T is most ef cient for 0 l Note Instead of saying a necessary and su icient condition for an estimate T of 0 to be most e icient in the previous Theorem we could say that an estimate T of 9 is most e icient i ie necessary and su cient 7 means the same as i l 86 The Method of Moments Based on CasellaBerger Section 721 De nition 861 Let X17 Xn be iid with pdf or pmf f9 9 E 9 We assume that rst k moments m1 mm of f9 exist lf 9 can be written as 9 M77117 nk where h Bk a R is a Borelimeasurable function7 the method of moments estimate mom of 9 is A 1 n 1 n 0mm TX1Xn hZXlZX ZX Note i The De nition above can also be used to estimate joint moments For example7 we use V L iZXili to estimate EXY i1 V L Since mi method of moment estimates are unbiased for the popula A V i1 tion moments The WLLN and the CLT say that these estimates are consistent and asymptotically Normal as well iii lf 9 is not a linear function of the population moments7 0mm will7 in general7 not be unbiased However7 it will be consistent and usually asymptotically Normal iv Method of moments estimates do not exist if the related moments do not exist v Method of moments estimates may not be unique If there exist multiple choices for the mom7 one usually takes the estimate involving the lowestiorder sample moment vi Alternative method of moment estimates can be obtained from central moments rather than from raw moments or by using moments other than the rst k moments Example 862 Let X1 Xn be iid Nu02 Since 1 m1 it is mom X This is an unbiased consistent and asymptotically Normal estimate V L i i i A 72 Sincea imgimi 1t1s amom ii XiziX This is a consistent asymptotically Normal estimate However it is not unbiased l Example 863 Let X1 Xn be iid Poisson We know that EX1 VarX1 71 Thus Y and 7 Y are possible choices for the mom of Due to part V of the i1 Note above one uses Amom X l 87 Maximum Likelihood Estimation Based on CasellaBerger Section 722 De nition 871 Let X17 7 X77 be an nirv with pdf or pmf f9z17 7 mn7 9 E 9 We call the function L0m17 7 mn f9m17 7m of 9 the likelihood function I Note i Often 9 is a vector of parameters 7L ii If X17 7X77 are iid with pdf or pmf f9z7 then L0m17 7zn H f9m7 11 I De nition 872 A maximum likelihood estimate MLE is a noniconstant estimate ML such that L MLm17 Han supL0m17 7 966 I Note It is often convenient to work with log L when determining the maximum likelihood estimate Since the log is monotone7 the maximum is the same I Example 873 Let X17 7Xn be iid N7u7 027 Where In and 02 are unknown n 9 I02 202 1 LltM702 1w7ngt 7 explti 2 l 1 a 27r n 2 n 2 n z e m 5 logo 7 5 log27r 7 1 T logLWJZWhnJM The MLE must satisfy 810gL 1 n 7 i 7 A 877 72220 p H 810gL 7 n 1 n I 27 802 t gm 0 B These are the two likelihood equations From equation A we get ML Y Substituting this for 1 into equation B and solving for 02 we get 6 i n Xi 7X Note that 6 is biased for 02 11 Formally7 we still have to verify that we found the maximum and not a minimum and that there is no parameter 0 at the edge of the parameter space 9 such that the likelihood function does not take its absolute maximum which is not detectable by using our approach for local extrema l Lecture 15 Example 874 H 1 1 We 021401 Let X17Xn be 11d U097 5719 E 1 if07mi0 v 1 n Llt0zlmngt O7 otherwise Therefore7 any such that max 7 g min is an MLE Obviously7 the MLE is not unique I Example 875 Let X N Bm1p p e g g 17 p7 ifm1 LOW pm17p m 17p7 1fx0 This is maximized by A 3 ifx1 2m1 p 7 l 4 It is 145 Epmaip Epltlt2leepgt2gt 1 EEAOX 17 419 EP4X222X728pX724p116p2 1 E04141 7 p 192 419 716192 7 8p 1 16192 1 16 So 13 is biased with MSEpQ If we compare this with 13 regardless of the data we have 1 1 1 1 1 3 MSE7E W 277 2lt7v H 122 222 19 2 p 716 p6 474l Thus in this example the MLE is worse than the trivial estimate when comparing their MSE s Theorem 876 Let T be a suf cient statistic for f9g 9 E 9 If a unique MLE of 9 exists it is a function of T Proof Since T is suf cient we can write me h geT due to the Factorization Criterion Theorem 835 Maximizing the likelihood function with respect to 9 takes Mg as a constant and therefore is equivalent to maximizing 99g with respect to 0 But 99g involves g only through T l Note i MLE s may not be unique however they frequently are ii MLE s are not necessarily unbiased iii MLE s may not exist iv If a unique MLE exists it is a function of a suf cient statistic v Often but not always the MLE will be a suf cient statistic itself Theorem 877 Suppose the regularity conditions of Theorem 851 hold and 9 belongs to an open interval in R If an estimate of 9 attains the CRLB it is the unique MLE Proof If attains the CRLB it follows by Theorem 851 that 31ng 7 1 A 8 09 7 mwxye wp 1 Thus satis es the likelihood equations We de ne 140 W15 Then it follows 192 10 MK 802 The Proof of Theorem 8511 gives us A0 E9 810g8 gt2gt gt 0 A 0 X e 0 e Aw SO 192 10 MX Ti A 7140 lt 0 99 ie7 log Mg has a maximum in Thus7 is the MLE l m The previous Theorem does not imply that every MLE is most ef cient I Theorem 878 Let f9 9 E 9 be a family of pdf s or pmf s with 9 Q Bk k 1 Let 719 a A be a mapping of 9 onto A g HP 1 g p g k If is an MLE of 0 then h is an MLE of W Proof For each 6 E A we de ne 50 aeoh06 and M6g sup Li9g7 9665 the likelihood function induced by h Let be an MLE and a member of 957 where 5 It holds MM sup Law 2 Ms 8 966 but also supM6g sup sup L0ggt supL0g L g 56A 56A 9665 966 Therefore7 L g supM6g 56A Thus 5 h is an MLE I Example 879 Let X1 Xn be iid Bin1p Let hp p17p Since the MLE ofp is 13 X the MLE of hp is 7113 Y17 l Theorem 8710 Consider the following conditions a pdf f9 can ful ll i W aa ggafg exist for all 9 E 9 for all m Also 008 a A fg mdxE9lt9Xgt 0 voeo H 82f9 11 LOO 802 dz0 V069 00 2 iii 700 lt Wham lt 0 v0 6 9 iv There exists a function such that for all 9 E 9 83 log f9z 803 00 lt Hm and Human we lt oo 700 v There exists a function 90 that is positive and twice differentiable for every 9 E 9 and there exists a function such that for all 9 E 9 82 810 z w 90 alga 00 lt Hm and Human we lt oo 700 In case that multiple of these conditions are ful lled we can make the following statements i Cramer Conditions i iii and iv imply that with probability approaching 1 as n a 00 the likelihood equation has a consistent solution ii Cramer Conditions i ii iii and iv imply that a consistent solution n of the likelihood equation is asymptotically Normal ie nio L2 where Z N N0 1 and a2 E9 7 iii Kulldorf Conditions i iii and v imply that with probability approaching 1 as n a 00 the likelihood equation has a consistent solution 81 iv Kulldorf Conditions i7 ii7 iii7 and V imply that a consistent solution n of the likelihood equation is asymptotically Normal 0 n case of a pmf f97 we can de ne similar conditions as in Theorem 8710 88 Decision Theory 7 Bayes and Minimax Estimation Based on CasellaBerger Section 723 amp 734 Let f9 9 E 9 be a family of pdf s or pmf s Let X17 A be the set of possible actions or decisions that are open to the statistician in a given 7Xn be a sample from f9 Let situation eg7 A reject H07 do not reject H0 Hypothesis testing7 see Chapter 9 A artefact found is of Greek7 Roman origin Classi cation A 9 Estimation De nition 881 A decision function 1 is a statistic7 ie7 a Borelimeasurable function7 that maps B into A If X g is observed7 the statistician takes action dg E A l Note For the remainder of this Section7 we are restricting ourselves to A 97 ie7 we are facing the problem of estimation l De nition 882 A noninegative function L that maps 9 X A into B is called a loss function The value Lt97 a is the loss incurred to the statistician if heshe takes action a when 9 is the true pa rameter value I De nition 883 Let D be a class of decision functions that map E into A Let L be a loss function on 9 X A The function R that maps 9 x D into B is de ned as 30971 E9L0 1 and is called the risk function of d at 0 Example 884 Let A 9 Q B Let Lt97 a 0 7 12 Then it holds that PM d E9L07 Kim Ee0 NEW Ee0 W 83 Lecture 16 Fr 021601 Note that this is just the MSE If is unbiased this would just be Vm I Note The basic problem of decision theory is that we would like to nd a decision function 1 E D such that Rt9 d is minimized for all 0 E 9 Unfortunately this is usually not possible I De nition 885 The minimax principle is to choose the decision function 1 E D such that maXRt9d maxRt9d Vd E D 966 966 Note If the problem of interest is an estimation problem we call a 1 that satisi es the condition in De nition 885 a minimax estimate of 0 l Example 886 Let X N Bin1p p e 9 i g A We consider the following loss function Nu NH Nu NH 9 P l 4 l 4 4 4 The set of decision functions consists of the following four functions 110 4 111 120 121 2 130 131 140 4 141 2 First we evaluate the loss function for these four decision functions Lltdiltogtgt Llti 0 Lltdilt1gtgt Llti 0 Llt d1ltogtgt HEM Llt d1lt1gtgt L rs Lltd2ltogtgt Llti 0 Lltd2lt1gtgt LQEH Llt d2ltogtgt HEM Llt d2lt1gtgt Llt gt0 Lltd3ltogtgt Llti712 Lltd3lt1gtgt Llti 0 Llt d3ltogtgt i Llt gt0 Llt dslt1gtgt i HEM Lltd4ltogtgt i LQEM Lltd4lt1gtgt i LQEH Llt d4ltogtgt i L 0 Llt d4lt1gtgt Llt gt0 Then the risk function R 1100 EpLp7 dX L097 10 39PpX 0 LP7d1 PpX takes the following values 2 p i Red p Red H p6512 4RP7di 1 0 5 5 2 02 20 2 3 20 were 4 2 0 2 Hence 5 min max R di 7 ie1 2 3 4 pen4 34 4 Thus 12 is the minimax estimate Note Minimax estimation does not require any unusual assumptions However it tends to be very conservative l De nition 887 Suppose we consider 9 to be a rv with pdf 7r0 on 9 We call 7139 the a priori distribution or prior distribution l Note g 0 is the conditional density of g given a xed 0 The joint density of g and 9 is g whom i 0 the marginal density of g is go mode and the a posteriori distribution or posterior distribution which gives the distribution of 9 after sampling has pdf or pmf W m M 9g 39 I De nition 888 The Bayes risk of a decision function 1 is de ned as Rm d E R07 1 where 7139 is the a priori distribution l Note If 9 is a continuous rv and X is of continuous type then R7rd E R07d R0d7r0d0 E9Lodxvr0d0 ltL0dgfg0dggt we L0d g 0 WW 1 d0 86 L0d 00 010 My Llt0dltggtgthlt0 l 0 d0 d0 Similar expressions can be written if 9 andor X are discrete l Lecture 17 M Tu 022001 A decision function 1 is called a Bayes rule if 1 minimizes the Bayes risk7 ie7 if R7rd Elngnd Theorem 8810 Let A 9 Q B Let L097 0 7 dg2 In this case7 a Bayes rule is 1a E0 l X 9 Proof Minimizing R7rd g 07d200 lg 10gt dg where g is the marginal pdf of X and h is the conditional pdf of 9 given g is the same as minimizing lt0 7 10020010 d0 However7 this is minimized when Kg E09 1 X g as shown in Stat 67107 Homework 3 Question ii7 for the unconditional case I Note Under the conditions of Theorem 88107 dg E09 X g is called the Bayes estimate I Example 8811 Let X N Binnp Let L 7o lm p7 dz2 Let 7rp 1 Vp E 017 ie7 7139 N U017 be the a priori distribution of p Then it holds aw ltgt29m129 m 990 f7pdp 1 71 lt gtpm1ip mdp 0 m 1 W17 pYHdp 0 Eltp z 1phltpwzgtdp 1pltngtp 1ip wdp 1 11910 PYHdp 0 m 1 0 p l 7 pYHdp 1 0 pm1ip mdp Bm2n7z1 7 Bm1n7z1 Pz21 nix1 Pm2nim1 Pz11 nix1 Pm1nim1 i m 1 7 n 2 Thus7 by Theorem 88107 the Bayes rule is A X 1 pBayes 71 2 88 The Bayes risk of 1 X is R017 1 X E Rp7dX 1 X1 E 7 2 d 0 pltltn2 PgtP 1 X1 1 E 272 2 d 0 pltn2 pn2pgtp ltX1gt27 2pltn2gtltX1gtp2ltn WW 1 Wo EPltX22X1QP712X1p2n22gt dp H n j W nplt17pgt ltan 2np 17 2pm 2W 1 19 2 dp H 1 WO 71197 npz 712192 1 2711971717 2712192 7 271197 11an 7 419 192712 471192 4192 dp 1 1 WO 14PnPnP24P2dP 1 1 7 1 74 47 2d n220lt ltn gtplt mp p 1 71742 47713 1 n22p 2 29 3 MO 1 7174 4771 n221 2 3 1 63n712872n n22 6 1 n2 n22 6 6n2 Now we compare the Bayes rule 1X with the MLE ML This estimate has Bayes risk PM Alw emdp 11191 7d 0 712 p 1 p171 widp 0 n 41an 7 Wm lt p 1 7 1 p2 p3 7 n 2 3 0 a which is as expected larger than the Bayes risk of 1 l Theorem 8812 Let f9 9 E 9 be a family of pdf s or pmf s Suppose that an estimate 1 of 9 is a Bayes estimate corresponding to some prior distribution 7139 on 9 If the risk function R9 1 is constant on 9 then 1 is a minimaX estimate of 9 Proof Homework l De nition 8813 Let F denote the class of pdf s or pmf s f9z A class H of prior distributions is a conju gate family for F if the posterior distribution is in the class H for all f E F all priors in H and all z E X l Note The beta family is conjugate for the binomial family Thus if we start with a beta prior we will end up with a beta posterior See Homework l 9 Hypothesis Testing 91 Fundamental Notions Based on CasellaBerger Section 81 amp 83 We assume that X X1Xn is a random sample from a population distribution F9 9 E 9 Q Bk where the functional form of F9 is known except for the parameter 0 We also assume that 9 contains at least two points De nition 911 A parametric hypothesis is an assumption about the unknown parameter 0 The null hypothesis is of the form H0 2 9 E 90 C The alternative hypothesis is of the form H12 96919790 De nition 912 If 90 or 91 contains only one point we say that H0 and 90 or H1 and 91 are simple In this case the distribution of X is completely speci ed under the null or alternative hypoth esis If 90 or 91 contains more than one point we say that H0 and 90 or H1 and 91 are composite l Example 913 Let X1 Xn be iid Bin1p Examples for hypotheses are p simple p 2 com posite p 7 composite etc I Note The problem of testing a hypothesis can be described as follows Given a sample point g nd a decision rule that will lead to a decision to accept or reject the null hypothesis This means we partition the space B into two disjoint sets C and 00 such that if g E C we reject H0 9 E 90 and we accept H1 Otherwise ifg 6 CC we accept H0 that X N F9 9 E 90 I De nition 914 Let X N F9 9 E 9 Let C be a subset of R such that ifg E C then H0 is rejected with probability 1 ie C g E R H0 is rejected for this The set C is called the critical region I De nition 915 If we reject H0 when it is true we call this a Type I error If we fail to reject H0 when it is false we call this a Type II error Usually H0 and H1 are chosen such that the Type I error is considered more serious I Example 916 We rst consider a nonistatistical example in this case a jury trial Our hypotheses are that the defendant is innocent or guilty Our possible decisions are guilty or not guilty Since it is considered worse to punish the innocent than to let the guilty go free we make innocence the null hypothesis Thus we have Truth unknown Decision known Not Guilty H0 Correct Type II Error Guilty H1 Type I Error Correct Innocent H0 Guilty H1 The jury tries to make a decision beyond a reasonable doubt ie it tries to make the probability of a Type I error small I De nition 917 If C is the critical region then P9C 9 E 90 is a probability of Type I error and PACE 9 E 91 is a probability of Type II error I Note We would like both error probabilities to be 0 but this is usually not possible We usually settle for xing the probability of Type I error to be small eg 005 or 001 and minimizing the Type II error I De nition 918 Every Borelimeasurable mapping 1 of R a 01 is called a test function Mg is the probability of rejecting H0 when g is observed lf 1 is the indicator function of a subset C Q B j is called a nonrandomized test and C is the critical region of this test function Otherwise7 if j is not an indicator function of a subset C Q B j is called a randomized test I De nition 919 Let 1 be a test function of the hypothesis H0 9 E 90 against the alternative H1 9 E 91 We say that 1 has a level of signi cance of 04 or 1 is a leveliaitest or 1 is of size 04 if E9 gtX P9rejct H0 a v0 6 90 In short7 we say that j is a test for the problem a 907 91 De nition 9110 Let 1 be a test for the problem a 907 91 For every 9 E 97 we de ne 9250 E9 gtX P9rei0t H0 We call 1509 the power function of 1 For any 9 E 917 1509 is called the power of 1 against the alternative 0 l De nition 9111 Let Pa be the class of all tests for 04907 91 A test 10 6 Pa is called a most powerful MP test against an alternative 9 E 91 if 92500 2 2509 W 6 10 De nition 9112 Let Pa be the class of all tests for a 907 91 A test 10 6 Pa is called a uniformly most powerful UMP test if 31500 2 We w e ltIgta v0 6 91 Lecture 18 We 022101 Example 9113 Let X17 Xn be iid Nu17 u E 9 0411 p0 lt 1 Let H0 X N Nuo7 1 vs H1 X N Np11 Intuitively7 reject H0 when X is too large7 ie7 if X 2 k for some k Under H0 it holds that Y N NW0 i For a given a we can solve the following equation for k i Yip kip PMOXgtkPltT ogt 1N gtPZgtzaoz Here7 791 Z N N01 and 2a is de ned in such a way that PZ gt 20 a ie7 2a is the upper orquantile of the N01 distribution It follows that ff 2a and therefore7 kp0 Thus7 we obtain the nonrandomized test 17 if E gt uo L m W 07 otherwise 1 has power i z WW1 PM X gt 0 Y 1 P 7 7 lt 1 gtIU39O H17Zagt PZ gt 2a WW1 0 EH gt0 gt 04 The probability of a Type ll error is PType ll error 17 5011 Example 9114 Let X N Bin6p7 p E 9 01 Ho p7 H1ip7 Desired level of signi cance Oz 005 Reasonable plan Since EpX 37 reject H0 when l X 7 3 l c for some constant 0 But 2 how should we select 0 m cm73WB JXM P X73E 07 6 3 0015625 003125 17 5 2 0093750 021875 27 4 1 0234375 068750 3 0 0312500 100000 Thus7 there is no nonrandomized test with 04 005 What can we do instead 7 Three possibilities i Reject if 1 X 7 3 l 37 ie7 use a nonrandomized test of size 04 003125 ii Reject if 1 X 7 3 l 27 ie7 use a nonrandomized test of size 04 021875 iii Reject if 1 X 7 3 l 37 do not reject if 1 X 7 3 lg 17 and reject with probability 005 7 003125 m 01 1f 1 X 7 3 l 2 Thus7 we obtain the randomized test 1 z06 7 we or zL5 0 zza4 This test has size a 7 Eplt ltXgtgt 10015625 2 01 0093750 2 0 024375 2 03125 005 as intended The power of 1 can be calculated for any p 74 and it is 3AMPRXOOLX OlJMX1OLX 92 The Neymanipearson Lemma Based on CasellaBerger Section 832 Let f9 9 E 9 907 01 be a family of possible distributions of X f9 represents the pdf or pn1f of X For convenience7 we write f0g 1 90 and f g 1 91 Theorem 921 NeymaniPearson Lemma NP Lemma Suppose we wish to test H0 X N f0g vs H1 X N f g where 1 is the pdf or pn1f of K under Hi i 017 where both7 H0 and H17 are simple i Any test of the form for some k 2 0 and 0 g 17 is most powerful of its signi cance level for testing H0 VS H1 is most powerful of size or signi cance level 0 for testing H0 vs H1 If k 007 the test 17 if f0g 0 0 if Mg gt 0 Mg ii Given 0 g 04 g 17 there exists a test of the form or with g constant such that 39y ie7 a E90 ME 04 Proof We prove the continuous case only Let 1 be a test satisfying Let f be any other test with size E90 gt E90 gtX It holds that We 7 gtf1 e kfo d ltf1gtkfolt gtgt e f1 7 were lt f 1ltkfolt gtltggt e gtgf1g e mam 96 Lecture 19 Fr 022301 1kfolt gtltggt 7 gf1g 7 wow 7 o It is we 7 f1 7 were 7 lt17 m he 7 kf0 dg 2 0 f1gtkfo f1gtkfo and we7 ltggtgtltf1ltzgt7kfoltggtgtdg7 lt07 ltzgtgtlthltggt7kfoltggtgtdz20 f1ltkfo f1ltkfoT O Therefore 0 We 7 wax e 7 kfo d 7 E91 an 7 E91 ME 7 kltEeolt ltxgtgt 7 Eeon Since E90 gtX gt E90 gt it holds that W01 7 Bar 01 Z M1990 ME 7 Eeo gtX Z 0 ie j is most powerful If k 00 any test f of size 0 must be 0 on the set g l f0g gt 0 Therefore 0 7 0 E X 7 E X 1 7 d gt 0 BA 1 gs 1 91 gti 91 J Agfo0 gt f1 z 7 7 ie j is most powerful of size 0 ii If 04 0 then use Otherwise assume that 0 lt 04 g 1 and g 39y It is Eeo gti Peof1X gt kf0X VPeof1X kfol 1 P90ltf1X S kfo t VPQOUNX kf0X f1X 101X 7 1 P90 Mi Mi kl Note that the last step is valid since P90f0X 0 O kgtVP90lt Therefore given 0 lt 04 g 1 we want to nd k and V such that E90 gtX a ie Jedi i Note that l g is a rV and therefore P90 g k is a cdf 97 If there exists a k0 such that we choose 39y 0 and k k0 Otherwise if there exists no such k0 then there exists a k1 such that ie the cdf has a jump at k1 In this case we choose k k1 and Theorem 921 mama Let us verify that these values for k1 and 39y meet the necessary conditions Obviously P90ltf1lt gt S 1 ME k1gt1a Also since it follows that 39y 2 0 and since X X X P90 k1 7 17 a lt P90 2 k1 7 P90 g lt k1 f ix i f ix P90 fag k1 P90 fax k1 P90 lt Pen 0 k1 1 7 it follows that y S 1 Overall7 0 S y S 1 as required W 1405822816 1 9 E 9 0001 then the Neymani e If a suf cient statistic T exists for the family f9 Pearson most powerful test is a function of T Proof Homework l Example 923 We want to test H0 X N N01 vs H1 X N Cauchy17 07 based on a single observation It is The MP test is 26XPL ltz 17 lf mzr 0 otherwise where k is determined such that EHO gtX a If 04 lt 01137 we reject H0 if l x lgt 2 where 2 is the upper quantile of a N01 distribution lfa gt 01137 we reject H0 if l x lgt k1 or if l x llt k2 where k1 gt 07 k2 gt 07 such that 1 15 k 2 7 w w and 1 1 XP d 1 a k 7 e 1 hf 1 kg 2 27139 2 Example 923a Example 923 x s 35 x i 35 t al haZZEI 1132 alphaZ EIM 2 E D x I g D 5 z z z s l l m 72 4 585 4 El 1 1585 2 72 4 585 4 u 1 1585 2 100 Why is 04 0113 so interesting For x 07 it is f 1 90 M90 Similarly7 for z x 71585 and z x 15857 it is M90 M90 More importantly7 PHOO X lgt 1585 0113 101 2 aw i1285 7r1i15852 V2 x 07979 7r f10 x 07979 x f00 93 Monotone Likelihood Ratios Based on CasellaBerger Section 832 Suppose we want to test H0 9 g 00 vs H1 9 gt 00 for a family of pdf s f9 9 E 9 Q B In general it is not possible to nd a UMP test However there exist conditions under which UMP tests exist De nition 931 Let f9 9 E 9 Q B be a family of pdf s pmf s on a onekdimensional parameter space We say the family f9 has a monotone likelihood ratio MLR in statistic T if for 01 lt 02 whenever f9 and 1 92 are distinct the ratio 22 is a nondecreasing function of Tg 1 for the set of values g for which at least one of f9 and 1 92 is gt O l Note We can also de ne families of densities with nonincreasing MLR in TX but such families can be treated by symmetry l Example 932 Let X1 Xn N U00 9 gt 0 Then the joint pdf is 0 g mmam g 0 1 otherwise 070 9mm7 fag 3 where mm max mi i1n Let 02 gt 01 then f92 7 t91 n1092maz lt gt I 1091 091 maz t92 It is 17 mmam E 0701 7 mmam E 01702 I092maw 091maz 00 since for mm 6 0 01 it holds that mm 6 002 But for 7M 6 09102 it is 091mam 0 gt as TX Xmam increases the density ratio goes from to 00 gt gff is a nondecreasing function of TX Xmam gt the family of U0 0 distributions has a MLR in TX Xmam l 102 Theorem 933 The oneparameter exponential family f9g expQ0Tg D09 Sg7 where 6209 is nondecreasing7 has a MLR in Proof Homework l Example 934 Let X X17 7Xn be a random sample from the Poisson family with parameter gt 0 Then the joint pdf is fg ETAVii ETnAAE1M exp nA logx ilogzilgt 11 11 11 11 which belongs to the oneparameter exponential family Since Q log is a nondecreasing function of 7 it follows by Theorem 933 that the Poisson family with parameter gt 0 has a MLR in TX iXi 11 We can verify this result by De nition 931 fA2 322m 54m 7 ltAzgt2mi nlthwml fA1lt gt 7 A12 57ml 71 i i i If 2 gt A1 then if gt 1 and 1s a nondecreasing function of Zn V L Therefore A has a MLR in g 2X1 I 11 Theorem 935 Let X N f9 9 E 9 Q B where the family f9 has a MLR in For testing H0 0 g 00 vs H1 9 gt 00 00 E 9 any test of the form we v if T has a nondecreasing power function and is UMP of its size E90 gt 01 if the size is not 0 Also7 for every 0 g 01 g 1 and every 00 E 97 there exists a to and a 39y 00 g to g 007 0 g 39y 17 such that the test of form is the UMP size 01 test of H0 vs H1 103 Let 01 lt 02 0102 6 9 and suppose E91 gt 0 ie the size is gt 0 Since f9 has a MLR in T is a nondecreasing function of T Therefore any test of form is equivalent to 1 a test f lt v a z 1 1f fang gt k 7 A f92 7 i 3977 1f fTQ k A f9 Q 0 1f fang lt k which by the NeynianiPearson Lemma Theorem 921 is MP of size 04 for testing H0 0 01 vs H1 0 02 Let 10 be the class of tests of size a and let 1 E 10 be the trivial test gttg Oz Vg Then 1 has size and power a The power of the MP test 1 of form must be at least 04 as the MP test 1 cannot have power less than the trivial test t ie Ee2 gt Z Ee2 gtti a E91 gtX Thus for 02 gt 01 Ee2 gti Z E91 gtX7 ie the power function of the test 1 of form is a nondecreasing function of 0 Now let 01 00 and 02 gt 00 We know that a test 1 of form is MP for H0 0 00 vs H1 0 02 gt 00 provided that its size 04 E90 gtX is gt 0 Notice that the test 1 of form does not depend on 02 It only depends on to and 39y Therefore the test 1 of form is MP for all 02 E 91 Thus this test is UMP for simple H0 0 00 vs composite H1 0 gt 00 with size E90 gtX 040 Since 1 is UMP for a class lt1 of tests gt E 1 satisfying mown a0 1 must also be UMP for the more restrictive class PM of tests gt 6 PM satisfying E9 gtWX S 040 V0 S 00 But since the power function of j is nondecreasing it holds for j that Ee gtX Emmi a0 v0 00 Thus 1 is the UMP size 040 test of H0 0 g 00 vs H1 0 gt 00 if 040 gt 0 104 Lecture 22 Fr 030201 77 Use the Neymanipearson Lemma Theorem 921 I Note By interchanging inequalities throughout Theorem 935 and its proof7 we see that this The orem also provides a solution of the dual problem H6 t9 2 00 vs Hi 9 lt 00 l Theorem 936 For the onekparameter exponential family7 there exists a UMP twoisided test of H0 9 g 01 or 9 2 02 Where 01 lt 02 vs H1 01 lt 9 lt 02 of the form 17 if cl lt Tg lt 02 Mg m if T Civ i172 07 if Tg lt cl or if Tg gt 02 Note UMP tests for H0 01 g 9 g 02 and H6 t9 00 do not exist for oneiparameter exponential families I 105 94 Unbiased and Invariant Tests Based on Rohatgi Section 95 RohatgiSaleh Section 95 amp CasellaBerger Section 832 If we look at all size 04 tests in the class P0 there exists no UMP test for many hypotheses Can we nd UMP tests if we reduce 10 by reasonable restrictions De nition 941 A size 04 test 1 of H0 0 E 90 vs H1 0 E 91 is unbiased if E9 gtX gt a v0 6 91 I Note This condition means that 0 04 V0 6 90 and 0 2 04 V0 6 91 In other words the power of this test is never less than a l De nition 942 Let U0 be the class of all unbiased size 04 tests of H0 vs H1 If there exists a test 1 6 U0 that has maximal power for all 0 E 91 we call 1 a UMP unbiased UMPU size 04 test I Note It holds that U0 Q 10 A UMP test 10 E 10 will have 3 2 04 V0 6 91 since we must compare all tests 10 with the trivial test Mg 04 Thus if a UMP test exists in P0 it is also a UMPU test in Ua l Example 943 Let X1 Xn be iid Nu02 where 02 gt 0 is known Consider H0 1 p0 vs H1 1 7 0 From the NeymaniPearson Lemma we know that for p1 gt 0 the MP test is of the form if X gt uo 20 otherwise 1 ifYltpoi za 0 otherwise If a test is UMP it must have the same rejection region as 1 and 12 However these 2 rejection regions are different actually their intersection is empty Thus there exists no 106 UMP test We next state a helpful Theorem and then continue with this example and see how we can nd a UMPU test I Theorem 944 Let 01 0 E B be constants and f1 g7 fn1g be real7valued functions Let C be the class of functions Mg satisfying 0 Mg 1 and 00 wailgm vz 1n 00 If gt E C satis es 7L 17 if fn1 gt Zkifllg if 0 iffn1lt gt lt Elmg i1 00 for some constants k1 7 k E R then f maximizes gtgfn1gdg among all j E C 00 Proof Let gtg be as above Let Mg be any other function in C Since 0 Mg 1 Vg it is WQ 7 lt gtgt we 7 Zkif o Z 0 Vi i1 This holds since if gtg 17 the left factor is 2 0 and the right factor is 2 0 If gtg 07 the left factor is g 0 and the right factor is g 0 Therefore7 0 S g 7 m ung 7 im dg i1 gt fn1 dg 7 fn1gd Thus7 ciici fn1 dg2 gtgfn1gd Note i If fwd is a pdf7 then f maximizes the power ii The Theorem above is the Neyman7Pearson Lemma if n 17 f1 10907102 1 91 and 3104 107 Example 943 continued So far7 we have seen that there exists no UMP test for H0 u p0 vs H1 u 344 0 We will show that 3 O7 otherwise 1 ifYlt M0 7 2042 or ifYgt to 2042 is a UMPU size 04 test Due to Theorem 9227 we only have to consider functions of suf cient statistics TX Y Let 7392 To be unbiased and of size a a test 1 must have lt1 gtlttgtmlttgtdt a and H 8 u 3 tfttdt1MMO gtlttgt aw dt O7 ie7 we have a minimum at 0 PPM We want to maximize gttftdt u 344 do such that conditions and ii hold W h b t d 1 t Lecture 23 e c oose an ar 1 rary p1 7 p0 an e MO 030501 mt mt mt 83m I PPM mt M We now consider how the conditions on f in Theorem 944 can be met f3t gt k1f1tk2f2t 1 1 7 2 k1 1 7 2 mTexm we m gt mTexm we M0 kg 1 7 2 U O mTexm we 0 T2 gt 1 7 1 7 1 7 m 7 p0 ltgt eXPV W 02 gt k1 eXPP W HOV k2 eXPP W M02 T2 1 if explt ltltiwogtzeltiemgt2gtgt gt k1k2lt JO 3 7 27 2 E7 ltgt amwimw m gt k1k2 0 108 Note that the left hand side of this inequality is increasing in E if m gt 30 and decreasing in E if m lt 30 Either way7 we can choose k1 and k2 such that the linear function in E crosses the exponential function in E at the two points 039 039 ML M0 20427 MU 0 2042 Obviously7 gt3 satis es We still need to check that gt3 satis es ii and that 3amp3 has a minimum at uo but omit this part from our proof here gt3 is of the form gt in Theorem 944 and therefore gt3 is UMP in C But the trivial test 1 04 also satis es and ii above Therefore7 3amp3 2 Oz VIM 7 30 This means that gt3 is unbiased Overall7 gt3 is a UMPU test of size a l De nition 945 A test 1 is said to be aisimilar on a subset 9 of 9 if 509 E9 X a V0 6 9 A test 1 is said to be similar on 9 Q 9 if it is orsimilar on 9 for some a 0 g 04 g 1 l Note The trivial test Mg 04 is orsimilar on every 9 Q 9 l Theorem 946 Let 1 be an unbiased test of size 04 for H0 0 E 90 vs H1 0 E 91 such that 3amp0 is a continuous function in 0 Then 1 is orsimilar on the boundary A 9io 9i17 where 970 and 9il are the closures of 90 and 917 respectively m Let 0 E A There exist sequences 0 and whith 0n 6 90 and 0 E 91 such that 0 and 0 By continuity 5090 9 2509 and WW 9 3450 Since 3amp0 04 implies 3amp0 Oz and since 3amp0g 2 04 implies 3amp0 2 04 it must hold that 3amp0 a V0 6 A I 109 De nition 947 A test 1 that is UMP among all orsimilar tests on the boundary A 970 O Gil is called a UMP aisimilar test I Theorem 948 Suppose 0 is continuous in 0 for all tests 1 of H0 0 E 90 vs H1 test of H0 vs H1 is UMP orsimilar then it is UMP unbiased Proof Let 10 be UMP orsimilar and of size a This means that E9 gtX lt 04 V0 6 90 06911fasizea Since the trivial test Mg 04 is orsimilar it must hold for 10 that 00 2 04 V0 6 91 since 10 is UMP orsimiliar This implies that 10 is unbiased Since 0 is continuous in 0 we see from Theorem 946 that the class of unbiased tests is a subclass of the class of orsimilar tests Since 10 is UMP in the larger class it is also UMP in the subclass Thus 0 is UMPU l Note The continuity of the power function 0 cannot always be checked easily I Example 949 Let X1XnNu1 LetH0u 0vsH1ugt0 V L Since the family of densities has a MLR in Z Xi we could use Theorem 935 to nd a UMP i1 test However we want to illustate the use of Theorem 948 here It is A 0 and the power function mo M yexp 7 i m dg of any test 1 is continuous in 0 Thus due to Theorem 946 any unbiased size 04 test of H0 is orsimilar on A WeneedaUMPtest ofH u0vsH1 ugt0 By the NP Lemma a MP test of Hg u 0 vs Hf 1 if exp 7 LT gt k 0 u pl where m gt 0 is given by 0 otherwise 110 Lecture 24 We 030701 or equivalently by Theorem 922 n 1 if T EX gt k 14 i1 0 otherwise Since under H0 T N N0n k is determined by 04 PM0T gt k P k za j is independent of M1 for every 1 gt 0 So 1 is UMP orsimilar for H6 vs H1 T k W gt W ie Finally 1 is of size 04 since for u g 0 it holds that EM gtX PT gt za Pltgt2awgt PZ gt 20 Oz 5 holds since TX N N0 1 for u S 0 and 2a 7 W Z 20 for u S 0 Thus all the requirements are met for Theorem 948 ie gs is continuous and j is UMP orsimilar and of size a and thus 1 is UMPU l Note Rohatgi page 4287430 lists Theorems Without proofs stating that for Normal data onei and twoitailed titests one and twoitailed Xzitests twoisample titests and Fitests are all UMPU l Note Recall from De nition 824 that a class of distributions is invariant under a group G of trans formations if for each 9 E Q and for each 9 E 9 there exists a unique 0 E 9 such that if X N P9 then N PQI I De nition 9410 A group G of transformations on X leaves a hypothesis testing problem invariant if Q leaves both P9 9 E 90 and P9 9 E 91 invariant ie if a 9g N hag then f9g 9 E 90 E haw 9 E 90 and f9g 9 E 91 E haw 9 E 91 l 111 Note We want two types of invariance for our tests Measurement Invariance lfa 9g is a 17to71 mapping the decision based on a should be the same as the decision based on g If Mg is the test based on g and gta is the test based on a then it must hold that Mg f gta Formal Invariance If two tests have the same structure ie the same 9 the same pdf s or pmf s and the same hypotheses then we should use the same test in both problems So if the transformed problem in terms of a has the same formal structure as that of the problem in terms of g we must have that gta Mg f We can combine these two requirements in the following de nition I De nition 9411 An invariant test with respect to a group G of tansformations is any test 1 such that we gt9z V V9 6 9 Example 9412 Let X N Binnp Let H0 p vs H1 p 7 Let Q 9192 where 91z n 7 z and 92z m lf 1 is invariant then gtz gtn 7 ls the test problem invariant For 92 the answer is obvious For 91 we get 91X n 7 X N Binn17p H0 pfpxp x Bmmp Bmngtz7 So all the requirements in De nition 9410 are met If for example 71 10 the test 1 ifz 01289 10 WE 0 otherwise is invariant under G For example gt4 0 gt10 7 4 gt6 and in general gtm gt107m Vz e 0 19 10 I 112 Example 9413 Let X17 7Xn N N01 02 where both In and 02 gt 0 are unknown It is Y N N01 and 71772152 N xiil and Y and 52 independent LetHO lu 0vs H1 ugt0 Let Q be the group of scale changes 9 gm 32c gt 0 90 s caczsm The problem is invariant because7 when gc 32 of 32327 then of and 0252 are independent gt e M a 6252 N xiil 2 CY N Ncu7 020 n n71 c2 72 So7 this is the same family of distributions and De nition 9410 holds because u g 0 implies that clu 0 for c gt 0 An invariant test satis es ME 52 E twig7 c gt 07 52 gt 0 E R E 7 1 7 2 7 i i 7 2 7 Let c 7 Then Mm s Mg 1 so invariant tests depend on x s only through lf ii 7 91 no restrictions on 1 for different Thus7 invariant tests are exactly those that depend if then there exists no 0 gt 0 such that 127 3 cfhcz So invariance places only on g which are equivalent to tests that are based only on t E Since this mapping SW39 is 17toi17 the invariant test will use T SX N tn1 if u 0 Note that this test does not depend on the nuisance parameter 02 lnvariance often produces such results I De nition 9414 Let Q be a group of transformations on the space of X We say a statistic Tg is maximal invariant under G if i T is invariant7 ie7 Tg Tgg V9 6 Q and ii T is maximal7 ie7 Tg1 Tg2 implies that 1 9g2 for some 9 E Q 113 Lecture 2 Fr 030901 Example 9415 Let g 17mn and 90g x1 677n6 Consider Tg ml 7 mhzn 7 m2 7m 7 zn1 It is Tgcg mn 7 m1 zn 7 m2 zn 7 zn71 Tg7 so T is invariant lfTg Tg 7 then mn7xi z l7m Vi 127717 1 This impliesthat m7m zn7z lc Vi 1277171 Thus ng c7 x c g Therefore7 T is maximal invariant De nition 9416 Let a be the class of all invariant tests of size 04 of H0 9 E 90 vs H1 9 E 91 If there exists a UMP member in Ia it is called the UMP invariant test of H0 vs H1 l Theorem 9417 Let Tg be maximal invariant with respect to Q A test 1 is invariant under Q iff j is a function of T Let 1 be invariant under Q If Tg1 Tg27 then there exists a g E G such that 1 9g2 Thus7 it follows from invariance that Mil lt9lt2 Mgg Since 1 is the same Whenever Tg1 Tg27 1 must be a function of T 77 Let 1 be a function of T7 ie7 Mg It follows that gt9 hT9 2 hT Mg holds since T is invariant This means that j is invariant 114 Example 9418 Consider the test problem HOXNfO13170713n70VSH1wa111707n3971n707 where 9 E B Let Q be the group of transformations with 94 1Cznc WherecEB andn 2 As shown in Example 9415 a maximal invariant statistic is TX X1 7 X Xn1 7 X T1 Tn1 Due to Theorem 9417 an invariant test 1 depends on X only through T Since the transformation T1 X1 7 X lt T gt Z Tn71 Xn71 7 Xn Z X is 17to71 there exists inverses Xn Z and X T Xn T Z Vi 1n 7 1 Applying Theorem 435 and integrating out the last component Z X gives us the joint pdf of T T1 Tn1 00 Thus under Hhi O 1 the joint pdf of T is given by fit1 2252 z tn1 22d2 which is independent of 0 The problem is thus reduced to testing a simple hypothesis against a simple alternative By the NP Lemma Theorem 921 the MP test is 1 if t gt c ltt1quotquot t 1 0 if A lt c CO f1t1zt2ztn1zzdz where t t1 tn71 and Mt quot 0 f0t1zt2ztn1zzdz 00 In the homework assignment we use this result to construct a UMP invariant test of H0 KN N01 vs H1 KN Cauchy10 1 where a Cauchy10 distribution has pdf m 0 7m where 9 E R l 7139 13 7 115 10 More on Hypothesis Testing 101 Likelihood Ratio Tests Based on CasellaBerger Section 821 De nition 1011 The likelihood ratio test statistic for H0 0690VS H1 96919790 sup Ma 9680 Ms SUP f9lt gt 39 966 The likelihood ratio test LRT is the test function Mg Iock 7 for some constant c E 017 where c is usually chosen in such a way to make 1 a test of size 04 I Note i We have to select 0 such that 0 g c g 1 since 0 Mg 1 ii LRT s are strongly related to MLE s lf is the unrestricted MLE of 9 over 9 and o is the MLE of 0 over 90 then g f3 g Example 1012 Let X17 7Xn be a sample from Nu1 We want to construct a LRT for H03 MM0VSH13 75110 It is g uo and lil X Thus7 2704 expk 2W HOV WOW2 expki 2 EV M2 expege 7 M 116 Lecture 26 Mo 031901 The LRT rejects H0 if Mg c or equivalently l E 7 pg 2 072 This means the LRT rejects H0 u go if is too far from 0 l Theorem 1013 lf TX is suf cient for 9 and t and Mg are LRT statistics based on T and K respectively then T Mg Vs ie the LRT can be expressed as a function of every suf cient statistic for 0 Proof Since T is suf cient it follows from Theorem 835 that its pdf or pmf factorizes as f9g 99Thg Therefore we get 981161 f9lt gt m 60 7 sup f9 z 966 sup 99Th 9680 sup99Th 966 sup 99T 9680 sup 99 T 966 T Thus our simpli ed expression for Mg indeed only depends on a suf cient statistic T l Theorem 1014 If for a given a 0 g 04 g 1 and for a simple hypothesis H0 and a simple alternative H1 a nonirandomized test based on the NP Lemma and a LRT exist then these tests are equivalent Proof See Homework l Note Usually LRT s perform well since they are often UMP or UMPU size 04 tests However this does not always hold Rohatgi Example 4 page 4407441 cites an example where the LRT is not unbiased and it is even worse than the trivial test Mg 04 l Theorem 1015 Under some regularity conditions on f9g the rv 72log MK under H0 has asymptotically 117 a chiisquared distribution with 1 degrees of freedom7 where 1 equals the difference between the number of independent parameters in 9 and 907 ie7 72log MK L X12 under H0 Note The regularity conditions required for Theorem 1015 are basically the same as for Theorem 8710 Under independen 7 parameters we understand parameters that are unspeci ed7 ie7 free to vary Example 1016 Let X17 7Xn N Nuz72 where u E B and 02 gt 0 are both unknown LetHO puovsH1 py luo We have 9 p02 9 01772 u 6 B702 gt 0 and 90 010702 02 gt 0 n It is 90 IMO7 7 I110 and t9 E7 7 i1 39 l H Now7 the LR test statistic Mg can be determined f o Mg I M 1 EwiiMoV 7ltEltwmogt2gt eXplt 2 Emmy 1 Zltwr 2 7ltEltzmgt2gt eXp 72 BMW 200139 Molz A n ngin12 5 E imEman in m E 1 E WV 1 Draw Note that this is a decreasing function of V7477 0 G M0 t 1 i i N n7 zltX7Xgt2 S 118 5 holds due to Corollary 724 So we reject H0 if g is large Now7 7 n 0 mm 1glt1 2 7 3977 2 Under H07 M N N01 and 2072 N kl and both are independent according to Theorem 721 Therefore7 under H07 7 nltXLO2 F1 1 i m ZltXi Xlz Thus7 the mgf of 72 log MK under H0 is Mnt EHoeXp2t10gX EHoeXpnt10g1 n 71 EHoltexpltloglt1 n M EHOltlt1n71gtmgt Note that P 9 1 mm 1 2 is the pdf of a Fukl distribution 71 Let y 1 g then 1 7 and df fligdy Thus7 Mn wlt1 gtwdf r H 1 n 2 n1 yTa m1y dy 0 119 5 holds since the integral represents the Beta function see also Example 8811 As 71 A 007 we can apply Stirling s formula which states that Pltaltngt1gt o aw o Mltaltngtgtaltngt expltealtngtgt SO Lecture 27 We 032101 MM explteLg2gtmltL tHgt 2 expeL i H 417207 exp7L23 W 2 1exp77nligt72 172072 ltng2gtlt ampgt lt11 t 1 le 1 mwwwf 2 WW W 1 1 1 He as Hoe He i as Hoe 4417207 as Hoe Thus7 Mnt H as n a 00 1 1 H 223 Note that this is the mgf of a x distribution Therefore7 it follows by the Continuity Theorem Theorem 642 that 2 10g Ml L Xi Obviously7 we could use Theorem 1015 after checking that the regularity conditions hold as well to obtain the same result Since under H0 1 parameter 02 is unspeci ed and under H1 2 parameters 1 02 are unspeci ed7 it is 1 2 7 1 1 l 120 102 Parametric ChiSquared Tests Based on Rohatgi Section 103 amp RohatgiSaleh Section 103 De nition 1021 Normal Variance Tests Let X17 Xn be a sample from a Nu7 02 distribution where u may be known or unknown and 02 gt 0 is unknown The following table summarizes the X2 tests that are typically being used Reject H0 at level 04 if H0 H1 1 known 1 unknown 1 0 Z 00 0 lt 00 2 I02 S UEXiqw 32 S X ZL711W 2 H a no a gt no m e m2 2 03x3 32 2 02 HI 7 70 7 3 70 EX 7 W2 3 ngiuiaZ 32 3 Tflelil i tZ 2 Or 7 I Z UngLDL2 or 82 Z Xi1DL2 I Note i In De nition 10217 00 is any xed positive constant ii Tests 1 and H are UMPU if u is unknown and UMP if u is known iii In test 1117 the constants have been chosen in such a way to give equal probability to each tail This is the usual approach However7 this may result in a biased test iv xiglia is the lower 04 quantile and xfw is the upper 17 Oz quantile7 ie7 for X N xi it holds that PX xfmia Oz and PX xgw1i 04 v We can also use X2 tests to test for equality of binomial probabilities as shown in the next few Theorems Theorem 1022 Let X17 Xk be independent rv s with X N Binnipii 17 7 k Then it holds that X H 2 TZlt 1 77sz gt in 11 MMU Pi asn17nkaoo 121 Proof Homework l Corollary 1023 Let X17 Xk be as in Theorem 1022 above We want to test the hypothesis that H0 p1 p2 pk 197 where p is a known constant vs the alternative H1 that at least one of the ms is different from the other ones An appoximate levelioz test rejects H0 if k 2 l 7MP 2 y Z Xk gQnip p a Theorem 1024 Let X17 Xk be independent rv s with X N Binltnip7i177k Then the MLE ofp is k 2 i1 13 T Z m i1 Proof This can be shown by using thejoint likelihood function or by the fact that E X N B2742 7747p and for X N Binnp7 the MLE is 13 l Theorem 1025 Let X1Xk be independent rv s with X N Binltni7pi7i 17k An approximate levelioz test of H0 p1 p2 pk 197 where p is unknown vs the alternative H1 that at least one of the ms is different from the other ones7 rejects H0 if m 2 l 7 l 2 y Z inlpm where 13 l Theorem 1026 k Let X17 7Xk be a multinomial rv with parameters 71191192 pk where Zpi 1 and i1 k 2X1 n Then it holds that i1 asnaoo An approximate levelioz test of H0 p1 193192 p z pk p2 rejects H0 if k 2 9539 i 711939 2 Z 1 mp l gt inlmz39 i1 Proof Case k 2 only 7 2 7 2 U2 X1 n191 X2 71192 71191 71192 X1 711902 01 X1 711 p12 nP1 711 191 X1 7 mp1 mp1 n17291 1 P1 P1 X w 2 ltlt gt 1 p1 n1911191 X1 71P12 nP11 P1 By the 0LT M L N0 1 Therefore U2 L X I n1911 191 Lecture 28 Theorem 1027 Fr 032301 Let X17 7Xn be a sample from X Let H0 X N F7 where the functional form of F is known completely We partition the real line into k disjoint Borel sets A17Ak and let PX Aipiwherepigt0 Vi1k 7L Let Y ng in A739 Emmi Vj i1 Then7 Y17 7Yk has multinomial distribution with parameters 71191192 pk l Theorem 1028 Let X17 7Xn be a sample from X Let H0 X N FQ where Q 917 707 is unknown Let the MLE Q exist We partition the real line into k disjoint Borel sets A1 Ak and let PX 6 Ai 1317Where13i gt 0 Vi1k 7L Let Y ng in A Emmi Vj i1 Then it holds that k V 7 n A39 2 d Vk 1 5191 Xii39ril39 i1 mp1 123 An approximate levelioz test of Ho X N Fggejects Ho if k A 2 11239 747239 2 Z 7 gt inrilmy where quotr is the number of parameters in 97that have to be estimated 124 103 tiTests and FiTests Based on Rohatgi Section 104 amp 105 amp RohatgiSaleh Section 104 amp 105 De nition 1031 One7 and TwoiTailed t Tests Let X17 7Xn be a sample from a N01 02 distribution where 02 gt 0 may be known or 7L 7L unknown and u is unknown Let Y and 2 7 Y i1 i1 The following table summarizes the 27 and titests that are typically being used Reject H0 at level 04 if H0 H1 02 known 02 unknown I U P O MgtM0 i2p0 ze EZMO l tniln U MEMO ultu0 E M0 ziia E M0 tmmw 1U MM0 Mo lE7M0l2Ze2 liipolz tnilwZ Note i In De nition 10317 uo is any xed constant ii These tests are based on just one sample and are often called one sample titests iii Tests 1 and H are UMP and test 111 is UMPU if 02 is known Tests 17 H7 and III are UMPU and UMP invariant if 02 is unknown iv For large n 2 307 we can use zitables instead of t tables Also7 for large n we can drop the Normality assumption due to the CLT However7 for small 71 none of these simpli cations is justi ed De nition 1032 Twoisample t Tests Let X17 Xm be a sample from a Np1af distribution where a gt 0 may be known or unknown and M1 is unknown Let Y1 Yn be a sample from a NW2 0 distribution where 0 gt 0 may be known or unknown and Hg is unknown m m Let Y 2X1 and 12 2X 7 Y i1 i1 i 71 71 Let Y 7121 and 2 41211 7 Y i1 39 71SZ ins Let S mlmjg 2 The following table summarizes the 27 and titests that are typically being used 125 Reject H0 at level 04 if H0 H1 00393 known 0 0 unknown 01 02 7 7 If 7 7 7 1 1 I M1 M2g6 Mli 2gt6 z7y26204 l iyZ6tmn72n3p l 2 2 1 17111226 M17M2lt6 fig 6217ozvznili72 Eiy 6tmn7217a5pvi 2 2 HI M1 M26 M1 M27 6 liiyi6lZZa2Va l l a z lfiyi6thmn72a23pl I Note i In De nition 1032 6 is any xed constant ii All tests are UMPU and UMP invariant iii If a 0 02 which is unknown then S is an unbiased estimate of 02 We should check that a 0 with an F7test 1v For large m n we can use 27tables instead of t tables Also for large m and large n we can drop the Normality assumption due to the CLT However for small m or small 71 none of these simpli cations is justi ed I De nition 1033 Paired tTests Let X1 Y1 Xn Y be a sample from a bivariate NW1 2 0f 0 0 distribution where all 5 parameters are unknown Let D X 7 Y N NW1 7 pg 0 a 7 200102 7L 7L Let E i213 and 2 ii 21 in i1 13971 The following table summarizes the t7tests that are typically being used H0 H1 Reject H0 at level a if I M1 M2g6 M1M2gt5 EZ6tn71a H M1 M226 M1 M2lt6 E 6tn711704 HI M1 M26 M1 M27 6 l3 6liiitnem2 126 Note i In De nition 10337 6 is any xed constant ii These tests are special cases of one7sample tests All the properties stated in the Note following De nition 1031 hold iii We could do a test based on Normality assumptions if 02 a 0 7 20ch were known7 but that is a very unrealistic assumption De nition 1034 F7Tests Let X17 Xm be a sample from a Nul7 a distribution where 111 may be known or unknown and a is unknown Let Y1 Yn be a sample from a Np2a distribution where p2 may be known or unknown and 0 is unknown Recall that m i n i 09 7 X2 01 7 Y2 11 N X2 i1 N X2 U milv 0 W717 and m i 09 7 X2 i1 m71a 43st 1 1 n gs W77 0 7 2 i1 71 71W The following table summarizes the F7tests that are typically being used Reject H0 at level 04 if phlug unknown H0 H1 17112 known I a g a a gt a Z F7WW g 2 Fm1n1a HI af a af 7g a 2 F mud2 2 FmA AWZ if s 2 s 1 2 2 7 E y397M2 s i g gt F or i 2 Fn71m1a21f8 lt 5 or A izxmiimy 7 nma2 127 Note i Tests I and H are UMPU and UMP invariant if m and 2 are unknown ii Test 111 uses equal tails and therefore may not be unbiased iii If an F7test at level 011 and a t7test at level 012 are both performed7 the combined test has level a 1 7 1 7 a11 7 a2 2 max017012 011 bag if both are small 128 104 Bayes and Minimax Tests Based on Rohatgi Section 106 amp RohatgiSaleh Section 106 Hypothesis testing may be conducted in a decisionitheoretic framework Here our action space A consists of two options a0 fail to reject H0 and a1 reject H0 Usually7 we assume no loss for a correct decision Thus7 our loss function looks like 0 if 0 6 90 1097 Lo 10 if 0 6 91 50 if 0 6 90 1097 al 0 if 0 6 91 We consider the following special cases 071 loss 10 130 17 ie7 all errors are equally bad Generalized 071 loss 10 011 30 CI ie7 all Type 1 errors are equally bad and all Type ll errors are equally bad and Type 1 errors are worse than Type ll errors or Vice versa Then7 the risk function can be written as WWW W7 aoPedX a0 L0a1P9dX a1 7 a0PedX a0 if 0 e 91 T 50Pedx m if 0 e 90 The minimax rule minimizes Igwwmewo3mxwmewo3m Theorem 1041 The minimax rule 1 for testing H0 00ovs H1 001 under the generalized 071 loss function rejects H0 if 1 91 g mmzk 7 129 Lecture 29 Mo 032601 where k is chosen such that R017di 3007 1 ltgt 0111391 WK a0 CIPeodX a1 a 13 ltkgt cfs 2k Proof Let 1 be any other rule o If R001 lt R001 then R001 R011 lt maXR001R011 So 1 is not minimax o If R001 Z R001 ie CIP90 1 11 R00 1 Z R00 1 CIP90 1 11 then Preject H0 l H0 true P90 1 11 2 P901 11 By the NP Lemma the rule 1 is MP of its size Thus P91 1 11 Z P911 11 ltgt P91 1 10 P911 10 gt R011 g R911 gt maXR001R011 R011 g R911 g maXR001R011 V1 gt 1 is minimaX l Example 1042 Let X1Xn beiid N01 1 Let H0 uuo vs H1 1111 gt uo As we have seen before 1691 1090 Therefore we choose k2 such that 2 k1 is equivalent to E 2 k2 Can1 lt k2 CIPMO 2 k2 ltgt 011 P5k211 CIG WWUQ MOWL where ltIgtz PZ z for Z N N0 1 Given 010111011 and n we can solve numerically for k2 using Normal tables I 130 Note Now suppose we have a prior distribution 7r0 on 9 Then the Bayes risk of a decision rule 1 under the loss function introduced before is R7r7d EWR07d R0d7r0d0 639 O b07r0P9dX a1d0 1a07r0P9dX a0d0 if 7139 is a pdf The Bayes risk for a pmf 7139 looks similar see Rohatgi7 page 461 l Theorem 1043 The Bayes rule for testing H0 t9 00 vs H1 t9 01 under the prior 7r00 7r0 and 7r01 7r1 17 we and the generalized 071 loss function is to reject H0 if 1 91 g CIWO gt Rug 7 cum Proof We wish to minimize R7r7 d We know that Rltmdgt Def888 EWltRlt07d Def7883 E E9L07di N98 ye Llt0dltggtgthlt0iggtd0gtdg V W marginal posterior Defjl WWW M l X g a E E9L07 1 l X Therefore7 it is suf cient to minimize E9Lt97 l The a posteriori distribution of 9 is T ZWWWWE 9 7r0f90lt gt 0 7rof90 7r1f91 7 0 Wifel g 0 01 Wofeo W1f91 7 131 Therefore7 CIhlt00 l g if 0 00dg a1 Ch01 l g if 0 01dg a0 E9L07dX l X g 0 if 0 00dg a0 0 if 0 01dg al This will be minimized if we reject H07 ie7 Kg a1 when 011090 l g 0111091 l g gt cmofedg S CII IF1f91 fMg gt CIWO gt fee g 7 Cum Note For minimax rules and Bayes rules7 the signi cance level 04 is no longer predetermined Example 1044 Let X17Xn be iid NW 1 Let H0 In no vs H1 1 p1 gt no Let C 011 By Theorem 10437 the Bayes rule 1 rejects H0 if 1 91 g gt 7r0 hug 17r0 7 2 7 2 gt exp i E b M Eml2 MO gt 2 leW0 71 27 2 7139 gt expltlt 1 02iwgt 2 1707 met 7r0 7 gt gt 1 029514 2 7 111177r0 1 1111 M0M1 gt 770 7 gt 712 7 n H1 IUIO 2 If 7r0 then we reject H0 if 2 w 132 Note We can generalize Theorem 1043 to the case of classifying among k options 01 70k If we use the 071 loss function 1 ifd 0 ij i L0i d 0 i ifdlt gt then the Bayes rule is to pick 0139 if mfmg Z ij9j Vi 7 i Example 1045 Let X17Xn be lid NL71 Let 1 lt lug lt lag and let 71391 71392 71393 Choosepui if 7 2 7 2 mexp ijexp 7 j7g 7j17273 Similar to Example 10447 these conditions can be transformed as follows 7 39 7 39 39 39 mwrw 2 J 7 17273 In our particular example7 we get the following decision rules 1 Choose m ifs MW and E My ii Choose W ifs 2 1IQ and E g 113 iii Choose pg ifs 2 1I and E 2 1I Note that in and iii the condition in parentheses automatically holds when the other condition holds If M 07 p2 27 and p3 4 we have the decision rules i Choose M if g 1 ii Choose M if 1 g E g 3 iii Choose p3 if 2 3 We do not have to worry how to handle the boundary since the probability that the rV will realize on any of the two boundary points is 0 133 I Lecture 30 We 032801 1 1 Con dence Estimation 111 Fundamental Notions Based on CasellaBerger Section 91 amp 932 Let X be a rv and a b be xed positive numbers a lt I Then PaltXltb PaltXandXltb X PaltXandlt1 PaltXandlta X PaTltaltX The interval X X is an example of a random interval X contains the value a with a certain xed probability For example if X N UO1a i and b then the interval X X contains with probability 5 De nition 1111 Let PQQ E 9 Q Bk be a set of probability distributions of a rv X A family of subsets 5g of 9 where 5g depends on g but not on 0 is called a family of random sets In particular if 9 E 9 Q R and 5g is an interval Qg g where g and Hg depend on g but not on 0 we call SX a random interval with QX and RX as lower and upper bounds respectively QX may be foo and RX may be 00 l Note Frequently in inference we are not interested in estimating a parameter or testing a hypoth esis about it Instead we are interested in establishing a lower or upper bound or both for one or multiple parameters l De nition 1112 A family of subsets 5g of 9 Q Bk is called a family of con dence sets at con dence level 1 7 04 if PQSX 9amp21704 VQE 9 where 0 lt 04 lt 1 is usually small The quantity ingiltsltxgt a Q 1 a 134 is called the con dence coe icient ie7 the smallest probability of true coverage is 1 7 04 I De nition 1113 For k 17 we use the following names for some of the con dence sets de ned in De nition 1112 i If 5g g oo7 then g is called a level 17 04 lower con dence bound ii If 5g 7oo g7 then g is called a level 17 04 upper con dence bound iii SQ Qg g is called a level 17 04 con dence interval CI De nition 1114 A family of 1 7 04 level con dence sets is called uniformly most accurate UMA if 13451 a a Flag a a VM 6 e a y a and for any 1 7 04 level family of con dence sets SKX ie7 5g minimizes the probability of false or incorrect coverage I Theorem 1115 Let X17 Xn N F9 9 E 9 where 9 is an interval on B Let TX0 be a function on R x 9 such that for each 0 TX0 is a statistic7 and as a function of 0 T is strictly 1 either incieasing or decreasing in 9 at every value of g E B Let A Q R be the range of T and let the equation Tg0 be solvable for 9 for every E A and every g E B If the distribution of T 7 0 is independent of 0 then we can construct a con dence interval for 9 at any level Proof Choose 04 such that 0 lt 04 lt 1 Then we can choose 1a lt 2a which may not necessarily be unique such that P91a lt TL0 lt 2a217 a v0 Since the distribution of T 7 0 is independent of 0 1a and 2a also do not depend on 0 lfTX7 0 is increasing in 0 solve the equations 1a T 7 0 for g and 2a TX7 0 135 for lf TX0 is decreasing in 0 solve the equations 1a TX0 for Hi and 2a mg 0 for g In either case7 it holds that P90X lt 0 lt my 2 17 a W Note i Solvability is guaranteed if T is continuous and strictly monotone as a function of 0 ii If T is not monotone7 we can still use this Theorem to get con dence sets that may not be con dence intervals Example 1116 Let X17 Xn N Nuz727 where u and 02 gt 0 are both unknown We seek a 1 7 04 level con dence interval for u Note that by Corollary 724 X 7 Id N tnil S and TX7 u is independent of u and monotone and decreasing in u TL M We choose 1a and 2a such that P1a lt TXu lt 2a 17 Oz and solve for In which yields 7 lt 2a Thus7 is a 1 7 04 level CI for u We commonly choose 2a 71Oz tn1a2 l Example 1117 Let X17 7Xn U00 We know that maXXl Mazn is the MLE for 9 and suf cient for 0 The pdf of Mazn is given by 7 nynill fny 7 0n 09y Then the W Tn Mquot has the pdf Mt ntn71101t7 which is independent of 0 Tn is monotone and decreasing in 0 We now have to nd numbers 1a and 2a such that P1a lt Tn lt 2Ot 17 Oz 2 gt n tnildt 1704 1 gt A 7 A 17 04 If we choose A2 1 and 1 elln then Mamn7a 1nMazn is a 17 a level CI for 0 This holds since M an 1704 Pa1nlt lt1 9 M an Poz 1 gt gt1 Pofl Mamn gt 0 gt Max 137 112 Shortest7Length Con dence Intervals Based on CasellaBerger Section 922 amp 931 In practice we usually want not only an interval with coverage probability 1 7 04 for 0 but if possible the shortest most precise such interval De nition 1121 A rv TX 0 whose distribution is independent of 9 is called a pivot l Note The methods we will discuss here can provide the shortest interval based on a given pivot They will not guarantee that there is no other pivot with a shorter minimal interval l Example 1122 Let X1 Xn N Nu02 where 02 gt 0 is known The obvious pivot for u is Y M T X MN 0 1 0i U 7 Suppose that a b is an interval such that Pa lt Z lt b 1 7 04 where Z N N0 1 A 1 7 04 level Cl based on this pivot is found by Y7 7 a i a 17 P 7 b P X7137 X7 7 Oz ltaltU ltgt lt ltplt 1 The length of the interval is L b 7 00 To minimize L we must choose a and I such that b 7 a is minimal while 17 x2 b 7 ltIgta 5 7dm 17 04 where ltIgtz PZ To nd a minimum we can differentiate these expressions with respect to a However I is d 7a dd However this not a constant but is an implicit function of a Formally we could write is usually shortened to Here we get Lecture 31 Fr 033001 dLiadb 7aa 1 da 7 da 7 39 The minimum occurs when gta b which happens when a b or a 7b If we select a b7 then ltIgtb 7 ltIgta ltIgta 7 ltIgta 0 17 a Thus7 we must have that b 7a 2042 Thus7 the shortest Cl based on TM is i 039 i 039 X 7 ZaZ vX ZaZ De nition 1123 A pdf x is unimodal iff there exists a x such that x is nondecreasing for x g x and x is nonincreasing for x 2 x l Theorem 1124 Let x be a unimodal pdf If the interval 11 satis es 1 Ab xdx1ioz ii Ha b gt 07 and iii a g x b7 where x is a mode of x7 then the interval 071 is the shortest of all intervals which satisfy condition Proof 7 Let 01 7 b be any interval with b ia lt bia We will show that this implies xdx lt 17047 ie7 a contradiction a We assume that a g a The case a lt a is similar 0 Suppose that b g a Then a g b g a g x 139 Theorem 1124a It follows f7 fmdg S WW 1 l90 S b S 90 5 f9 S NU fab7a lb a z b fa lt fab7a lb7altb7aand fagt0 bfzdz magma foragng 17a by lt1 0 Suppose b gt a We can immediately exclude that b gt I Since then b 7 a gt b 7 a7 ie7 b 7 a wouldn t be of shorter length than I 7 a Thus7 we have to consider the case that a g a lt b lt 1 Theorem 1124b 140 It holds that bl b a b A a A 7 b Note that Ai z m S faa 7 a and Ab mdm Z fbb 7 13 Therefore7 we get a b A wkb mm fltagtlta7a gt7fltbgtltb7b gt 7 NOW 7 d 7 b 7 b N 7 M fab 1 i b i 1 lt 0 Thus7 I b fxdzlt17oz Note Example 1122 is a special case of Theorem 1124 However7 Theorem 1124 is not immedi ately applicable in the following example since the length of that interval is proportional to 7andnottob7a l Example 1125 Let X17 7Xn N Nuz727 where u is known The obvious pivot for 02 is T02XMNX V 039 So PltaltltXi U2ltbgt 17a 039 ltgt PltZltXibi ylt02ltEltXiaTMgt2gt 17a We wish to minimize 1 1 L a g 09 NZ 7 such that fntdt 1 7 a where fnt is the pdf of a xi distribution 1 We get db fnb 7 fna 0 and dL 1 1 db 1 1 f n a E lt WZ 07 g u2 141 We obtain a minimum if azfna bzfnb Note that in practice equal tails foZ and xiiliaZ are used which do not result in shortesti length 01 s The reason for this selection is simple When these tests were developed com puters did not exist that could solve these equations numerically People in general had to rely on tabulated values Manually solving the equation above for each case obviously wasn t a feasible solution I Example 1126 Let X1Xn N U00 Let Mazn maxXi X Since Tn Mg has pdf ntn 101t which does not depend on 0 Tn can be selected as a our pivot The den sity of Tn is strictly increasing for n 2 2 so we cannot nd constants a and b as in Example 1125 IfPaltTnltb 1704 then HMquot lt0lt 1704 We wish to minimize 1 1 LM 777 axna b 17 such that ntn ldt b 7 a 17 a a We get da da mil 71 71 71b 77m 0 E anil and dL 1 da 1 bnil 1 an1 7 bn1 E Mamn7 Mazn7w MamnW lt 0 for 0 g a lt b g 1 Thus L does not have a local minimum However since lt O L is strictly decreasing as a function of b It is minimized when I 1 ie when I is as large as possible The corresponding a is selected as a 041 The shortest 1 7 04 level Cl based on Tn is Maznof1 Mazn l 142 Lecture 32 Mo 040201 113 Con dence Intervals and Hypothesis Tests Based on CasellaBerger Section 92 Example 1131 Let X17 Xn N Nuz727 where 02 gt 0 is known In Example 1122 we have shown that the interval Y 7 Za2i7Y 2042 w w is a 17 04 level CI for 0 Suppose we de ne a test 1 of H0 u uo vs H1 u 7 uo that rejects H0 iff uo does not fall in this interval Then7 PMOType I error PMOReject H0 when H0 is true PMO ltY72a27Y2a2 M0gt 17 PM if Za27Y Za2 9 0 i 039 i 039 17 PMO ltXiza2 uo and uo Xza2 gt i 039 i 0 PM X ZaZ Z M0 0F 0 Z XZuz2 gt i 039 i 0 PM X MO Z Zea2 0r X MO S ZaZW Y 0 X 0 PMOlt a ZZaZOF I Sized2 W W PMO by Z Zia2 047 W ie7 1 has size 04 So a test based on the shortest 1 7 04 level Cl obtained in Example 1122 is equivalent to the UMPU test 111 of size 04 introduced in De nition 1031 when 02 is known Conversely7 if gtg7 MO is a family of size 04 tests of H0 u IMO the set 00 Mg uo fails to reject H0 is a level 1 7 04 con dence set for ILLQ l 143 Theorem 1132 Denote H0090 for H0 t9 00 and H1090 for the alternative Let 14090 00 E 9 denote the acceptance region of a levelioz test of H0090 For each possible observation g de ne 535 a g e A00 e 9 Then 5g is a family of 1 7 04 level con dence sets for 0 If moreover 14090 is UMP for aH000H100 then 5g minimizes P9SX 9 0 V0 6 H107 among all 17 04 level families of con dence sets ie 5g is UMA Proof It holds that SQ 9 t9 iffg 6 140 Therefore 13455 9 a Peg e 14019 a Let 5 be any other family of 1704 level con dence sets De ne A0 g 5 9 0 Then P4X 6 A0 P9SX 9 9217 04 Since 14090 is UMP it holds that FAX E A00 Z P4X 6 14090 V0 6 H1090 This implies that 139991 9 00 2 PM 6 A00 134mg 9 00 v0 6 H1090 Example 1133 Let X be a rv that belongs to a onekparameter exponential family with pdf Mm expQ0T Sh 130 where 6209 is nonidecreasing We consider a test H0 t9 00 vs H1 9 lt 00 By Theorem 933 the family f9 has a MLR in TX It follows by the Note after Theorem 935 that the acceptance region of a UMP size 04 test of H0 has the form 14090 z Tm gt C00 and this test has a noniincreasing power function Now consider a similar test H6 t9 01 vs Hi 9 lt 01 The acceptance region of a UMP size 04 test of H6 also has the form A01 z Tm gt C01 Thus for 01 2 00 P90ltTltXgt C00 a P91ltTltXgt C01 P90ltTltXgt C01 144 since for a UMP test it holds that power 2 size Therefore we can choose 00 as non7 decreasing A level 1 7 04 CI for 0 is then 595 0 i 95 6140 007C 1T7 where c 1Tm Sgp0 00 l Example 1134 Let X N Ezp0 with f9z e glmmdm which belongs to a one7parameter exponential familty Then Q0 7 is non7decreasing and Tm x We want to test H0 0 00 vs H1 0 lt 00 The acceptance region of a UMP size 04 test of H0 has the form A00 where C00 C00 1 7i 7 a f90mdm 7e 90dm17e 90 0 0 t90 C90 1 e 90 17a gt C0000l0g17a m m 2 C00 Thus Therefore the UMA family of 1 7 04 level con dence sets is of the form 590 0 x E A0 0 0 710g 13 07 l 10glt gtl Note Just as we frequently restrict the class of tests when UMP tests don t exist we can make the same sorts of restrictions on 01 s I De nition 1135 A family Sg of con dence sets for parameter 0 is said to be unbiased at level 1 7 04 if P9SX 9 0 Z 1704 and P9SX 9 0 g 1704 V00 E 9 0 7 0 If Sg is unbiased and minimizes P9SX 9 0 among all unbiased 01s at level 1 7 04 it is called uniformly most accurate unbiased UMAU l 145 Theorem 1136 Let 1400 be the acceptance region of a UMPU size 04 test of H0 0 00 vs H1 0 7 00 for all 00 Then 5g 0 g 6 140 is a UMAU family of con dence sets at level 17 a Proof Since 140 is unbiased it holds that P9SX 9 0 P4X 6 140 17 01 Thus S is unbiased Let 5 be any other unbiased family of level 1 7 04 con dence sets where 140 g 5g 9 0 It holds that P4X 6 140 P9SX 9 0 g 17 01 Therefore 14 0 is the acceptance region of an unbiased size 04 test Thus P9SX 9 0 P4X 6 140 Z PAX 6 140 P9SX 9 0 5 holds since 140 is the acceptance region of a UMPU test I Th 11 3 7 Lecture 33 We 040401 Let 9 be an interval on R and f9 be the pdf of X Let 5X be a family of 1 7 04 level Cl s where 5X 0X70X7 0 and 0 increasing functions of X and 0X 7 0X is a nite rv Then it holds that MM 7 M 7 91 7 Qltzgtgthlt2gtd 7 max 9 0 10 v0 6 9 99M Proof 7 7 9 It holds that 0 7 0 10 Thus for all 0 E 9 0 E99XQX A cw mvemdz 7 1 171 fi 0 0 R Q 10 f9 d d0 17 P9X Q 10 7 710 l W R RPQSx90d0 Al msg 9 0 de I Note Theorem 1137 says that the expected length of the Cl is the probability that 5X includes the false 0 averaged over all false values of 0 l Corollary 1138 lf 5X is UMAU7 then E9 X 7 is minimized among all unbiased families of 01 s Proof In Theorem 1137 we have shown that E g X 7 0 X elt i 44gt M Since a UMAU Cl minimizes this probability for all 0 the entire integral is minimized l 9 max 9 0 10 Example 1139 Let X17 7Xn N Nuz727 where 02 gt 0 is known By Example 11227 Y7 za2Y 2a2 is the shortest 17 04 level CI for u By Example 9437 the equivalent test is UMPU So by Theorem 1136 this interval is UMAU and by Corollary 1138 it has shortest expected length as well I Example 11310 Let X17 7Xn N Nuz727 where u and 02 gt 0 are both unknown Notethat 2 71715 Thus7 7152 7152 7152 P02 A1ltMltA2 1704 ltgt P02 ulta2ltu 1704 02 2 1 We now de ne P39y as 7152 7152 T T P02ltlt02ltgtPltflt YltlgtP1 YltTcrlt2 YP39Y7 2 1 2 where 39y If our test is unbiased7 then it follows from De nition 1135 that 131 17ozandP39ylt 17oz V39yy l This implies that we can nd A1 2 such that P1 1 7 Oz and dP y 7 d A27 71 3AW fToWWV dv 2 mom A2 7 mm A1 0 171 2fTo2 1fTo1 0 71 7 where fTO is the pdf that relates to TU7 ie7 a the pdf of a xiil distribution follows from Leibniz s Rule Theorem 324 We can solve for A1 2 numerically Then7 n 71S2 n 71S2 A2 7 1 is an unbiased 1 7 04 level CI for 02 Rohatgi7 Theorem 4b7 page 42874297 states that the related test is UMPU Therefore7 by Theorem 1136 and Corollary 11387 our Cl is UMAU with shortest expected length among all unbiased intervals Note that this Cl is different from the equal7tail Cl based on De nition 10217 1117 and from the shortest7length Cl obtained in Example 1125 I 148 114 Bayes Con dence Intervals Based on CasellaBerger Section 924 De nition 1141 Given a posterior distribution h0 g a level 1 7 Oz credible set Bayesian con dence set is any set A such that P0 AlgAh0lgd017a Note If A is an interval7 we speak of a Bayesian con dence interval l Example 1142 Let X N Binnp and 7rp N U01 In Example 88117 we have shown that n7m m 17 MW 7 1p p pm1p mdp 0 396 1771 7 m 1 1pm1p m1o1p l l P im1o1P 0119 N Betam 1771 7 m 17 where Ba7 b W is the beta function evaluated for a and b and Betaz 171 7 z 1 represents a Beta distribution with parameters z 1 and n 7 z 1 Using the observed value for z and tables for incomplete beta integrals or a numerical ap proach7 we can nd 1 and 2 such that PPMh lt p lt A2 17 04 So 17 A2 is a credible interval for p l Note i The de nitions and interpretations of credible intervals and con dence intervals are quite different Therefore7 very different intervals may result ii We can often use Theorem 1124 to nd the shortest credible interval if the precondi tions hold 149 Example 1143 Let 17 7Xn be iid Nu1 and 7ru N N01 We want to construct a Bayesian level 17aleorlu By De nition 8877 the posterior distribution of 1 given g is where 9g 7r f l M 710 l g gm Hz Md 1 more 1 my co 1 1 1 1 L E 8Xplt 2W 1 5 2 d 1 1n 2 lt n1lt 2 n ex 77 z ex 7 72 d 2 plt 2121 p 2 n1 1 1n 2 0 n1 n1 2 n1 n1 2 7 7 d 2wexplt 2211gtLmexplt 2 lt n1 1 2 n1 ltgtltgt 1 1 1 lt n gt2 d iexp quot7 M77 M 00 27rnrl 21n1 n1 1 since pdf of a Nanlnrl distribution 1 n1 lt 1 n 2 71212 gt 2 exp 9039 7 2702 2 l 2n1 150 Lecture 34 Mo 040901 Therefore7 hwg i n1 1 n 2 7 n 2 1 2 1 n 2 71212 r2 explt mi m 5 mger n1 n1 2 2 7 2 71212 ex 77 7 77 WW p 2 quot n1 m n12n1 1 lt 1 1 lt 71 gt2 iexp 77f M77 7 QWTL 2 TH n1 n 1 h N 7 mg WNW Therefore7 a Bayesian level 1 7 04 CI for u is lt n if 2042 71 2042 gt n1 xn17n1 7l1 39 The shortest classical level 17 04 CI for u treating u as xed is 7 2042 i 2042 X 7 7 X 7 lt W W as seen in Example 1122 Thus7 the Bayesian Cl is slightly shorter than the classical Cl since we use additional infor mation in constructing the Bayesian Cl l 151 12 Nonparametric Inference 121 Nonparametric Estimation De nition 1211 A statistical method which does not rely on assumptions about the distributional form of a rV except perhaps that it is absolutely continuous or purely discrete is called a nonpara metric or distributionifree method I Note Unless otherwise speci ed we make the following assumptions for the remainder of this chap ter Let X1 Xn be iid N F where F is unknown Let 73 be the class of all possible distributions of X l De nition 1212 A statistic T is su icient for a family of distributions 73 if the conditional distibution of X given T t is the same for all F E 73 l Example 1213 Let X1 Xn be absolutely continuous Let I X1 X be the order statistics It holds that 1 re l I z a so I is suf cient for the family of absolutely continuous distributions on B l De nition 1214 A family of distributions 73 is complete if the only unbiased estimate of 0 is the 0 itself ie EphX0 VFE gthg 0 Vg De nition 1215 A statistic TX is complete in relation to 73 if the class of induced distributions of T is complete I Theorem 1216 The order statistic X1 X is a complete suf cient statistic provided that X1 Xn are of either pure discrete of pure continuous type I 152 De nition 1217 A parameter gF is called estimable if it has an unbiased estimate7 ie7 if there exists a TX such that EFT 9F VF E 73 Example 1218 Let 73 be the class of distributions for which second moments exist Then X is unbiased for MF Thus7 MF is estimable l De nition 1219 The degree m of an estimable parameter gF is the smallest sample size for which an unbi ased estimate exists for all F E 73 An unbiased estimate based on a sample of size m is called a kernel I Lemma 12110 There exists a symmetric kernel for every estimable parameter m Let TX17 7Xm be a kernel of De ne T9X1Xm TXl1Xim all permutations Of1m where the summation is over all ml permutations of 17 7 m Clearly T9 is symmetric and ETs l Example 12111 i EX1 MF so MF has degree 1 with kernel X1 ii EIwoX1 PFX gt c7 where c is a known constant So gF PFX gt c has degree 1 with kernel COOX1 iii There exists no TX1 such that ETX1 02F 7 pF2dFm But ETX1X2 EX12 7 X1X2 02F So 02F has degree 2 with kernel X12 7 X1X2 Note that X 7 X2X1 is another kernel iv A symmetric kernel for 02F is 1 T9X1X2 7 X12 7 X1X2 7 X22 7 X1X2 X1 7 X22 1 2 153 De nition 12112 Given a kernel TX 117 Let gF be an estimable parameter of degree m Let X17 Xn be a sample Ofsize 7171 2 m le of gF7 we de ne a Uistatistic by 1 UX17 7Xn m ZTJXm 7Xim7 m C tions of m integers i17 where T9 is de ned as in Lemma 12110 and the summation c is over all combina from 17mm UXl7 EFUX 9F for all F 7X is symmetric in the Xi s and I Example 12113 For estimating MF with degree m of NF 1 Symmetric kernel Uistatistic For estimating 02F with degree m of 02F 2 Symmetric kernel 1 T9Xi17Xi2 Xi Xi 27 172177n 17 2 Uistatistic Uni 25mm i1lti2 i 1 n72l211 711 m 1 1 ZltXi17 Xi22 l3917 i2 154 Lecture 35 We 041101 1 m 2Xi2172Xl1Xi2 X32 i1 121 1 1 7L 7L 7L 7L m 71 1 2 X3 22 X1 2 XMHZXZA 111 111 121 11 1 71 Xi 121 1 n 2 n 2 n 2 n 2 m X117 ZXii 72IZX1391 2Xi 111 111 111 7 1 n n 3927 n 2 nltnelgtl gill 1 n 2 M7171 11X12XJ 1 n 2 n71 XX 52 Theorem 12114 Let 73 be the class of all absolutely continuous or all purely discrete distribution functions on B Any estimable function gFF E 737 has a unique estimate that is unbiased and sym metric in the observations and has uniformly minimum variance among all unbiased estimates Proof Let X17 7Xn 5 F E 737 with TX17 7Xn an unbiased estimate of We de ne 11T1X177Xn TX117X1277X1n7 11 2 over all possible permutations of 17 711 n 11 Let T ZT1 and T 20 11 11 155 Then El iTZVlwgwnz 39i1 E 719W 39 i1j1 E 719W i1j1 E in fireE i1 j1 n 2 E 719W i1 Equality holds iff Tl Tj Vij 1 nl gt T is symmetric in X1 X and T T gt by Rohatgi Problem 4 page 538 T is a function of order statistics gt by Rohatgi Theorem 1 page 535 T is a complete suf cient statistic gt by Note following Theorem 8412 T is UMVUE l Corollary 12115 lf TX1 Xn is unbiased for gFF E 73 the corresponding Uistatistic is an essentially unique UMVUE l 156 De nition 12116 Suppose we have independent samples X17 Xm 5 F E 737 11 Yn E G E 73 C may or may not equal Let gF7 G be an estimable function with unbiased estimator TX17 Xk Y1 7Y1 De ne 1 TSX1Xk nun1 WZZTXi1Xlk EMMA PX PY Where PX and Py are permutations of X and Y and 1 UX7XWZZT5Xi17H397Xik7 YJ3917 397YJ39L k 1 OX CY Where CX and Cy are combinations of X and Y U is a called a generalized Uistatistic l Example 12117 Let X17 Xm and Y17 71 be independent random samples from F and G7 respectively7 with F7 G E 73 We wish to estimate 9F7G PFGX Let us de ne 17 Xi S Yj Zij 07 Xi gt 2 for each pair Xi YjJ 127m7j 17 7n m V L Then ZZlj is the number of X s Yj and Z Zij is the number of Y s gt Xi i1 j1 EltIltXi 19gt M G PFaltX Y and degrees k and l are 17 so we use 1 UX7X mgtngtZZT5Xi17H397Xik73917quot397l 1 1 OX CY m71ln71l 1 TZZWZZNXWHWXW EMMA1 39 39 OX CY quotPXPY This Mann7Whitney estimator or Wilcoxin 2Sample estimator is unbiased and symmetric in the X s and Y s It follows by Corollary 12115 that it has minimum variance 157 122 SingleSample Hypothesis Tests Let X17 Xn be a sample from a distribution F The problem of t is to test the hypoth esis that the sample X17 Xn is from some speci ed distribution against the alternative that it is from some other distribution7 ie7 H0 F F0 vs H1 7 F0z for some m De nition 1221 Let X17 7Xn 1131 F7 and let the corresponding empirical cdf be 1 71 i1 The statistic Dn sup l Fi 7 Fl 12 is called the twoisided Kolmogorovismirnov statistic K78 statistic The oneisided K78 statistics are D sgplsz e Fzl and D sng 7 FM Theorem 1222 For any continuous distribution F7 the K7S statistics D 7 D 7 D are distribution free Proof Let X1Xn be the order statistics of X1Xn7 ie7 X1 X9 XW and de ne X0 foo and XWH 00 Then7 4 Z g for g m lt Xlti1 2 07 771 Therefore7 L 1 Dn 7 Orgy sup l Fml XiSwltXi1 n 2 l i g XSIEWFltQ gtH ltgt i FX 7 02 lt w 71 max 13112 7FXi7 0 5 holds since F is nondecreasing in XlXi1 158 Lecture 36 Fr 041301 Note that D is a function of FX In order to make some inference about Di the dis tribution of FX must be known We know from the Probability Integral Transformation see Rohatgi page 203 Theorem 1 that for a rV X with continuous cdf FX it holds that FXX U0 1 Thus FX is the ith order statistic of a sample from U0 1 independent from F There fore the distribution of D is independent of F Similarly the distribution of iil Dn max 1233 FXi 0 71 is independent of F Since a sup 1 Fax e M 1 max 12th 12 the distribution of Dn is also independent of F l Theorem 1223 If F is continuous then 7 if 1 g 0 1 14 14 142quot PDn l 12 a 2 mWIZ fada if0ltult 2n 1 A 2 71 1 1le 2 3T where nl if0ltu1ltu2ltltu lt1 fgfu177un l n O otherw1se is the joint pdf of an order statistic of a sample of size n from U0 1 l Note As Gibbons amp Chakraborti 1992 page 1087109 point out this result must be interpreted carefully Consider the case n 2 For 0 lt 1 lt 3 it holds that 1 Hi W Oltu1ltu2lt1 4 When 0 lt 1 lt i it automatically holds that 0 lt ul lt 1L2 lt 1 Thus for 0 lt 1 lt i it holds that 1 Hi Vt PD2 g l i 1 3 dug dul 11 Vi 1x glsa 2 gtd 1 vi QIl 21 dul 171 1 2l 214mg 2l 212 For i g 1 lt the region of integration is as follows Note to Theorem 1223 HZ Areai ArEaZ 34rrlu D 347nu immu m Thus for i g 1 lt 37 it holds that 1 Hi Vt PD2 1 1 3 2 dug dul Z V oltu1ltu2lt1 Z V vi 1 giv 1 771 ul 0 371 vi 1 7V 1 7V ltU2iu1gt ltU2igiVgtdU1 Vti v 3 17u1du1 177Vdu1 31 0 4 160 2 V 4 371 2101721 ltmgt4 1 z 1 H4 3 7ug2 fl4 7 2V1 2 V1 2 4 2 1 12 1 1 3l 3 ll3 23 1 7 7 7 7 777 177 7771 777 771 71 4 2 4 32 4 4 32 4 16 1 7 2 7 if 2i V 2 16 1 722 3 L 1 1 8 Combining these results gives 07 lfV 0 2 2142 ifO lt 1 lt l PD2ltzl 4 4 72u23u7 wiglt2 A 3 17 1f121 Theorem 1224 Let F be a continuous cdf Then it holds V2 2 O 00 i Z 397 91310 Wgt L1ltzgt17 gem lexplte22222gt Theorem 1225 Let F be a continuous cdf Then it holds L71 n 07 1 u PDzPDz 172 1 17 where g is de ned in Theorem 1223 Note It should be obvious that the statistics D and Dg have the same distribution because of symmetry 161 ifz 0 us ug zizlizfgdg1f0ltzlt1 if221 I Lecture 37 Mo 041601 Theorem 1226 Let F be a continuous cdf Then it holds V2 2 O i i i 7 i 2 ggHgOPwn wggngonn wL zFl explt 22 Corollary 1227 Let Vn 4nD 2 Then it holds V L X3 ie7 this transformation of D has an asymptotic xg distribution M Let x 2 0 Then it follows 73320 P04 z 4 73320 PM 422 JgngoP4nD2 422 ggngome 2 Th 26 17 exp7222 422w 17 exp7z2 Thus7 lim PVn z 17 exp7z2 for z 2 0 Note that this is the cdf of a xg distribu Hoe tion I De nition 1228 Let Dma be the smallest value such that PDn gt Dma a Likewise7 let Dag be the smallest value such that PDr gt Darn a The Kolmogorovismirnov test K78 test rejects H0 F0m Vz at level 04 if D gt law It rejects H6 2 F0m Vz at level 04 if Dg gt Dim and it rejects Hg F0z Vz at level 04 if D gt D7120 l Note Rohatgi7 Table 7 7 page 6617 gives values of Dma and D3201 for selected values of Oz and small 71 Theorems 1224 and 1226 allow the approximation of Dma and Dim for large n l 162 Example 1229 Let X1 Xn N C10 We want to test whether H0 X N N0 1 The following data has been observed for mu 71310 7142 70437019026030045064096 197 and 468 The results for the K7S test have been obtained through the following Siplus session ie Dfo 002219616 D170 03025681 and D10 03025681 gt x c 142 043 019 026 030 045 064 096 197 468 gt FX pnormx gt FX 1 007780384 033359782 042465457 060256811 061791142 067364478 7 073891370 083147239 097558081 099999857 gt Dp 11010 FX gt Dp 1 2219616902 1335978901 1246546e01 2025681e01 1179114e01 6 7364478902 3891370e02 3147239902 7558081e02 1434375906 gt Dm FX 0910 gt Dm 1 007780384 023359782 022465457 030256811 021791142 017364478 7 013891370 013147239 017558081 009999857 gt maxDp 1 002219616 gt maXDm 1 03025681 gt maxmaxDp maxDm 1 03025681 gt gt ksgofx alternative quottwosidedquot mean 0 sd 1 Onesample KolmogorovSmirnov Test Hypothesized distribution normal data x ks 03026 pvalue 02617 alternative hypothesis True cdf is not the normal distn with the specified parameters Using Rohatgi Table 7 page 661 we have to use D10020 0323 for 04 020 Since D10 03026 lt 0323 D10020 it is p gt 020 The K7S test does not reject H0 at level 04 020 As Siplus shows the precise pivalue is even p 02617 l 163 Note Comparison between X2 and K7S goodness of t tests o K7S uses all available data X2 bins the data and loses information o K7S works for all sample sizes X2 requires large sample sizes 0 it is more dif cult to modify K7S for estimated parameters X2 can be easily adapted for estimated parameters 0 K7S is conservative for discrete data7 ie7 it tends to accept Hg for such data 0 the order matters for K78 X2 is better for unordered categorical data 164 123 More on Order Statistics De nition 1231 Let F be a continuous cdf A tolerance interval for F with tolerance coe icient 39y is a random interval such that the probability is 39y that this random interval covers at least a speci ed percentage 100p of the distribution l Theorem 1232 lf order statistics X0 lt Xs are used as the endpoints for a tolerance interval for a continuous cdf F7 it holds that 97771 n I I v Z ltgtp 1p l 2 10 Proof According to De nition 12317 it holds that v PXX PXltXT lt X lt Xe 2 p Since F is continuous7 it holds that FX X N U01 Therefore7 PXX lt X lt Xs PX lt Xs 7 PX Xm FXs FX7quot Ultsgt Ultrgtv where Us and Um are the order statistics of a U07 1 distribution Thus7 V PXXS PXXT lt X lt Xs Z P PUs Um Z P By Therorem 4447 we can determine the joint distribution of order statistics and calculate 39y as 1 y 17 V 7 77quot 771 97771 n79 7 1 7 d d V 0 7 71ls77 71ln7sl y g y a y Rather than solving this integral directly7 we make the transformation U Ultsgt Um V UM Then the joint pdf of U and V is 7 uT 1u9 T 11 7 Unis7 if 0 lt u lt 1 lt1 faxMU O7 otherwise 165 and the marginal pdf of U is 1 fUU OfUVu7vdv x 1 WWWOMM U W1 UV d 77 97771 n7sr 1 771 n75 7 1711 tangoJ7 177 dt B39rn7s1 9777 n79 7T 7 7 11 nl n7srls7r71l 7 1 nlt 1gtu9117 u srl01u 01 us7r7l17 un7s39r01 A is based on the transformation 25 11 7 u v 7 u 17 74257 17 v 17 u 7 17 ut 7 U 17 u17t and d1 17 udt It follows that v P UM 7 Um 2 p PU 2 p 1 7 nlt 77 1 gtus77 7117un7s7 du 17 37771 PY lt s 7 r l where Y N Binnp 97771 n I I Z lt4gtp117pn7139 13970 Z B holds due to Rohatgi7 Remark 3 after Theorem 53187 page 2167 since for X N Binnp7 it holds that PX lt k Dm u 7 mwk dz 166 Example 1233 Let s n and r 1 Then7 n72 v 23 31931 29 1 7pquot 7 WWW 7p i lfp 08 and n 107 then V10 170810 i 10 089 02 0624 ie7 X1X10 de nes a 624 tolerance interval for 80 probability lfp 08 and n 207 then m 170820 7 20 0819 02 0931 and ifp 08 and n 307 then 30 170830 7 30 0829 02 0989 Theorem 1234 Let kl be the pth quantile of a continuous cdf F Let X1 Xn be the order statistics of a sample of size n from F Then it holds that 971 n 239 n4 PXr Skp Xs ZltigtP1P 27quot Proof It holds that Pat least r of the Xi s are kp i 929317 FY i39r PXr S 1 Therefore7 PX39r S kp PXs lt 1 i0 I pi1p i Z PX7quot S kp S Xs is s 1 gtpi1p i i39r 167 Lecture 38 We 041801 Corollary 1235 971 i n 7 i XTXs 1s a level ltigtpl1 7p l con dence Interval for kp l Example 1236 Let n 10 We want a 95 con dence interval for the median7 ie7 kp where p 971 n We get the following probabilities pm Z lt 4 i39r Zgtpi1 719W that XTXs covers kov5 p739 3 2 3 4 5 6 7 8 9 10 1 001 005 017 038 062 083 094 099 0998 2 004 016 037 061 082 093 098 099 3 012 032 057 077 089 093 094 4 021 045 066 077 082 083 T 5 025 045 057 061 062 6 7 8 9 021 032 037 038 012 016 017 004 005 001 Only the random intervals X1X9 X1X10 X2X9 and X2X10 give the desired coverage probability Therefore7 we use the one that comes closest to 957 ie7 X97 Xg7 as the 95 con dence interval for the median l 168 13 Some Results from Sampling 131 Simple Random Samples De nition 1311 Let Q be a population of size N with mean u and variance 02 A sampling method of size n is called simple if the set S of possible samples contains all combinations of n elements of 9 without repetition and the probability for each sample 3 E S to become selected depends only on n ie7 193 jlr V3 6 S Then we call 3 E S a simple random sample SRS of size n l Theorem 1312 Let Q be a population of size N with mean u and variance 02 Let Y Q a R be a measurable function Let m be the total number of times the parameter y occurs in the population and pi E be the relative frequency the parameter y occurs in the population Let 291 7 yn be a SRS of size n with respect to Y7 where PY pi Then the components yi i 17 771 are identically distributed as Y and it holds for i 7 j 71km k 7 I 1 PW yin yj yl nkz where nkl WWW 17 k l NN 7 1 Note i In Sampling7 many authors use capital letters to denote properties of the population and small letters to denote properties of the random sample In particular7 ms and ms are considered as random variables related to the sample They are not seen as speci c realizations ii The following equalities hold in the scenario of Theorem 1312 139 1 1 72 N WZ l l 1 2 2 Zm yi 7 N i 169 Theorem 1313 n Let the same conditions hold as in Theorem 1312 Let y i be the sample mean of a 391 SRS of size n Then it holds 1 i Ey u ie7 the sample mean is unbiased for the population mean u 1 2 Nilgin 11 My 3V N 2 17fN710 7Wheref Proof n Ey 21991 1 since Ere 1 Vi i1 ii Vady i Varyi 2 Z Cowgi7 i1 iltj COUltM7 24739 EW 24739 EyiEy Eyi 24 7 M2 Emmi 21m 271 7 2 W Th 1 g 1 2 W Zykymkm 1 i 1 i M2 7 k7 l k 1 m 23143171an i M2 w k 23km lm i i M2 k z k My 1 NZ 7 Ma2 2 i 2 NN 71 ltNM202M212N 1gt 1 7N71027fori7 j 170 gt Vary 1 TL 7 E Varyi 2 200143 z ZltJ ltn02nn71lt7N17102gtgt ilt111gt02 5130 17N1U2 1 JCNJY1 72 Theorem 1314 Let y be the sample mean of a SRS of size n Then it holds that 3n M N m0 d 1 f a N071 where N a 00 and f is a constant In particular7 when the yi s are 0717distributed with 7 Pyi 1 p Vi then it holds that M7 y p igtNlt071gt7 17fxP1 P where N a 00 and f is a constant 171 Lecture 39 Fr 042001 132 Strati ed Random Samples De nition 1321 Let Q be a population of size N that is split into m disjoint sets 9739 called strata of size m Nj j 1 m where N ZNJ39 If we independently draw a random sample of size 71739 in j1 each strata we speak of a strati ed random sample I Note i The random samples in each strata are not always SRS s ii Strati ed random samples are used in practice as a means to reduce the sample variance in the case that data in each strata is homogeneous and data among different strata is heterogeneous iii Frequently used strata in practice are gender state or county income range ethnic background etc De nition 1322 Let Y Q a R be a measurable function In case of a strati ed random sample we use the following notation Let ij j 1 m k 1 Nj be the elements in 97 Then we de ne Ni i Z the total in the jth strata k1 ii W the mean in the jth strata m 1 ZNJIuj the expectation or grand mean in M N j1 m m Ni iv Np Zig Z Z 13 the total j1 j1 k1 Ni v 072 7 M2 the variance in the jth strata and k1 m Ni vi 02 7 u2 the variance j1 k1 172 M vii We denote an ordered sample in 97 of size 717 as 3471 7yam and yj Z yjk the k1 sample mean in the jth strata Theorem 1323 Let the same conditions hold as in De nitions 1321 and 1322 Let j be an unbiased estimate of W and Varmj be an unbiased estimate of Varmj Then it holds 1 m 1 m EU ZNjElj NM M N 7 N 7 771 771 By independence of the samples within each strata7 1 m WWW Nil I31 7 ii WW imwm imam mo j1 7391 Theorem 1324 Let the same conditions hold as in Theorem 1323 If we draw a SRS in each strata7 then it holds m M i Ill ZNij is unbiased for u where yj Eggk j1m m 1 N39 Varamp Zanju 7 fjf10727Where fj j1 173 m A 1 ii Varamp E Nfi17 ms is unbiased for Varamp where 71 39 71 7 W 2 1 5739 771 Zij 722 k1 71739 Proof For a SRS in the jth strata7 it follows by Theorem 1313 Elt jgt M 1 N1 2 WNW 77711 MN 710739 Also7 we can show that N 7110 7 Now the proof follows directly from Theorem 1323 I De nition 1325 Let the same conditions hold as in De nitions 1321 and 1322 If the sample in each strata is of size 71739 71 j 17 7m7 where n is the total sample size7 then we speak of pro portional selection I Note i In the case of proportional selection7 it holds that fj f j 17 7 m ii Proportional strata cannot always be obtained for each combination of m7 n and N Theorem 1326 Let the same conditions hold as in De nition 1325 If we draw a SRS in each strata7 then it holds in case of proportional selection that A 1 17 f m WWW 77 EN 727 j1 where 772 072 Proof The proof follows directly from Theorem 1324 l Theorem 1327 If we draw 1 a strati ed random sample that consists of SRS s of sizes 717 under proportional m selection and 2 a SRS of size n Z 717 from the same population7 then it holds that j1 7 A 1 N 7 n m 2 1 m 2 Vary 7 VaTUI LENA 7 7 I 7 N N aj 7 Proof See Homework l 175 14 Some Results from Sequential Statistical Inference 141 Fundamentals of Sequential Sampling Example 1411 A particular machine produces a large number of items every day Each item can be either defective or nonidefective The unknown proportion of defective items in the production of a particular day is p Let X1 Xm be a sample from the daily production where z 1 when the item is m defective and z 0 when the item is nonidefective Obviously Sm EX Binmp i1 denotes the total number of defective items in the sample assuming that m is small compared to the daily production We might be interested to test H0 and use this decision to trash the entire daily production and have the machine xed if indeed where c is chosen such that 11 is a levelioz test p p0 vs H1 p gt p0 at a given signi cance level 04 p gt p0 A suitable test could be 1 ifsmgtc 111 7mm 0 ifsmgc However wouldn t it be more bene cial if we sequentially sample the items eg take item 57 623 1005 1286 2663 etc and stop the machine as soon as it becomes obvious that it produces too many bad items Alternatively we could also nish the time consuming and expensive process to determine whether an item is defective or nonidefective if it is impossible to surpass a certain proportion of defectives For example if for some j lt m it already holds that 5739 gt c then we could stop and immediately call maintenance and reject H0 after only j observations More formally let us de ne T minj Sj gt c and T minTm We can now con sider a decision rule that stops with the sampling process at random time T and rejects H0 if T g m Thus ifwe consider R0 m1mm tg m and R1 m1mm 3m gt c as critical regions of two tests 10 and 11 then these two tests are equivalent l 176 Lecture 4 We 042501 De nition 1412 Let 9 be the parameter space and A the set of actions the statistician can take We assume that the rv s X1X2 are observed sequentially and iid with common pdf or pmf f9z A sequential decision procedure is de ned as follows i A stopping rule speci es whether an element of A should be chosen without taking any further observation If at least one observation is taken this rule speci es for every set of observed values 1 2 m n 2 1 whether to stop sampling and choose an action in A or to take another observation 13n1 A V A decision rule speci es the decision to be taken If no observation has been taken then we take action do 6 A If n 2 1 observation have been taken then we take action dnm1 z E A where dnm1 zn speci es the action that has to be taken for the set 1mn of observed values Once an action has been taken the sampling process is stopped I Note In the remainder of this chapter we assume that the statistician takes at least one observation I De nition 1413 Let Rn Q B n 1 2 be a sequence of Borelimeasurable sets such that the sampling process is stopped after observing X1 m1X2 m2 Xn m if 1mn E Rn lf ml zn Rn then another observation ml is taken The sets Rn n 1 2 are called stopping regions I De nition 1414 With every sequential stopping rule we associate a stopping random variable N which takes on the values 1 2 3 Thus N is a rv that indicates the total number of observations taken before the sampling is stopped I Note We use the sloppy notation N n to denote the event that sampling is stopped after observing exactly 71 values m1 mn ie sampling is not stopped before taking 71 samples Then the following equalities hold N1 R1 177 N n zl zn E R l sampling is stopped after 71 observations but not before RloRgooRn10 Rn Rf Rg RfLmRn Here we will only consider closed sequential sampling procedures ie procedures where sampling eventually stops with probability 1 ie PN lt oo 1 PNoo17PNltoo0 Theorem 1415 Wald7s Equation N Let X1X2 be iid rv s with X1 lt 00 Let N bea stoppingvariable Let SN ZXk k1 lf EN lt 00 then it holds EltSNgt EltX1gtEltNgt Proof De ne a sequence of rv s Yi i 1 2 where 1 if no decision is reached up to the 7 1 stage ie N gt 7 1 y 0 otherwise Then each Y is a function of X1X2 XFl only and Yi is independent of Xi Consider the rv 00 Z XnYn n1 Obviously it holds that 00 SN Z XnYn n1 Thus it follows that 00 ESN E 2 my 9 n1 It holds that iEOXnYnl iE XMWHYnl 1 n1 Elt1X11gtiPltN2ngt n1 178 EX1iiPNk n1kn i EltX1igtinPltNngt n1 EX1EN lt 00 A holds due to the following rearrangement of indizes We may therefore interchange the expectation and summation signs in and get Lecmre 423 Fr 04 27 01 00 ESN E 2 XML n1 which completes the proof I 179 142 Sequential Probability Ratio Tests De nition 1421 Let X17X27 be a sequence of iid rv s with common pdf or pmf f9z We want to test a simple hypothesis H0 X N 1 90 vs a simple alternative H1 X N 1 97 when the observations are taken sequentially Let fOn and fln denote the joint pdf s or pmf s of X17 7 Xn under H0 and H1 respectively7 ie7 n n f0nz17 7 mn H f90mi and f1nz17 7m H f91z7 i1 i1 Finally7 let 7 f1nlt gt 7 JIOTLQ7 where g 17 7 Then a sequential probability ratio test SPRT for testing H0 Anx17 7 mn vs H1 is the following decision rule i If at any stage of the sampling process it holds that Mg 2 A then stop and reject H0 ii If at any stage of the sampling process it holds that An g 37 then stop and accept H07 ie7 reject H1 iii If B lt Ang lt A7 then continue sampling by taking another observation 7711 Note i It is usually convenient to de ne f91Xi Z lo 77 l g feoXi where Z17 Z27 are iid rv s Then7 we work with n 10g AME 210gf9110gf90 i1 i1 instead of using Obviously7 we now have to use constants b log B and a log A instead of the original constants B and A 180 ii A and B where A gt B are constants such that the SPRT will have strength a B where 04 PTypel error PReject H0 l H0 and B PType ll error PAccept H0 l H1 If N is the stopping rV7 then a P90ltANltXgt gt A and P91ltANltXgt lt B Example 1422 Let X17X27 be iid Nuz727 where u is unknown and 02 gt 0 is known We want to test H0 In no vs H1 1 1 where no lt 111 If our data is sampled sequentially7 we can constract a SPRT as follows Iogxne Z wimfir o2gt i1 1 11 1 n 2 2 2 2 g QQEWO 0 i 95139 2mm M1 1 1 n 2 2 5p1ltmmmhwim 11 1 7L W 2 2M0 7 0 71 i 1 i1 7 mim mm 47 ZM W f 7 i1 We decide for H0 if 1 b 7L 7 a mltzmingtltb ltgt mm bi i1 2 where If lm 181 We decide for H1 if 2 7 MI MOa39 where f Example 1422 accept HEI urmnue accept H1 w a sumw Theorem 1423 For a SPRT with stopping bounds A and B A gt B and strength 043 we have 1 7 73 and B 2 B Oz 1 04 AS where0ltozlt1and0lt lt1 Theorem 1424 Assume we select for given a B E 0 1 where 04 3 g 1 the stopping bounds A7li andBi 04 1704 Then it holds that the SPRT with stopping bounds A and B has strength 0 3 where 5 1704 og1 3 g andoz oz 182 Note i The approximation A 1 and B in Theorem 1424 is called Waldi Approximation for the optimal stopping bounds of a SPRT ii A and B are functions of Oz and B only and do not depend on the pdf s or pmf s 1 90 and 1 91 Therefore7 they can be computed once and for all fezs i O7 1 THE END 183 Index Daisimilar 109 071 Loss 129 A Posteriori Distribution 86 A Priori Distribution 86 Action 83 Alternative Hypothesis 91 Ancill 56 Asymptotically Most Ef cient 73 Basu7s Theorem 56 Bayes Estimate 87 Bayes Risk 86 Bayes Rule 87 Bayesian Con dence Interval 149 Bayesian Con dence Set 149 Bias 58 Borel antelli Lemma 21 Cauchy Criterion 23 Centering Constants 15 19 Central Limit Theorem Lindeberg 33 Central Limit Theorem LindebergiLevy 30 Chapman Robbins Kiefer Inequality 71 Complete 52 152 Complete in Relation to 73 152 Composite 91 Con dence Bound Lower 135 Con dence Bound Upper 135 Con dence Level 134 Con dence Sets 134 Conjugate Family 90 Consistent 5 Consistent in the rthMean 45 Consistent MeaniSquarediError 59 Consistent Strongly 45 Consistent Weakly 45 Continuity Theorem 29 Contradiction Proof by 61 Convergence Almost Sure 12 Convergence In 7 Mean 10 Convergence In Absolute Mean 10 Convergence In Distribution 2 Convergence In Law 2 Convergence In Mean Square 10 Convergence In Probability 5 Convergence Strong 12 Convergence Weak 2 184 Convergence With Probability 1 12 ConvergenchEquivalent 23 Cram riRao Lower Bound 67 68 Credible Set 149 Critical Region 92 CRK Inequality 71 CRLB 67 68 Decision Function 83 Decision Rule 177 Degree 153 Distribution A Posteriori 86 Distribution Population 36 Distribution Sampling 2 DistributioniFree 152 Domain of Attraction 32 Ef ciency 73 E icient Asymptotically Most 73 E icient More 73 E icient Most 73 Empirical CDF 36 Empirical Cumulative Distribution Function 36 Equivalence Lemma 23 Error Type I 92 Error Type II 92 Estimable 153 Estimable Function 58 Estimate Bayes 87 Estimate Maximum Likelihood 77 Estimate Method of Moments 75 Estimate Minimax 84 Estimate Point 44 Estimator 44 Estimator ManniWhitney 157 Estimator WilcoXin 2aSample 157 Exponential Family OncLParameter 53 FiTest 127 Factorization Criterion 50 Family of CDF7s 44 Family of Con dence Sets 134 Family of PDF7s 44 Family of PMF7s 44 Family of Random Sets 134 Formal Invariance 112 Generalized UiStatistic 157 ClivenkmCantelli Theorem 37 Hypothesis Alternative 91 Hypothesis Null 91 Independence of Y and 52 41 Induced Function 46 Inequality Kolmogorov75 22 Interval Random 134 Invariance Measurement 112 Invariant 46 111 Invariant Test 112 Invariant Location 47 Invariant Maximal 113 Invariant Permutation 47 Invariant Scale 47 KiS Statistic 158 KiS Test 162 Kernel 153 Kernel Symmetric 153 Khintchine7s Weak Law of Large Numbers 18 Kolmogorov7s Inequality 22 Kolmogorov7s SLLN 25 KolmogoroviSmirnov Statistic 158 KolmogoroviSmirnov Test 162 Kronecker75 Lemma 22 Landau Symbols 0 and 0 31 LehmanniSche tee 65 Level of Signi cance 93 Level iTest 93 Likelihood Function 77 Likelihood Ratio Test 116 Likelihood Ratio Test Statistic 116 Lindeberg Central Limit Theorem 33 Lindeberg Condition 33 LindebergiLevy Central Limit Theorem 30 LMVUE 60 Locally Minumum Variance Unbiased Estimate 60 Location Invariant 47 Logic 61 Loss Function 83 Lower Con dence Bound 135 LRT 116 Mannin1itney Estimator 157 Maximal Invariant 113 Maximum Likelihood Estimate 77 Mean Square Error 59 MeanaSquarediError Consistent 59 Measurement Invariance 112 Method of Moments Estimate 75 Minimax Estimate 84 Minimax Principle 84 Minmal Suf cient 56 MLE 77 MLR 102 MOM 75 Monotone Likelihood Ratio 102 185 More Ef cient 73 Most Ef cient 73 Most Powerful Test 93 MP 93 MSEConsistent 59 NeymaniPearson Lemma 96 Nonparametric 152 Nonrandomized Test 93 Normal Variance Tests 121 Norming Constants 15 19 NP Lemma 96 Null Hypothesis 91 One Sample tiTest 125 OncLTailed tiTest 125 Paired tiTest 126 Parameter Space 44 Parametric Hypothesis 91 Permutation Invariant 47 Pivot 138 Point Estimate 44 Point Estimation 44 Population Distribution 36 Posterior Distribution 86 Power 93 Power Function 93 Prior Distribution 86 Probability Integral Transformation 159 Probability Ratio Test Sequential 180 Problem of Fit 158 Proof by Contradiction 61 Proportional Selection 174 Random Interval 134 Random Sample 36 Random Sets 134 Random Variable Stopping 177 Randomized Test 93 RaoiBlackwell 64 RaoiBlackwellization 65 Realization 36 Regularity Conditions 68 Risk Function 83 Risk Bayes 86 Sample 36 Sample Central Moment of Order k 37 Sample Mean 36 Sample Moment of Order k 37 Sample Statistic 36 Sample Variance 36 Sampling Distribution 2 Scale Invariant 47 Selection Proportional 174 Sequential Decision Procedure 177 Sequential Probability Ratio Test 180 Signi cance Level 93 Similar 109 9 Simple Random Sample 169 Size 93 Slutsky7s Theorem 8 Statistic KolmogoroviSmirnov 158 Statistic Likelihood Ratio Test 116 Stopping Random Variable 177 Stopping Regions 177 Stopping Rule 177 Strata 172 Strati ed Random Sample 172 Strong Law of Large Numbers Kolmogorov7s 25 Strongly Consistent 45 Suf cient 48 152 Suf cient Minimal 56 Symmetric Kernel 153 tiTest 125 TailiEquivalent 23 Taylor Series 31 Test Function 93 Test Invariant 112 Test KolmogoroviSmirnov 162 Test Likelihood Ratio 116 Test Most Powerful 93 Test Nonrandomized 93 Test Randomized 93 Test Uniformly Most Powerful 93 Tolerance Coe icient 165 Tolerance Interval 165 TwoiSample tiTest 125 TwoiTailed tiTest 125 Type 1 Error 92 Type II Error 92 UaStatistic 154 UaStatistic Generalized 157 UMA 135 UMAU 145 UMP 93 UMP Daisimilar 110 UMVUE 60 Unbiased 58 106 145 Uniformly Minumum Variance Unbiased Estimate 60 Uniformly Most Accurate 135 Uniformly Most Accurate Unbiased 145 Uniformly Most Powerful Test 93 Unimodal 139 Upper Con dence Bound 135 Wald7s Equation 178 WaldiApproXimation 183 Weak Law Of Large Numbers 15 Weak Law Of Large Numbers Khintchine7s 18 Weakly Consistent 45 WilcoXin 2aSample Estimator 157

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

#### "I made $350 in just two days after posting my first study guide."

#### "I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

#### "Their 'Elite Notetakers' are making over $1,200/month in sales by creating high quality content that helps their classmates in a time of need."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.