### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# Mathematical Statistics I STAT 6710

Utah State University

GPA 3.72

### View Full Document

## 22

## 0

## Popular in Course

## Popular in Statistics

This 277 page Class Notes was uploaded by Geovanny Lakin on Wednesday October 28, 2015. The Class Notes belongs to STAT 6710 at Utah State University taught by Juergen Symanzik in Fall. Since its upload, it has received 22 views. For similar materials see /class/230495/stat-6710-utah-state-university in Statistics at Utah State University.

## Similar to STAT 6710 at Utah State University

## Popular in Statistics

## Reviews for Mathematical Statistics I

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 10/28/15

STAT 6710 Mathematical Statistics I Fall Semester 1999 Dr Jijrgen Symanzik Utah State University Department of Mathematics and Statistics 3900 Old Main Hill Logan UT 843223900 m 435 7970696 FAX 435 7974822 e mail symanzistunfs math usu edu Contents Acknowledgements 1 1 Axioms of Probability 1 11 U Fields 1 12 Manipulating Probability 5 13 Combinatorics and Counting 11 14 Conditional Probability and Independence 16 2 Random Variables 23 21 Measurable Functions 23 22 Probability Distribution of a Random Variable 27 23 Discrete and Continuous Random Variables 31 24 Transformations of Random Variables 36 3 M and F C u 42 31 Expectation 42 32 Generating Functions 51 33 Complewaalued Random Variables and Characteristic Functions 57 34 Probability Generating Functions 68 3 5 Moment Inequalities 69 4 Random Vectors 72 41 Joint Marginal and Conditional Distributions 72 42 Independent Random Variables 77 43 Functions of Random Vectors 82 44 Order Statistics 91 45 Multivariate Expectation 93 46 Multivariate Generating Functions 98 47 Conditional Expectation 104 48 Inequalities and Identities 106 5 01 Particular Distributions 113 51 Multivariate Nornlal Distributions 113 52 Exponential Fanliity of Distributions 120 Limit Theorems 122 61 Modes of Convergence 123 62 Weak Laws of Large Numbers 136 63 Strong Laws of Large Numbers 140 Acknowledgements I would like to thank my students Hanadi B Eltahir Rich Madsen and Bill Morphet who helped in typesetting these lecture notes using L TEX and for their suggestions how to improve some of the material presented in class In addition I particularly would like to thank Mike Minnotte and Dan Goster who previously taught this course at Utah State University for providing me with their lecture notes and other materials related to this course Their lecture notes combined with additional material from Rohatgi 1976 and other sources listed below form the basis of the script presented here The textbook required for this class is I Rohatgi V K 1976 An Introduction to Probability Theory and Mathematical Statistics John Wiley and Sons New York A Web page dedicated to this class is accessible at http Hum math usu eduquotsymanzikteaching1999stat6710stat6710 html This course closely follows Rohatgi 1976 described in the syllabus Additional material origi nates from the lectures from Professors Hering Trenkler and Gather I have attended while study ing at the Universitat Dortmund Germany the collection of Masters and PhD Preliminary Exam questions from Iowa State University Ames Iowa and the following textbooks I Bandelow G 1981 Einfuhrung in die W 39 39 39 quot quot quot ie Iquot Lquot Insti tut Mannheim Germany I Gasella G and Berger R L 1990 Statistical Inference Wadsworth SI BrooksCole Paci c Grove CA I Pisz M 1989 Wa39u 39 39 quot quot quot 39 E und quot Statistih VEB Deutscher Verlag der Wissenschaften Berlin German Democratic Republic I Kelly D G 1994 Introduction to Probability Macmillan New York NY I Mood A M and Graybill P A and Boes D G 1974 Introduction to the Theory of Statistics Third Edition McGrawHill Singapore I Parzen E 1960 Modern Probability Theory and Its Applications Wiley New York NY I Searle S R 1971 Linear Models Wiley New York NY Additional definitions integrals sums etc originate from the following formula collections I Bronstein I N and Semencljajew K A 1985 Taschenbuch der Mathematik 22 Auflage Verlag Harri Deutscll Tllun German Democratic Republic I Bronstein I N and Semendjajew K A 1986 Erydnzende Kapz39tel zu Taschenbuch der Mathematik 4 Auflage Verlag Harri Deutsch Thun German Democratic Republic I Sieber H 1980 Mathematische Formeln Erweiterte Ausgabe E Ernst Klett Stuttgart Germany lurgen Symanzik January 18 2000 Stat 6710 Mathematical Statistics I Fall Semester 1999 Lecture 2 Wednesday 911999 Scribe Jurgen Symanzik 1 Axioms of Probability 11 aFields Let S be the sample space of all possible outcomes of a chance experiment Let w E S or a 6 S2 be any outcome Example Count of heads in n coin tosses 2 012 n Any subset A of 2 is called an event For each event A Q 32 we would like to assign a number ie a probability Unfortunately we cannot always do this for every subset of 2 Instead we consider classes of subsets of 2 called elds and U elds Definition 111 A class L of subsets of 2 is called a eld if S E L and L is closed under complements and finite unions ie L satisfies a 2 e L ii A e L 2 AC 5 L iii 1413 6 L AUB e L Since 120 0 and ii imply 0 E L Therefore iquot 0 E L can replace Recall De Morgan s Laws U A 0 AC0 and A U AC A 6A A E A AE A A EA Note So 33 333 3mply 333quot AB 6 L 2 A 013 E L can replace 333 Proof 14135 LgACBC e LELMCUBC ELQ ACUBCC eL AnBeL De n3t3on 112 A class L of subsets of 2 3s called a U eld Borel eld amalgebra 3f 3t 3s a eld and closed under countable un3ons 3e 00 3V 46311 6 L 2 U An 6 L I n 1 Note 3v 3mpl3es 333 by tak3ng An 0 for n 2 3 Example 113 For some 2 let L conta3n all f3n3te and all co mte sets A 3s co mte 3f AC 3s f3n3te Then L 3s a eld But L 3s a aw eld iff 3f and only 3f 12 3s f3n3te Example 00 2 Z Take An n each f3n3te so An 6 L But U An ZJr g L s3nce the set 3s not f3n3te 3t n1 0 3s 3nf3n3te and also not co mte U An0 0 3s 3nf3n3te too 791 Quest3on Does th3s construct3on work for 2 ZJr 5quot l The largest aw eld 331 2 3s the power set PM of all subsets of 2 The smallest m eld 3s L 012 Term3nology A set A E L 3s sa3ltl to be measurable L Stat 6710 Mathematical Statistics I Fall Semester 1999 Lecture 3 Friday 931999 Scribe Jurgen Symanzik We often begin with a class of sets a which may not be a field or a mfield Definition 114 The awfield generated by a 7a is the smallest awfield containing a or the intersection of all Uwfields containing a l Note i Such awfields containing 1 always exist eg PUD and ii the intersection of an arbitrary of awfields is always a awfield Proof ii Suppose L 0L9 We have to show that conditions i and ii of Def 111 and iv of Def 112 are fulfilled 8 1125 Lg v9 12 e L 11 Let AeLAeL9 v9ACEL9 v9ACeL iv Let A e L m A 5 L1 v9 m 2 UAR 5 L1 v9 2 UAR e L I R R Example 115 52 0123a b Olt 0t What is 7a 7a must include 12 0 0 also 123 by 111 ii Since all unions are included we have 7a S2 31 0 1 2 What is 71 71 must include 120 001 also 12323 by 111 ii 023 by 111 iii 1 by 111 ii Since all unions are included we have ob S2 33 0 1 0 1 23 0 23 1 23 I If 12 is finite or countable we will usually use L PZ If 2 n lt 00 then L 2 If 12 is uncountable PM may be too large to be useful and we may have to use seine smaller owfield Definition 116 If 12 R an important special case is the Bore a eid ie the a field generated from all half open intervals of the form a b denoted 8 or 81 The sets of 8 are called Bore sets The Borel awfield on I39Rd Ed is the a field generated by d dimensional rectangles of the form 11121dailt1igbii12d l Note 00 1 8 contains all points 1 7 711 n closed intervals y 1y 1y U open intervals 1 y 1 y y 1y 9 0 00 and semi infinite intervals 1 so U n l 711 We now have a measurable space ELL We next define a probability measure P on ELL to obtain a probability space 12 L P Definition 117 Koimogorov Axioms of Probability A probability measure pm P on 12 L is a set function P L gt IR satisfying a 0 g PA VA 6 L ii PZ 1 00 00 00 iii If An Roz are disjoint sets in L and U A E L then P U A Z PAn 711 711 71 Note 00 U A E L holds automatically if L is a mfield but it is needed a precondition in the case that 711 L is just a field Property iii is called countable additiuity Stat 6710 Mathematical Statistics I Fall Semester 1999 Lecture 4 Wednede 981999 Scribe Jijrgen Synlanzik Bill Morphet 12 Manipulating Probability Theorem 121 For P a pin on 2 L it holds i P 0 ii HAG 1 PA VA 5 L iii PA g 1 VA 5 L iv PA u 13 PA P13 PA n 13 VA 13 e L V If A Q B then PA g PUB Proof 00 114 11 v U An eL 791 A79 0 A 11 0 11 3 Vi j 2 A are disjoint Vn 0 05117919 0 0 P PUA79 ZPAnZP 791 791 791 This can only hold if Pall 0 ii A E 14142 E 14014n E 11 V712 3 00 00 1392 U A 141 1114211 U A 141 111421111 791 793 Al 0 Ag 1410 ll Ag 0 ll ll 141142 ll are disjoint 1 Pm PU A79 791 05117919 ipm A 79 791 2 PAlt 1 PA VA 5 L iii By Th 121 ii PA 1 PAlt 2 PA g 1 VA 5 L since PAlt 2 0 by Def 117 i iv A U B A 0 BC U A O B U B 0 AC So A U B can be written a union of disjoint sets A n 130 A n 13 13 0 A0 PA n 130 o A n 13 u 13 0 AC PA 0 BC PA n 13 p13 0 AC PA n 130 PA n 13 p13 0 AC PA n 13 PA n 13 PA n 130 PA n 13 p13 0 A PA n 13 PA n 13 PA p13 PA n 13 2 PA 11 13 5 131mm v B B 0 AC U A where B 0 AC and A are disjoint sets p13 p13 0 A0 11 A quot quot 7W p13 0 A0 PA 2 PA PB PB 0 AC 2 PA g PB since PB 0 AC 2 0 by Def 117 i Theorem 122 Principle of Inclusion Exclusion Let 1411421An E L Then R R R R R WU Ak ZPMIJ Z PAk DAMF Z PAk DAIcmAngH 1quot1P Ak k1 k1 k ltk2 k1 ltk2ltk3 k1 Proof n 1 is trivia n 2 is Theorem 121 iv use induction for higher 71 Homework l Theorem 123 7 D T 1 y Let AAzAn E L Then ZPA1 ZPA1 OAj S PUA1 S ZPUL i1 iltj 1 i1 Proof Right side induction base For n 1 Th 123 right side evaluates to PAl g PAl which is true H142 For n 2 Th 123 right side evaluates to PAl U Ag 3 PAl 1 PAg since PAl Ag 2 0 by Def PAl u 142quot39 quot p 41 p142 PAl 0 Ag 3 PA 117 i This establishes the induction base for the right side of Th 123 Right side induction step assumes Th 123 right side is true for n and shows that it is true for n 1 n1 R WU A1 PU 4iUAnl i1 i1 Th 1 1 iv PM A PltAn1gt PltltUAigtnAn1gt i1 i1 Uta4176 quot 3 PM A HAW i1 113 n 3 2 PM HAW i1 n1 2 PM i1 Left side induction base For n 1 Th 123 left side evaluates to PAl g 13141 which is true For n 2 Th 123 left side evaluates to PAl PAg 13141 0 Ag 3 13141 U Ag which is true by Th 121 iv For n 3 Th 123 left side evaluates to PM P912 PA3 PPM 0 A42 P9110 A3 PUD 0 A3 3 PPM U 42 U A3 This holds since PAl 0 AZ 0 43 PAl u 42 u 43 quot39 quot quot ml u 42 PAa PAl u 42 n 43 H410 Ag P43 PAl n 43 0 AZ 0 Ag quot39quot lt ms PM PHI 0 Ag PAx HA 0 A mg 0 Ag PAl 0 A3 0 42 0 A3 PAl Pm PA H410 42 H410 43 PAz n 43 H410 42 n 43 WE PAl PW 1143 PM o 42 PM o 43 PAz n 43 This establishes the induction base for the left side of Th 123 Left side induction step assumes Th 123 left side is true for n and shows that it is true for n 1 n1 n H U A PltU4igtu4n1gt l 1 7 1 n PUA1 PAn1 PUAilnAnH i1 i1 left 113 n n n 2 Z HA1 Z PAi fl Aj P 4nl PU Ai fl Ami i1 iltj i1 n1 n n ZPA1 21 4in 4 PUAinAnl i1 iltj i1 I39hl23 right side quot1 n n 2 ZHAi ZHAinAjl ZPHinAnHl i1 iltj i1 n1 n1 ZPA1 ZPA1 0AJ39 i1 iltj Stat 6710 Mathematical Statistics I Fall Semester 1999 Lecture 5 Friday 9101999 Scribe Jurgen Symanzik Rich Madsen Theorem 124 Boole s Inequality Let ALB E L Then a PA n B 2 PA PU 1 ii PA n B 2 1 HAG HBO Proof Homework l Definition 125 Continuity of sets For a sequence of sets An1An E L and A E L we 00 a An T A if A1 g A2 g A g and A U A 791 00 ii An 1 A if Al 3 A2 3 A g and A Q A 791 Theorem 126 If An lAn E L and A E L then 7331010 PAn PA if125 or 125 ii holds Proof Part i Assume that 125 holds Let 131 A1 and 13 A AH A nAgL1 w 2 2 By construction Bi 0 Bj 11 for i j 00 00 ItisA UAR U13 791 n1 R R and also An U 147 U Bi i1 i1 00 00 n PA Bk Byl39ill7iiil ZPUBIII P030 kl k l 391 R R againbyng 117 iii I I nlggol kgl 81 niggemkgl Aw Wiggle PM W The last step is possible since A U Ak kl Part ii Assume that 125 ii holds 00 00 Then A Q A Q A Q and AC Any DBMEWW U Ag n1 n1 I C By Fit 3 I C PA nlglolopz1n 0 0 So 1 PA 1 nlglOIOPAn 2 PM 1 MA gggopw 3320 Theorem 127 i Countable unions of probability 0 sets have probability 0 ii Countable intersections of probability 1 sets have probability 1 Part i Let 47321 E L PM 0 Vn By RAP a 00 By Bonicrroni s Ina xiality 0 0 0 HUM s ZPAgt200 n1 791 nl 00 Therefore P U A 0 n1 Part ii Let An il E L PAn 1 Vn l T V l T l 2 7 f 00 quot I M 39 00 ng 0 Vn NU Ag 0 pm A 1 791 n1 Stat 6710 Mathematical Statistics I Fall Semester 1999 Lecture 6 Monday 9 13 1999 Scribe Iurgen Symanzik Hanadi B Eltahir 13 Combinatorics and Counting For now we restrict ourselves to sample spaces containing a finite number of points Let 2 w1w and L 129 For any A e L PA 2 PM we 4 Definition 131 We the elements of 2 are equally likely or occur with uniform probability if l 39 PwJ n VJ 1n I Note t 4 num er in z If this is true PA number wj m 32 Therefore to calculate such probabilities we Just need to be able to count elements accurately Theorem 132 Fundamental Theorem of Counting If we wish to select one element a1 out of m choices a second element 12 out of m choices and so on for a total of k elements there are m x m x m x x W ways to do it Proof By Induction Induction Base k 1 trivial k 2 m ways to choose m For each m ways to choose 12 Total of ways m m m m x m m times Induction Step Suppose it is true for k 1 We show that it is true for k k 1 1 There are m x m x m x x WA ways to select one element a1 out of m choices a second element 12 out of m choices and so on up to the k 1 element 114 out of WA choices For each of these m x W x m x x nkq possible ways we can select the k element ak out of m choices Thus the total of ways 2 m x m x m x x nkq x m l 11 Definition 133 For positive integer n we define n factorial n n x n 1 x n 2 x x 2 x 1 n x n 1 and 0 1 I Definition 134 For nonnegative integers n 2 aquot we define the binomial coef cient read n choose aquot T n i 1 1lt2lt3lt i n n Wun 1 n 1 n r1 9 I Note Most counting problems consist of drawing a fixed number of times from a set of elements eg 1 2 3 4 5 6 To solve such problems we need to know i the size of the set M ii the size of the sample iii whether the result will be ordered ie is 1 2 different from 2 and iv whether the draws are with replacement ie can results like 11 occur Theorem 135 The number of ways to draw 9 elements from a set of n if l i ordered without replacement is ii ordered with replacement is M I 39 V Rf 7 iii unordered without replacement is m i 71 E nril iv unordered with replacement is lt aquot Proof i n choices to select 15 n 1 choices to select 2W n aquot 1 choices to select 9 I 7 71 7 By Theorem 132 there are nix n 1xx n v 1 mm 1 X Zfr Xm T ways to do so A u V V iii V Corollary The number of permutations of n objects is NJ n choices to select 15 n choices to select 2W n choices to select 9 By Theorem 132 there are n x n x x n M ways to do so I We know from i above that there are NZ without replacement in the ordered case However for each unordered set of size aquot there are n n w ways to draw 391quot elements out of n elements aquot related ordered sets that consist of the same elements Thus there are ways to draw 391quot elements out of n elements without replacement in the unordered case There is no immediate direct way to show this part We have to come up with some extra motivation We assume that there are n 1 walls that separate the n bins of possible outcomes and there are r markers If we shake everything there are n 1 aquot permutations to arrange these n 1 walls and 391quot markers according to the Corollary Since the r markers are indistinguishable and the n 1 markers are also indistinguishable we have to divide the number of permutations by aquot to get rid of identical permutations where only the markers are changed and by n 1 to get rid of identical permutations where only the walls are 1 g 7 changed Thus there are 7 1 ways to draw 391quot elements out of n elements with replacement in the unordered case Theorem 136 The Binomial Theorem If n is a non negative integer then n yquot T 1 35 Z T a r0 Proof By Induction Induction Base 0 0 0 n011m0Z ml 300 1 r0 0 ni11yli 130T 1360 m11m r0 T 0 Induction Step Suppose it is true for k We show that it is true for k 1 1mkl 1 ll quot w Efllrillrl ziilr HE A w Here we use Theorem 138 Since the proof of Theorem 138 only needs algebraic trans formations without using the Binomial Theorem part of Theorem 138 can be applied here I Corollary 137 For a non negative integer n it holds n n 1 0 1 2quot ii i Proof Use the Binomial Theorem 3 Let a 1 Then 21 1 1 mgr i 1T 2 i r0 y r0 ii Let a 1 Then 0 1 1n 811217 2 n 1r r0 d d n n in 51 31 a Z T ml r0 n y 2 n1 31 1 Z an r1 Substitute a 1 then n y 7 2quotquot n1 Uni E vquot iv Substitute a 1 in iii above then 0 n1 1quot1 1T 1 1 since for Z if 0 also Z ai 0 Note A useful extension for the binonlial coe icient for n lt 9quot is 3Wun l 0 n v 1 0 12 Theorem 138 For non negative integers n m aquot it holds a 1 r 0 lt8 0 390 31 I 396 quot1quot Proof Homework Stat 6710 Mathematical Statistics I Fall Semester 1999 Lecture 7 Wednesday 9 15 1999 Scribe Jurgen Symanzik Bill Morphet 14 Conditional Probability and Independence So far we have computed probability based only on the information that 2 is used for a probabil ity space 12 L P Suppose instead we know that event H E L has happened What statement should we then make about the chance of an event A E L 539 Definition 141 Given SZ L P and H E LPH gt 0 and A E L we define PA O H P A H P A lt 1 gt Pm m gt and call this the conditional probability of A given H l Note This is undefined if PH 0 Theorem 142 In the situation of Definition 141 32 L P is a probability space Proof If P is a probability measure it must satisfy Def 117 a pm gt 0 and by Def 117 3 H140 H 2 0 2 p A 2 0 v14 5 L is PMS gg 1 iii Let 145311 be a sequence of disjoint sets Then P 00 Daf141 Pang An 0 H H Hg An 5 f 15W 4 PM o H PltHgt 00 PAn n H 791 00 I5f14l 2 PH fin 79 Note What we have done is to move to a new sample space H and a new awfield L H L O H of subsets A O H for A E L We thus have a new measurable space 7111 and a new probability space imp Note From Definition 141 if 1413 6 LPA gt 0 and PU gt 0 then 1314 0 B PAPBA PBPAB which generalizes to Theorem 143 Multiplication Rule R71 1 41 14 e L and P 0 A gt 0 then jl 7 R71 P 143 Pl41 Pl42141 P143141 142 PAn m 143 j1 i1 Proof Homework l Definition 144 A collection of subsets 14 311 of 2 form a partition of S if 00 a U An 2 and nl ii 147 O Aj 0 Vi j ie elements are pairwise disjoint Theorem 145 Law of Total Probability If is a partition of 2 and PHj gt 0 W then for A E L PM 2m 0 Hg ZPWHAW j1 j1 Proof By the Note preceding Theorem 143 the summands on both sides are equal 2 the right side of Th 145 is true The left side proof Hj are disjoint A O Hj are disjoint 00 00 A A n 2 511 An U Hj U A 0 H3 j1 j1 m kaLL m 2PM PUA Hj EMAan I j1 j1 Theorem 146 Bayes Rule Let be a partition of 2 and PHj gt 0 W Let A E L and PA gt 0 Then P H P A H 39 pHjA M V Z PHnPAHn n1 Proof Pm 0A MA MA 4 PWM Pm 4 mm A JI A 39w z I 111 A PHjAHHLZQHH1 4 OOIHI1H I Z PHnPAHn 791 Definition 147 For AB 6 L A and B are independent iff PA O B PAPB l Note There are no restrictions on PA or PB I If A and B are independent then PAB PA given that PB gt 0 and PBA PB given that PA gt 0 I If A and B are independent then the following events are independent well A and BC AC and 13 AC and 130 Definition 148 Let A be a collection of stets The events of A are pairwise independent ifl for every distinct AlAz E A it holds PAl 0 AZ PAIPAZ I Definition 149 Let A be a collection of stets The events of A are 39 J r J or L h inde r k pendent if for every finite subcollection AiI Aik A77 6 A it holds P Aij H PAij jl jl I Note To check for mutually independence of n events Al An 6 L there are 2 n 1 relations ie all subcollections of size 2 or more to check Example 1410 Flip a fair coin twice S2 1111 HT TH TT Al 2 H on 1st toss AZ 2 H on 2nd toss Ag Exactly one H Obviously PAl PAz PAg Question Are A1 Ag and Ag pairwise independent and also mutually independent PAl 0 AZ 25 5 5 PAl PAZ AlAz are independent PAl 0 Ag 25 5 5 PAl PAg AlAg are independent PAz 0 Ag 25 5 5 PAZ PAg AZ Ag are independent Thus AlAZ Ag are pairwise independent PAl Az Ag 0 5 5 5 PAl PAz PAg Al 142143 are not mutually independent I Stat 6710 Mathematical Statistics I Fall Semester 1999 Lecture 8 Friday 9171999 Scribe Jurgen Symanzik Rich Madsen Example 1411 from Rohatgi page Example 5 I 391quot students 365 possible birthdays for each student that are equally likely I One student at a time is asked for hisher birthday I If one of the other students hears this birthday and it matches hisher birthday this other student has to raise hisher hand if at least one other student raises hisher hand the procedure is over I We are interested in pk Pprocedure terminates at the kth student Pa hand is first risen when the kth student is asked for hisher birthday It is 1 Pat least 1 other from r1 students has a birthday on this particular day 1 Pall Tl students have a birthday on the remaining 364 out of 365 days 364 Tquot 1 s 365 p2 Pno student has a birthday matching the first student and at least one of the other rE students has a bday matching the second student Let A E No student has a bday matching the 1SEstudent Let B E At least one of the other rQ has bday matching 2nd So 2 PAOB PA PBA Pno student has a matching bday with the 1SEstudent x Pat least one of the remaining students has a mathching bday with the second given that no one matched the first 20 1 p11 Pah TQ students have a bday on the remaining 363 out of 364 days 1 365 364 365 1 1 1 T2 365 364 Working backwards from the book 35515 2 1H 365 2 H P2 7 1 1 3652quot 365 365 2 1 1 1yquot w 365 365 364 365 1 l 1H 365 364 Same what we found before p3 PNo one has same bday first and no one same second and at least one of the remaining 9 3 has a matching bday with the 3rd student Let A E No one has the same bday the first student Let B E No one has the same bday the second student Let C E At least one of the other 9 3 has the same bday the third students Now p3 PAOBOC PA PBA PCAOB 364 c gt T71 c n r72 rgt T73 32 11 i 364r71 363r72 r73 36557139 3641 1 363 i 364r71 363T72 1 ri3 364T 365Tquot 363 T72 1 Pah r 3 students have a bday on the remaining 362 out of 363 days 21 364 363 4 1 T3 365 365 2 363 Working backwards from the book 3 1 T31 365 3 T73 P3 1 65 65 J 1 1 35T72 IL1 r73 gt5 gt f n r72 r gt T 2223240629 Hg 1 lt 39 364 riz 1 ri3 365 365 363 Same what we found before For general pk and restrictions on 9quot and k see Homework 2 Random Variables 21 Measurable Functions Definition 211 I A random variable rv is a set function from 2 to 1755 I More formally Let 2 L P be any probability space Suppose X 2 gt 1755 and that X is a vileasumble function then we call X a random variable I More generally le 2 gt Bk we call X a random vector X X1 X2w What does it mean to that a function is vileasumble Definition 212 Suppose 52 L and 88 are two measurable spaces and X 2 gt S is a mapping from 2 to S We that X is measurable L 8 if X 1 B E L for every set 13 E 8 where X 1B E 12 X w E 13 Example 213 Record the opinion of 50 people yes y or no n 12 All 250 possible sequences of yn HUGE I L 129 X 2 gt S All 250 possible sequences of 1 y and 0 n 8 PS X is a random vector since each element in S has a corresponding element in 2 for B E BX 1B E L 129 Consider X 2 gt S 012 50 where of y s in w is a more manageable random variable A shnple function which takes only finite many values 1yk is measurable iff X71II7 EL VII397 Here X lk 6 2 1 s in sequence w k is a subset of 2 so it is in L PZ l Example 214 Let S infinite fair coin tossing space ie infinite sequence of H s and T s Let Ln be a awfield for the 1st n tosses 0 Define L 7 U L n1 Let X 2 gt 1R be Xnw proportion of H s in 1st n tosses For each n is simple values 0 n and X7 6 Ln Vk 01 n Therefore X7 1 E L So every random variable Xn is measurable L 8 Now we have a sequence of rv s Xn hail We will show later that Pw Xnw gt 1 ie the Strong Law of Large Numbers SLLN l Stat 6710 Mathematical Statistics I Fall Semester 1999 Lecture 9 Monday 9201999 Scribe lurgen Symanzik Some Technical Points about Measurable Functions 215 Suppose ELL and 88 are measure spaces and that a collection of sets A generates 8 ie 7A 8 Let X 2 gt S If X 1A E L VA 6 A then X is measurable L 8 This means we only have to check measurability on a basis collection A The usage is 8 on IR is generated by 1 E 216 If ELL 12 L and S39Z L are measure spaces and X 2 gt 12 and Y 12 gt 2 are measur able then the composition YX 2 gt 2 is measurable L L 217 If f Bi gt IRIquot is a continuous function then f is measurable Ki 8k 218 If t 52 gt JRJ 1 k and g Bk gt R are measurable then gf1 is measurable The usage is 9 could be sum average difl erence product finite maximums and minimums of 1yk etc 219 Limits Extend the real line to 00 B U 00 We f 2 gt IR is measurable L 8 if i fin E L W3 E 8 and ii fquot 00fquotoo E L also 2110 Suppose f1 f2 is a sequence of realwvalued measurable functions ELL gt B 8 Then it holds i sup f inf f lim sup f lim inf f are measurable n R n R ii If f liTignfn exists then f is measurable iii The set fnw converges E L iv If f is any measurable function the set fnw gt fw E L Stat 6710 Mathematical Statistics I Fall Semester 1999 Lecture 10 Wednesday 922 1999 Scribe Jurgen Symanzik Bill Morphet 22 Probability Distribution of a Random Variable The definition of a random variable X 32 L gt 88 makes no mention of P We now introduce a probability measure on S 8 Theorem 221 A random variable X on S39ZLP induces a probability measure on a space 171318 Q with the probability distribution Q of X defined by 98 PXquotB Pw Xw e 8 v13 6 8 Note By the definition of a random variable X 1B E L VB 6 8 Proof If X induces a probability measure Q on 1R8 then Q must satisfy the Kolmogorov Axioms of probability X ELL gtS8 X is a rv X 1B E B A E L VB 6 8 3 93 par 13 Pw Xw e 13 PA 5f3912391397i0 v13 5 5 is c2019 PltX4lt2Rgtgt P792 quot quot 1 iii Let Bnfl 6 8139 GB I W j Then M U B PXquot U Bu 3 PltUltX1Bngtgtgt quot 397 m Z PX1Bngtgt Z c2039 791 791 791 791 791 gt9 holds since X 1 commutes with unionsintersections and preserves disjointedness Definition 222 A real valued function F on i so that is nonwdecreasing rightwcontinuous and satisfies F oo 0Foo 1 is called a cumulative distribution function cdf on IR l Note No mention of probability space or measure P in Definition 222 above Definition 223 Let P be a probability measure on 1138 The cdf associated with P is F00 FMS 0040 PGW Xw S all PX S 1 for a random variable X defined on 1R8P Note F defined in Definition 223 above indeed is a cdf Proof i Let am lt 302 22 oom1C oox2 39I39hl2l17 F1PWIXWSI S PWIXWS2F2 Thus since am lt 302 and Fy1 g Fy2 F is non decreasing Since F is nondecreasing it is suf cient to show that F is rightcontinuous if for any sequence of numbers you gt 30 which means that 31 is approaching a from the right with m1gt302gtgtmngtgtmFXn gtFX A u V V Let x n Xw E E L and An J 3 None of the intervals ahaat contains As an gt m the number of points w in A diminishes until the set is empty Formally R 0 lim A lim A A new quot new 7 01 n a 7 7 By Theorem 126 it follows that 12131010 PM P9131010 A P 0 It is NA WW I Xw 3 nal Pw MM 3 xi FM FM 2 Wiggle Fm Fm mu m MA 0 lim 7300 2 FOUR Fy 2 F is right continuous m Fn I5f223 2 Paw M s ngt F oo gggopawxwm nw PnlggowXwS nl Pall 0 iv Fm I5f223 2 PH Xw S will FM Pnlglolow Pm 1 s is s nil Note that iii and iv implicitly use Theorem 126 In iii we use A 30 n where A D 1 n1 and A J 3 In iv we use A 00 n where A C An and A T 1R l Definition 224 If a random variable X 2 gt 1R has induced a probability measure PX on 171318 with cdf Fy we i rv X is continuous if is continuous in ii rv X is discrete if is a step function in There are rvs that are mixtures of continuous and discrete rvs One such example is a truncated failure time distribution We assume a continuous distribution eg exponential up to a given truncation point a and assign the remaining probability to the truncation point Thus a single point has a probability gt 0 and F jumps at the truncation point l Definition 225 Two random variables X and Y are identically distributed iff PXX E A PyY E A VA 6 L I Note Def 225 does not mean that Yw Va 6 32 For example X H in 3 coin tosses Y T in 3 coin tosses X Y are both Bin3 05 ie identically distributed but for w H H T Xw 2 5 1 Yw 39 X Y I Theorem 226 The following two statements are equivalent i X Y are identically distributed Proof i ii FXW byl5225 Pym own FNX ii i Requires extra knowledge from measure theory I Stat 6710 Mathematical Statistics I Fall Semester 1999 Lecture 11 Friday 9241999 Scribe Jurgen Symanzik 23 Discrete and Continuous Random Variables We now extend Definition 224 to make our definitions a little bit more formal Definition 231 Let X be a real valued random variable with cdf F on 32141 X is discrete if there exists a countable set E C B such that PX E E 1 ie Pw Xw E E 1 The points of E which have positive probability are the jump points of the step function F ie the cdf of X 00 Define pi when E PXX W 2 1 Then pi 2 021 1 i1 We call pi pi gt 0 the probability mass function pmf also probability frequency function ofX l Note 00 Given any set of numbers pm 331 pH gt 0 Va 2 1 2 pH 1 1le is the pmf of some rv X 1 Note The issue of continuous rv s and probability density functions pdfs is more complicated A rv X 2 gt 1R always has a cdf F Whether there exists a function f such that f integrates to F and F exists and equals f almost everywhere depends on something stronger than just continuity Definition 232 A real valued function F is continuous in m0 6 1R iff Vcgt0 36gt0 Vac m molt6Fm Fmo ltc F is continuous iff F is continuous in all a E R l Definition 233 A realwvalued function F defined on 11 is absolutely continuous on 11 iff V6 gt 0 36 gt 0 V finite subcollection of disjoint subintervals 173 1139 1 n R R Zah if lt l 3 Z Fan FIRM lt 6 i1 i1 Note Absolute continuity implies continuity Theorem 234 i If F is absolutely continuous then F exists almost everywhere ii A function F is an indefinite integral iff it is absolutely continuous Thus every absolutely continuous function F is the indefinite integral of its derivative F Definition 235 Let X be a random variable on SZ L P with cdf F We X is a continuous rv iff F is absolutely continuous In this case there exists a non negative integrable function f the probability density function pdf of X such that l M mm PX m 00 From this it follows that if 1 6 R a lt b then b laxQ lt X g b m Fa mm 1 exists and is well defined l Theorem 236 Let X be a continuous random variable with pdf f Then it holds i For every Borel set 13 E BPB ftdt If ii If F is I and f is at 31 then F a dz an Proof Part i From Definition 235 above Part ii By Fundamental Theorem of Calculus I As already stated in the Note following Definition 224 not every rv will fall into one of these two or if you prefer three ie discrete continuousabsolutely continuous classes However most rv which arise in practice will We look at one example that is unlikely to occur in practice in the next Homework assignment However note that every cdf F can be written FM GFM 1 aFx 0 S a S 1 where E is the cdf of a discrete rv and FC is a continuous but not necessarily absolute continuous cdf Some authors such Marek Fisz W 39 39 39 quot quot quot 39 E and quot Siaiistik VEB Deutscher Verlag der Wissenschaften Berlin 1989 are even more specific There it is stated that every cdf F can be written FM 1Fd 0230 WHW 010203 2 001 02 a3 1 Here Fda and Rim are discrete and continuous cdfs above Fs 31 is called a singular cdf Singular means that FS 31 is continuous and its derivative F 31 equals 0 almost everywhere ie everywhere but in those points that belong to a Borelwmeasurable set of probability 0 Question Does continuous but not absolutely continuous mean singular 539 We will hope fully see later l Stat 6710 Mathematical Statistics I Fall Semester 1999 Lecture 12 Monday 9271999 Scribe Jurgen Symanzik Hanadi B Eltahir Example 23 7 Consider 0 a lt 0 1 2 0 Fm a 12a2 0ltmlt1 1 y 2 1 We can write FM LE31 1 aEy0 g a g 1 How Since FM has only one jump at a 0 it is reasonable to get started with a pmf p0 1 and corresponding cdf Since Fy 0 for a lt 0 and FM 1 for a 2 1 it must clearly hold that Elm 0 for a lt 0 and FEM 1 for a 2 1 In addition Fy increases linearly in 0 lt a lt 1 A good guess would be a pdf 1 Ilt0gtly and corresponding cdf 1ng EII 3 0ltmlt1 1 x21 Knowing that F0 12 we have at least to multiply by 12 And indeed can be written 1 1 FW EFdW 5330 I Definition 238 The two valued function IA 31 is called indicator function and it is defined follows 1 ify E A and 0 ify g A for any set A l An Excursion into Logic When proving theorems we only used direct methods so far We used induction proofs to show that something holds for arbitrary H To show that a statement A implies a statement B ie A B we used proofs of the type A 2 Al 2 Ag 2 2 1 mil 2 An 2 B where one step directly follows from the previous step However there are different approaches to obtain the same result A 2 B is equivalent to 1B 2 1A is equivalent to 1A V B A B AQB A B BQ39IA 39IAVB 1 1 1 0 0 1 1 1 0 0 0 1 0 0 0 1 1 1 0 1 1 0 0 1 1 1 1 1 A ltgt B is equivalent to A 2 B B 2 A is equivalent to A V B A V 1B A l 13 AltgtB 1413 1314l ABBA 1AVB 14V 1B 1AVBAV wB 1 1 1 1 1 1 1 1 1 1 0 0 0 1 0 0 1 0 0 1 0 1 0 0 1 0 0 0 0 1 1 1 1 1 1 1 Negations of Quantifiers Var E X By is equivalent to 33 E X 1By ay E X By is equivalent to Va 6 X 1By 33 E X V9 6 Y Byy implies V9 6 Y 33 E X Ba y Stat 6710 Mathematical Statistics I Fall Semester 1999 Lecture 13 Wednesday 9291999 Scribe Jurgen Symanzik Rich Madsen 24 Transformations of Random Variables Let X be a real valued random variable on S39ZLP ie X ELL gt 1R 8 Let g be any Borelwmeasurable real valued function on IR Then by statement 216 Y gX is a random variable Theorem 241 Given a random rariable X with known induced distribution and a Borel measurable function y then the distribution of the random variable Y gX is determined Proof FYW PHYS pwgXwSylgt Pw Xw E B where B1 g 1 ooy E 8 since g is Borelmeasureable P X71032 Note From now on we restrict ourselves to real valued vectorwvalued functions that are Borelwmeasurable ie measurable with respect to 1138 or Bk k More generally Py Y E C PX X E g 1C VC 6 8 Example 242 Suppose X is a discrete random variable Let A be a countable set such that PX E A 1 and PXagt0VaEA Let Y gX Obviously the sample space of Y is also countable Then Pmrygt Z PXltXxgt Z PXltXxgt Vyeym zeal WM 1 I1 y Example 243 X N U 1 1 so the pdf of X is 121151m which according to Definition 238 reads fX 31 12 for 1 g a g 1 and 0 otherwise gt 0 LetYX 3quot 0 otherwise Then 0 y lt 0 1 2 z 0 Flownism 39 J 12y2 0ltylt1 1 y 2 1 This is the mixed discretecontinuous distribution from Example 237 I Note We need to put some conditions on g to ensure gX is continuous if X is continuous and avoid cases in Example 243 above Definition 244 For a random variable X from SZ L P to 171318 the support of X or P is any set A E L for which PA 1 For a continuous random variable X with pdf we can think of the support of XasA X 1m fXygt0 l Definition 245 Let f be a realwvalued function defined on D 9 R D E 8 We say f is strictly non decreasing if a lt y fy lt 3 y Vac y E D f is strictly non increasing ify lt y 2 ne gt 2 y V3539 E D is monotonic on D if is either increasing or decreasing and write T or i l Theorem 246 Let X be a continuous rv with pdf y and support X Let 1 901 be differentiable for all a and either i g y gt 0 or ii g a lt 0 for all Then Y gX is also a continuous W with pdf M9 Maw 11949 1mm 9 Part i g a gt 0 Va 6 X So 9 is strictly increasing and continuous Therefore a 1 1 9 exists and it is also strictly increasing and also differentiable Then from Rohatgi page 9 Theorem 15 d d 1 Ty m We lxg v gt 0 We get Fwy PAY S y PyyX S y PxX 3 PM Fxyquoty for y 6 91 and by differentiation Hemmwwmmwwwi My FHy E Part ii g a lt 0 Va 6 X So 9 is strictly decreasing and continuous Therefore a 1 1 9 exists and it is also strictly decreasing and also differentiable Then from Rohatgi page 9 Theorem 15 d 1 d 1 Ty y We FaKm lt 0 We get Fwy PAY S y PyyX S y fixX 2 PM 1 fixX 3 WW 1 fix 9 l 9 for y E gX and by differentiation M9 FHy i RxU WW Rm W fxaquotyiaquoty fxaquoty i9 ly dy dy 9 Since 94 9 lt 0 the negative sign will cancel out always giving us a positive value Hence the need for the absolute value signs Combining parts i and ii we can therefore write hmnmwww1 199 y l lama Stat 6710 Mathematical Statistics I Fall Semester 1999 Lectures 14 SI 15 Friday 1011999 8 Monday 1041999 Scribe Jurgen Symanzik Hanadi B Eltahir Note In Theorem 246 we can also write f x fYW W 9 6 91 l W l x9 y If g is monotonic over disjoint intervals we can also get an expression for the pdfcdf of Y gX stated in the following theorem Theorem 247 Let Y gX where X is a rv with pdf fX 31 on support X Suppose there exists a partition 210211 Ak of X such that PX 6 Ag 0 and y 31 is continuous on each Ai Suppose there exist functions g1 gk defined on Al through Ak respectively satisfying 3 M W W 6 A7 ii gi is monotonic on 21 iii the set y 1421i y y Mac for some a 6 21 is the same for each i 1 k and iv 1771 9 has a continuous derivative on y for each i 1 k Then k M9 Zn ltg1ygtgtlt fe lye i1 Note Rohatgi page 73 Theorem 4 removes condition iii by defining n My and am y mnw l Example 248 Let X be a rv with pdf fX 31 1mm 31 Let Y sinX What is fyy Since sin is not monotonic on 0 77 Theorem 246 cannot be used to determine the pdf of Y Two possible approaches Method 1 cdfs For 0 lt y lt 1 we have Fyy PAY S y PXsinX g y PX0 g X g sin ly or 77 sin 1y g X g 77 Fx sixrl w lt1 FX 7r 9 since 0 g X g sin 1y and 77 sin 1y g X g 77 are disjoint sets Then My FHy mm 9 1 2 1 2 1fX7T sin 1y17y 1 y 1 1 1 W x8m w Mr lt9 1 2sin 1y 27r sin 1y W 92 772 772 1 7T2 i1 92 27139 2 Fl W 10309 Method 2 Use of Theorem 247 Let 41 42 377 and 40 Let 11719 sin 1y and 519 7r sin 1y It is again dlyy w and y 01 Thus by use of Theorem 247 we get 2 M9 BMW jiggle 1319 i1 40 27139 1 Roddy 2 Wl W I011J Obviously both results are identical l Theorem 249 Let X be a rv with a continuous cdf FX 31 and let Y FXX Then Y N U01 Proof We have to consider two possible cases a FX is strictly increasing ie FX 301 lt Fx72 for am lt 302 and b FX is non decreasing ie there exists am lt 302 and Fx71 Fx 12 Assume that an is the infirnurn and 2 the suprernurn of those values for which FX FX holds In Fig19 is uniquely defined In b we define Fig19 infy FX 2 9 Without loss of generality Fig11 00 HEX31 lt 1 Va 6 JR and Fig10 30 HERm gt 0 Va 6 R For Y 2 FX X and 0 lt y lt 1 we have NY 3 y PFxX S y T 3 gtltI PF FXXgtgt Fg w PX rig19 WIFE19 y as At the endpoints we have PY g y 1 if y 2 1 and PY g y 0 if y g 0 But why is true In a if FX is strictly increasing and continuous it is certainly a F 1 FX 30 In b if FX FX for am lt a lt m it may be that FXTWFX But by definition F 1FXJ 301 Va 6 1302 holds since on 301302 it is PX g 31 PX g 301 Va 6 1302 The flat cdf denotes Fx72 Fx71 Py1 lt X 3 m2 0 by definition I Stat 6710 Mathematical Statistics I Fall Semester 1999 Lecture 16 Wednesday 106 1999 Scribe Jurgen Symanzik 3 Moments and Generating Functions 31 Expectation Definition 311 Let X be a realvalued rv with cdf FX and pdf fX if X is continuous or pmf y and support X if X is discrete The expected value mean of a measurable function g of X is 00 gyfX 31130 ifX is continuous 00 E0100 Z gyfX 31 if X is discrete L El if E gX lt otherwise EgX is undefined ie it does not exist I Example X N Cauchy fX 31 00 lt a lt 00 2 00 7 1 200 EX 0 1m2d log1x0 00 So EX does not exist for the Cauchy distribution l Theorem 312 If EX exists and a and I are finite constants then EaX 1 exists and equals aEX 1 Proof Continuous case only EaXbl axzfxxdx 5 ilal39lwl bwfnmdx Mantammw mmdx 42 GEEGXDHM lt 00 Numerical Result EaX b 00 17 bfX 31131 1700 mfxardybjo fXyda aEXb Theorem 313 If X is bounded ie there exists a I 0 lt I lt 00 such that P X lt I 1 then exists I Definition 314 The km moment of X if it exists is mk The km central moment of X if it exists is W EX EXquot l Definition 315 The variance of X if it exists is the second central moment of X ie VaNX EX EX2 I Theorem 316 Vm X 302 EX2 Proof VarX EX 300 EX2 2XEX EX2 MK 2EXEXEX2 EX2 E002 I Theorem 31 7 If VmX exists and a and I are finite constants then VarmX 1 exists and equals a2VarX Proof Existence 8 Numerical Result Vm aX b E b EaX b2 exists if E aX b EaX 1 2 exists It holds that E aX b EaX 1 2 Q E aX b EaX W VarmX b EaX 102 EaX I 2 Ea2X2 2abX 02 woo 1 2 fax 2abEX b2 ax 2aiEX 2 a2EX2 EX2 a2VarX quotI w It It II E is Ex quotI 9 iv so since VarX exists Stat 6710 Mathematical Statistics I Fall Semester 1999 Lecture 17 Friday 1081999 Scribe Jurgen Symanzik Theorem 318 If the t5 moment of a rv X exists then all moments of order 0 lt s lt t exist Continuous case only mm mgmammdmlmWde 1111 fxxdxlxlgtl 3M j me S PU X S 1El X l lt 00 I Theorem 319 If the t5 moment of a rv X exists then lim n P X gt n 0 Ram Proof Continuous case only 00gt m mdylim m mdy ml mm W Ml mm lim 3quot ady0 we ml m gt tt t 2 But mm 3M fx da 2 7331010 n mm fx 1d7 n P X gt n 0 l The inverse is not necessarily true ie if 73310109151 X gt n 0 then the t5 moment of a rv X does not necessarily exist We can only approach t up to some 6 gt 0 the following Theorem 3110 indicates Stat 6710 Mathematical Statistics I Fall Semester 1999 Lecture 18 Monday 10111999 Scribe Jurgen Symanzik Bill Morphet Theorem 3110 Let X be a rv with a distribution such that lgn n P X gt n 0 for some t gt 0 Then R 00 EX ltoo V0ltsltt Note To prove this Theorem we need Lemma 3111 and Corollary 3112 Lemma 3111 Let X be a non negative rv with cdf F Then if either side exists Proof Continuous case only To prove that the left side implies that the right side is finite and both sides are identical we assume that EX exists It is M mm quotmx 0 Replace the expression for the right side integral using integration by parts Let u a and d1 fX 31130 then R R A mm Wm H Fxwdw l l R HFXM OFX 0 Fxmdy 0 R HEXn n n FX 31131 0 HEXn n An 1 FX 3011 mm H 1 in FX mdy n1 Fxn in Fxxdm nPX gt n in fixde Iquot n1PXgt nR1 Fxxdm 0 2 301 nngJ n pam gt nj1 fixde 39139h319 n I 0nlglolo01 Fxydx Oo1 FX 3011 0 00 Thus the existence of E X implies that 1 FX 3013 is finite and that both sides are identical 0 We still have to show the converse implication 00 If 1 FXydy is finite then EX exists ie E X EX lt 00 and both sides are 0 identical It is R my 23 n a mam n1 FXMH nu fixde 0 0 0 seen above Since n1 Fxn g 0 we get nmfXmdm n1 Fxmdmg m1 Fxmdmltoo Vn 0 0 l Thus 39 n 00 0 lim m m mmdmg1 Fmdmltoo W01 M 01 M 0 m1 00 221900 exists and is identical to 1 FX 3011 seen above I 0 Corollary 3112 00 M X m s Wm X gt my 0 Proof 5 1551711711 3111 00 00 s 0 1 lelszdz 0 PX gtzdz Let z 1 Then 1571 and dz 51 Wt Therefore J W J J J 00 X 5gt zdz 00 X 5gt ysy fsildy 0 0 00 8 y HPU X gt 1J dtJ 0 u 0 r mmwtltmu T 5 9511 X gt ydy 0 Proof of Theorem 3110 For any given a gt 0 choose N such that the tail probability P X gt n lt Va 2 N 7 Car 3112 0 71 MM y PXgt my 0 7 00 7 8 1J5quotPXlgt ydys 1J IPXgt My 0 N r 00 r 6 3 sy I 1dys yb ljdy 0 y s N 00 lt71 1d y lo 56N y 979 00 2N 56 95 1 5dy N 00 4 01 0 3 1 It ycdyz 19 lny c 1 so 2 1 1 1 n jw lt clt 1 Thus for E X 5 lt 00 it must hold that s 1 t lt 1 or equivalently s lt t So E X 5 lt 00 ie it exists for every 5 with 0 lt s lt t for a rv X with a distribution such that lim n P X gt n 0 for some t gt 0 l Ram Stat 6710 Mathematical Statistics I Fall Semester 1999 Lecture 19 Wednesday 10131999 Scribe Jurgen Symanzik Rich Madsen Theorem 3113 Let X be a rv such that P X gt Mk l 1 320 p X gt k 0 V gt Then all moments of X exist Proof I For a gt 0 we select some k0 such that P X gt Mk gt PXgtk ltc Vkko I Select k1 such that P X gt k lt Vk 2 k1 I Select N maxk0 k1 I If we have some fixed positive integer r P X gt Mk P X gt Mk P X gt rzk 4 P X gt Mk 4 P X gt Mk P X gt k P X gt k P X gt Mk P X gt rzk P X gt M lk P X gt Mk P X gt Mk P X gt r rk P X gt r rzk P X gt 139 M 1k P X gt k P X gt k P X gt 1 139k P X gt 1 1 P X gt 1 M 1k I Note Each of these 9 terms on the right side is gt c by our original statement of selecting some k0 such that lt Vk 2 kg and since 139 gt 1 and therefore Wk 2 k0 J I r 39 I Now we get for our entire expression that g T for k 2 N since in this case also k 2 kg and 139 gt 1 I Overall we have P X gt Mk 3 TP X gt k g H for k 2 N since in this case also k 2 k1 I For a fixed positive integer n 00 00 E X W C W l39lz numn ll X gt 31131 Hfl ilp X gt 31131 nmR IPU X gt 31131 0 0 I We know that Viin71PXgt 31131 g nm ldy 30 N lt 00 0 0 49 but is 00 illWill X gt 31131 lt 00 539 I To check the second part we use 00 00 MN y lP X gt 31131 Z W IPU X gt 31131 HrIN I We know that 4 lt 1er y W IPU X gt 31131 3 6T mn ldy IrIN IrIN This step is possible since 6T 2 P X Var E W lN MN and N nlaxk0 k1 2 M U V 2 P X gt an gt P X 2 MN I Since 1141qu 3 wk 3 MNW I Va 6 rT IN TN we get VN VN 6T mn ldy g WTNW I My 3 WTNW WMN g REA7 a a I Now we go back to our original inequality 00 00 VN 00 00 y lP X gt 31131 3 Ear mnildy g ZeWMN N 26 WY rl 0th rl rl Nnm39 1 if 6039 lt 1 or equivalently if c lt 7 1 cm W m I Since 175a is finite all moments E X 71 exist Stat 6710 Mathematical Statistics I Fall Semester 1999 Lecture 20 Friday 10151999 Scribe Jurgen Symanzik Hanadi B Eltahir 32 Generating Functions Definition 321 Let X be a rv with cdf By The moment generating function mgf of X is defined MXt we provided that this expectation exists in an open interval around 0 ie for h lt t lt It for some ltgt0 I Theorem 322 If a rv X has a mgf M X t that exists for lt lt t lt It for some It gt 0 then d EX MEMO dtnMxt 50 Proof We assume that we can differentiate under the integral sign If and when this really is true will be discussed later in this section MXW ca fXmdy gemfx 3013 EXe X Evaluating this at t 0 we get Mxt 0 EX By iteration we get for n 2 2 dquot d d39lil 7M t 7 7M t m 0 dt dtn l gt d 00 a mn lemfxMdm 00 gmnilemfx 3013 00 mamfxmdw 00 EXquote x Evaluating this at t 0 we get iz Ixt 502 EX l Example 323 X N 0100 fo Iagtbm Then I 051 at eta a badflm MX0 7 L ngiia be act I a 1 I So M X 0 1 and smce Zak xs contmuous 3t also exxsts m an open mterval around 0 m fact it exists for every t 6 IR Dew WWW a cw WW a II A X t t2 12 be act cw 0 t2 a 0 2 EX MlO 6 Jngiia be act Wequot tazetquot be act 2tb a i0 We tazetquot 2tb a 6 i0 xngpna We 1205 tl ew tai e 21 a i0 b2 a2 21 a I a 2 Note In the previous example we made use of L Hospital s rule This rule gives conditions under which we can resolve indefinite expressions of the type g and g i Let and g be functions that are differentiable in an open interval around 0 in 0 6300 6 but not necessarily differentiable in 300 Let f70 9300 0 and g 31 0 I Var 6 m0 33m 3 Then lim 7 A implies that also lim 7 A The 39 11 0 g 1 11 0 same holds for the cases lim lim so and a gt or a gt 11 0 11 0 A u V V Let f and g be functions that are differentiable for a gt a a gt 0 Let lim f 31 lim gy 100 100 I 0 and lim g39y 0 Then lim f A implies that also lim U A xgtoo 39 xgtoo g 1 xgtoo g 1 iii We can iterate this process long the required conditions are met and derivatives exist eg if the first derivatives still result in an indefinite expression we can look at the second derivatives then at the third derivatives and so on iv It is recommended to keep expressions simple possible If we have identical factors in the numerator and denominator we can exclude them from both and continue with the simpler functions v Indefinite expressions of the form 0 so can be handled by rearranging them to 15400 and can be handled by use of the rules for l The following Theorems provide us with rules that tell us when we can differentiate under the integral sign Theorem 324 relates to finite integral bounds 18 and MB and Theorems 325 and 326 to infinite bounds Theorem 324 Leibnitz s Rule If f 31 9 18 and MB are differentiable with respect to 9 for all 31 and 00 lt 18 lt MB lt 00 then d M f ed W6 6 d 29 f a 6 d a W a f ed 7 31 7 a 7a 7 19 11 39 39 t9 39 19 11 9 39 The first 2 terms are vanishing if 18 and MB are constant in 9 Proof Uses the Fundamental Theorem of Calculus and the chain rule I Theorem 325 Lebesque s Dominated Convergence Theorem 00 Let g be an integrable function such that yada lt 00 If fn g 9 almost everywhere ie 00 except for a set of Borelwmeasure 0 and if f gt f almost everywhere then f and f are integrable and mm a fmdx l Note If f is differentiable with respect to 9 then as 19 9 35 3 and 00 a 00 f 96 f 6 an an while a 00 00 8 6 8 3639 3639 l Theorem 326 LetfygWf 9 g mi y n 0 6 or some 0 f uppose t lere exists an integra gt e unction 9a suc 1 00 that yada lt 00 and fnm9 g 901 Vac then 00 d 00 9 d 00 a 9 00 Ulozoo 00 g90 17 Usually if f is differentiable for all 9 we write 1 0 0 6 Lom m 700 m m Corollary 327 Let f 31 9 be differentiable for all 9 Suppose there exists an integrable function ya 9 such that ga9dy lt 00 and fm9 9901 g gy9 Va W90 in some c neighborhood of 9 then 00 7 2 m 6m a 990 am More on Moment Generating Functions Consider a Igemfx 31 552 a catxfxy for t39 t g 60 Choose t 60 small enough such that t 60 E h It and t 60 E h It Then gtimf ml ltt gm t where Mm l a c fxm a 2 0 A 31 05 6 Wf m a lt 0 To verify fgytdy lt 00 we need to know fX 31 Suppose mgf NIXt exists for t g It for some It gt 1 Then t 60 1 lt It and t 60 1 lt It Since a g em Vac we get 2H60HV gt0 amt 6 1amp0 1 e quot fx lt0 00 0 Then ya tdy g MX tHSO 1 lt 00 and ya tdy g MX t So 1 lt 00 and therefore 0 700 00 yady lt 00 00 Together with Corollary 327 this establishes that we can differentiate under the integral in the Proof of Theorem 322 If It 3 1 we may need to check more carefully to see if the condition holds Note If 11 exists for t E h It then we have an infinite collection of moments Does a collection of integer moments mk k 1 2 3 completely characterize the distribution ie cdf of X 339 Unfortunately not Example 328 shows I If If Stat 6710 Mathematical Statistics I Fall Semester 1999 Lecture 21 Monday 10181999 Scribe Jurgen Symanzik Example 328 Let X1 and X2 be rv s with pdfs ij m i1 emf loom 4 Immune y 2 and fX2 31 fXI 31 1 sin27rloga 1lt0gt00m It is erg392 for 9quot 0 1 2 you have to show in the Homeworks Two different pdfscdfs have the same moment sequence What went wrong In this example M X I t does not exist shown in the Homeworks l Theorem 329 Let X and Y be 2 rv s with cdf s FX and Fy for which all moments exist i If FX and Fy have bounded support then Fx1l Fyu Vu iff EON EYT for y 012 ii If both mgfquots exist ie NIXt llyt for t in some neighborhood of 0 then Fxm Fym Vu Note The existence of moments is not equivalent to the existence of a mgf seen in Example 328 above and some of the Homework assignments l Theorem 3210 Suppose rv s X l have mgf s MXit and that MXit NIXt Vt E ltlt for some It gt 0 and that 31 itself is a mgf Then there exists a cdf FX whose moments are determined by NIXt and for all continuity points a of FX it holds that FX ie the convergence of mgfquots implies the convergence of cdf Proof Uniqueness of Laplace transformations etc I Theorem 3211 For constants a and I the mgf of Y aX I is My t cab MX at given that M X t exists Proof Mm Eeltaxbgt EeaXtebt OMEWXH cab MX at 33 ComplexValued Random Variables and Characteristic Functions Recall the following facts regarding complexcl numbers to 1i 1392 1i3 1 etc in the planar Gauss ian number plane it holds that i 0 1 z a if rcoslt isin r z W tan 2 Euler s Relation z rcos isin 2 mm M L on Complex Numbers zl izz a1ia2ib1ib2 zl ltz2 rlrzcaiw JrW 9192cos 1 12 isinltJ1 7 gawk cos 1 12 isin 1 z Moivre s Theorem z rcos isin r cosn isinn 7 for k 01n 1 and the main value for k 0 lnz lna ib ln z M i 2917 where 1 arctang and the main value for n 0 Conjugate Complex Numbers For z a if we define the conjugate complex number 2 a if It holds z1z217 is Z 22 E z Eazb2 VFW z Definition 331 Let 52 L P be a probability space and X and Y real valued rv s ie X Y 32 L gt 175513 i Z X iY SZ L gt 0184 is called a complex valued random variable rv ii If EX and EY exist then EZ is defined EZ EX iEY E 0 Note EZ exists if X and Y exist It also holds that ifEZ exists then EZ g Z see Homework l Definition 332 Let X be a real valued rv on 32 L P Then IR gt Q39 with PX Ewitx is called the characteristic function of X l Note i flut 00 emsfxydm 00 costafX 31131 00 sintaf X 31131 if X is continuous 00 ii flut Z cai PX 31 Z costaPX mi Z sintaPX 3130 ifX is discrete 16 Ex 16 and X is the support of X iii laxt exists for all real valued rv s X since em 1 Stat 6710 Mathematical Statistics I Fall Semester 1999 Lecture 22 Wednesday 1020 1999 Scribe Jurgen Symanzik Bill Morphet Theorem 333 Let PX be the characteristic function of a real valued rv X Then it holds 3 PX0 1 ii ext g 1 Vt 6 IR iii PX is uniformly continuous ie V6 gt 0 36 gt 0 Vt1t2 E 17t1 t2 lt 3 ltIt1 ltIt2 lt 6 iv PX is a positive definite function ie Vn E IN Vm an E 0339 Vt1 tn 6 IR 71 R Z Z 175 tz t3 2 0 l1j1 v bxt PMt vi If X is symmetric around 0 ie if X has a pdf that is symmetric around 0 then 6 1R Vt 6 1R Vii DaXbt WMat Proof See Homework for parts i ii iv v vi and vii Part iii Known conditions i Let c gt 0 ii 3agt0P altXltagt1 andPXgtalt iii 3 6 gt 0 emquot1 1 lt Va st lt a and Vt t st 0 lt t t lt 6 This third condition holds since 670 1 0 and the exponential function is continuous Therefore if we select t t and a small enough cam 5 1 will be lt for a given a Let t t 6 IR tlt t and t t lt 6 Then u laxUH joo 07quot fxmdm 00 czm fxmdm jeowm 67 fx 31131 0751 67 fx 31131 awwx 67 fx 31131 Lind 17 fx 31131 lt ewx 67 fx 31131 707 17 fx 31131 Lind 67 fx 31131 We now take a closer look at the first and third of these absolute integrals It is 7a I 7a I 7a l 1e 1 0mfxd l e m 1xgtdx fo 1111 l 00 00 00 lt l a 63wng 31131 a 111 me 57711 l quot7 l fXmdm a cam fxf1dy a 1fX 31131 7a 1fX 313 a QfX 310113 A holds due to Note iii that follows Definition 332 Similarly 00 g 00 111 1 1 1gtfxltxgtdms 21X1xgtdx 11 11 Returning to the main part of the proof we get 11 11 I g 00 u laxUH g QfX 31131 07 17 fx 31131 QfX 31131 00 11 11 2 Q maria L fx1mgtdx java Wm my 11 I 2P X gt a 675 x 17 fxydy 01111111511111 ii l new 67 fxmdm l 6 11 p E 0111 611 701 11 6 11 p g 5 11111110701 1 mam 11 lt 5 l W l 4 1H1 1 fXWx If 6 11 6 lt 7 1 2 Qfxwldy lt 6 006 5m E fxxdy 6 2 mla 6 B holds due to Note iii that follows Definition 332 and due to condition iii l Theorem 334 Bochner s Theorem Let I R gt Q39 be any function with properties i ii iii and iv from Theorem 333 Then there exists a realwvalued rv X with EX 2 I l Theorem 335 Let X be a realwvalued rv and EXk exists for an integer k Then PX is k times differentiable and ikEXkei X In particular for t 0 it is 190 ikmk l Theorem 336 Let X be a realwvalued rv with characteristic function DX and let PX be k times differentiable where k is an even integer Then the km moment of X mk exists and it is 190 ikmk l Theorem 337 Levy s Theorem Let X be a realwvalued rv with cdf FX and characteristic function PX Let 1 6 R a lt b If PX a PX b 0 ie FX is continuous in a and I then 1 00 07m 071 s 171 Fm a tdt m Theorem 338 Let X and Y be a realwvalued rv with characteristic functions DX and by If EX 2 by then X and Y are identically distributed l Theorem 339 00 Let X be a realwvalued rv with characteristic function PX such that PX t it lt 00 Then X has pdf 700 mm i 00 Humam 2 700 A I Theorem 3310 Let X be a realwvalued rv with mgf MXt ie the mgf exists Then flut NIXit l Theorem 3311 Suppose realwvalued rv s X33021 have cdf s IR3 and characteristic functions 1 Xtl If PX t laxt Vt E ItJt for some It gt 0 and laxt is itself a characteristic function of a rv X with cdf then in FX for all continuity points a of FX ie the convergence of characteristic functions implies the convergence of cdf s l Stat 6710 Mathematical Statistics I Fall Semester 1999 Lecture 23 Friday 10221999 Scribe Jurgen Symanzik Rich Madsen Theorem 3312 Characteristic functions for some weiiwknown distributions Distribution 1 X t i X N Diracc em 3 X N BinUp 1 pe 1 iii X N Poissonc expcei 1 i b i a w X 0 a b 3757 v X N01 exp t22 vi X Np 72 31quot ex 02t22 vii X Una viii X Ems 74 l t ix X X3 1 QitHZ Proof i Ewitx cait PX c em 1 ii laxt Z eitkPX k emu p 0in 1 pcz 1 k0 I t 2 im On it39 it39 i 1 y it39 m7 lt F l i 7 1 H X i 3 if 3 i C 3 3 3 73 SlnCe i 73 c n n n new 790 n0 r b r r b y y 1 0151 6315b 6315a iv DXW cmdy l J a b a it b ait v X N N0 1 is symmetric around 0 PX is real since there is no imaginary part according to Theorem 333 vi 00 x2 PXt costye 2 17 00 Since E X exists PX is differentiable according to Theorem 335 and the following holds Pixt RquotI I ltt 00 1 x2 Re 67 6 2 17 V277 cos xi sin 1 00 1 x2d 00 1 x2d Re cos ta ET 3 a sin ta ET 700 VQTF foo VQTF 302 x sintaae 2 17 u tcosta and z e 2 RH U 1 00 27r 00 T 1 rflem 7 Oolt Slll I 3 7 27139 00 27139 700 0 since sin is odd t costar e202 dy 302 costyera 1 00 WT L t 1 xt 5 5 Thus t tltIXt It follows that 2 t and by integrating both sides we get ln laxt t2 c with c 6 R For t 0 we know that 1 by Theorem 333 and ln 0 It follows that 0 0 0 Therefore 0 0 and flut 2 67252 If we take t 0 then EX 0 1 by Theorem 333 Since PX is continuous PX must take the value 0 before it can eventually take a negative value However since 67252 gt 0 Vt 6 1R PX cannot take 0 a possible value and therefore cannot pass into the negative numbers So it must hold that laxt 67252 Vt 6 1R vi For a gt 0 6 R we know that if X N N0 1 then 7X p N Np72 By Theorem 333 vii we have PMWU emfbxwt 6371 07202 2 vii 00 bxt Wmmm 0 impilcziq 7txdm 1 it 00m ity7 le q xq itdy u q ita 11 q itdy 1p 77 00 p 1 ud 7 it u 0 1 Wm 0 viii Since an Empc distribution is a H1 0 distribution we get for X N Empc H1 0 it 1 axt 1 ix Since a 95 distribution for n E IN is a H3 distribution we get for X N xi 2 H3 amt 1 1 2 1 2mm Example 3313 Since we know that ml EX and mg EX2 exist for X N Bin1p we can determine these moments according to Theorem 335 using the characteristic function It is Rxt 1 we 1 t pie P39X 0 M v0 39i 71 J pEltXgt V 2 ml 2 FIXt p120 PK0 M mg I EX2 VMX Em Em p p2 m pgt Note 00 The restriction PX t it lt 30 in Theorem 339 works in such a way that we donquott end up 00 with a non existing pdf if X is a discrete rV For example 39 X N Diracc 70063i ltdt 00 1dt 00 00 who which is unde ned Also for X 13mm mxwdt ZmM mdt lpe p 1gtldt 2 i pm wan mt pei idt ZHp mdt p1dt 1 p 1dt 2p 100 1dt 2p 1miooo IV which is undefined for p 12 Up 12 we have 00 petit p l dt 1200 czit1dt 00 12 costisirit1dt 00 00 12 cost 12 sint2 it 00 1270 00 12 v22cost dt 00 cos2t2cost1sinztdt which also does not exist CariEma N Z 28 CH 8 E 00 axc lemVAN 00 EM E 00 H maxilwmkvg Stat 6710 Mathematical Statistics I Fall Semester 1999 Lecture 24 Monday 10251999 Scribe Jurgen Symanzik Hanadi B Eltahir 34 Probability Generating Functions Definition 341 Let X be a discrete rv which only takes nonwnegative integer values ie pk PX k and 00 2 pk 1 Then the probability generating function pgf of X is defined k0 00 65 Zpk5k k0 I Theorem 342 65 converges for 5 lt 1 Proof 0 00 68 E Z 1amp5le Z W l 1 l k0 k0 Theorem 343 Let X be a discrete rv which only takes nonwnegative integer values and has pgf 6 5 Then it holds 1 dk PX k 65 50 Theorem 344 Let X be a discrete rv which only takes nonwnegative integer values and has pgf 65 If EX exists then it holds 1 Eco gab 51 Definition 345 The k5 factorial moment of X is defined EXX 1X 2 X k1 if this expectation exists I Theorem 346 Let X be a discrete rv which only takes nonwnegative integer values and has pgf 65 If 1X 2 X k 1 exists then it holds k me 1X 2 4 X k 1 as 51 Note Similar to the Cauchy distribution for the continuous case there exist discrete distributions where the mean or higher moments do not exist See Homework l 35 Moment Inequalities Theorem 351 Let ltX be a nonwnegative Borelnleasurable function of a rv X If EILX exists then it holds EILX PWX 26 S V6gt0 Proof Continuous case only EmmwmmMMm mammm Hawkmm where Ayhy 26 zjfmnmm 2mmw A 6PILX 2 6 V6 gt 0 Therefore PltX 2 c g V6 gt 0 l Corollary 352 Markov s Inequality Let ltX X T and c kf where 391quot gt 0 and k gt 0 If E X V exists then it holds E X V Proof Since P X 2 k P X T2 W for k gt 0 it follows using Theorem 351 M X r Th3 H X 12 k Pa X V2 m s k Corollary 353 Chebychev s Inequality Let ltX X 2 and c 202 where EX u VmX 72 lt 30 and k gt 0 Then it holds 1 Pa X M gtIw E Proof Since P X p gt k0 P X p 2gt 202 for k gt 0 it follows using Theorem 351 E X p 2 Iany 72 1 P X M lgt k0 P X M l2gt 202 202 202 202 Theorem 354 Lyapunov Inequality Let 0 lt 3 E X W lt 00 For arbitrary k such that 2 g k g n it holds that mail 3031 39 E X MW 3 E X W Proof Continuous case only Let em 1 E u X 1 X 2 Obviously Quz 2 0 Wm 6 IR Also em 00 u l a V2 1 a mmx 13700 a 1 y 31131 201700 31 fXyda 12 OO 31 H1 fxmdy 1423k1 21mm 0231 2 0 Vume IR 70 Using the fact that Ax 213309 192 2 0 Way 6 IR if A gt 0 and AC 132 gt 0 see Rohatgi page 6 Section P24 we get with A 11 m and C 3k1 mil31c 3 2 0 2 18 S l3k713k1 2 18 S lazil zl This nieans that 3 g 3o32 33 g 3 113333 and so on Multiplying these we get k7 2 k7 X p 11ng S H3571351 j1 j1 30323123 333 3 3 u 3 3 f3 3 1 kiz 397 r 2 11301132712113 11133 jl kiz Dividing both sides by H 1632 we get 21 r 2k72 31671 ax35 fif 113 lt 1131 1 2 31 lt 3 If x 2 11353 lt 3 71 Stat 6710 Mathematical Statistics I Fall Semester 1999 Lecture 25 Wednesday 1027 1999 Scribe Jurgen Synlanzik 4 Random Vectors 41 Joint Marginal and Conditional Distributions Definition 41 1 for all n dinlensional intervals I 00 lt mi 3 7304 6 R W 1 i Note It follows that if X1yyXn are any n rv s on SLAP then i X1Xn is an n rv on 2141 since for any I it holds i710 E w 1 S al quotX w 3 an 0W k1 I V EL I Definition 412 For an n rv i a function F defined by Hi Pi3amp PX1 sm1Xn 3 ME 13955 is the joint cumulative distribution function of g l Note i F is non decreasing and rightcontinuous in each of its arguments mi ii lim 1 and lim xgtoowwngtoo A700 Fg 0 Vm1mk1mk1mn 6 IR 72 However conditions i and ii together are not su icient for F to be a joint cdf Instead we need the conditions from the next Theorem I Theorem 413 A function Far1 am is the joint cdf of some n rv i iff i F is non decreasing and rightwcontinuous with respect to each mi ii F ooy2yn Fy1 ooy3mn Fy1yn1 oo 0 and iii VQ 6 1R Vci gt Li 1 n the following inequality holds R Famp ZF1617gt1 capacityam 6i1mmn 612 i1 Z FWI61wuiil6i71ii16i1w 1giltjgn j715jiltmj 7jl 6j1wwn 610 Note We won t prove this Theorem but just see why we need condition iii for n 2 Pa1ltX 330291 ltngz PX szl39 yzl PQ39 SSWYSyzl HX sm2Y39SylPX SSWYSQI 20 I We will restrict ourselves to n 2 for most of the next Definitions and Theorems but those can be easily generalized to n gt 2 The term bivariate W is often used to refer to a 2rv and multi variate W is used to refer to an n rv n 2 2 Definition 414 A 2 rv X Y is discrete if there exists a countable collection X of pairs mi 97 that has proba bility 1 Let pi PX miY w gt 0 V Ijyj E X Then Zp 1 and p73 is the joint w probabiiiy mass function of X Y I Definition 415 Let X Y be a discrete 2 rv with joint pnlf pij Define 00 00 p12 21m ZHX MY 93 PX 15 i1 i1 and 00 00 1 21 ZHX WY 93 PW 9 i1 i1 Then 1 is called the marginal probability mass function of X and pg is called the marginal probability mass function of Y I Definition 416 A 2 rv X Y is continuous if there exists a nonwnegative function f such that F1y oofmm d1 11 V1y E R2 where F is the joint cdf of X Y We call f the joint probability density function of X Y I Note If F is continuous at 1 y then VFW1 11 dy 3 Definition 417 00 Let X Y be a continuous 2 rv with joint pdf f Then fX1 f1ydy is called the 00 00 marginal probability density function of X and fyy f 1 yd1 is called the marginal 00 probability density function of Y I x f maria Z fmydy dx Fm so 1 Z fltmygtdx dz handy and fX1 2 0 V1 6 JR and fyy 2 0 V9 6 IR ii Given a 2 rv X Y with joint cdf F1y how do we generate a marginal cdf fix1 PX g 1 539 The answer is PX g 1 PX g 1 00 lt Y lt 00 F1 00 74 Stat 6710 Mathematical Statistics I Fall Semester 1999 Lecture 26 Friday 10291999 Scribe Jurgen Synlanzik Definition 418 If F wh qmn is the joint cdf of an n rv i X1Xn then the marginal X SkSn L1 9391 lt12 lt ltik n cumulative distribution function of XiI is given by lirn FX30ooyiloocoarse oomik0000 xgtooj injk 7 7 Note In Definition 141 we defined conditional probability distributions in some probability space 12 L P This definition extends to conditional distributions of 2 rvquots X Y I Definition 419 Let X Y be a discrete 2 rv If PY 93 13 gt 0 then the conditional probability mass function of X given Y 93 for fixed j is defined PX MY yj My VPXr 39 7 I l 7 l W m y W Note For a continuous 2 rv X Y with pdf f PX g a Y y is not defined Let 6 gt 0 and suppose that P y 6 lt Y 3 y 6 gt 0 For every 3 and every interval 9 6 y 6 consider the conditional probability of X g a given Y E y 6 y We have PXltyy 6ltYlty6 PXlt Vltrlt 624 r 39 yly c y c Py 6ltY y6 which is well defined if Py 6 lt Y 3 y 6 gt 0 holds So when does liin PX g a Y E 9 69 6 5gt0 exist See the next definition I Definition 4110 The conditional cumulative distribution function of a rv X given that Y y is defined to be Flipw y 5 5PX Sm YE y wwl provided that this limit exists If it does exist the conditional probability density function of X given that Y y is any non negative function le y satisfying FXDy y rmt l yldt V3 6 1R Note 00 For fixed 9 fXy I y 2 0 and fXy I ydy 1 So it is really a pdf l 700 Theorem 4111 Let X Y be a continuous 2rv with joint pdf fxy It holds that at every point 31 y where f is continuous and the marginal pdf fyy gt 0 we have PXsmYEy 6yd FXllmly PYEy cy6 lingr 0 94 5l fmdv 975 1 y fXgu 111 du 1 fX uydu W 00 fyw du Thus X y a 1 exists and equals fxquotx y provided that y y y gt 0 Furthermore since I J n lt1 l fmu wdu fmwaMx y 00 we get the following marginal cdf of X M fxvuydu dz Mymw my 76 Example 4112 Consider 2 0 lt a lt t lt 1 fxg U wl 39 J 0 otherwise We calculate the marginal pdf s f X 31 and fyy first fx1ifxgtyflydi 112dy21 y for0ltmlt1 and 00 y M9 szltxygtdm 2dx2yroroltylt 1 700 0 The conditional pdfquots lmy 31 and fix x y are calculated follows 2 f x lw y formltylt1where0ltmlt1 fgt XylWm1 m and 2 1 leyayM77for0ltmltywhere0ltylt1 My 2J 9 Thus it holds that Y X a N Uar 1 and X Y y N U0y ie both conditional pdfquots are related to uniform distributions l 42 Independent Random Variables Example 421 from Rohatgi page 119 Example 1 Let f1 f2f3 be 3 pdf s with cdf s F1 72173 and let a g 1 Define fawuw2w3 f1m1f2m2f3w34 1 cr2Flm1 12Fzm2 12F3m3 1 We can show a fa is a pdf for all 139 e 11 ii fa 1 s a g 1 all have marginal pdfquots f1f2f3 See book for proof and further discussion but when do the marginal distributions uniquely determine the joint distribution l 77 Definition 422 Let ny 31 y be the joint cdf and fix31 and Fyy be the marginal cdf s of a 2rv X Y X and Y are independent if 00 y FxFgt y W00 9 E 3552 I Lemma 423 If X and Y are independent 11 ad 6 IR and a lt I and c lt 1 then Pa lt X g Ilt Y 3 d Pa lt X g bPclt Y3 1 Proof P lt X S I C lt Y S d FXgtId FXgtLd FXgtI3 FXQam 2 FX 3Fy FX aFy FX 3Fy 3 FX aFy 3 FXU FXQFYd FvW Pa lt X g bPc lt Y 3 d I Definition 424 A collection of rv s X1 Xn with joint cdf and marginal cdf s are mutually or completely independent ifl ng Fmei Vz 6 IR i1 I Note We often simply that the rv s X1 X are independent when we really mean that they are mutually independent I 78 Stat 6710 Mathematical Statistics I Fall Semester 1999 Lecture 27 Monday 1111999 Scribe Jurgen Symanzik Bill Morphet Theorem 425 Factorization Theorem i A necessary and su icient condition for discrete rv s X1 X to be independent is that R PgPX1y1Xn mn HPOQ 39 me x 7 where X C IR is the countable support of i ii For an absolutely continuous n rv i X1 Xn X1 X are independent iff leygtXn 7lf Wm i1 where f is the joint pdf and fixI are the marginal pdfs of i Proof i Discrete case This is a generalization of Rohatgi page 120 Theorem 1 to ndimensional random vectors The Theorem is based on Lemma 423 see also Rohatgi page 120 Lemma 1 which gives 39 of a bounded region in a twodimensional support A L the a lt X g I c lt Y 5 d The extension of the Lemma to ndimensional space gives the cumu lative probability in a completely bounded region in an ndimensional support hyperspace Let i be a random vector whose components are independent random variables of the discrete type with 1 Q gt 0 Lemma 423 extends to limPg lt i g g lim Pa1lt X131 12 lt X2 5 b2 an lt X 3 bn Elk am V76lgtgtn PX1b1 X2b2 Xnbn 1 fl n 11m 2 1 Before considering the converse factorization of the joint cdf of an nrv recall that indepen dence allows each value in the support of a component to combine with each of all possible 79 combinations of the other component values a 3dimensional vector where the first compo nent has support mlawwwlc the second component has support mzlzwzmmzc and the third component has support mam3bar3c has 3 x 3 x 3 27 points in its support Due to independence all these vectors can be arranged into three sets the first set having an the second set having new and the third set having 16 with each set having 9 combinations of mg and 3 2 FII 1II 2II 3 PX2 2aF3 3 PX2 0020570039 PX2 2ltF3l P09 212 PX2 21 PX2 x2Fa3 F1F2Famp More generally for n dimensions let 3 mi1mi2mm B an 3 m1 132 3 m2 mm 3 mm B E X Then it holds Edi Z Piampi 613 Z PX1 3m X2 12 Xn 30m gen d mg 2 PX1 3671sz 12 Xn 30m gen Z PXl 7ilPX2 739i2 Xnml m S 1 1 1m S 1n Z PXl 7il Z PX2 739i2 Xnml 1 S 1 m S 1 2gt 1m S 1n FX1 71 Z PX2 7i2 Xnml 132 S 1 2 1m S 1n d my FxI 061 Z PX2 0m 2 PX3 0633 Xn 00m 132 S 12 xx S 13gt 1m S 1n FX1 71FX2 72 Z PX3 3673 Xn 30m 193 S 1 3 1m S 1n inwllFx39zwzleXnUI nl n i1 ii Continuous case Homework l Theorem 426 R X1Xn are independent iff PXi 6 Ai i 1n HPOQ 6 A V Borel sets Ai E 8 1 ie rv s are independent iff all events involving these rv s are independent Proof Lemma 423 and definition of Borel sets I Theorem 427 Let X1 Xn be independent rv s and g1 g be Borel measurable functions Then g1X1g2X2 gnXn are independent Proof FgXgtgX2gtgtgXnhlthZ Whit P0100 S humle S thUXn S hu 7 7 Pan 6 111 lt oomtxt e ynl00hnl W Th 42r HPX1 E 1i FOOhill i1 R H PUiXi S hi i1 R Ham3W 71 holds since gfl oolt16 Bg ooltn E B l Theorem 428 If X1Xn are independent then also every subcollection XilXik k 2n 1 1 3 i1 lt i2 lt ik g n is independent I Definition 429 A set or a sequence of rv s Xn 5301 is independent iff every finite subcollection is independent I Note Recall that X and Y are identically distributed iff FX 31 Fy 31 Va 6 IR according to Definition 225 and Theorem 226 I Definition 4210 We that X6321 is a set or a sequence of independent identically distributed iid rv s if Xn 320 is independent and all X are identically distributed l Note Recall that X and Y being identically distributed does not that X Y with probability 1 If this happens we that X and Y are equivalent rv s l Note We can also extend the defintion of independence to 2 random vectors X quotX1 and YR X and Y are independent ifl Piggy FLQWEQ Vg 6 IR This does not mean that the components Xi of X or the components Y of Y are independent However it does mean that each pair of components XiYi are independent any subcollections XilXik and YIle are L and any Bur l functions and KY are independent I Corollary 4211 to Factorization Theorem 425 If X and Y are independent rv s then a l y m and Pym9 m FWJ W l 43 Functions of Random Vectors Theorem 431 If X and Y are rv s on 12141 gt 1755 then i X le is a rv ii XY is a rv iii I w Yw o 039 then 4 is a rv I Theorem 432 Let X1yyXn be rv s on 2141 gt 1755 Define IAXR naxX1Xn Xm by IAXRW naxX1 Va 6 S2 and IINn nlinX1Xn X nlax X1 Xn by IINRW nlinX1 Va 6 32 Then i HNn and IAXR are rV s ii If X1 Xn are independent then R menm PUVIAXR g z PXi g z w 1 n H sz 71 and R Fanz PUVIINR g z 1 PXi gt z w 1n 1 u inz i1 iii If X gl are iid rv s with conlnlon cdf FX then FMAXn z Figquot 2 and FMINnz 1 1 FAQquot If FX is absoluter continuous with pdf X then the pde of IAXR and I 1quotNn are fMAXnz n FIEquot2 fXW and fMINnz W 1 FXV nil fXW for all continuity points of FX Note Using Theorem 432 it is easy to derive the joint cdf and pdf of IAXR and IINR for iid rv s Xh Xn For example if the Xi s are iid with cdf FX and pdf then the joint pdf of IAXR and IINR is f lt gt 0 3939 quotquot quot quot y n 1Fxm Fxy 2ufxwfxy my However note that 11 AX and 111 I NR are not independent See Rohatgi page 129 Corollary for more details l Stat 6710 Mathematical Statistics I Fall Semester 1999 Lecture 28 Wednesday 11 3 1999 Scribe Jurgen Symanzik Rich Madsen Note The previous transformations are special cases of the following Theorem Theorem 433 If g IR gt 1755 is a Borel measurable function ie VB 6 Km g WB E K and if i X1 Xn is an n rv then is an m rv Proof If E 839 then E B E g 1B E K I Question How do we handle more general transformations of g 539 Discrete Case Let g X 1Xn be a discrete n rv and X C IR be the countable support of ie Pi X 1 and Pig gt 0 V36 X Define ui yi71 mni 1 n to be 1 to 1mappings of X onto 13 Let a ulun Then PQ 2g Pg1g u1gni 14 PX1h1 X hum Va 6 B where xi ti 1 n is the inverse transformation and PQ 0 Va g B The joint marginal pmf of any subcollection of ms is now obtained by summing over the other remaining uj s Example 434 Let X Y be iid Binnp0 lt p lt 1 Let U girl and V Y 1 Then X 2 UV and Y V 1 So the joint pnlf of U V is PH 2 at V 21 W gtp1t171 pnimrr 117710 pnliv m 7quot 7quot mrvil 1 2nlimriv 1w 17 1gtp p for z E 12n 1 and m E 01n l Continuous Case Let i X1 Xn be a continuous n rv with joint cdf FA and joint pdf Let U1 1 Q E 11 I Un Ml ie Ui gi be a mapping from IR into IR If B E K then m e B PX e 1quotB 1quot mom 1quot fxa39 rm 7 i l 7 I 7 7 7 1 7 n y 03 y 03 i1 where 9 10 1yn 6 IR Ag 6 13 Suppose we define B the half infinite n dinlensional interval Blu 1u g oo lt14 ltui Vi 1n for any g 6 IR Then the joint cdf of Q is Gm HQ 6 Ba Wadi S wuqyn l S 1m aim f zlf zl If G happens to be absolutely continuous the joint pdf of Q will be given by f at every continuity point of f Under certain conditions we can write in ternls of the original pdf L of i stated in the next Theorem Theorem 435 Multivariate Transformation Let i X1 Xn be a continuous n rv with joint pdf i Let U1 1 l U E 11 UR In there exist inverses hi ie Ui g i be a 1 to 1 mapping from B into IR i 1n such that xi him Itimuni 1n over the range of the transformation g ii Assume both 9 and h are continuous iii Assume partial derivatives 957 1 1 n exist and are continuous iv Assume that the Jacobian of the inverse transformation 91 911 91 Gun 91 n 913 91 Gun is different from 0 for all a in the range of g Then the n rv Q Al has a joint absolutely continuous cdf with corresponding joint pdf fdu l J l fghi a Jam Let u 6 JR and Then 1filtIziugthnmgtgt l J l M Bu The result follows from differentiation of For additional steps of the proof see Rohatgi page 135 and Theorem 17 on page 10 or a book on multivariate calculus l Theorem 436 Let i X1 Xn be a continuous n rv with joint pdf i Let U1 1 Q E 91 Un Mi ie U g i be a mapping from B into R ii Let X f g gt 0 be the support of i iii Suppose that for each u E B 6 IR g 2 Ag for some g E X there is a finite number k km of inverses iv Suppose we can partition X into X0 X1 Xk st a PX 6 x0 0 b Q is a 1 to 1 mapping from X onto B for all I 1 k with inverse trans 1113 formation him E g E B ie for each u E B him is the unique g E X Wu such that g 2 Ag v Assume partial derivatives 5 l 1 k i j 1 n exist and are continuous vi Assume the Jacobian of each of the inverse transformations 8L 81 9 I lt9 I 91 91W 91 Gun 941A m 9 n 9 n 91 91Ln 91 Gun is different from 0 for all a in the range of g Then the joint pdf of Q is given by k fdu Z i J fag511m hmm 1 Example 437 Let X Y be iid N0 1 Define K Y 0 U X Y Y 39 Jl 0 Y and V gzltXYgt1Y X I132 but U V are not 1 to 1 mappings since U Vay U V y y ie conditions do not apply for the use of Theorem 435 Let 1 0 149 y 0 X1 149 y gt 0 X2 149 y lt 0 Then PXY 6 X0 0 Let B uz 1gt 0 gz1 1 gag Inverses B gt X1 3 lt11u1m y hmmm 1 B gt 1931 hm um w y L22u17 z z 1 J 2 J z 1 0 1 l m l l f u J Q J 1 2 I 0 1 l A l fx YI 1 37122 43 322 n 2 1 2 u 2 u 1 2 u 2 fzi VW 1 17 67Wquot e quot 2 z VFW Zc quot 2 quot 39 277 27139 1 PM 763 2 ooltultoo 0ltvltoo 7r Marginal 0 1 3 w 12 1 12 dz hm A e 2 d1 z hf u2 1z l 00 Z 767 dz 0 7ru2 1 711 1 17 0 1 7 OCltUltOC Thus the quotient of two iiltl N1 i39v s is a w that In a Cauchy distribution 90 Stat 6710 Mathematical Statistics I Fall Semester 1999 Lecture 29 Friday 1151999 Scribe Jurgen Symanzik Hanadi B Eltahir 44 Order Statistics Definition 441 Let X1Xn be an n rv The k5 order statistic XW is the k smallest of the X745 ie X minX1Xn X minX1Xn X1 Xm maxX1Xn It is X g X g g Xm and X1X2 XW is the set of order statistics for X1Xn Note As shown in Theorem 432 X and Xm are rv s This result will be extended in the following Theorem Theorem 442 Let X1 X be an n rv Then the km order statistic XW k 1 n is also an W l Theorem 443 Let X1 X be continuous iid rv s with pdf The joint pdf of X Xm is R n fom an 5062 Sam fxltlgtgtxm1wwwn i1 0 otherwise Proof For the case n 3 look at the following scenario how X1X2 and X can be possibly ordered to yield X lt X lt X3 Columns represent X1X2 and X3 Rows represent X1 X2 and X3 1 0 0 X1ltX2ltX3 0 1 0 0 0 1 1 0 0 X1ltX3ltX21 0 0 1 0 1 0 0 1 0 X2ltX1ltX3 1 0 0 0 0 1 0 0 1 X2ltX3ltX1 I 1 0 0 0 1 0 0 1 0 X3ltX1ltX21 0 0 1 1 0 0 0 0 1 X3ltX2ltX11 0 1 0 1 0 0 For n 3 there are 3 6 possible arrangements In general there are n arrangements of X1 Xn for each X1 Xn This mapping is not 1 to 1 For each mapping we have a n X n matrix J that results from an n X n identity matrix through the rearrangement or rows Therefore J 1 By Theorem 436 we get fxltlgtgtx1wwmm nlfXgtgtXnm c tm cgH fmkn n I H fXM ml 1 7R n H firm i1 l Theorem 444 Let X1 X be continuous iicl rv s with pclf ancl cclf FX Then the following holds i The marginal 1ltlfofXk k 1 n is n F k711 F 7quot finkl k1lnkl My My AW ii The joint pclf of X0 and X00 1 g j lt k g n is f I M n X quot ltigtgtquotltkgt j 1Ik j 1In kl FX 3le FX 3 FX mjllk jil 1 FX wkquot kfxwjfxmk if my lt and 0 otherwise I 45 Multivariate Expectation In this section we assume that i X1Xn is an n rv and g IR gt R is a Borel measurable function Definition 451 If n 1 ie g is univariate we define the following i Let i be discrete with joint pmf pixin PX1 mil Xn win If 2 piini y ilfquot min lt 3039 we de ne Z pi1in 49milf min and 14 funny this value exists ii Let x be continuous with joint pdf f g If Ag f z 13 lt 00 we define 1m max W 3ampng and this value exists Note The above can be extended to vectorwvalued functions 9 n gt 1 in the obvious way For example if g is the identity mapping from IR gt IR then EGG M1 provided that EU Xi lt 30 W 1n Similarly provided that all expectations exist we get for the variancewcovariance matrix will 2x EUX ME i Ml with j quot component EXi EXi Xj E09 C11 XiXj and with if component Em Em Xi Em mm 0 Joint higher order moments can be defined similarly when needed I Note We are often interested in weighted sums of rv s or products of rv s and their expectations This will be addressed in the next two Theorems Theorem 452 it Let Xii 1n be rv s such that EU Xi lt 00 Let a1an 6 IR and define S ZaiXi i1 Then it holds that EU 8 lt 00 and n Zaii39XXi 1 Proof Continuous case only M s l n l l f zldz s l lmwmwfnmg ai W WH f gdm1dmi1dmi1 Ham day m l m l fXMiWIH flailEUXil lt 00 R It follows that ES Z aiEXi by the same argument without using the absolute values 71 I Note If Xhi 1 n are iid with u then i R we a 2X Z 7EltX1gt u 11 11 I Stat 6710 Mathematical Statistics I Fall Semester 1999 Lecture 30 Monday 1181999 Scribe Jurgen Symanzik Bill Morphet Theorem 453 Let Xii 1 n be independent rv s such that Xi lt 00 Let hi 1 n be Borel measurable functions Then n n all 94X H EUiXi i1 i1 if all expectations exist Proof 7 By Theorem 425 H and by Theorem 427 giXii 1 n are also inde i1 pendent Therefore angle magma n 71 El S 39l39h z Rn giyifx flidflj R R 12427 A mW fomMm R12 72fX2fI392dflz l gnmnfxnmndmn A yiwafxxxadxi HEU1 X1 i1 If X Y are independent then CmX Y 0 I Theorem 455 Two rv s X Y are independent iff for all pairs of Borel measurable functions g1 and g2 it holds that Eg1X g2Y Eg2Y if all expectations exist 2 It follows from Theorem 453 and the independence of X and Y that EUIXUZY 301100 39 E92Y lt22 From Theorem 426 we know that X and Y are independent if PX E 41 Y E 42 PX E 41 PY E 42 V Bore sets 41 and 42 How do we relate Theorem 426 to gl and 12 Let us define two Boreiwmeasurabie functions gl and 2 1 a E A mm 1mm 39 1 0 otherwise 1 1 E A 929 11429 r J 2 0 otherwise Then Ema 0391 E A 1 39PX e 41 PX e 41 mm 0 PY 5 Ag 1 my 5 42 PY e 42 and EUIX 12Y PX 6 All 6 42 2 PltX 64116 A E6100 4 my 97 3quot E01100 4 mm m 6 A my 6 AZ 2 PX 6 All 6 AZ PX E 41PY 6 AZ 2 X Y independent by Theorem 426 I Definition 456 The i h39Jgh39 ig multi way moment of 2 X1 Xn is defined mm EX X 291 if it exists The i hdgh39 ig multi way central moment of X1 Xn is defined Him EHXj E0937quot amp if it exists I Note If we set ir is 1 and ij 0 W aquot s in Definition 456 we get 0 0 Mrs COMXnXs Theorem 457 Cauchy Schwarz Inequality Let X Y be 2 rv s with finite variance Then it holds i CmXY exists ii MXYW S EX2EY2 iii EXY2 EX2EY2 iff there exists an mg3 E R2 00 such that Pm39X 3Y 0 1 Proof Assumptions VarX VarY lt 00 Then also EX2 EY2 EY lt 00 Result used in proof 0 g a 12 a2 2a 2 2 ab 3 Zjbg 03ab2a22abb2 abg ab 3 27 v M 5 IR 3 EU XY W m fxyltmygtdx dy 3 2 g 9 fxywyldw dy 132 2 2 92 2 7ampme dy 2 irisma da m2 IR 2 2 iiiWm iiimy IR 2 1g 2 EY2 2 lt 00 2 EXY exists 2 1017X Y EXY EXEY exists ii 0 g Em X 3Y2 MEG 2r3EXY 32EY2 V lg3 6 IR A If E X 2 0 then X has a degenerate 1 point Dirac distribution and the inequality trivially is true Therefore we can assume that EX2 gt 0 As A is true for all a 3 6 R we can choose 139 1 3 E 23KEwago 2 EXY2 EY2EX2 2 0 2 MXYW 3 302 MW iii When are the left and right sides of the inequality in ii equal Assume that EX2 gt 0 EXY2 EX2EY2 holds ifl Em X 3Y2 0 This can only happen if PMX 3Y 0 1 Otherwise if Pm39X 3Y 0 PY 1 for some 033 6 IRZ 00 ie Y is linearly dependent on X with probability 1 this implies wmnrwm7 5VWMWWEMW WMaM wwa I 46 Multivariate Generating Functions Definition 461 Let g X 1Xn be an n rv We define the multivariate moment generating function nlnlgf of I Ee Eexp tiXigt i1 R if this expectation exists for t Z lt h for some h gt 0 l i1 Definition 462 Let g X 1Xn be an n rv We define the n dimensional characteristic function PLR gtofias FAQ Ekaw Eexp i thj jl I i fb exists for any real valued n rv ii If I exists then 139 ight I Theorem 463 i If N exists it is unique and uniquely determines the joint distribution of i I is also unique and uniquely determines the joint distribution of g ii 1M if it exists and I uniquely determine all marginal distributions of i ie illIX ti and and PX ti iii Joint moments of all orders if they exist can be obtained 51 i2in mix frag 30pr X3 651 th at E9 if the mmgf exists and 1 ai1i2min 7 7 r r l 2 7n mmmm iiii2in at EX1 X2 i i X iv X1 Xn are independent rv s if M t1tn M t19 M 0t29 MAQJR Vt1tn 6 IR given that N exists Similarly X1 Xn are independent rv s if lt1 t1tn lt1 t19 lt1 0t29 ltD QtnVt1tn 6 IR Proof Rohatgi page 162 Theorem 7 Corollary Theorem 8 and Theorem 9 for mmgf and the case n 2 l Theorem 464 Let X1 X be independent rv s T6 i If mgfquots IXIt 1Xnt exist then the mgf of Y ZaiXi is i1 ill1y AllIx aft i1 on the common interval where all individual mgfquots exist R ii The characteristic function of Y 2 anj is 31 Dyt H Xi th j1 99 iii If mgfquots let 1Xnt exist then the mmgf of i is R MQQ HMXM i1 on the common interval where all individual mgfquots exist iv The n dimensionai characteristic function of i is am him j1 Proof Homework parts ii and iv only 100 Stat 6710 Mathematical Statistics I Fall Semester 1999 Lecture 31 Wednede 11101999 Scribe Jijrgen Synlanzik Rich Madsen Theorem 465 Let X1 Xn be independent discrete rv s on the non negative integers with pgfquots GXI s Gxn s R The pgf of Y Z Xi is i1 5 Gus 1 Proof Version 1 11 R 1 Version 2 case n 2 only am my 0 my 1s my 252 PX1 0x2 0 PX1 1X2 0 PX1 0x2 15 PX12X2 0PX1 1x2 1 PX1 0X2 2SZ myquot PX1 0PX2 0 PXl 1PX2 0 PXl 0PX2 1 s PXl 2PX2 0 PX1 1PX2 1 PX1 0PX2 2 2 PXl 0 PXl 1s PXl 2s2 1 PX2 0 PX2 1s PX2 2s2 GXI 101 A generalized proof for n 2 3 needs to be done by induction on n l Theorem 466 Let X1 X be iid discrete rv s on the non negative integers with common pgf G Let N be a discrete rv on the non negative integers with pgf G s Let N be independent of the Xf s N Define SN 2X The pgf of SN is i1 Grads GNGX5 Proof PSN k iP S N kNn PNn 0 2 6345 i imsy kW n 4 PN n 4 5k k0 n0 ipuv Hgipm kW n 4 5k n0 k0 0 00 2pm nZPSR k 4 5k n0 k0 0 00 R 2pm yogazxi k 4 5k n0 k0 i1 I yr m n m 2pm axgs n0 i1 H 00 61 2pm n 4 ax5quot n0 GNGX Example 467 Starting with a single cell at time 0 after one time unit there is probability 1 that the cell will have split 2 cells probability 1 that it will survive without splitting 1 cell and probability 9quot that it will have died 0 cells It holds that p q 391quot 2 0 and p q 391quot 1 Any surviving cells have the same probabilities of splitting or dying What is the pgf for the of cells at time 2 Lys GMS p52 15 r s pp52 p5 r2 1ps2 p5 aquot r 102 Theorem 468 Let X1 X be iid rv s with common mgf IX Let N be a discrete rv on the nonwnegative integers with mgf I Let N be independent of the Xi s Define SN Z Xi The mgf of 8 i1 is MM t 1Niil 1Xt Proof Consider the case that the Xi s are nonwnegative integers We know that GXS msx Ee39 x Ee 39x MX1nS Sim5 auxes 2 NISV at Vl39hr em GNEVIXt MNiiljIxt In the general case ie if the are not non negative integers we need results from Section 47 conditional expectation to proof this Theorem I 103 Stat 6710 Mathematical Statistics I Fall Semester 1999 Lecture 32 Friday 11121999 Scribe Jurgen Symanzik Hanadi B Eltahir 47 Conditional Expectation In Section 41 we established that the conditional pmf of X given Y yj for Pyyj gt 0 is a pmf For continuous rv s X and Y when fyy gt 0 ley y 313 and fXl and f 319 continuous then fXy I y is a pdf and it is the conditional pdf of X given Y 9 Definition 471 Let X Y be rv s on SZ L P Let It be a Borel measurable function Assume that EltX exists Then the conditional expectation of ltX given Y ie EILX Y is a rv that takes the value EILX 9 It is defined Z IL IPX a Y y if X Y is discrete and PY y gt 0 L El mm y 00 hfIfXyII ydy if X Y is continuous and fyy gt 0 00 Note i The rv EltX Y gY is a function of Y a rv ii The usual properties of expectations apply to the conditional expectation a E Y 0 VC 6 R b EaX 1 Y aEX Y 1 Vab 6 R c If g1g2 are Borel measurable functions and if Eg1 Eg2 exist then EalglX 1292X Y a1Eg1X Y a2EggX Y Va1a2 6 R d If X 2 0 then EX Y 2 0 e If X1 2 X2 then EX1 Y 2 EX2 Y iii Moments are defined in the usual way If E X W lt so then E X T Y exists and is the 9 conditional moment of X given Y 104 Example 472 Recall Example 4112 2 0 lt a lt y lt 1 0 otherwise fXg WW The conditional pdf s lmy 31 and fXy I 9 have been calculated 1 fl lXUJlJ l 1y formltylt1 where0ltylt1 and 1 pra y I for 0 lt a lt y where 0 lt y lt 1 J So y Elegt Eda9 0 y 2 and ll 1 1 1 92 EY 7 77 m 1 1 yyd1 1 3 1 11 y21y 21 1 2 Therefore we get the rv s EX Y and EY X Theorem 473 If EltX exists then EyEXyhX Y EltX Proof Continuous case only Emmmxnr Emwnyvmwdy f f Mavlfxly a yfyydmdy 0 W 0 fxyWyldydm 0 izxfxmdw EltX Theorem 474 If EX2 exists then VaryEX Y EyVarX Y Vm X 105 hwwmummmewwzmwmnmaamwmnmr mmem wmmma EMMmYM WMWMWEMMEYV Emaummr Vm X Note If exists then VarX 2 Vary Y VarX Vary Y if X gY The inequality directly follows from Theorem 474 For equality it is necessary that EyVarX Y Ey X EX Y2 Y 0 which holds if XEMlmmm If X Y are independent FXIY y FX Van Thus if EILX exists then EILX Y EhX I 48 Inequalities and Identities Lemma 481 Let 1 be positive numbers and p q gt 1 such that 71 i 1 ie pq p q and q 7 Then it holds that 1 1 7a 7bquot 2 ab 1 1 with equality ifl a Proof Fix I Let 1 1 r a 7a 7bquot ab 1 p q 29 aP I b 0 2 I a7 1 51 awil z a y a p 1W gt 0 106 Since g a gt 0 this is really a minimum The minimum value is obtained for I a7 1 and it is 1 1 1 1 1 1 7a 7a7 71q aapil 7a 7a a 1W7 7 1 0 p I p I p I Since g a gt 0 the minimum is unique and ga 2 0 Therefore 1a ab 713a bq 2 ab l Theorem 482 Holders Inequality Let X Y be 2 rv s Let pq gt 1 such that 71 g 1 ie pq p q and q 7 Then it holds that Ed XY 3 M X Wm Y 4 Proof In Lemma 481 let a and I quotNIYip p 5l quotquot mena 4811 1 WI XVI 3 WW WW 2 MIX W Mil1 0quot Taking expectations on both sides of this inequality we get 1 2 17LXY L 17X quot7Yquotquot The result follows immediately when multiplying both sides with E X Y l Note Note that Theorem 457 ii Cauchy Schwarz1nequality is a special case of Theorem 482 with p 1 2 39 107 Stat 6710 Mathematical Statistics I Fall Semester 1999 Lecture 33 Monday 11151999 Scribe Jurgen Symanzik Bill Morphet Theorem 483 Minkowski s Inequality Let X Y be 2 rv s Then it holds for 1 g p lt 00 that M X Y W 3 M X W M Y mi EXY7 EXY XYPquot 3 EM XllYl lXY 1 Ele lXYl H ElYl lXYl H lt EHXWMEMX Y mi lt M Y m gt 4 lt m a X Y My gt gt5 A 2 EUXPgtgtltEYPgtgt ltEltX Y gtgtY Divide the left and right side of this inequality by E X Y The result on the left side is E X Y PVT E X Y 7 I3 and the result on the right I l side is X 7 3 Y 7 3 Therefore Theorem 483 holds In A we define q Therefore 71 i 1 and p 1q p holds since 71 i 1 a condition to meet Holder s Inequality l Definition 484 A function is convex if AM 1 My 3 A900 1 Agy V3539 6 IR V0 lt A lt 1 Note i Geometrically a convex function falls above all of its tangent lines Also a connecting line between any pairs of points and y gy in the 2 dimensional plane always falls above the curve 108 ii A function gy is concave iff ga is convex Theorem 485 Jensen s Inequality Let X be a rv If is a convex function then E0100 2 9EX given that both expectations exist Proof Construct a tangent line 1 31 to ga at the constant point 300 EX 30 w I for some 1 6 JR The function gy is a convex function and falls above the tangent line 1 31 2901 2 LacH VmEJR 2 EgX 2 EaX b aEX b mm gEX Therefore Theorem 485 holds l Note Typical convex functions 1 are 3 911 l M EU X l 2 E00 l on we x2 2 E00 2 E00 VaNX 2 0 iii 9330 for a gt 0p gt 0 E 2 why for p 1 EXl 2 MIX iv Other convex functions are 31 for a gt 01 2 1 9 for 9 gt 1 lna for a gt 0 etc v Recall that if g is convex and differentiable then g 31 2 0 Van vi If the function g is concave the direction of the inequality in Jensen s Inequality is reversed ken E0100 3 9EX Example 486 Given the real numbers a1 a2 an gt 0 we define 1 1 7 arithmetic mean M m a2 an 7 Zai I n i1 109 I n 5 geometric mean LG a1 a2 105 Haj 1 l 1 1 harmonic mean a nil 5EEE 27 R A 11 Let X be a rv that takes values 11112 an gt 0 with probability each i M 2 ac in 0079011175 17 man am i1 2 inma Taking the anti log of both sides gives LA 2 La 33 M 2 an l X mmrm 1 Inverting both sides gives 14 2 an 110 iii LG 2 an lnaH lna1 R l lnH Li i1 2 lnaG Multiplying both sides with 1 gives lnaH g lnag Then taking the anti log of both sides gives an 3 LG In summary a 3 LG 3 M Note that it would have been su icient to prove steps i and iii only to establish this result However step ii has been included to provide another example how to apply Theorem 485 I Theorem 48 7 C T y 391 Let X be a rv with finite mean u i If is non decreasing then EyXX W 2 0 if this expectation exists ii If is non decreasing and Mac is non increasing then EUXhX S EUXEhX if all expectations exist 111 iii If and Man are both nondocroasing or if and Man are both non increasing then EUXhX 2 EUXEhX if all expectations exist Proof Homework l 112 Stat 6710 Mathematical Statistics I Fall Semester 1999 Lecture 34 Wednesday 11171999 Scribe Jurgen Symanzik Rich Madsen 5 Particular Distributions 51 Multivariate Normal Distributions Definition 511 A rv X has a univariate Normal distribution ie X N Np02 with p 6 JR and 039 gt 0 if it has the pdf 1 7 1M2 a if 202 fx W X has a standard Normal distribution ifl p 0 and 72 1 ie X N N01 l Note If X N Np02 then p and VarX 72 If X1 Np10 X2 Np2a and 1 2 6 R then Y 2 1X1 cng w Nc1p1 02 fa 050 l Definition 512 A 2 rv X Y has a bivariate Normal distribution if there exist constants 1111112 121 122411412 6 IR and iid N0 1 rv s Zl and Z2 such that X M1a1121 1222 Y M2 a2121azzzw kmquot 021022 7 2 Y Zz then we can write If we define XAZM Note E00 M1 11EZl 12EZzl M1 and My 2 21EZl 22EZzl M2 The marginal distributions are X N Bimbo 1 and Y N fVM2L 1 132 Thus X and Y have univariate Normal marginal densities or degenerate marginal densities which correspond to Dirac distributions if a a 0 I 113 Theorem 513 Define g IRZ gt IRZ g 2 Cg 31 If i is a bivariate Normal rv then g also is a bivariate Normal rV Proof gX Ci L1 CAz a 1 CA 2 OH 1 another Inau ix amoker vow0r 21 which represents another bivariate Normal distribution l Note 10102 CmXY CmanZ1 a12Z2a21Z1 12222 011021601421 Z 011022 012020601421 Z2 012022601422 Z2 011021 012022 since Zl ZZ are iid N0 1 rv s l Definition 514 The variance covariance matrix of X Y is i i 2 2 44 all 012 an G21 af1 afz mam 112122 a1 palaz 2 2 2 G21 G22 012 022 011021 012022 a21 022 10102 72 l Theorem 515 Assume that 71 gt 0 72 gt 0 and 0 lt 1 Then the joint pdf of XY 211 amp defined in Definition 512 is m2 m exp gz NYEquot z m 1 1 m m an m y M2 9way 7e gt 7 7 2 7 7 7 Ermazvl z XI 21 12 r71 390 r71 r72 02 Proof The mapping Z gt g is 1 1 l AZ a 114 2 Z 217 a requirement is that A is invertible 1 Al W W W xlAAT M03 warm mm1 W We can use this result to get to the second line of the theorem J A W f lt gt L 7 L p g if is 4 A 277 27139 1 172 27139 As already stated the mapping from Z to i is 1 to 1 so we can apply Theorem 435 1 1 fgz iexp z g 4quotquot4 1amp g mafgg 2 T a 1 1 39139 71 7 E a rm cxp 21 a LU This proves the 1St line of the Theorem Step holds since 47739A471 4391571 471 4 4391571 271 The second line of the Theorem is based on the following transformations 2 maml pz 21 Wm El 00102 0 70 0141702 0102102 L 01021702 031702 wwm Note In the situation of Theorem 515 we that X Y N Np1p2aagp l Theorem 516 If X Y has a non degenerate Np1p2aagp distribution then the conditional distribution of X given Y y is 71 2 2 Min paw M2011 1 115 Proof Homework l Example 517 Let rV s X1 Y1 be N00 110 with pdffl y and X2 Y2 be N00110 with pdf 9 Let X Y be the rv that corresponds to the pdf fmm 5mm gnaw X Y is a bivariate Normal rv if 0 0 However the marginal distributions of X and Y are always N0 1 distributions See also Rohatgi page 229 Remark 2 l 116 Stat 6710 Mathematical Statistics I Fall Semester 1999 Lecture 35 Friday 11191999 Scribe Jijrgen Symanzik Bill Morphet Theorem 518 The mgf I of a bivariate Normal rv i X Y is 1 1 A A A A M g Mx thtz expw39 552 exp mm um 507 a t 200102t1t2 Proof The mgf of a univariate Normal rv X No 72 will be used to develop the mgf of a bivariate Normal rv i XY IX EexptX 00 1 1 2 e M e 7Aa 13 700 xr W xrlt 202 u 1 2 com exp 21 202tm m urlzl d 0 1 1 A A A 7 exp 7 202tm mg Qua 2 la 202 8 700 27702 00 1 1 A A A A A A A 7 exp hz Qm 72th uHUZV uHUZV p dd 700 27702 20 1 i 22 2 1L 22 explt 2UZ uHU u ichxp 202b uta 13 pdf of NO n72 72 that integrates to 1 1 A A A A exp QPMZ Qutaz 04 pa ex 2uta2 t2a4 I 202 1 A A exp Mt 502 Bivariate Normal mgf 00 00 mum expltt1xtzygtmmgtdandy 00 00 117 expt1m exptzy fixx fl X3 l 1 d9 dd 8 8 F A i 0 gt f mm fmy x dy expmm fix139 da 1X N r 2 700 lt 700 exp 19 exptly fxy 127 2031 12 U2 x 2 111 m 01 0 1 2 2 2 2 exp 231 t2 E030 W3 expm fxm den 00 0 1 2 2 2 exp pgw 1t2 5031 p2t t1m fXy 127 700 l 00 0quot 0quot 1 2 exp Mth 172th pimm 7031 oz t1m fx 31 127 700 0391 0391 2 g A 2 2 2 02 0 0392 exp 021 2 tzuz limtz exp h pitm fxw39 dd 71 700 71 gt exp mm M2t2 1 2 7 2 aim pitz 0391 2 2 2 0392 02 031 12 t3 tzuz plum exp 1 M itz 0391 0391 2 1 2 1 2 1 rx7 egg 221222 0392 0392 122 1222 0 g za t M2t2 Ml ah mt1 Ml ah 50ft palaztltz 52036 t 2p0102t1t2gt 2 A follows from Theorem 516 since Y X N N3XU 1 12 B follows when we apply our calculations of the mgf of a NW 72 distribution to a N3X 031 12 distribution C holds since the integral represents NIXt1 pgiftz I Corollary 519 Let X Y be a bivariate Normal rv X and Y are independent if 0 0 l Definition 5110 Let Z be a kwrv of k iid N01 rv s Let A 6 IRka be a k X k matrix and let a E IRk be a k dimensional vector Then i Ag a has a multivariate Normal distribution with mean vector p and variancewcovariance matrix 2 2 AA l 118 Note i If E is non singular i has the joint pdf 1 1 I 71 fgamp WWI ii l 2 ii If E is singular then i Ii takes values in a linear subspace of Bk with probability 1 iii If E is non singular then i has mgf 53 V MLQ expi Theorem 5111 The components X1 Xk of a normally distributed k rv are independent iff 101701 Xj 0 Vij1k 1 Theorem 5112 Let i X 1 Xk i has a k dimensional Normal distribution iff every linear function of i ie g t1X1 thz thk has a univariate Normal distribution Proof The Note following Definition 511 states that any linear function of two Normal rv s has a uni variate Normal distribution By induction on k we can show that every linear function of i ie i has a univariate Normal distribution Conversely if i has a univariate Normal distribution we know from Theorem 518 that 1 MEAS exp Ei s EVaNl39Q 2 I 1 I 2 exp Ii 5 5 25 1 MV 1 exp g 2 By uniqueness of the mgf and Note iii that follows Definition 5110 i has a multivariate Normal distribution l 119 52 Exponential Familty of Distributions Definition 521 Let r be an interval on the real line Let f 9 9 E 19 be a family of pdf s or pmfquots We assume that the set f 19 gt 0 is independent of 9 where g 1an We that the family f 9 9 E 19 is a one parameter exponential family if there exist realwvalued functions 99 and D9 on 19 and Borelwmeasurable functions Ni and Hi on IR such that famp 9 0X1gtQ9Tz 139 5a Note We can also write f 9 f 97 NEW midiNE where Mg expSg r 99 and C97 expDQquotr and call this the exponential family in canonical form for a natural parameter r l Definition 522 Let Q Q IRk be a kwdimensional interval Let f Q Q E be a family of pdf s or pmf s We assume that the set f Q gt 0 is independent of Q where g an 4 We that the family f Q Q E is a k parameter exponential family if there exist realwvalued functions 91QQkQ and DQ on Q and Borel measurable functions T1 and Hi on IR such that k f Q exp QAQWHQ 196 5m 9 izl Note Similar to the Note following Definition 521 we can express the k parameter exponential family in canonical form for a natural k x 1 parameter vector 97 r11rk l Example 523 Let X N NW 72 with both parameters 1 and 72 unknown We have 1 1 2 1 2 M 1392 1 2 faQ exp m p exp a Ea Eln27ra39 Q 102 Q 01472 p E 175502 gt 0 120 Therefore we 32 T100 302 ME 7123 3639 mg in2m2 830 Thus this is a 2 parameter exponential family 121 re 2 lt 322 a A 5 i z z a 8w 4 s 2g 3 EYE f mm 313 r f 7 i ziix LE 1 sq L4 9 Hits 511 T Li 1135 it their 6 v i frft r Fri 571fter 1 iii 3 5 g E A ii a i 61 Modes of Convergence Definition 611 Let X1 X be iid rv s with common cdf FX 31 Let I 1X be any statistic ie a Borelw measurable function of X that does not involve the population parameters 19 defined on the support X of X The induced probability distribution of 1X is called the sampling distribu tion of 1X l Note i Commonly used statistics are El Sample Mean X X n Sample Variance 8 ZXi X02 i1 Sample Median Order Statistics Min Max etc ii Recall that if X1Xn are iid and if and Va39rX exist then EXXR p E7692 72 Vm X and VaMXn L2 n T6 iii Recall that if X1 X are iid and if X has mgf IX or characteristic function then 1M iiIX or 1 Note Let X6311 be a sequence of rv s on some probability space 12 L P Is there any meaning behind the expression lign X X 339 Not immediately under the usual definitions of limits We R 00 first need to define modes of convergence for rv s and probabilities l Definition 612 Let Xn hi be a sequence of rv s with cdf s FR 351 and let X be a rv with cdf F If F 31 gt Fy at all continuity points of F we that X converges in distribution to X Xn d gt X or X converges in law to X Xn X or FR converges weakly to F FR 31 l Example 613 Let X N N0 Then i n 123 51 exp 52 d s 700 V277 new Moo 2 1 if a gt 0 2 FRW gt 0 ify oo 0 ify lt 0 1 a 2 0 If FX 31 0 0 the only point of discontinuity is at a 0 Everywhere else 7lt a my 9 Fm d So X d gt X where PX 0 1 or X gt 0 since the limiting rv here is degenerate ie it has a Dirac0 distribution I Example 614 In this example the sequence FR 530 converges pointwise to something that is not a cdf Let X N Diracn ie PXn n 1 Then 0 a lt n F a M 1 a 2 n It is gt 0 Var which is not a cclf Thus there is no rv X such that X d gt X l Example 615 Let X53311 be a sequence of rv s such that PXn 0 1 n1 and PXn r X N Dirac0 ie PX 0 1 7 and let Itis 0 mlt0 an 1 Ogmltn 1 291 0 mlt0 Fy m 1 20 It holds that FR 41 FX but EX Mquot 7L 30 0 Thus convergence in distribution does not imply convergence of momentsmeans l 124 Note Convergence in distribution does not that the Xf s are close to each other or to X It only means that their cdf s are eventually close to some cdf F The X s do not even have to be defined on the same probability space I Example 616 I Let X and Xn Roz be iid N0 1 Obviously X d gt X but 7331010 X X Theorem 617 Let X and Xn 5301 be discrete rv s with support X and A n hail respectively Define the count 00 able set A XL U X ak k 123 Let pk PX 1k and pm PXn 11 Then n1 it holds that m gt pk Vk if X i X Theorem 618 Let X and X6311 be continuous rv s with pdfquots f and f g gh respectively If f 31 gt f 31 for almost all a n gt so then X d gt X I Theorem 619 Let X and Xn 5301 be rv s such that X d gt X Let c 6 IR be a constant Then it holds i Xnc d gtXc ii Xn d gt X iii If an gt a and I gt I then Lan I d gt aX 3 Low Part iii Suppose that a gt 04 gt 0 Let Y Lan 1 and Y aX 1 It is 1 1 1 J braL a a Fyy PY lt y PaX I lt y PX lt Likewise mm mm an If y is a continuity point of Fy 9 is a continuity point of FX Since an gt 11 gt I and gt FX it follows that Fyny gt Fyy for every continuity point 9 of Fy Thus aan 1 i aX 1 I Stat 6710 Mathematical Statistics I Fall Semester 1999 Lecture 37 Wednesday 11 24 1999 Scribe Jurgen Synlanzik Rich Macisen Definition 6110 Let X53311 be a sequence of rv s defined on a probability space 32141 We that Xn converges in probability to a rv X Xn 3 X P lgn Xu 2 X if R 00 nhgl oPXn Xgtc0 V6gt0 Note The following are equivalent Xn X gt c 0 ltgt Xn X g c 1 ltgt gt 0 If X is degenerate ie PX c 1 we that X is consistent for c For exanlple let X such that PXn 0 1 n1 and PXn 1 Then 1 7 0 lt 6 lt 1 P X gt 6 n a n l gt 0 2 1 Therefore P X gt c 0 V6 gt 0 So Xn 3 0 ie Xn is consistent for 0 l Theorem 6111 k 6 IR a constant 2 an 3 kX kk 6 IR a constant 2 X 3 kT Vaquot 6 IN a r i b M e 5 2 XRYR ii ab 126 an in aeBbE JR O 2 i g Y an arbitrary rv 2 XRY 3 XY 1 i Y 2 XRYR L XY Proof See Rohatgi page 244245 for partial proofs Theorem 6112 Let X 3 X and let 1 be a continuous function on R Then gXn 3 Preconditions 1 X rv vcgt0 3kkcPX gtk lt5 2 g is continuous on IR 2 9 is also uniformly continuous see Definition of uc in Theorem 333 iii on k k 35 56k W S kthn X lt 5 9an UXl lt 6 Let A le S k w Xwl S k B an Xl lt 3 W Xn vl Xwl lt 3 C WM will lt 6 w 9Xnw yXwl lt 6 HweAnB w E C 2 A n B C C 2 10 c A n 130 AC 0 130 2 P 10 3 MAC 0 130 3 MAC P030 Now PUXn UX26 S PXlgtk P Xn Xlz l g by 1 g for n2mgt6gtk since Xnim g c for n 2 n0c 3 k 127 Corollary 6113 Let X 3 c c 6 IR and let 9 be a continuous function on R Then gXn 3 gc Theorem 6114 X i X 2 X i X Proof XRLX PXn Xgtc gt0asn gtoo vcgt0 Itholds Png c PXgm 6Xn X6PXgm cXn Xgtc A S PXn SUCHPUXR X gt 6 A holds since X g a c and X within a of X thus X g Similarly it holds PXn gm 3 PX mcPXn Xl gt6 PXn wan X 6PXn wan X gt6 Combining the 2 inequalities from above gives Png c PXn X gtc PXn gmgPX mcPXn X gt6 r RH v gt0 as 7100 Fnx gt0 as RgtOO Therefore PX gag c anw 3PX anH n gtoo Since the cdf s FRO are not necessarily left continuous we get the following result for c J 0 PX lt3 S FRW S PX SW FX Let a be a continuity point of F Then it holds F06 PX lt m S F120 S F06 2 an gt Fy Xn KX 128 Theorem 6115 Let c 6 IR be a constant Then it holds 11 Xn gtltgtXn p gtc Example 6116 In this example we will see that X i X gt X L X for some W X Let X be identically distributed rv s and let Xn X have the following joint distribution X quot01 X o 011 1 0 111 Obviously X d gt X since all have exactly the same cdf but for any a E 0 1 it is P Xn X gtc PXn X 1 1 v so 7331010 P X X gt a 5 0 Therefore X 7477 X l Theorem 6117 Let Xn gl and Yn il be sequences of rv s and X be a rv defined on a probability space S39ZLP Then it holds Yn d gtXXn Yn LOSXn QX Proof Similar to the proof of Theorem 6114 See also Rohatgi page 253 Theorem 14 l 129 Stat 6710 Mathematical Statistics I Fall Semester 1999 Lecture 38 Monday 11291999 Scribe Jijrgen Symanzik Bill Morphet Theorem 6118 Slutsky s Theorem Let X0321 and Yn il be sequences of rv s and X be a rv defined on a probability space 2 L P Let c 6 IR be a constant Then it holds a X Lucy p gtXRYR d gtXc ii X i X i 0 2 XRYR i cX If t 0 then also XRYR 3 0 iii X Lucy p gtck i 3 arm Proof 3 r i c Thiilzim YR c A 0 21 cYnXn Xn cXRY Xnc Lgt0 A X d gt X gm X c d gt X c 13 Combining A and B it follows from Theorem 6117 X YR i X 0 ii Case 0 0 V6gt0 Vkgt0itis 6 k mural Paws itsiwmxnmm Yngt Pm X gt 6 Hm gt ggt s m X gt k 130 r2 lgt ggt Since X igt X and r A 0 it follows niggqu XRYR gt a g P X gt k gt 0 k gt 00 Therefore XRYR 1 0 130 Case 0 0 Since Xn d gt X and Yu 3 c it follows XRYn cXn XnYn c 3 0 2 XRYR 1 cXn g XRYR i cXn Since cXn d gt X by Theorem 619 ii it follows from Theorem 6117 ang i cX iii Let Zn 1 1 and let YR c2 0 1 1 1 3 W 7 l39h6lll variii l i P With part ii above it follows Xn d gt X and 3 L a 2 n Definition 6119 Let X0321 be a sequence of rv s such that EU Xu W lt 00 for some 391quot gt 0 We that Xn converges in the 9 mean to a rv X Xn 3 X if E X V lt 30 and T nlgloloEUXn X 0 I Example 6120 Let X0321 be a sequence of rv s defined by PXn 0 1 n1 and PXn 1 It is Xn V Vaquot gt 0 Therefore Xn 3 0 Vaquot gt 0 l Note The special cases 391quot 1 and 391quot 2 are called conreryence in absolute rnean for 391quot 1 Xn l gt X I ms 2 and convergence rn mean square for 391quot 2 Xn gt X or Xn gt X Theorem 6121 Assume that Xn T gt X for some aquot gt 0 Then Xn 3 X Using Markovquots Inequality Corollary 352 it holds for any 6 gt 0 wgmxn Xlzd 131 Xn T gtXngloloEXn XT0 E X X T 2 lim P Xn X 2 c 5 lim M 0 R200 R200 6T 2X iiX I Example 6122 Let X0321 be a sequence of rv s de ned by PXn 0 1 2 72 and PXn n R2 for some iquot gt 0 For any a gt 0 P X gt c 2 0 n 2 so X J20 Fer0ltslt9y E X 5 1 205155 Vii gt00 8029 520 But E X T1 74gtOasn200 nr s so 297320 I Theorem 6123 If X 2L X then it hoios r r x gggoE an E X 7 and ii X 2L X for 0 lt s lt iquot Proof i For 0 lt i g 1 it hoios EUXR vgtEltan XX V EUXn X ViX V 2 Elt1Xn m M X m s M X X m 2 gym Xn V Vega X V s M X X V o 2 gym Xn V s M X V A Simiiaiiy EiiX irgtEltiX XRXR m EaXn X 2an m 2 M X m E X V s M X X m 2 gggoE X V gggom X W niggom X X V 0 2 M X V 3 gym Xn V 13gt Combining A and 13 gives liglOloEU X W E X W 132 For 9quot gt 1 it follows from Minkowskiis Inequality Theorem 483 Em X X X W 5 EU X X W Eu X W 2 EU X W Ea X W 5 Ed X X W 2 EU X W snow X m s gggow X X m o X 2 X 2 EU X W s gggow X W 1 Similarly Eu X X X W 5 E0 X X W Ea X W 2 gggow X m gggow X W 3 Ed Xn X m o X 2 X 2 gggow X W 5 Ed X W 12 Combining C and D gives 3ng X W Ea X W gggoEq X W El X ll ii For 1 g s lt aquot it follows from Lyapunovquots Inequality Theorem 354 Em X Xsgt1isEltan X m 2 EU Xn X l S E Xn X Tl 2 gggoEd X X 5 g 73ng X X m 0 since X L X 2XRLgtX An additional proof is required for 0 lt s lt aquot g 1 133 Stat 6710 Mathematical Statistics I Fall Semester 1999 Lectures 39 SI 40 Wednesday 1211999 8 Friday 1231999 Scribe lurgen Symanzik Hanadi B Eltahir Definition 6124 Let X6321 be a sequence of rv s on 32141 We that X converges almost surely to a rv X Xn X or X converges with probability 1 to X Xn X or X converges strongly to X ifl gt n gt 1 Note I I I I An interesting characterization of convergence with L h 1 and convergence in L can be found in Parzen 1960 Modern Probability Theory and Its Applications on page 416 see Handout l Example 6125 Let S 01 and P a uniform distribution on 12 Let X w w and w For w 6 01 w gt 0 n gt so So Xnw gt Xw Va 6 01 However for w 1 Xn1 2 1 X1 Vn ie convergence fails at w 1 Anyway since gt n gt E 0 1 it is X X l Theorem 6126 X X 2 X L X Proof Choose a gt 0 and 6 gt 0 Find no n0c 3 such that Pm Xn Xs 21 a 00 Since 0 Xn XE cgXn Xg a Vngno it is 0716 00 PGEXn X sc2P Xn X 9 21 6 Vngno 0716 Therefore PM Xn X S 6 gt 1 n gt 00 Thus X 3 X I 134 Exanlpie 6127 X i X g X X Let S 0 1 and P a unifornl distribution on 2 Define A by 41 i39 lZ 1 A3 039 giAc 147 o 1Ag 5 it Let an Low It is P X 0 2 c gt 0 V6 gt 0 since X is 0 except on A and PM J 0 Thus X 3 0 But Pw Xnw gt 0 0 and not 1 because any w keeps being in some A beyond any no ie Xnw looks like 0010010010 so X 0 I Exanlpie 6128 X L X g X X Let X be independent rv s such that PXn 0 1 i and PXn 1 ItisEXn 0TEXnTEXn gt0asn gtooszn LgtO v gt0 But no 1 n 1 m m 1 no 2 no 1 n 1 PX 0 Vmltriltri 1 7 n I 0 mg n m m 1 m 2 no 1 no no As no gt so it is PXn 0 Vm g n 3 no gt 0 Wm so X 0 l Exanlpie 6129 X X g X L X Let S 01 and P a unifornl distribution on 12 Let A 0 Let Xnw niAnw and Xw 0 It holds that Va gt 0 Eino 1 lt w 2 Xnw 0 Vn gt no and Pw 0 0 Thus X 0 In no ButE Xn OV i gtooVrgt0szn7 gtX Inn Stat 6710 Mathematical Statistics I Fall Semester 1999 Lecture 41 Monday 1261999 Scribe Jurgen Symanzik 62 Weak Laws of Large Numbers Theorem 621 WLLN Version I Let X33021 be a sequence of iid rv s with mean EXi p and variance VarXi 72 lt 00 Let n X i Z Xi Then it holds i1 i nlggol Xn u gt c 0 ie YR 3 u Proof By Markovquots Inequality it holds for all 6 gt 0 PH X1 M 2 6 S EXfm2 L1 gt 0 n gt so Note For iid rv s with finite variance X n is consistent for u A more general way to derive a WLLN follows in the next Definition l Definition 622 R Let X l be a sequence of rv s Let Tn ZXi We that obeys the WLLN with i1 respect to a sequence of norming constants B f h Bi gt 011 T 00 if there exists a sequence of centering constants 147 boil such that 135 An ii 0 Theorem 623 Let X33021 be a sequence of pairwise uncorrelated rv s with EXi m and VarXi If i E IN n n R If 2 If gt so n gt so we can choose An Z m and BR Z If and get i1 i1 i1 R 209 Mi Proof By Markovquots Inequality it holds for all 6 gt 0 mice W R R R A 1 PZXi ZpigtcZUfg7ln n gt0asn gtoo l i1 i1 i1 622 Off 6220 i1 i1 Note To obtain Theorem 621 we choose A up and B n02 l Theorem 624 R Let X l be a sequence of rv s Let Y A necessary and su icient condition for i1 Bn 2 n is that X to obey the WLLN with respect to 72 X E lt 2 gt 0 1 X n n gt 00 Proof Rohatgi page 258 Theorem 2 l Example 625 Let X1 Xn be jointly Normal with 0 EX2 1 for all i and C117X7 Xj 0 if i j 1 and C11XiXj 0 if i j gt 1 Then Tn N0n 2n Up N002 It is X T E 72 E ll AXn 717 2 00 m2 Lac 3 17 7 if 02 7 7 27770 vizFm 2 day y 739 dy 039 7292 7L2 J 2 00 d M271 0 n2 72126 2 9 11 2n 1oyz 2 00 7 6 2 dz 2770 n 71 2n 10yz J 39 i 2 39 i 1 0 2 2 g y W 7 yze dy n 0 x 27139 1 since Var of N01 distribution gt 0 n gt so 137 Note We would like to have a WLLN that just depends on means but does not depend on the existence of finite variances To approach this we consider the following R Let X l be a sequence of rv s Let Tn Z Xi We truncate each Xi at c gt 0 and get i1 Xi Xi Sc 0 otherwise R R Let Tg and mm I i1 i1 Lemma 626 For TR TT and mu defined in the Note above it holds P Tn mn gt a g P T mu gt c 1 Xi gt 0 V6 gt 0 i1 Proof It holds for all a gt 0 P Tn mu gt c P Tn mu gt c and Xi g t W E 1n P Tn mn gt c and Xi gt c for at least one i E 1n g P TT mu gt c P Xi gt c for at least one i E 1n R g P TT mu gt c ZPQ Xi gt c holds since Tg Tn when Xi g 3 W E 1n l Note If the Xf s are identically distributed then P Tn mu gt c g P T73quot mu gt c nPX1gt 0 V6 gt 0 If the Xf s are iid then EUXi lzl P Tn mu gt c g 2 nPX1gt 0 Va gt 0 6 Note that P Xi gt c P X1 gt t W E IN if the Xi s are identically distributed and that EX2 mom W e N if the are iid I 138 Theorem 627 Khintchine s WLLN Let X33021 be a sequence of iid rv s with nite mean EX M Then it holds i 1 Xn7Tn p gtp n Proof If we take 0 n and replace a by m in in the Note above we get E Xn 2 P Tn mu gt m g i g nP X1 gt n me Since X1 lt 00 it is nP X1 gt n gt 0 n gt so by Theorem 319 From Corollary 3112 00 we know that E X a r ma lp X gt 31130 Therefore 0 Emma 2 quotmm X lgt mm 23qu X gtmdy2yP X gt my ltgt n s K6 da A g Kn6 In A is chosen su iciently large such that mP X F gt 31 lt Va 2 A for an arbitrary constant 6 gt 0 and K gt 0 a constant Therefore E X 1 2 k 3 2 H62 7 7L 6 6 Since 3 is arbitrary we can make the right hand side of this last inequality arbitrarily small for su iciently large n SHch EX7 p Vt it is n gt p as 11 3c I Note Theorem 627 meets the previously stated goal of not having a finite variance requirement l 139 Stat 6710 Mathematical Statistics I Fall Semester 1999 Lecture 42 Wednesday 12 8 1999 Scribe Jurgen Symanzik 63 Strong Laws of Large Numbers Definition 631 it Let X33021 be a sequence of rv s Let Tn ZXi We that obeys the SLLN with i1 respect to a sequence of norming constants B f h Bi gt 011 T 00 if there exists a sequence of centering constants 147 boil such that s B Tn 14RLgt 0 Note Unless otherwise specified we will only use the case that 13 n in this section I Theorem 632 X ltgt lim Psup Xm X gt 6 0 V6 gt 0 quotTwo 1717 Proof see also Rohatgi page 249 Theorem 11 WLOG we can assume that X 0 since X X implies X X 0 Thus we have to prove X ltgt lim Psup Xm gt a 0 w gt 0 quotTwo 17127 79 Choose a gt 0 and define ARR sup Xm gt a 17127 0 X 0 We know that PC 1 and therefore P 1quot 0 00 Let Bum C 0 Ads Note that Bn1c Q Bum and for the limit set 0 Bum 33 It follows 791 that 12131010 P0346 P 6 13126 0 791 140 We also have PBn5 PAn no 1 P uAg 1 P 1 PA P 1 0 Ag 0 0 2 72131010 HARM 0 Assume that TlgloloP14nc 0 V6 gt 0 and define D6 X gt Since D6 Q ARR Vn E N it follows that PDc 0 V6 gt 0 Also 00 7 1 awn 0 2 X gt z 2 1 Pogt immi o kl Xniigt0 I Note i X 0 implies that V6 gt 0 V6 gt 0 Elm E IN P sup X gt 6 lt 3 792 ii Recall that for a given sequence of events An g 00 00 00 A lim A lim A r 1 naoo quot naoo U k m U 4quot k 791 167 is the event that infinitely many of the An occur We write PA PAn Lo where ix stands for infinitely often iii Using the terminology defined in ii above we can rewrite Theorem 632 aniH ltgt P1Xn1gt6i o 0 VOO 141 Theorem 633 Borel Cantelli Lemma i 15 BC Lemma 00 Let An 351 be a sequence of events such that 2 P91 lt 30 Then PA 0 791 ii 2nd BC Lemma 00 Let An 5301 be a sequence of independent events such that 2 PM 00 Then PA 1 n1 Proof 3 00 PM Hung U Al k 00 mm M 167 00 1320 1 PW m i MAI mm kl k1 0 00 00 ii We have A U 0 Ag Therefore 791 k 00 00 l 7 I 7 I 7 PM 733101010 Ak nlggo k jk 7 r If we choose no gt n it holds that 00 n0 0 A g 0 Ag k kn Therefore 00 730 3 0 HQ Ak S ngglloom Ak kn k W mpg H 1 HAN kn i d p quot0 g mhglooexp Z PAkgt kn 0 2 PA 1 l 142 Example 634 Independence is necessary for 2nd BC Lemma Let S 0 1 and P a uniform distribution on 32 Let A Imi w Therefore 00 00 1 Zpi 4nZ OO n n1 791 But for any w E SE A occurs only for 1 2 where denotes the largest integer floor that is g Therefore PA P64 120 0 l Lemma 635 V u a T 1 quot Let X l be a sequence of independent rv s with common mean 0 and variances Let TR 2 R Z Xi Then it holds i1 P r T gt lt lglggnl k l c Proof See Rohatgi page 268 Lemma 2 l Lemma 636 Kronecker s Lemma If 2 Xi converges to s lt 00 and 13 T 00 then it holds i1 1 R B 13ka gt 0 See Rohatgi page 269 Lemma 3 l Theorem 637 Cauchy Criterion X ltgt gggOHSEP Xnirm n g a 1 w gt 0 Proof See Rohatgi page 270 Theorem 5 l 143 Theorem 63 ac at If Z Va Xn lt 0C then 2133 7 EX converges almost surely n1 n1 Proof See Rollatgi page 272 Theorem 6 Stat 6710 Mathematical Statistics I Fall Semester 1999 Lecture 43 Friday 12101999 Scribe Jurgen Symanzik Corollary 639 Let Xihoil be a sequence of independent rv s Let B f h Bi gt 011 T 30 a sequence of norming n 00 7 l U X7 constants Let TR 2 Z Xi If 2 lt 30 then it holds 2 i1 i1 Bi Tn ETn a BR gt 0 Proof This Corollary follows directly from Theorem 638 and Lemma 636 I Lemma 6310 Equivalence Lemma R R Let X l and be sequences of rv s Let TR 2 ZXi and T7 1 i1 0 If the series 2 PXi lt 00 then the series and are tail equivalent and TR and 11 T are convergence equivalent ie for BR T 00 the sequences 1 I A I En Tu and En Tu converge on the same event and to the same limit except for a null set Proof See Rohatgi page 266 Lemma 1 l Lemma 6311 Let X be a rv with E X lt 00 Then it holds 00 PUXEMSle31ZPX2n 7 Proof Continuous case only Let X have a pdf Then it holds M X l m m mm 311ng x fmdm 2ka 31X lk1sEX l s ltk1gtPltk 31X SisH k0 k0 145 It is kaSleSwrl iP SXSk1 10 10 iPkSlXS 1 121 12 immzm 121 Similarly k1Pklelsk1 iPUX2HPkgXg 1 0 791 390 ipaxlznm n1 Theorem 6312 Kolmogorov s SLLN R Let X33021 be a sequence of lid rv s Let TR 2 Z Xi Then it holds i1 7 Xn lt 00 ltgt X lt 30 and then u 2 Suppose that X p lt 30 It is R R7 Tquot ZZXF ZXHXR n71Xw i1 i1 E n 1 Tnl x 2 J 0 n n n n 1 ans a1 ans 00 By 15 Borel Cantelli Lenlnla we nmst have Z P X 2 n lt 00 n1 11517112931 X lt 00 139 327201uNXn pgt Since in u it holds that X 3 u Therefore it nmst hold that p EX 146 42 Let E X lt 00 Define truncated rv s X12 2 ifXkgk 0 otherwise R T Z X k1 i T39 X J n n Then it holds 00 00 ZPXkX ZPU Xk 21 k1 k1 2 H X 2 k 1 Lmnma 3311 lt EU X l lt 00 By Lemma 6310 it follows that TR and T7 are convergencewequivalent Thus it is su icient to prove that X EX We now establish the conditions needed in Corollary 639 It is VWXk S MOLD n xzfx 31131 R7 xzfx 31131 o kgxltk1 2m 12Pk g X lt k 1 k0 0 k12 TPM gXltk1 00 1 ZIWVWXQ s 22 n1ko 00 n k12 00 1 ggTPkSXltk1gn7POgXlt1 lt3 co 1 I klk12pk S X ltk1 2P0 gm K 1 A 147 1 712 00 holds since 2 n1 It is 2 7T a 165 lt 2 and the rst two sums can be rearranged follows M8 From Bronstein page 30 7 we know that nkl 1 Mn 1 1 k2 k12 k22 lt iJri1 1 k2 kk1 k1k2 14r 1 7 7 k nklnn 1 1 1 i 1 12 23 4 nn1 1 1 1 1 0 1 77ffw Z I 142 25 1544 k 1k mwnm U 1 1 142 243 344 k 1 k 1 1 1 1 2 243 344 k 1 k 1 1 1 3 34 k 1 k 1 1 4 k 1 k l k 1 0 1 Z 2 y k nzk 11 1 1 W 148 S wlm Using this result in A we get i wigmag 2 i ka g X lt k 1 2pm g X lt 1 791 kl Qikpk Xltk14ipk Xlt 1 kl k0 2 p gxltk12POSXlt1 k1 2EX422 lt so To establish B we use an inequality from the Proof of Lemma 6311 Thus the conditions needed in Corollary 639 are met It follows that 1 I 1 I 125 Tn WTR H 0 C n n Since gt n gt so it follows by Kronecker s Lemma 636 that gt Thus when we replace ET by E X in C we get 1 r j 1 7T7 lennggt 310 7TH ll n n Merry Xmas and a Happy New Millennium 149 STAT 6710 Mathematical Statistics I Fall Semester 2008 Dr Jiirgen Symanzik Utah State University Department of Mathematics and Statistics 3900 Old Main Hill Logan UT 8432273900 Tel 435 79770696 FAX 435 79771822 e mail syinanzik nathusuedu Contents Acknowledgements 1 1 Axioms of Probability 1 11 aiFields 1 12 Manipulating Probability 5 13 Combinatorics and Counting 12 14 Conditional Probability and Independence 18 2 Random Variables 25 21 Measurable Functions 25 22 Probability Distribution of a Random Variable 29 23 Discrete and Continuous Random Variables 33 24 Transformations of Random Variables 39 3 Moments and Generating Functions 46 31 Expectation 46 32 Generating Functions 54 33 ComplexiValued Random Variables and Characteristic Functions 61 34 Probability Generating Functions 72 35 Moment Inequalities 75 4 Random Vectors 78 41 Joint7 Marginal7 and Conditional Distributions 78 42 Independent Random Variables 85 43 Functions of Random Vectors 90 44 Order Statistics 97 45 Multivariate Expectation 100 46 Multivariate Generating Functions 106 47 Conditional Expectation 112 48 Inequalities and Identities 116 5 Particular Distributions 122 51 Multivariate Normal Distributions 122 52 Exponential Family of Distributions 129 Index 132 Acknowledgements I would like to thank my students Hanadi B Eltahir Rich Madsen and Bill Morphet who helped during the Fall 1999 and Spring 2000 semesters in typesetting these lecture notes using LATEX and for their suggestions how to improve some of the material presented in class Thanks are also due to more than 20 students who took Stat 671020 with me since the Fall 2000 semester for their valuable comments that helped to improve and correct these lecture notes In addition I particularly would like to thank Mike Minnotte and Dan Coster who previously taught this course at Utah State University for providing me with their lecture notes and other materials related to this course Their lecture notes combined with additional material from CasellaBerger 2001 Rohatgi 1976 and other sources listed below form the basis of the script presented here The primary textbook required for this class is o Casella G and Berger R L 2002 Statistical Inference Second Edition Duxbury PressThomson Learning Paci c Grove CA A Web page dedicated to this class is accessible at http www math usu eduquotsymanzikteaching2008stat6710stat6710 html This course closely follows Casella and Berger 2002 as described in the syllabus Additional material originates from the lectures from Professors Hering Trenkler and Gather l have attended while studying at the Universitat Dortmund Germany the collection of Masters and PhD Preliminary Exam questions from Iowa State University Ames Iowa and the following textbooks Bandelow C 1981 Einfiihrung in die Wahrscheinlichkeitstheorie Bibliographisches lnstitut Mannheim Germany Casella G and Berger R L 1990 Statistical Inference Wadsworth amp BrooksCole Paci c Grove CA FiszM1989 Vy 39 quot 1 y and L quot L StatistikVEB Deutscher Verlag der Wissenschaften Berlin German Democratic Republic Kelly D G 1994 Introduction to Probability Macmillan New York NY Mood A M and Graybill F A and Boes D C 1974 Introduction to the Theory of Statistics Third Edition McGraW Hill Singapore o Parzen E 1960 Modern Probability Theory and Its Applications Wiley New York NY 0 Rohatgi V K 1976 An Introduction to Probability Theory and Mathematical Statis tics John Wiley and Sons New York NY 0 Rohatgi V K and Saleh A K Md E 2001 An Introduction to Probability and Statistics Second Edition John Wiley and Sons New York NY 0 Searle S R 1971 Linear Models Wiley New York NY Additional de nitions integrals sums etc originate from the following formula collections 0 Bronstein l N and Semendjajew K A 1985 Taschenbuch der Mathematik 22 Au age Verlag Harri Deutsch Thun German Democratic Republic o Bronstein l N and Semendjajew K A 1986 Erga39nzende Kapitel 2a Taschenbuch der Mathematik 4 Au age Verlag Harri Deutsch Thun German Democratic Republic o Sieber H 1980 Mathematische Formeln 7 Erweiterte Ausgabe E Ernst Klett Stuttgart Germany Jurgen Symanzik August 25 2008 1 Axioms of Probability Based on CasellaBerger Sections 11 12 amp 13 11 UiFields Let Q be the sample space of all possible outcomes of a chance experiment Let w E Q or z E Q be any outcome Example Count of heads in n coin tosses Q 07 17 27 7 Any subset A of Q is called an event For each event A Q 9 we would like to assign a number ie7 a probability Unfortunately7 we cannot always do this for every subset of Q Instead7 we consider classes of subsets of 9 called elds and ai elds De nition 111 A class L of subsets of Q is called a eld if Q E L and L is closed under complements and nite unions7 ie7 L satis es i Q S L ii ASLgtAC SL iii ABSLgtAUBSL I Since 90 Q and ii imply Q E L Therefore7 i Q E L can replace Note De Morgan7s Laws For any class A of sets7 and sets A 6 A7 it holds U A ACCand A U ACC ASA ASA ASA ASA I Note So ii7 iii imply iii AB 6 L gt A O B E L can replace iii Proof ABeL ACBCeLACoBCeL mcoBCFeL AmBeL De nition 112 A class L of subsets of Q is called a ai eld Borel eld7 aialgebm if it is a eld and closed under countable unions7 ie7 00 iv 143311 e L gt U A e L I n1 Note iv implies iii by taking An Q for n 2 3 Example 113 For some 9 let L contain all nite and all co nite sets A is co m39te if A0 is nite 7 for example7 if Q IN A x l x 2 c is not nite but since A0 x l x lt c is nite7 A is co nite Then L is a eld But L is a ai eld iff if and only if Q is nite CO For example7 let 9 Z Take An n7 each nite7 so An 6 L But U An ZJr L7 since n1 C 00 the set is not nite it is in nite and also not co nite An Z6 is in nite7 too n1 Question Does this construction work for Q ZJr 7 C 00 If we take A n7 then lt An Q E L But7 if we take An 2n7 then n 1 0 00 C Ufa 246 Land UAW 173757 L 39 n1 n1 Note The largest ai eld in Q is the power set 739 of all subsets of Q The smallest ai eld is L Q Q I Terminology A set A E L is said to be measurable L l Note We often begin with a class of sets7 say 017 which may not be a eld or a ai eld l De nition 114 The 07 eld generated by 0 0a is the smallest 07 eld containing a or the intersection of all 07 elds containing a l Note i Such 07 elds containing 0 always exist eg 739 and ii the intersection of an arbitrary of 07 elds is always a 07 eld Proof ii Suppose L Le We have to show that conditions and ii of Def 111 and iV of Def 112 are ful lled 196L9 V19gtQ L ii LetAeLgtA L9 V0gtACEL9 V0gtACEL 1VLetAn LVngtAnEL9 V0VngtUAn L9 V0gtUA L I TL TL Example 115 9 0717273lva 075 07071 What is 00 00 must include 9 Q 0 also 123 by 111 ii Since all unions are included we have 0a 9 Q 0 1 23 What is 01 01 must include 9 Q 0 0 1 also 123 23 by 111 ii 023 by 111 iii 1 by 111 ii Since all unions are included we have 01 9 Q 0 1 0 1 23 0 23 1 2 3 I Note lf 9 is nite or countable we will usually use L lf 1 Q l n lt 00 then 1 L l 2 lf 9 is uncountable 739 may be too large to be useful and we may have to use some smaller 07 eld l De nition 116 lf 9 R an important special case is the Borel ai eld7 ie7 the ai eld generated from all halfiopen intervals of the form a7 b7 denoted B or 81 The sets of B are called Borel sets The Borel ai eld on Rd Ed is the ai eld generated by didimensional rectangles of the form 17z277zdlailtmi bii12d I Note 00 1 8 contains all points m 7 7 m 774 ri1 closed intervals m y 44 tl U open intervals z y 714 7 9024 m ivlo 00 and semifin nite intervals z oo U z z n l ri1 Note We now have a measurable space 97L We next de ne a probability measure on 97L to obtain a probability space 9 L7 P l De nition 117 Kolmogorov Axioms of Probability A probability measure pm7 P7 on 97L is a set function P L a B satisfying 10 PA VA 6 L ii PQ 1 0 0 0 iii If 147331 are disjoint sets in L and U A e L then P U A Z PAn nil ri1 ri1 Note 00 U An 6 L holds automatically if L is a ai eld but it is needed as a precondition in the case ri1 that L is just a eld Property iii is called countable additivity l 12 Manipulating Probability Theorem 121 For P a pm on Q L7 it holds 1 PQ 0 ii PAC17 PA VA 6 L iii PA 1 VA 6 L iv PA o B PA PB 7 PA m B VA B e L v If A g B then PA PB Proof i LetAnQ V71 UAnQ L n1 Al O Aj Q O Q Q Vij A are disjoint V71 Plt gt Pltfj Am De im i PltA gt i W n1 n1 n1 This can only hold if PQ 0 ii Let A1 AA2 AQAn 21 V71 2 3 Q UAnA1UA2U 614141UA2UQ n1 n3 A1 Ag A1 Q Ag Q Q A17A27Q are disjoint 1 PQ MU An n1 Def117iii ZPULJ n1 PA1 PA2 i PM n3 Thiil PA1 PA2 PM PAC gtPAC 17PA VA 6 L iii By Th 121 ii PA17PAC gt PA 1 VA 6 L since PAC 2 0 by Def 117 iV A U B A O BC U A O B U B O AC SO7 A U B can be written as a union of disjoint sets A m B0 A m B B m AC Theorem 121 iv gtPAUB PA BCUA BUB AC Def39l 397 PA m BC PA m B PB m AC PAOBCPAOBPB ACPA B7PA B PAmBCPAmB PB ACPA B7PA B Def17iii PA PltBgt 7 PA B V B B O AC U A where B O AC and A are disjoint sets Theorem 121 v 3CD PB PB m A0 U A Deflymz39 P A A B m A0 PA PB 7 PB m AC gtP gtP 1 A PB since PB AC 2 0 by Def 117 Theorem 122 Principle of InclusioniExclusion Let A17A27A E L Then P U Ak ZPltA1 7 Z PAk1 Ak2 Z PAk1 Ak2 Ak3i 71n1P Ak k1 k1 k1ltk2 k1ltk2ltk3 k H Proof n 1 is trivial n 2 is Theorem 121 iv Use induction for higher 71 Homework l Note A proof by induction consists of two steps First7 we have to establish the induction base For example7 if we state that something holds for all noninegative integers7 then we have to show that it holds for n O Similarly7 if we state that something holds for all integers7 then we have to show that it holds for n 1 Formally7 it is suf cient to verify a claim for the smallest valid integer only However7 to get some feeling how the proof from n to n 1 might work7 it is sometimes bene cial to verify a claim for 17 27 or 3 as well In the second step7 we have to establish the result in the induction step7 showing that some thing holds for n 17 using the fact that it holds for n alternatively7 we can show that it holds for 71 using the fact that it holds for n 7 1 l Theorem 123 Bonferroni7s Inequality Let A17A27A E L Then n n n i1 iltj i1 i1 Proof Right side PU A EMA i1 i1 lnduction Base For n 17 the right side evaluates to PA1 PAl7 which is true Formally7 the next step is not required However7 it does not harm to verify the claim for n 2 as well For n 27 the right side evaluates to PA1 U A2 PA1 PA2 13141 u A2 LEW PA1 PA2 7 13141 m A2 PA1 PA2 since 13141 m A2 2 0 by Def 117 This establishes the induction base for the right side Induction Step We assume the right side is true for n and show that it is true for n 1 n1 n PU Ai PUAiUAn1 13971 13971 my M A PltAn1gt 7 M A m Aw i1 i1 Def117i S PltUA1PAn1 H LB S M3 PW PAn1 H H E E H TL 71 Left side ZPA 7 ZPA m 14 g P U A i1 iltj i1 Induction Base For n 1 the left side evaluates to PA1 PA1 which is true For n 2 the left side evaluates to PA1 PA2 7 PA1 A2 PA1 U A2 Which is true by Th 121 iv For n 3 the left side evaluates to 7 A2 7 A3 7 A3 U A2 U This holds since PA1 U A2 U A3 U A2 U A3 LEW PA1 u A2 PA3 PA1 U 142 A3 PA1 0 A2 PA3 7 PA1 A3 0 A2 O 143 Th3911 gtPA1 PA2 7 PA1 m A2 PA3 7 13141 m A3 7 PA2 m A3 PA1 A3 A2 1413 PA1 PA2 PA3 7 PA1 A2 7 PA1 A3 7 PA2 A3 PA1 A2 A3 Def117i Z PA1 PA2 PA3 7 PA1 A2 7 PA1 A3 7 PA2 A3 This establishes the induction base for the left side Induction Step We assume the left side is true for n and show that it is true for n 1 n1 n PU Ai PUAiUAn1 i1 39 V L HO A PltAn1gt 7 M A m Aw i1 153839 2PM 7 lm m A PltAn1gt 7 MD A m Aw i1 iltj i1 n1 n n ZPltA gt7PAi Aj7PUAi An1 i1 iltj i1 Th123 right side 1 gt 7 Aj 7 An1 i1 iltj i1 n1 n1 i1 iltj Theorem 124 B001e7s Inequality Let A713 6 L Then i PA B 2 PA PB 71 ii PA B 2 17PAC 713130 Proof Homework De nition 125 Continuity of sets For a sequence of sets 1473311 A E L and A 6 L7 we say 1A TAifAlgAggAggaHdA UAW ii AniAifAlgAngggandA An Example A 1 2 7 i Theorem 126 If An1An E L and A 6 L7 then PAn PA if 125 or 125 ii holds Proof Part 1 Assume that 125 holds Let B1 A1 and Bk Ak 7 Ak1 Ak m A5571 Vk 2 2 By construction7 Bl Bj Q for i 7 j 00 00 ItisA UA U13 n1 n1 TL 71 andalsoA UAi UB1 i1 i1 PltAgt PltU Bk Def 1170 ZPltBk k1 k1 k1 Def 117 m C C gym BM gngom Aw igngoPmn k1 w H 1 11 The last step is possible since A U Ak kil Part ii Assume that 125 ii holds 00 00 Then A 949 gAg g andAC Any DeM W m U AS n1 n1 C ByPirtU i 0 PM 7 735gme C 7 7 A C So17PA 71 gtPA lim km13 Th 121 lim PAn Theorem 127 i Countable unions of probability 0 sets have probability 0 ii Countable intersections of probability 1 sets have probability 1 Proof Part 1 Let Anff1 6 L7 PAn 0 Vn Def 1171 00 0 0 0 PU Agni 2PM 200 n1 n1 n1 Therefore P U An O n1 Part ii Let 147331 e L PA 1 vn Th 1 13mg 0 V71 70 P U Ag 0 gt P U Aggy 1 n1 n1 0 P An 1 n1 13 Combinatorics and Counting For now we restrict ourselves to sample spaces containing a nite number of points Let Q w1 WW and L 739 For any A e LPA Z Pw ijA De nition 131 We say the elements of Q are equally likely or occur with uniform probability if Pwjl Vj1n I n Note If this is true PA W 7 need to be able to count elements accurately Therefore to calculate such probabilities we just Theorem 132 Fundamental Theorem of Counting If we wish to select one element 04 out ofm choices a second element a2 out of 712 choices and so on for a total of k elements there are n1gtltn2gtltn3gtltgtltnk ways to do it Proof By Induction Induction Base k 1 trivial k 2 n1 ways to choose a1 For each 712 ways to choose a2 Total ofways n2n2 n2 m X712 m times Induction Step Suppose it is true for k 7 1 We show that it is true for k k 7 1 1 There are m x 712 x 713 x x 7114 ways to select one element 04 out of m choices a second element a2 out of 712 choices and so on up to the k 7 1 element 014 out of 7114 choices For each of these 711 x 712 x 713 x x 7114 possible ways we can select the kth element 04 out of nk choices Thus the total of ways m x 712 x 713 x x 7114 x nk I De nition 133 For positive integer n we de ne n factorial as 71 nx 7171 x 7172 x x 2x1 nx 7171 and O 1 l De nition 134 For nonnegative integers n 2 r we de ne the binomial coe icient read as n choose r as 71gt n1 nn71n72nir1 717177 123r 39 I Note A useful extension for the binomial coef cient for n lt r is n inn710nir170 r 7 12r 739 I Note Most counting problems consist of drawing a xed number of times from a set of elements eg7 17 237 4 57 To solve such problems7 we need to know i the size of the set7 71 ii the size of the sample7 r iii whether the result will be ordered ie7 is 17 2 different from 21 and iV whether the draws are with replacement ie7 can results like 11 occur7 Theorem 135 The number of ways to draw r elements from a set of n if i ordered7 without replacement7 is ii ordered7 with replacement7 is M iii unordered7 without replacement7 is iV unordered7 with replacement7 is 171 Proof i n choices to select 19 n 7 1 choices to select 2 n 7 r 1 choices to select rth By Theorem 132 there are n x n7 1 x x n7r 1 n3 ways to do so nXn71gtltgtltn739r1gtltn739r 7 n77 7 Corollary The number of permutations of 71 objects is 71 n choices to select 19 A V n choices to select 2 n choices to select rth By Theorem 132 there are n x n x x n nT ways to do so 7 times iii V We know from above that there are n3 ways to draw r elements out of n elements without replacement in the ordered case However for each unordered set of size r there are r related ordered sets that consist of the same elements Thus there are nifty ways to draw r elements out of n elements without replacement in the unordered case iv There is no immediate direct way to show this part We have to come up with some extra motivation We assume that there are n 7 1 walls that separate the n bins of possible outcomes and there are r markers If we shake everything there are n 7 1 r permutations to arrange these 71 7 1 walls and r markers according to the Corollary Since the r markers are indistinguishable and the n7 1 walls are also indistinguishable we have to divide the number of permutations by r to get rid of identical permutations where only the markers are changed and by n 7 1 to get rid of identical permutations n7139r n71gt where only the walls are changed Thus there are W ways to draw r elements out of n elements with replacement in the unordered case Theorem 136 The Binomial Theorem If n is a noninegative integer7 then Proof By Induction Induction Base 71011m0ilt2gtx7ltggtx01 O 1 lt1gtZltigtklt gtm0ltigtmm 70 Induction Step Suppose it is true for k We show that it is true for k 1 1 26 1 zk1 22 k Here we use Theorem 138 Since the proof of Theorem 138 only needs algebraic transformations without using the Binomial Theorem7 part of Theorem 138 can be ap plied here I Corollary 137 For integers n it holds lt1 ggtggt2n n20 n 37ltfgtlt gtiltggtnltilgtnltzgt0 n21 in172ltggt3ltggtnltgtngw n20 lt1vgt1ltgt2ltggt3lt Proof Use the Binomial Theorem i Let x 1 Then for n 2 0 2n 1 1 36 21 i 70 0 ii Let x 71 Then for n 1 0 0n ThiS Z 71 i 0 iii 51 x i 25 70 V L gt 1 n71 77 Til n z Zr ltTgtz Substitute z 17 then for n 2 0 n2 1 n11 1 in 71 iv Substitute z 71 in iii aloove7 then for n 2 2 7L 0 nlt1lt71gtgtn1 Zr AW T 71 Since for E ai 0 also 2704 07 it also holds that Theorem 138 For noninegative integers n m r it holds m ltigt ltgt lt1 WCgtCZgtWCgt m gt iii Proof Homework 14 Conditional Probability and Independence So far we have computed probability based only on the information that Q is used for a probability space QLP Suppose instead we know that event H E L has happened What statement should we then make about the chance of an event A E L 7 De nition 141 Given 9 L P and H E LPH gt O and A E L we de ne PA O H PA1H W 7 PHA and call this the conditional probability of A given H l Note PAH is unde ned if PH O l Theorem 142 In the situation of De nition 141 9 L PH is a probability space Proof If PH is a probability measure it must satisfy Def 117 i PH gt 0 and by Def 1171 PA H 2 0 PHA 32251 0 VA 6 L H P 0 H P H 11 PHQ 7QIgt ing 1 iii Let Anf1 be a sequence of disjoint sets Then 00 An O H PHlt U A Def141 1 n1 00 Pm m H Def1i7iii 7 PH w n1 D5 141 00 f E PHA n1 Note What we have done is to move to a new sample space H and a new ai eld LH L O H of subsets A H for A E L We thus have a new measurable space H7 LH and a new probability space H7LH7PH l Note From De nition 1417 if A7 B E LPA gt 07 and PB gt 07 then PA m B PAPBlA PBPAlB which generalizes to the following Theorem I Theorem 143 Multiplication Rule n71 IfA1An e L and P Aj gt0 then j1 n n71 P Aj PA1PA21A1PA31A1 A2 PAnl Aj j1 j1 Proof Homework l De nition 144 A collection of subsets An il of 9 form a partition of 9 if i UAW Q and n1 ii Ai Aj Q Vi 344 j ie7 elements are pairwise disjoint Theorem 145 Law of Total Probability lfH71 is a partition of Q and PHj gt 0 Vj then7 for A 6 L7 PltAgt PltA m H iPHjPAng j1 7391 Proof By the Note preceding Theorem 1437 the summands on both sides are equal gt the right side of Th 145 is true The left side proof Hj are disjoint A O Hj are disjoint AAm DBQ4 4AmU Hi UmmHj j1 j1 spm PUA HDef1391397iii ZPAOH 71 j1 Theorem 146 Bayes7 Rule Let Hj1 be a partition of Q and PHj gt 0 Vj Let A E L and PA gt 0 Then PHj A OOPHjPAlHj VJ Z PHnPAlHn n1 Proof Def141 PHa39 m A PM 39PHle PHj 39PAlHj gtPHj A PHjPEAHj Th zl 00 PHjPAHj 39 I Z PHnPAlHn n1 De nition 147 For A7 B 6 L7 A and B are independent iff PA O B PAPB l Note 0 There are no restrictions on PA or PB o lfA and B are independent7 then PAlB PA given that PB gt 0 and PBlA PB given that PA gt 0 o If A and B are independent7 then the following events are independent as well A and B0 A0 and B A0 and BC De nition 148 Let A be a collection of Lisets The events of A are pairwise independent iff for every distinct 1417142 6 A it hOldS 7 A2 I De nition 149 Let A be a collection of Irsets The events of A are mutually independent or completely independent iff for every nite subcollection Al17 7Aik 7 Aij E A it holds k k P Aij PM 71 71 Note To check for mutually independence of 71 events Al7 7A 6 L7 there are 2 7 n 7 1 rela tions ie7 all subcollections of size 2 or more to check I Flip a fair coin twice 9 HH7HT7TH7 TT A1 H on 1st toss77 A2 H on 2nd toss77 A3 Exactly one H77 Obviously7 PA1 PA2 PA3 Question Are A17 A2 and A3 pairwise independent and also mutually independent PA1 A2 25 5 5 PA1 PA2 141142 are independent PA1 A3 25 5 5 PA1 PA3 A17A3 are independent PA2 A3 25 5 5 PA2 PA3 142143 are independent Thus7 1411427 A3 are pairwise independent PA1 A2 A3 0 7 5 5 5 PA1 PA2 PA3 A17A27A3 are not mutually independent I Example 1411 from Rohatgi7 page 377 Example 5 o r students 365 possible birthdays for each student that are equally likely 0 One student at a time is asked for hisher birthday o If one of the other students hears this birthday and it matches hisher birthday7 this other student has to raise hisher hand 7 if at least one other student raises hisher hand7 the procedure is over c We are interested in pk Pprocedure terminates at the kth student Pa hand is rst risen when the kth student is asked for hisher birthday 0 The textbook Rohatgi claims without proof that 1 p1 365 and 7 r7k1 7 77k pkltLP 1lt17L 1 17lt 365 k k23 365 1 365 365 7 k 1 wherenPTnn71n7r1 Proof It is p1 Pat least 1 other from the r 7 1 students has a birthday on this particular day 7 1 7 Pall r 7 1 students have a birthday on the remaining 364 out of 365 days 3647 1 1 7 7 365 p2 Pno student has a birthday matching the rst student and at least one of the other T 7 2 students has a b day matching the second student Let A E No student has a b day matching the 1Ststudent Let B E At least one of the other T 7 2 has b day matching 2nd So pg 7 PA m B PM 39PBlA Pno student has a matching b day with the 1Ststudent x Pat least one of the remaining students has a matching b day with the second7 given that no one matched the rst 1 7 p11 7 Pall r 7 2 students have a b day on the remaining 363 out of 364 days lt gt114 H lt 1117663 Q 22 363 364 364 365 365 7 1 365 1 1 m 7 365 365 364 WPH 17271 21 17 36572 7 2 3652 1 365 365 7 2 1 Formally7 we have to write this sequence of equalities in this order However7 it often might be easier to rst work from both sides towards a particular result and combine partial results afterwards Here7 one might decide to stop at with the forward direction of the equalities and rst work backwards from the loook7 which makes things a lot simpler lt WPH gt lt 2 71gtH1 lt 365 7 2 gt74 292 7 1E 7 1E 3652 1 365 365 7 2 1 7 365 1 1 H 1 363 7 2 7 7 7 7 7 1 39rE1 17 7E2 7 365 364 We see that this is the same result as Now let us consider p3 p3 PNo one has same b day as rst and no one same as second7 and at least one of the remaining r E 3 has a matching b day with the 3rd student Let A E No one has the same b day as the rst student Let B E No one has the same b day as the second student Let C E At least one of the other T E 3 has the same b day as the third student Now p3 PA O B O C PAPBlAPClA B 11 1 7 17 362 365H 364 1 7E2 363 1 E Pall r E 3 students have a 10 day on the remaining 362 out of 363 days 23 3647 1 3637 2 3647 3657 1 lt gt ltgtltgti1 lt gt lt 1 ltgtH T 1 ltgtH gt 1 2 1 773 Once again7 working backwards from the book should help to better understand these 72 T 3652 365P2 T 365364gt 363 365 transformations For general pk and restrictions on r and k see Homework l 2 Random Variables Based on CasellaBerger Sections 14 15 16 amp 21 21 Measurable Functions De nition 211 o A random variable rv is a set function from Q to R o More formally Let QLP be any probability space Suppose X Q a R and that X is a measurable function then we call X a random variable 0 More generally le Q a Bk we call X arandom vector X X1w X2w Xkw What does it mean to say that a function is measurable De nition 212 Suppose QL and 88 are two measurable spaces and X Q a S is a mapping from Q to S We say that X is measurable L 7 B if X 1B E L for every set B E B where X 1BweQXweB I Example 213 Record the opinion of 50 people yes y or no Q All 250 possible sequences of yn 7 HUGE l L 739 i Consider X Q a S All 250 possible sequences of 1 y and 0 n B 738 X is a random vector since each element in S has a corresponding element in Q for B E BX 1B E L ii Consider X Q a S 0 1 2 50 where Xw of y s in w is a more manage able random variable A simple function ie a function that takes only nite many values m1 zk is measurable iff X71i E L Vlz Here X 1k w E Q y s in sequence w k is a subset of 9 so it is in L l Example 214 Let Q in nite fair coin tossing space ie in nite sequence of H s and T s Let Ln be a a7 eld for the 1st 71 tosses 00 De ne L a U L Let Xn Q 7n sfbe Xnw proportion of H s in 1st 71 tosses For each n is simple values 0 i g and X17kl 6 Ln Vk O 1 n Therefore X1 E L So every random variable is measurable L78 Now we have a sequence of rv s Xn f 1 We will show later that Pw Xnw 7 1 ie the Strong Law of Large Numbers SLLN I Some Technical Points about Measurable Functions 215 Suppose Q L and S B are measure spaces and that a collection of sets A generates B ie 0A B Let X Q 7 S If X 1A E L VA 6 A then X is measurable L 7 B This means we only have to check measurability on a basis collection A The usage is B on R is generated by 700 m z E B 216 If QL Q L and 9 L are measure spaces and X Q 7 9 and Y 9 7 Q are measurable then the composition YX Q 7 Q is measurable L 7 L 217 If f Bi 7 Bk is a continuous function then f is measurable Bi 7 Bk 218 If fj Q 7 Bj 1 k and g Bk 7 R are measurable then gf1 fk is measur able The usage is 9 could be sum average difference product nite maximums and minimums of 1 zk etc 219 Limits Extend the real line to 700 00 B U 7007 00 We say 1 Q a R is measurable L 7 B if i f 1B E L VB 6 B and ii f7100f 1oo E L also 2110 Suppose f1 f2 is a sequence of realivalued measurable functions 97L a R7 8 Then it holds i sup fn supremum7 inf fn in mum7 lim sup fn limit superior7 and lim inf fn 7H Hoe Hoe Hoe limit inferior are measurable ii If f lim fn exists7 then f is measurable Hoe iii The set w fnw converges E L iV If f is any measurable function7 the set w fnw a fw E L Example 2111 i Let It holds 1 sup W 7 mace Z1 12th 0 0 313081113an 0 lim inf fn 0 Hoe 0 lim fnz lim sup fnz lim inf fnz 0 Hoe Hoe Hoe ii Let O7 otherwise 3 m 6 77171 It holds 3 m 7 m 2 71 0 sup Mm e Laoe otherw1se m3 m lt 1 o inf z 7 7 H00 fn O7 otherwise 0 lim sup fnz 3 Hoe hm mf Mm m3 Hoe hm Mm hm sup Mm hm mth m3 TLHOO TLHOO Lace iii Let 71 mg m 6 77171 Mm lt gt e 0 otherw1se 7 It holds sup fnz z 3 Hoe if 3 g M96 7 m 0 lim supfnz z 3 Hoe h 7 7 3 1 1nf fnz H z 0 lim fnz lim sup fnz lim inf fnz 0 if z 07 but lim fnz does not Hoe Hoe Hoe Hoe exist for z 7 0 since sup fnz 7 inf fnz for z 7 0 22 Probability Distribution of a Random Variable The de nition of a random variable X 97L a 88 makes no mention of P We now introduce a probability measure on 87 8 Theorem 221 A random variable X on 9 L7 P induces a probability measure on a space R B Q with the probability distribution Q of X de ned by QB PX 1B Pw Xw e 3 VB 6 8 Note By the de nition of a random variable7 X 1B E L VB 6 B Q is called induced proba bility Proof If X induces a probability measure Q on R7137 then Q must satisfy the Kolmogorov Axioms of probability X 97L HSB X isarv X 1BwXw BA L VBEB Def117i Z i QB PX 1B Pw Xw E B PA 0 VB 6 B Xim Def1717ii 1 n W PltX1ltJRgtgt Pltngt iii Let 137331 e 2131 m B 2 Vi 7g j Then mfj Bo PltX1ltfj B 2 PfjX 1BnDef391 397W i PltX1ltBgtgt i c209 n1 n1 n1 n1 n1 5 holds since X 1 commutes with unionsintersections and preserves disjointedness De nition 222 A realivalued function F on 7007 00 that is nonidecreasing7 righticontinuous7 and satis es F7oo 0Foo 1 is called a cumulative distribution function cdf on R l Note No mention of probability space or measure P in De nition 222 above I De nition 223 Let P be a probability measure on B B The cdf associated with P is FM FPM P00790l Pw 3 XW S 9d PX S 90 for a random variable X de ned on B B P l Note de ned as in De nition 223 above indeed is a cdf Proof of Note Let 1 lt 2 gt 7007 1 C OO72 Th121v S gtFx1PwXw z1 PwXw x2Fz2 Thus7 since 1 lt 2 and Fm1 Fz27 is nonidecreasing Since F is non decreasing7 it is suf cient to show that is righticontinuous if for any sequence of numbers zn a z which means that ml is approaching z from the right Withmlgtm2gtgtmngtgtmFmngtFm A V Let A w Xw E E L and An l Q None of the intervals z zn contains x As zn a ml the number of points w in An diminishes until the set is empty Formally7 71 00 lim An lim Ai An Hoe TLHOOZI1 1 n By Theorem 126 it follows that Tigngopmn 139313014 PM 0 It is PltAngt Paw Xltwgt mm 7 Paw Xltwgt m Fm 7 FM 731330 Fm e M ggngoltFltzngt 7 m gigon 0 gt lim gt is righticontinuous iii Fn Defv223 Pw Xw lt in Few ggngoFen JLHgOPw3Xw n PggngowiXw n PM 0 gt gigon Xltwgt n ngngjw Xltwgt n 139 1 Note that iii and iv implicitly use Theorem 126 In iii7 we use An 700771 where An 3 An and Ale In iv7 we use An 70071 where An C An and An T B l De nition 224 If a random variable X Q a R has induced a probability measure PX on B B with cdf Fz7 we say i rV X is continuous if is continuous in m ii rV X is discrete if is a step function in z Note There are rvs that are mixtures of continuous and discrete rvs One such example is a trun cated failure time distribution We assume a continuous distribution eg7 exponential up to a given truncation point z and assign the remaining probability to the truncation point Thus7 a single point has a probability gt 0 and jumps at the truncation point m l De nition 225 Two random variables X and Y are identically distributed iff PXX EA PyY EA VAEL I Note Def 225 does not mean that Xw Yw Vw E 9 For example7 X H in 3 coin tosses Y T in 3 coin tosses XY are both Bin37 057 ie7 identically distributed7 but for w HHTXw 2 7 1 Yw7 ie7 X 7 Y I Theorem 226 The following two statements are equivalent i X7 Y are identically distributed ii Fym Vm E R Proof 1 i ii Fxm PXOO7zl Pw Xw 6 70090 Wig Paw M 6 man PYltgt07 ml FYW ii a i Requires extra knowledge from measure theory I 23 Discrete and Continuous Random Variables We now extend De nition 224 to make our de nitions a little bit more formal De nition 231 Let X be a realivalued random variable with cdf F on 9 L7 P X is discrete if there exists a countable set E C R such that PX E E 1iePwXw E 1 The points ofE which have positive probability are the jump points of the step function F7 ie7 the cdf of X 00 De ne pi Pw Xw mhzi E PXX Vi 1 Then7pi 2 0211 i1 We call pi pi 2 0 the probability mass function pmf also probability frequency function of X l Note 00 Given any set of numbers pn17pn gt 0 V71 2 17 Z pn 17 pn1 is the pmf of some rv n1 X l Note The issue of continuous rv s and probability density functions pdfs is more complicated A rv X Q a B always has a cdf F Whether there exists a function f such that f integrates to F and F exists and equals 1 almost everywhere depends on something stronger than just continuity l De nition 232 A realivalued function F is continuous in mo 6 B iff Vegt0 36gt0 Vm lximollt6 FziFm0llte F is continuous iff F is continuous in all z E R l De nition 233 A realivalued function F de ned on 11 is absolutely continuous on 11 iff V6 gt 0 36 gt 0 V nite subcollection of disjoint subintervals ah bii 1 n n 21 7 a lt a a i l FUN 7 Fltaigt llt e i1 i1 Note Absolute continuity implies continuity l Theorem 234 i If F is absolutely continuous then F exists almost everywhere ii A function F is an inde nite integral iff it is absolutely continuous Thus every abso lutely continuous function F is the inde nite integral of its derivative F De nition 235 Let X be a random variable on QLP with cdf F We say X is a continuous rv iff F is absolutely continuous In this case there exists a noninegative integrable function f the probability density function pdf of X such that 1 Fa ftdt PX 35 From this it follows that if a b E R a lt b then b PXa lt X g b Fb 7 Fm ftdt 1 exists and is well de ned l Theorem 236 Let X be a continuous random variable with pdf 1 Then it holds i For every Borel set B E BPB ftdt B ii IfF is ah 01mph and f is r at x then was dfl z Proof Part 1 From De nition 235 above Part ii By Fundamental Theorem of Calculus l Note As already stated in the Note following De nition 224 not every rv will fall into one of these two or if you prefer 7 three 7 ie discrete continuousabsolutely continuous classes However most rv which arise in practice will We look at one example that is unlikely to occur in practice in the next Homework assignment However note that every cdf F can be written as aFdz 17 aFcm 0 g a g 1 where Fd is the cdf of a discrete rv and F0 is a continuous but not necessarily absolute continuous cdf Some authors such as Marek Fisz Wuh L 39 quot 1 y und L quot L Statistik VEB Deutscher Verlag der Wissenschaften Berlin 1989 are even more speci c There it is stated that every cdf F can be written as ale95 1ch95 ast7 1170127013 2 07011 a2 a3 1 Here Fdm and Fcz are discrete and absolute continuous cdfs F9z is called a singu lar cdf Singular means that F9z is continuous and its derivative F z equals 0 almost everywhere ie everywhere but in those points that belong to a Borel7measurable set of probability 0 Question Does continuous but not absolutely continuous mean singular 7 We will hopefully see later l Example 237 Consider 0 xlt0 1 2 0 Fa 7 3 12z2 0ltzlt1 1 21 We can write as aFdm 17 aFcz0 a g 1 How Since has only one jump at z 0 it is reasonable to get started with a pmf p0 1 and corresponding cdf 0 m lt 0 1 m 2 0 EM 7 35 Since 0 for z lt 0 and 1 for z 2 17 it must clearly hold that Fcz 0 for z lt 0 and Fcz 1 for z 2 1 In addition increases linearly in 0 lt z lt 1 A good guess would be a pdf fcm 1 01m and corresponding cdf O7 x 0 Fc z 0ltzlt1 1 x21 7 Knowing that F0 127 we have at least to multiply Fdm by 12 And7 indeed7 can be written as 1 1 7Fdm 7Fcm 2 2 I De nition 238 The twoivalued function Az is called indicator function and it is de ned as follows Az1imeAandI mOifz AforanysetA l An Excursion into Logic When proving theorems we only used direct methods so far We used induction proofs to show that something holds for arbitrary 71 To show that a statement A implies a statement B ie A a B we used proofs of the type A A1 A2 a An1 An B where one step directly follows from the previous step However there are different approaches to obtain the same result Basic Operators Boolean Logic makes assertions on statements that can either be true represented as 1 or false represented as 0 Basic operators are not and A or V implies a equivalen 7 ltgt and exclusive or77 These operators are de ned as follows ABH Al BlAABlAVBlAjBlAQBlAeBB 1 1 0 0 1 1 1 1 0 1 0 0 1 0 1 0 0 1 0 1 1 0 0 1 1 0 1 0 0 1 1 0 0 1 1 0 lmplication A implies B A a B is equivalent to B A is equivalent to A V B AlBHAjBl Al Bl Bj l AVB 1 1 1 0 0 1 1 1 0 0 0 1 0 0 0 1 1 1 0 1 1 0 0 1 1 1 1 1 Equivalence A is equivalent to B A ltgt B is equivalent to A a B A B a A is equivalent to A V B A A V B A 1413 BAl AVB A Bl AB AV B A A 1 1 1 0 0 0 0 1 HHOH A 1 1 1 0 0 0 1 1 Negations of Quanti ers The quanti ers for all77 V and it exists77 3 are used to indicate that a statement holds for all possible values or that there exists such a value that makes the statement true respectively When negating a statement with a quanti er this means that we ip from one quanti er to the other with the remaining statement negated as well ie V becomes 3 and becomes V Vm E X Bm is equivalent to Hz 6 X Bz m E X Bm is equivalent to Vz E X Bz HzEXVyEY BzyimpliesVy Y3m X Bmy 24 Transformations of Random Variables Let X be a realivalued random variable on QLP ie X QL a E18 Let g be any Borelimeasurable realivalued function on R Then by statement 216 Y gX is a random variable Theorem 241 Given a random rariable X with known induced distribution and a Borelimeasurable function 9 then the distribution of the random variable Y gX is determined Proof FYy PYY S y P Pw Xw 6 By where By 971700 E 8 since g is Borelimeasureable Note From now on we restrict ourselves to realivalued vectorivalued functions that are Boreli measurable ie measurable with respect to E18 or BREW More generally PyY E C PXX 6 940 VC 6 B l Example 242 Suppose X is a discrete random variable Let A be a countable set such that PX E A 1 andPXmgt0Vm A Let Y gX Obviously the sample space of Y is also countable Then pyo y Z PXX z Z PXX z Vy e 9A werluyh w 9wy Example 243 X N U71 1 so the pdf of X is fXm 12I11z which according to De nition 238 reads as fXm 12 for 71 g x g 1 and 0 otherwise m 2 0 otherwise 0 LetYX 7 7 Then7 07 y lt 0 127 y 0 F Y lt 7 y PY y 12y2 0ltylt1 17 y 2 1 This is the mixed discretecontinuous distribution from Example 237 I Note We need to put some conditions on g to ensure gX is continuous if X is continuous and avoid cases as in Example 243 above I De nition 244 For a random variable X from 9 L7 P to R7137 the support of X or P is any set A E L for which PA 1 For a continuous random variable X with pdf 1 we can think of the support of X as X z fXm gt 0 l De nition 245 Let f be a realivalued function de ned on D Q B D E B We say 1 is nonidecreasing if z lt y gt fz y Vm y E D f is strictly nonidecreasing or increasing if z lt y gt fz lt y Vzy E D f is noniincreasing if z lt y gt fz 2 y Vzy E D f is strictly noniincreasing or decreasing if z lt y gt fz gt y Vzy E D f is monotonic on D if f is either increasing or decreasing and write 1 T or f l l Theorem 246 Let X be a continuous rv with pdf g and support X Let y gz be differentiable for all z and either g m gt 0 or ii g m lt 0 for all m Then7 Y gX is also a continuous rv with pdf 71 d 71 WW fx9 24 9 y l Ig2cy Proof Part 1 g m gt 0 Vz E X So 9 is strictly increasing and continuous Therefore7 z g 1y exists and it is also strictly increasing and also differentiable It holds that 1 d S 71 9 lmg 1ygt gt We get Fyy PYY S y PY9X S y PxX S 9 1y Fx9 1y for y 6 9X and7 by dil llerentiation7 d S ain ue S S My F3994 FX9 120 By Ch R1 fx9 1Q 9 19 Part ii g m lt 0 Vz E X So 9 is strictly decreasing and continuous Therefore7 z g 1y exists and it is also strictly decreasing and also differentiable It holds that 71 d d g lty gym lmg 1ygt lt 0 We get Fy PyY S y PygX S y PXX 2 WM 17 PXX S g 1y 1 7 FXg 1y for y E 9X and7 by dil llerentiation7 My Fwy d1FX9 1y ifx9 1y9 1y fx9 1ylt9 1ygt Since g l y lt 07 the negative sign will cancel out7 always giving us a positive value Hence the need for the absolute value signs Combining parts and ii7 we can therefore write My fx9 1y gray l ngty Note In Theorem 2467 we can also write 1304 d 7y 6 9X l d1 l wg 1y If g is monotonic over disjoint intervals7 we can also get an expression for the pdfcdf of Y gX as stated in the following Theorem I Theorem 247 Let Y gX where X is a rV with pdf fXm on support X Suppose there exists a partition 140141 Ak of X such that PX 6 A0 0 and fXm is continuous on each Ai Suppose there exist functions 91z7 7914 de ned on A1 through Ak respectively7 satisfying 1 995 91490 V90 6 Ah ii 91 is monotonic on Ai iii the set 3 giAi y y for some z 6 Al is the same for each i 17 7 k and iv g1y has a continuous derivative on 3 for each i 17 7 k Then7 k d fyy fo91y gilw Iyy i1 Example 248 Let X be a rV with pdf fXz 1035 Let Y sinX What is fyy7 Since sin is not monotonic on 07 7r7 Theorem 246 cannot be used to determine the pdf of Y Example 248 fxX 900 SinX 27t O n 0 sinquoty n2nisinquoty 1 Two possible approaches Method 1 cdfs For0ltylt1wehave since 0 g X sin 1y Fifty fYy PyY S y PXsinX y PX0 S X S sin 1y10r7r sin 1y X S 7r FXSin71y and 7139 7 sin 1 Fwy Fx7r Sin lw y g X g 7139 are disjoint sets Then7 1 i1 fXSin 1yTy2 1fx7r 7 sin lw W ltfXltSin 1y fX7r 7 sin 1ygt W 7T2 71392 1 W 27139 WW W Method 2 Use of Theorem 247 Let A1 07 g A2 g77r and A0 Let 9I1y sin 1yand 9 1y W sin 1y It is diggiw 1 H iii927 and y 01 Thus7 by use of Theorem 2477 we get fYy 2 fo91y 1 19m 1 my i1 dy 2sin 1y 2 27139 isin 1y 2 1 1 W I01y W I01y y 7139 27139 1 F PM 0 7T 2 1 7r if 019 Obviously7 both results are identical Theorem 249 Let X be a rV with a continuous cdf and let Y FXX Then7 Y N U01 m We have to consider two possible cases a FX is strictly increasing7 ie7 FXz1 lt FXz2 for 1 lt 27 and b FX is nonidecreasing7 ie7 there exists 1 lt 2 and FXz1 Assume that 1 is the in mum and 2 the supremum of those values for which FXz1 FXz2 holds Theorem 249 3 b FxX gtltg 3ltnu M l M X l X Fx V In a7 F 1y is uniquely de ned In lo7 we de ne F 1y infm 2 y Without loss of generality Fglu 00 if FXz lt1 Vz 6 R and F 10 700 if FXz gt 0 Vz 6 R For Y FXX and 0 lt y lt 17 we have ETHWW W PX FEW FXFE1y y At the endpoints7 we have PY S y 1 if y 2 1 and PY S y 0 if y g 0 But Why is true 7 ln a7 if FX is strictly increasing and continuous7 it is certainly 90 F 1FX ln lo7 if FXz1 FXz2 for 1 lt z lt 27 it may be that 7 m But by de nition7 ml Vz E 172 holds since on 1727 it is PX z PX ml Vz E 172 The at cdf denotes FXz2 7 FXz1 Pm1 lt X 2 0 by de nition I Note This proof also holds if there exist multiple intervals with M lt mi and FXmj7 ie7 if the support of X is split in more than just 2 disjoint intervals I 3 Moments and Generating Functions 31 Expectation Based on CasellaBerger Sections 22 amp 23 and Outside Material De nition 311 Let X be a real valued rV with cdf FX and pdf fX if X is continuous or pmf g and support X if X is discrete The expected value mean of a measurable function of X is 00 gmfXmdz7 if X is continuous 00 E9X Z gmfxm if X is discrete zeX if gX lt oo otherwise EgX is unde ned7 ie7 it does not exist I Example X N Cauchy7 fXm lbw foo lt z lt oo 2 0 95 1 200 EleO mdz uoguw o oo So7 EX does not exist for the Cauchy distribution l Theorem 312 If EX exists and a and b are nite constants7 then EaX 1 exists and equals aEX 1 Proof Continuous case only Existence ElaXbl 0 lazblezdz mawzmbwfxwz lal lzlfxmdmlblLfxmdm lalElelbl lt oo Numerical Result 00 EaXb ambfXzdz foo aLmezdmbLfXxdx aEXb Theorem 313 le is bounded ie7 there exists a M7 0 lt M lt 007 such that Pl X llt M 17 then EX exists I De nition 314 The kth moment of X7 if it exists7 is mk The kth central moment of X7 if it exists7 is pk EX 7 l De nition 315 The variance of X7 if it exists7 is the second central moment of X7 ie7 VarX EX 7 EX2 I W VarX EX2 7 EX2 Proof VarX EX EX2 EX272XEXEX2 EX2 7 2EXEX EX2 EX2EX2 I Theorem 317 If VarX exists and a and b are nite constants7 then VaraX 1 exists and equals aZVarX Proof Existence amp Numerical Result VaraX b E aX b 7 EaX 132 exists if E aX b 7 EaX 132 exists It holds that EltHWXM7EWXMVD Egmxw7mew VaraX b EaX 22 7 EaX b2 Eg x22aux77aprbV a2EX2 2abEX b2 7 a2EX2 7 2abEX 7 b2 a2EX2 7 EX2 aZVarX T x E 0 T H93 H93 H92 H E 7 T x H m a 9 H92 Equot O lt 00 since VarX exists Theorem 318 If the if moment of a rV X exists7 then all moments of order 0 lt s lt if exist Proof Continuous case only Em X F 7 lz l9 fxltzgtdz l w l9 fxltzgtdz AgykmmAmWthm Pltlxl1gtEltlXVgt lt00 Theorem 319 If the if moment of a rV X exists7 then lim ntP X lgt n 0 Hoe Proof Continuous case only 00 gt R l m it fxzdm 7 lim lm it fxzdm W0 lwlsn lim lx it fXzdx 0 gtn But lim mtxmdmgtlimnt xmdzlimntP Xgtn0 I W W lf H W W H M lt l gt Note The inverse is not necessarily true ie if ntP X lgt n 0 then the if moment of a rV X does not necessarily exist We can only approach 25 up to some 6 gt 0 as the following Theorem 3110 indicates l Theorem 3110 Let X be a rV with a distribution such that ntP X lgt n 0 for some t gt 0 Then Elesltoo V0ltsltt I Note To prove this Theorem we need Lemma 3111 and Corollary 3112 l Lemma 3111 Let X be a noninegative rV with cdf F Then 00 EX 17 FXzdz 0 if either side exists Proof Continuous case only To prove that the left side implies that the right side is nite and both sides are identical we assume that EX exists It is 00 71 EX lim 0 Hoe 0 Replace the expression for the right side integral using integration by parts Let u z and dv fXmdm then nmemdm 7 man 13 7 nFXxdz 0 0 7 nFXn 7 0FX0 7 A FXzdz nFXn7nn7OnFXmdm 7 nFXn 77170711 7FXmdm 7 nFXn71 70317 FXxdm 7 77417 FXn 70317 Fania 7nPX gt n 70711 7 FXxdm X20 7n1PX gt nn17 Fania 0 gtEX11Lngo lt7n1PX gtnOn17FXmdxgt Th319 0 On17 FXdm 0017FXzdm 0 00 Thus the existence of EX implies that 1 7 FX is nite and that both sides are 0 identical We still have to show the converse implication co lf 1 7 is nite then EX exists ie X EX lt 00 and both sides are identical It is n X gt0 n 0 Wow 7 0 zfXzdz7n17FXnO 17FXxdz as seen above Since 7n17 g 0 we get A m fxzdm On17 FXxdm A0117 FXxdm lt 00 V71 Thus 71 CO CO lin1 m fXx m mam 17 FXxdm lt 00 H00 0 0 0 CO gt EX exists and is identical to 1 7 as seen above I 0 50 Corollary 3112 E0 X is 8 Wm X lgt my Proof 5 Lemma 3111 00 00 5 MW Alli xwz dz Pltlegtzgtdz Let 2 ys Then if 33494 and dz sys ldy Therefore7 CO CO Pl X 19gt 2mg Pl X 19gt msys ldy 0 0 00971 X9 9d 80y 1 lgtyy monotonicT 00 51 8 0 y Plegtydy Proof of Theorem 3110 For any given 6 gt 07 choose N such that the tail probability Pl X lgt n lt 5 00 n7 VnZN Em X is 8 Wm X lgt my N 00 3 yHPu X lgt gnawN y5 1Plegt my N 00 E sy9 11dy 3 ysiljdy 0 N y N 11 y9 lo 86 249 jdy N y 00 Nsse ysilitdy N It is mycdy 24011 1N3 71 1112413107 671 007 c 2 71 ichjNHlltoo7 clt71 Thus7 for X F lt 007 it must hold that s 7 1 it lt 71 or equivalently7 s lt 25 So X F lt 007 ie7 it exists7 for every 3 with 0 lt s lt t for a rV X with a distribution such that lim ntP X lgt n 0 for some t gt O l Hoe Theorem 3113 Let X be a rV such that i Pl X lgt ak l 0 V 1 k3 Pl X lgt k a gt Then7 all moments of X exist Proof 0 For 6 gt 07 we select some k0 such that Pl X lgt ak lt Vk gt k Pl X lgt k E 7 0 0 Select In such that Pl X lgt k lt e Vk 2 In 0 Select N maxk07 k1 o If we have some xed positive integer r Pl X lgt Mk 7 Pl X lgt ak Pl X lgt azk Pl X lgt agk Pl X lgt 047k Pl X lgt k 7 Pl X lgt k Pl X lgt ak Pl X lgt azk Pl X lgt M lk Pm X lgt cm M X lgt a am M X lgt a W m X lgt a We 7 Pl X lgt k Pl X lgt 1 ak Pl X lgt 1 04 Pl X lgt 1o 1k 0 Note Each of these T terms on the right side is lt e by our original statement of selecting some k0 such that W lt e Vk 2 kg and since 04 gt 1 and therefore Wk 2 kg 0 Now we get for our entire expression that W ET for k 2 N since in this case also k2k0andozgt1 o Overall7 we have Pl X lgt 04 PH X lgt k g H11 for k 2 N since in this case also k 2 k1 o For a xed positive integer n 00 N 00 El X in COTE nzn1plegt mdm nzn 1Plegt mdm nzn 1P X lgt mdm 0 0 N c We know that N N nznilP X lgt zdm nxn71dm m 15quot N lt oo 0 0 but is co nmn 1Plegtxdz lt oo 7 N c To check the second part7 we use 00 00 MN m 1Pl X lgt dx Z zn 1Plegt mdm N 710 71N 52 c We know that OWN OWN m 1Pl X lgt zdm ET mnildm OW lN OW lN This step is possible since ET 2 Pl X l oW lN 2 Pl X lgt z 2 Pl X l OWN Vz 60W 1N70WN and N n1axk07 k1 0 Since of lNW l mn l OWNYL l Vz E oW lN70WN7 we get OWN OWN ET mnildm 670WNYL 1 1dm 67OWNn71oWN eWoWN BLT1N BLT1N 0 Now we go back to our original inequality co co OWN 00 00 x 1Pl X lgt zdm ZET mnildm Z 670WN N 26 04 N 71 O T71N 71 71 N n 1 6a if ea lt 1 or7 equivalently7 if e lt 7 1 7 604 OWL N71 7L 0 Since 1 San is nite7 all moments X exist I 7 ea 32 Generating Functions Based on CasellaBerger Sections 23 amp 24 De nition 321 Let X be a rv with cdf FX The moment generating function mgf of X is de ned as MXt E6tX provided that this expectation exists for t in an open interval around 07 ie7 for 7h lt t lt h for some h gt O l Theorem 322 If a rv X has a mgf MXt that exists for 7h lt t lt h for some h gt 07 then dquot M Wm WM lee Proof We assume that we can differentiate under the integral sign lf7 and when7 this really is true will be discussed later in this section d d 00 MX Elmetwfxmdm 00 a Lmaetmfxxdx 00 zemezdm EXe X Evaluating this at t 07 we get Mxt lt0 EX By induction7 we get for n 2 2 d d dnil 7M t 7 M t dt X dt dtn l X gt i 00 mnilemfx x dt 700 Emmetwwz 8t 00 mnetmfxxdz 00 EXnetX Evaluating this at t 07 we get Mxt lt0 EX l Note We use the notation fm t for the partial derivative of f with respect to t and the notation t for the ordinary derivative of f with respect to t l Example 323 X N Uab where a lt b fXz Iabz Then7 b etm etb 7 Sta M t d 7 1170m tb7a 0 MX0 5 L7Ht pital betb 7 act 7 b 7 a 70 tb 7 ta So MX0 1 and since is continuous7 it also exists in an open interval around 0 in 7 a fact7 it exists for every t E B bath 7 aetatb 7 a 7 etb 7 emb 7 a 7521 7 a2 7family 7 aeta 7 5th 7 eta 2521 7 a EX M340 L7Ht pital bath 7 act tbzetb 7 tazem 7 bell7 act 7 2tb 7 a thEtb 7 taZEta 2tb 7 a t0 7045 21 7 a bZEtb 2 ta t0 2 702 21 7 a b a Note In the previous example we made use of L7Hospital7s rule This rule gives conditions under which we can resolve inde nite expressions of the type and i i0 ice i Let f and g be functions that are differentiable in an open interval around me say in 0 7 6z0 6 but not necessarily differentiable in me Let fz0 gz0 0 and gx 73 0 v1 6 mo 7 6 350 6 7 9 Then 1112 if A implies that also a o g x 1111 ix A The same holds for the cases lim fz lim 9m 00 and m A 3 maze g maze mama or z a ma ii Let f and g be functions that are differentiable for z gt 1 1 gt 0 Let i i i i f z i i mlLIIgofx mlLIIgog 0 and mhangogm 31 0 Then mango gm A implies that also lim M A 4100 9m iii We can iterate this process as long as the required conditions are met and derivatives exist eg if the rst derivatives still result in an inde nite expression we can look at the second derivatives then at the third derivatives and so on iv It is recommended to keep expressions as simple as possible If we have identical factors in the numerator and denominator we can exclude them from both and continue with the simpler functions 077 v lnde nite expressions of the form 0 gt077 can be handled by rearranging them to W and lim m can be handled by use of the rules for lim 41700 9m 14100 97 I Note The following Theorems provide us with rules that tell us when we can differentiate under the integral sign Theorem 324 relates to nite integral bounds 10 and 30 and Theorems 325 and 326 to in nite bounds l Theorem 324 Leibnitz7s Rule lf fm010 and 30 are differentiable with respect to 0 for all z and foo lt 10 lt b0 lt 00 then d 179 0d b00db0 00d 0 blt9gt8 0d E 09 m gt z 7 f lt gt gt7 lt gt7 fa gt W gt 19 We gtz The rst 2 terms are vanishing if 10 and 30 are constant in 0 56 Proof Uses the Fundamental Theorem of Calculus and the chain rule I Theorem 325 Lebesque7s Dominated Convergence Theorem 00 Let g be an integrable function such that gmdz lt 00 If l fn lg 9 almost everywhere foo ie7 except for a set of Borelimeasure 0 and if fn a 1 almost everywhere7 then fn and f are integrable and I Note If f is differentiable with respect to 0 then 8 7 i f7057f70 WW 35 and a f 06 f 0 0 0 i z 7 z 00 fm0dmiioo gnj 6 dm While d f 0 6 f 0 0 i 0 m 7 m Eioo m zitlsgnjim 6 dm I Theorem 326 Let fnz7 00 W for some 00 Suppose there exists an integrable function gz such that 00 gmdz lt 00 and l fnz7 0 lg gz Vm then 1 0 0 8 E mfltm0gtdml HO 00 Wow we dz Usually7 if f is differentiable for all 0 we write 1 0 0 8 100 fm0dzm fm0dm Corollary 327 Let m 0 be differentiable for all 0 Suppose there exists an integrable function gz7 0 such 00 that gm0dz lt 00 and aie zg l990 gz0 Vz V00 in some eineighborhood of 0 00 then d 00 00 8 100 fm0dzm fm0dm More on Moment Generating Functions Consider 8 Eemfx W lmlet wfxm for we 60 Choose 2560 small enough such that 25 60 6 77171 and t 7 60 6 77171 or equivalently lt60 llt71andlt760 llt 71 Section 32 MGF 1 50t50 1 l l l I l l h1h h 0 h h1h Then 8 Eetme ltt 9W7 where t60zc gt 0 m e X 7 7 97t7 75 m me one zlto To verify fgmtdz lt 00 we need to know Suppose mngXt exists for l 25 lg 71 for some 71 gt 1 where 7171 2 71 Then l 2576071 llt 71 and l t7 60 71llt 71 Since l x lg elml Vm we get 5t6o1mcXm7 2 0 6 601wfx907 z lt 0 MDS co 0 Then gztdm MXt 60 1 lt 00 and gztdm MXt 7 60 71 lt 00 and 0 700 00 therefore gmdz lt oo 700 Together with Corollary 327 this establishes that we can differentiate under the integral in the Proof of Theorem 322 If 71 g 1 we may need to check more carefully to see if the condition holds Note If MXt exists for t 6 7h h then we have an in nite collection of moments Does a collection of integer moments mk k 1 2 3 completely characterize the distri bution ie cdf of X 7 Unfortunately not as Example 328 shows I Example 328 Let X1 and X2 be rv s with pdfs rm explte ltlogzgt2gt Rowe and mm ix 25 1 sinlt2vr Iogz mm It is 5722 for r O 1 2 as you have to show in the Homework Two different pdfscdfs have the same moment sequencel What went wrong In this example MX1 25 does not exist as shown in the Homeworkl l Theorem 329 Let X and Y be 2 rv s with cdf s FX and Fy for which all moments exist i If FX and Fy have bounded support then Fyu Vu iff EXT E0 for T 0 1 2 ii If both mgf s exist ie MXt My 25 for t in some neighborhood of 0 then Fyu Vu Note The existence of moments is not equivalent to the existence of a mgf as seen in Example 328 above and some of the Homework assignments l Theorem 3210 Suppose rv s X3221 have mgf s and that lim MXt Vt 6 77171 for some h gt 0 and that MXt itself is a mgf Then thereegsts a cdf FX whose moments are deter mined by MXt and for all continuity points z of it holds that lim FXz ie the convergence of mgf s implies the convergence of cdf s lace M Uniqueness of Laplace transformations etc I Theorem 3211 For constants a and b7 the mgf of Y aX b is Mm ethxat given that MXt exists Proof Myt EeltaXbgt EeaXtebt etheXat ethXat 33 ComplexiValued Random Variables and Characteristic Functions Based on CasellaBerger Section 26 and Outside Material Recall the following facts regarding complex numbers Section 33 Complex Numbers imaginary aib real a i0 1i xiln z 712 3 iii4 1 etc in the planar Gauss ian number plane it holds that i 07 1 z a Mb rcos gt isin gt r i z i W tan gt Euler s Relation z rcos gt isin gt rem Mathematical Operations on Complex Numbers 21 i 22 04 i a2 ib1 i 2 21 22 T1T26i 1 2 r1r2cos gt1 g isin gt1 gt2 gawk eltcoslt gt1e M isinlt gt1 e M r cosn gt isinn gt 2 W 77 ltcos 3927r isin 3927 gt for k 071771 71 and the main value is obtained for k 0 Moiure s Theorem 2 rcos j isin gt lnz lna ib lni z 221 iik 27139 where j arctan g k 07 i1 i2 7 and the main value is obtained for k 0 Note Similar to real numbers where we de ne xZl 2 while it holds that 22 4 and 722 4 the nth root and also the logarithm of complex numbers have one main value However if we read nth root and logarithm as mappings where the inverse mappings power and exponential function yield the original values again there exist additional solutions that produce the original values For example the main value of xjl is 2 However it holds that i2 71 and 702 712i2 i2 71 So all solutions to xjl are i 72 Conjugate Complex Numbers For 2 a ib we de ne the conjugate complex number 5 a 7 2b It holds NH 2 z iffz R 21i22ZiE 212271E De nition 331 Let Q L P be a probability space and X and Y realivalued rv s ie X Y Q L a 138 i Z X HY QL a 39 Em is called a complexivalued random variable GLrv ii If EX and EY exist then EZ is de ned as EZ EX E d I Note EZ exists iff X and Y exist It also holds that if EZ exists then lEZ lS EU Z l see Homework l De nition 332 Let X be a realivalued rv on 07 L7 P Then7 ltIgtXt R a G39with ltIgtXt EeitX is called the characteristic function of X l Note 00 1 co 00 i ltIgtXt emmxm costmfXmdm 239 sintzfXxdz 700 700 00 if X is continuous ii ltIgtXt Z emPX z Z costzPX z Z sintzPX z zeX zeX zeX if X is discrete and X is the support of X iii ltIgtXt exists for all realivalued rv s X since l em l 1 Theorem 333 Let I X be the characteristic function of a realivalued rv X Then it holds i ltIgtX0 1 ii l ltIgtXt l 1 Vt E R iii ltIgtX is uniformly continuous7 ie7 V6 gt 0 36 gt 0 Vthtz E R l 251 7 t2 llt 6 l ltIgtt1 7 ltIgtt2 llt 6 iv ltIgtX is a positive de nite function7 ie7 V71 6 W V041 70 E d V251 tn 6 R 7L 7L ZZOQOTJ39QXOQ 7 25739 Z 0 l1j1 V 1gtXt Edit vi If X is symmetric around 07 ie7 if X has a pdf that is symmetric around 07 then ltIgtXt e B Vt e B vii rumba eitbltIgtXat Proof See Homework for parts i7 ii7 iv7 v7 vi7 and vii Part iii Known conditions i Letegt0 ii 3agt0P7altXltagtligandPle2a i iii 36gt0 lel l tlmil K Vmst lzlltaand Vt 7tst Oltt 7tlt6 This third condition holds since l 510 71 l 0 and the exponential function is continuous Therefore7 if we select 25 7 t and z sn1all enough7 l amt4W 7 1 l will be lt 5 for a given 6 Let tt 6 1R tlt t and t 7 t lt 6 Then 7 7 00 lt m 7 00 ltm l IgtXt 1Xtl 7 l 6 fxd 6 fxdl 00 00 00 lt m ltm l e 7e gtfxltzgtdzl 00 7a I a I 00 I l eithemvmdm eithemvmdm eithemvmdzl 00 7a a 7a lt m ltm a lt m ltm l e 7e gtfxltzgtdzll e 7e gtfxltzgtdzl 00 04 00 lt mi ltm d H 6 6 fxm 9M a We now take a closer look at the rst and third of these absolute integrals It is 04 I 04 I 04 l eltkeltw zml l eltmezdxi e tmexdzl 700 700 00 7a lt m 7a ltm g l e fXzdzll e mam 700 700 7a lt m 7a ltm we lfxltzgtdz is mm 00 00 A 04 04 l 1fX90d90 1fxzgtdz 700 700 7a 2fxmdx 00 A holds due to Note iii that follows De nition 332 64 Similarly7 00 lt m ltm 00 l e 7e gtfxltzgtdzi 2fxltzgtdz a 0 Returning to the main part of the proof7 we get a a 00 I 7 I 2 d lt m 7 ltm d 2 d lx Mwl xmf mmM7 evnmm ajmmz 2 00 l7aelt z 7 eltmfX1d1 HerwHme mnmmi Candi ion ii a I 2 2 M7M 7 Wmel a g emwrhohmw a 7 elt1elt itm 7 l A a 3 wmwwWWAMkmw B a lt 1fXmdm B holds due to Note iii that follows De nition 332 and due to condition iii l Theorem 334 B0Chner7s Theorem Let I B a G39be any function with properties i7 ii7 iii7 and iv from Theorem 333 Then there exists a realivalued rV X with ltIgtX I l Theorem 335 Let X be a realivalued rv and EXk exists for an integer k Then7 ltIgtX is k times differen tiable and 905 ikEXke X in particular for t 0 it is ltIgt gt0 ikmk I Theorem 336 Let X be a realivalued rv with characteristic function ltIgtX and let ltIgtX be k times differentiable7 where k is an even integer Then the kth moment of X7 mk exists and it is by 0 ikmk l Theorem 337 Levyls Theorem Let X be a realivalued rv with cdf FX and characteristic function PX Let 071 6 R a lt b If PX a PX b O7 ie7 FX is continuous in a and b7 then 1 co fita 7 fitb Fb 7 Fat Xtdt E 00 it Theorem 338 Let X and Y be a realivalued rv with characteristic functions ltIgtX and by If ltIgtX by then X and Y are identically distributed l Theorem 339 00 Let X be a realivalued rv with characteristic function ltIgtX such that l ltIgtXt l dt lt 00 Then X has pdf 7 1 fXz g A e ithwt Theorem 3310 Let X be a realivalued rv with mgf MXt7 ie7 the mgf exists Then ltIgtXt l Theorem 3311 Suppose realivalued rv s XiBil have cdf s and characteristic functions PXZ lf lim ltIgtXit ltIgtXt Vt 6 77171 for some h gt 0 and ltIgtXt is itselfa characteristic function 05 X with cdf FX then lim Hoe for all continuity points z of FXz7 ie7 the convergence of characteristic functions implies the convergence of cdf s l Theorem 3312 Characteristic functions for some welliknown distributions Distribution ltIgtX t i X N Diracc em ii X N Bin1p 1peit 71 iii X N Poissonc expceit 7 1 iv X Ultai 1 52217522 v X N N0 1 exp7t22 Vi X N Nia2 5W exp7a2t22 Vii X N TQM viii X N Ezpc 17 girl ix X N Xi 17 220 Proof i lt1gt Xt EeitX emPX c em 1 ii ltIgtXt Z 5MPX k eit017 p 7 amp 1peit 71 k0 iii Xt 2 Sim 55 0 5 0 i0 1 1 5 5 0 ic39eit edeit l nEWo n0 i 90 m SlHCe E j 5 0 774 b 1 b in 1 eitm eitb 7 eita 1V Xlttgti biaa e dmi bia 7 bia t a V X N N01 is symmetric around 0 gt ltIgtXt is real since there is no imaginary part according to Theorem 333 Vi 5 1 itz 52 1 7 e e 7 X V 27139 700 V 27139 700 Since EX exists7 ltIgtXt is differentiable according to Theorem 335 and the following holds 2 costme 2 dz dm vii P Xt Relt 3clttgtgt Re 00 n a eig dm TOO costmi sintm W 00 1 7x2 00 1 7x2 Re Amimcostzme 2 dzLm izsintm e 2 dz 1 00 7x2 712 7 isin tm zerz u 7tcos tm andv7e 2 glt5 74 l H Lwgtc Lflt 7 Sin z e 700 if 27139 27139 700 0 since sin is odd itcosm7e 2 dm 1 co 42 725 L00 costmerz tQX me Thus ltIgt Xt 7tltIgtXt It follows that get ln 1 ltIgtXt 7 c with c 6 JR hm it and by integrating both sides we Fort O we know that ltIgtX0 1 by Theorem 333 and ln 1 ltIgtX0 l 0 It follows that 0 0 0 Therefore 0 0 and l ltIgtXt l 572 lfwe take if 0 then ltIgtX0 1 by Theorem 333 Since ltIgtX is uniformly continuous ltIgtX must take the value 0 before it can eventually take a negative value However since 42 5 2 gt 0 Vt E R ltIgtX cannot take 0 as a possible value and therefore cannot pass into the negative numbers So it must hold that ltIgtXt 572 Vt E R For a gt 0 E B we know that if X N N0 1 then 0X 1 u N Nu02 By Theorem 333 vii we have mmwmeW W 00 IgtXt 0 Emma qi zdz 00 P elm q mpileiqmdm 0 UP qp p7157q7itmd 0 UP iq 7 it p ooq 7 itmpileiqiitmq 7 itdm u q 7 itm du q 7 itdm 0 Pp qp 4 7p 00 177 7M q7zt 0 15 du FP 1617275 7175 q7z7 viii Since an Ezpc distribution is a N1 0 distribution7 we get for X N Ezpc F1 c 71 it lt1gt t 177 Xo lt 6 ix Since a xi distribution for n E W is a Hg distribution7 we get for X N xi me a 2 12 ltIgtXt 7 17 W2 7 17 22 t 2 Example 3313 Since we know that m1 EX and m2 EX2 exist for X N Bin1p7 we can determine these moments according to Theorem 335 using the characteristic function It is I ltIgtXt 1p6quot1 PX mequot 34 in lt1gt 0 75m 7 X 7 1 7 pEltXgt Z Z Px M26 340 7 m2 PB0 m2 2 gtm2 i2 7i2pEX VaNX EX2EX2p7p2p1p l Note 00 The restriction l ltIgtXt l dt lt oo in Theorem 339 works in such a way that we don t 700 end up with a noniexisting pdf if X is a discrete rV For example7 o X N Diracc m lltIgtxlttgt l dt which is unde ned 0 Also for X N Bin1p l ltIgtXlttgt l dt gt gt which is unde ned for p 7 12 lfp 127 we have 00 It lpelip71ldt 700 which also does not exist 00 eltc dt 700 co hit 700 00 l1pelt71ldt 0 I pempemdt 0 I col pe wwm dt 00 1 co pewte lp1ldt 00 00 00 p 00 219 7 1t lilo 00 1dt717p 1m 00 00 12 len1ldt 00 00 12 lcostisint1ldt 00 00 12 cost 1 sint2 dt 00 00 12 xcos2t2cost1sin2tdt 00 00 12 V22costdt foo Otherwise X N N01 ltIgtXt w lt 00 exp7t22dt m1 x27r 00 1 V 27r exp7t22z 34 Probability Generating Functions Based on CasellaBerger Section 26 and Outside Material De nition 341 Let X be a discrete rv which only takes noninegative integer values7 ie7 pk PX k7 and 00 Zpk 1 Then7 the probability generating function pgf of X is de ned as k0 Cs Zpksk k0 I Theorem 342 Cs converges for 1 s g 1 Proof 00 00 1981 Z1pk8k1 21pk11 l k0 k0 Theorem 343 Let X be a discrete rv which only takes noninegative integer values and has pgf G Then it holds PX k iikGs 150 kl dsk Theorem 344 Let X be a discrete rv which only takes noninegative integer values and has pgf Cs If EX exists7 then it holds d M 368 191 I De nition 345 The kth factorial moment of X is de ned as EXX71X72Xik1 if this expectation exists I Theorem 346 Let X be a discrete rV which only takes non7negative integer values and has pgf Cs lf EXX 71X 7 2 X 7 k 1 exists7 then it holds EXX71X72X7k17 dk Mae 191 Note Similar to the Cauchy distribution for the continuous case7 there exist discrete distributions where the mean or higher moments do not exist See Homework l Example 347 Let X N Poissonc with k012 ltis It follows GdS 67062609 C16 3 From Theorem 3437 we get 1 dk 1 1 1 gyms 1907 7e 190 7 k k klei 06k 73 From Theorem 3447 we get EX 1 d8 3 51 6706609 51 0 From Theorem 3467 we get 1 EX2 7 EX EXX 71 2Gs Fl 50225 91 0 13 It follows VarX M e M M 7 MW c c 7 c2 c 2 3 5 Moment Inequalities Based on CasellaBerger Sections 36 38 and Outside Material Theorem 351 Let hX be a noninegative Borelimeasurable function of a W X If EhX exists7 then it holds PhX gt e w Vegt0 Proof Continuous case only 00 EhX hmfxd Am A Mm A efxmdm ePhX Z 6 V6 gt 0 fXsz AC hzfXzdz where A 35 W 2 e I fxmdm I Therefore PhX 2 e w vs gt 0 I Corollary 352 Markov7s Inequality Let hX X V and e kT where r gt 0 and k gt 0 If X V exists7 then it holds EH X V mm W Proof Since PO X 2 k PH X V k for k gt 07 it follows using Theorem 351 Th351 T Pm X 2 k Pm X V H Corollary 353 Chebychev7s Inequality Let hX X 7 u2 and e kzaz where EX u VarX 02 lt 007 and k gt 0 Then it holds i Pltwxwzkagtk2 Proof Since Pl X 7 u 2 k0 Pl X 7 1 2 202 for k gt 0 it follows using Theorem 351 Th351 El X7 2 VarX 02 1 lt 7 7 kzgz 202 kzgz Po xw l 1w Pm xw F W Note For k 2 it follows from Corollary 353 that 1 PlX7ullt20217272075 no matter what the distribution of X is Unfortunately this is not very precise for many distributions eg the Normal distribution where it holds that Pl X 7 u llt 20 z 095 l Theorem 354 Lyapunov7s Inequality Let 0 lt n X lt 00 For arbitrary k such that 2 g k g n it holds that 1 Bk71m S Bk 7 Eu X We Eu X W wee Proof Continuous case only Let Quv Eltu l X l u l X 7392 where 1 j k71 Obviously by construction Qu v 2 0 Vuv E B Also QWU 7 u l e M l e lzfxltzgtdz 700 7200 lx W fXmdz2uvOO lx V fXzdmv2OO lx W1 fXxd 700 700 700 uz jil 271123 1125741 Z 0 Vuv E R Note that for a binary quadratic form A B m QFWJW z y Amz QBxy Cyz B C y it holds that QF is positive semide nite ie QFzy 2 0 Vzy E B iff A gt 0 and AC7B2 2 0 Here we have by construction that Qu v 2 0 with A 3771 gt O B j gt O and C 3H1 gt 0 Therefore it must hold AC 7 B2 3 0 57216741 7 62 2 0 gt 572 S 51715741 gt 63quot 654641 This means that 5 g 5052 53 g 5125 553 g 5352 and so on until 5ka 3535134 Multiplying these It 7 1 inequalities7 we get kil 239 kil I I H 5 3 H 5771 1 j1 j1 Bo z f i ij iif l k72 k72 kil 239 BOBkil k H 3739 j1 kiz Dividing both sides by H 3727 we get j1 613512 o l liif 511371 S 5171 kil 1 7 gt Bkil kk A 33 S 6 5 holds since 30 X lo E1 1 m o It follows from Theorem 354 that EU X W S EU X l212 For X N Diracc c gt 07 with PX c 17 it follows immediately from Theorem S M X W S S M X WV 3312 1 and Theorem 335 that mk EXk ck So Eltl X l EX ck and ck1k c Elek1k EXk1k Therefore7 equality holds in Theorem 354 4 Random Vectors 41 Joint Marginal and Conditional Distributions Based on CasellaBerger Sections 41 amp 42 De nition 411 Thevector X X17 7X on 07 L7 P a R de ned byiw X1w7 Xnw w E Q is an nidimensional random vector nirv if X l w X1w a1 Xnw an E L for all nidimensional intervals I 17zn foo lt xi ahai E B Vi 1771 l Note It follows that if X17 Xn are any n rv s on Q7L7P7 then X X17 Xn is an nirv on QLP since for any I it holds 141 w X1wXnw e I w X1w 11 Xnw an u Xkw ak k1 6L x 6L I De nition 412 For an nirv X a function F de ned by Famp PX PX1 17Xn 2 Vg 61R is the joint cumulative distribution function joint cdf of X l Note i F is nonidecreasing and righticontinuous in each of its arguments mi 11 Hogimmaoo 1 and mklirnQQFamp 0 Vm17k17k177mn E However7 conditions and ii together are not suf cient for F to be a joint cdf Instead we need the conditions from the next Theorem I Theorem 413 A function Fz17 zn is the joint cdf of some new X iff i F is nonidecreasing and righticontinuous with respect to each m ii F7oom2mn Fm17oom3mn Fm1mn17oo Oand Foooo17 and iii Vg E R Vei gt DJ 17 n the following inequality holds 71 Fg ZF15177i71Ei717i7i1Ei1w7n5ngt i1 Z F16177i71Ei717i7i15i17w 1Siltj n j1 Ej717j7j1 Ej17 7m En q 1nF Z 0 Note We won t prove this Theorem but just see Why we need condition iii for n 2 Theorem 413 13901 ltX m27y1 ltY y2 PX 9027Y y2PXS90171 Sy2PX m27Y y1PX m17YSm20 I Note We will restrict ourselves to n 2 for most of the next De nitions and Theorems but those can be easily generalized to n gt 2 The term bivariate rV is often used to refer to a 2FV and multivariate rV is used to refer to an 7rrV7 n 2 2 l De nition 414 A 2FV X7Y is discrete if there exists a countable collection X of pairs that has probability 1 Let pij PX mi7Y yj gt 0 Wm yj E X Then7 2p 1 and pH is hi the joint probability mass function joint pmf of X7Y l De nition 415 Let X7Y be a discrete 2FV with joint pmf plj De ne co co 1 2p m yj j1 j1 CO CO pj Em ZPX 93 yj PY w i1 i1 Then is called the marginal probability mass function marginal pmf of X and pj is called the marginal probability mass function of Y I De nition 416 A 2FV X7 Y is continuous if there exists a noninegative function f such that w 2 My m u dv du m y 6 IR 00 00 where F is the joint cdf of X7 Y We call 1 the joint probability density function joint pdf of X7Y l Note If F is continuous at z7 y7 then 82F7y 8m 8y aw De nition 417 00 Let X7Y be a continuous 27rv with joint pdf 1 Then fXm m7 ydy is called the 700 00 marginal probability density function marginal pdf ofX and fyy fmydz is called the marginal probability density function of Y 700 l m i L dm Foo7oo WW dy fyydy and fXm 2 0 Vz E B and fyy 2 0 Vy E E ii Given a 27rv X7Y with joint cdf Fzy7 how do we generate a marginal cdf PX z 7 The answer is PX z PX 7700 lt Y lt oo Fm7 De nition 418 lfF m1 Han F g is the joint cdf of an nirv X X17 X 7 then the marginal cumulative distribution function marginal cdf of Xi17 7Xlk1 k g n 7 11 i1lti2ltltik nisgivenby lim Fxoooomi1oooomi2oo7007xik7007oo miaooii1ik i I Note In De nition 1417 we de ned conditional probability distributions in some probability space 07 L7 P This de nition extends to conditional distributions of 2irv s X7 Y I De nition 419 Let X7Y be a discrete 27rv lf PY 24739 pi gt 07 then the conditional probability mass function conditional pmf of X given Y yj for xed j is de ned as PX397Yy39 pquot piljPXilYyj lac3 7 Note For a continuous 27rv X7Y with pdf 1 PX x 1 Y y is not de ned Let 6 gt 0 and suppose that Py 7 e lt Y y e gt O For every x and every interval y 7 e y 6 consider the conditional probability of X g z given Y E y 7 e y e We have PX my7eltY ye P X lt 7 lt Y lt 7 JEly 6 y6 Plty7 lty yd which is well7de ned if Py 7 e lt Y y e gt 0 holds So7 when does lin1 PX zlY y7eyel 570 exist See the next de nition I De nition 4110 The conditional cumulative distribution function conditional cdf of a rv X given that Y y is de ned to be FX y l y EE1PX z 1 Y6 y7ey6 provided that this limit exists If it does exist7 the conditional probability density func tion conditional pdf of X given that Y y is any non7negative function fX y l y satisfying 1 FXY90 l y OO leYltt l 20 V90 6 R Note 00 For xed y fX yz 1y 0 and fX yz l ydz 1 So it is really a pdf l 700 Theorem 4111 Let X7 Y be a continuous 27rv with joint pdf ny It holds that at every point z y where f is continuous and the marginal pdf fyy gt 07 we have hm PX xY yE7ye 670 PltYelty7eyeigt m y5 i fxyu Ud U du 700 175 1 FXuQE l y 1 5351 95 Z fyvdv y 5 rmwwu fYy mk h 7 L00 d Thus7 fX y l y exists and equals fo y provided that fyy gt O Furthermore7 since 12 mmwmnmammw 00 we get the following marginal cdf of X FX fXYU7ydUgt dy fYyFXY l 20 Example 4112 Consider 2 0 0ltzltylt1 otherwise fXY7y 7 7 Example 4112 We calculate the marginal pdf s fXm and fyy rst 00 1 fXymydy Qdy 217 m for 0 lt m lt1 00 I and 00 y fyy fxyxydm Qdm 2y for 0 lt y lt1 700 0 The conditional pdf s fy Xy l x and fX y l y are calculated as follows fXYlt7y 2 1 gt 7 7 f 1 h 0 1 fy Xylz 184 217 17m orzltylt were ltmlt and f lt gt 2 1 XY 7 y f zy iifor0ltmlty where0ltylt1 XM l fyy 2y y Thus7 it holds that Y l X z N Um1 and X l Y y N U0y7 ie7 both conditional pdf s are related to uniform distributions l 42 Independent Random Variables Based on CasellaBerger Sections 42 amp 46 Example 421 from Rohatgi7 page 1197 Example 1 Let f1 f2 f3 be 3 pdf s with cdf s F17 F27F3 and let 1 Oz jg 1 De ne 12901202203 f11f22f33 1 a2F1112F2212F331 We can show i fa is a pdf for all 04 6 711 ii fa 71 g 04 g 1 all have marginal pdf s f1 f2 f3 See book for proof and further discussion 7 but when do the marginal distributions uniquely determine the joint distribution l De nition 422 Let Fgmz7 y be the joint cdf and and Fyy be the marginal cdf s of a 27rV X7Y X and Y are independent iff FX3z7 y Wm y E R2 I Lemma 423 If X and Y are independent7 11 07d 6 R and a lt b and c lt d7 then PaltX bcltY dPaltX bPcltY d Proof Pa lt X bc lt Y d Fxybd 7 Fxyad 7 Fxybc Fxyac FxbFYd FxaFYd FxbFYC FxaFYC FXb FXaFYd FYC PaltX bPcltY d I De nition 424 A collection of rV s X17 Xn with joint cdf F g and marginal cdf s are mutually or completely independent iff 7L Fgg H FXiai Vg e R i1 I Note We often simply say that the rV s X17 they are mutually independent 7Xn are independent when we really mean that I Theorem 425 Factorization Theorem i A necessary and suf cient condition for discrete rV s X17 that PltrggtPltX1z1 7Xn to be independent is 7XnxnHPXzi VgEX 9 where X C B is the countable support of X ii For an absolutely continuous H IV X X17 where f5 is the joint pdf and fX1 Proof i Discrete case 7Xn7 X17 7Xn are independent iff f g fX1Xn17 WM H ini7 i1 an are the marginal pdfs of X 7 Let X be a random vector Whose components are independent random variables of the discrete type with PX b gt 0 Lemma 423 can be extended to Polt1n Therefore7 ie7 holds Pa1ltX1 b1anltXn bn lim Pg lt X g b QM Pa1ltX1 b1anltXn bn lim ainiVi61n Pa1ltX1 b1 lim ainiVi61n PX1 b1PXn bn 71 B mil 1 mg m2 mm zn B E X We assume that holds Then it follows lt773 For n dimensions7 let g 17z27zn z zi17m27mm and Fag 2 Hi E LEE 2 PX1 M17 X2 90127 m7 Xn ESE e Z HPX39 9 263731 71 HPX39 9 wii wi 12 2S12 mimilgmnil wins j1 n71 Z Z Z Z PXj z PXn mm wii wi M23 mimilgwnil win wn 11 n71 Z Z Z PXj lt Z PXn wii wi 12 2S12 mimilgmnil j1 zin wn 3 7 1 PXj PXn71 9513771 lt Z PXn wii wi 12 2S12 mimilgmnil lt zin wn n72 1 739 mimilgmnil zin wn x H wii wi 12 2S12 11 zmsz 7L HFXjJ7 j1 ie7 X17 Xn are mutually independent according to De nition 424 ii Continuous case Homework Theorem 426 71 X17 Xn are independent iffPXi 6 Ai i 1 771 H PXi 6 Ai VBorel sets Ai E B i1 ie7 rv s are independent iff all events involving these rv s are independent Proof Lemma 423 and de nition of Borel sets I Theorem 427 Let X1Xn be independent rv s and 917gn be Borelimeasurable functions Then 91X1792X27 7gnXn are independent Proof F91X192X2m9nXnh17I127 7hn P91X1 E 71179209 E 7127 79nltXn S hn PX1 E 91717007 h1l7 39 39 39 7Xn E 971 7007 n Th zf PltXi E 9517oohil H 1 u 3 m 3 5 holds since 91717007 711 E B g17oohn E B l Theorem 428 If X17 Xn are independent7 then also every subcollection Xi17 Xik k 27 n 7 17 1 1 lt i2 lt ik n is independent I De nition 429 A set or a sequence of rv s X73311 is independent iff every nite subcollection is indepen dent l Note Recall that X and Y are identically distributed iff Fyz Vz E R according to De nition 225 and Theorem 226 I De nition 4210 We say that Xnf1 is a set or a sequence of independent identically distributed iid rv s if Xn1 is independent and all Xn are identically distributed l Note Recall that X and Y being identically distributed does not say that X Y with probability 1 If this happens we say that X and Y are equivalent rv s l Note We can also extend the de ntion of independence to 2 random vectors X and Y X and Y are independent iff F Xga F gFXa Vg E B This does not mean that the components X of X or the components Y of Y are independent However it does mean that each pair of components Xi are independent any subcollec tions Xi1 Xik and Yj1 Yj are independent and any Borelimeasurable functions fX and 9Y are independent I Corollary 4211 to Factorization Theorem 425 If X and Y are independent rv s then FXnQE l y FX V957 FYXy 190 FYW Vy 43 Functions of Random Vectors Based on CasellaBerger Sections 43 amp 46 Theorem 431 If X and Y are rV s on Q7L7P a B7 then i XiYisarV ii XY is a W iii If w Yw 0 2 then is a W Theorem 432 Let X17 7Xn be rV s on Q7L7P a R De ne MAX maXX17 7X77 X07 by MAXnw maXX1w7 7 Xnw Vw E Q and MIN minX17 7X77 X1 7 max7X17 7 7X77 by MINnQu minX1w7 7 Xnw Vw E Q Then7 1 MIN and MAX are rV s ii If X17 7 Xn are independent7 then 7L FMAX72 PMAX77 z PX7 z Vi177n H FX72 i1 and V L FM1N772PMINn z 17PXi gt zVi177n17 i1 iii If Xi1 are iid rV s with common cdf FX7 then FMAXTXZ FEW FMINT21717 lf FX is absolutely continuous with pdf fX then the pdfs of MAX and MIN are fMAXJZ nF 1zfxZ fMINTZ 71 1 FXZn7139fXZ for all continuity points of FX l Note Using Theorem 432 it is easy to derive the joint cdf and pdf of MAX and MIN for iid rv s X1 Xn For example if the Xi s are iid with cdf FX and pdf fX then the joint pdf of MAX and MIN is 901 7 fMAX MIN z y nltn 1 FX FxltygtgtH fxzfxy7 z gt y However note that MAX and MIN are not independent See Rohatgi page 129 Corollary for more details l Note The previous transformations are special cases of the following Theorem 433 I Theorem 433 If g B H R is a Borelimeasurable function ie VB E B g 1B E B and if X X1 Xn is an nirv then 9X is an mirv Proof lfB E B then w g w E B w Xhu E 971B E B l Question How do we handle more general transformations of X 7 Discrete Case Let X X1 Xn be a discrete nirv and X C R be the countable support of X ie PX X1andPXggt0 VgEX De ne ui gim1mni 17n to be litoil mappings of X onto B Let g 741 un Then 3 PQg P91X u1779nl Un PX1 hump7X7 nu V26 B where M higi 1 771 is the inverse transformation and PQ g 0 Vg B The joint marginal pmf of any subcollection of ms is now obtained by summing over the other remaining uj s Example 434 Let XY be iid Bmnp0 ltplt 1 Let U YA and V Y 1 ThenXUY1UVandYV71 SothejointpmfofU7Vis PltU U7 V U ltngtpuy ipgtniuvltvj1gtp117117pn17v U U n 77 uvv71 17 2n17u11711 Qggigp lt m forv 12n1anduv 01n l Continuous Case Let X X17 7X be a continuous nirV with joint cdf FX and joint pdf fg U1 91X Q E 9X E 7 Un ie7 Ul 94X be a mapping from B into B If B E B then 71 Pa 6 B H 6 943 new f 39 39 39 f f e Hm where g 1B g 17mn E R 9g E B Suppose we de ne B as the halfiin nite nidimensional interval Bg71177 n 7oolt iltui Vi1n 92 for any g E R Then the joint cdf of Q is GM Pa 6 Bi Parax u1gnltxgt an 9f 139 39 Bi game If G happens to be absolutely continuous7 the joint pdf of Q will be given by 8 u at ever continuit oint of fgltggt aulaug 81 y y p fg Under certain conditions7 we can write fg in terms of the original pdf fg of X as stated in the next Theorem Theorem 435 Multivariate Transformation Let X X17 7X be a continuous nirv with joint pdf f5 i Let U1 91X Q 3 MK 3 7 Un Ml ie7 Ul be a litoilimapping from B into B ie7 there exist inverses hi i 1771 such that M hiu1uni 17717 over the range of the transformation 9 ii Assume both 9 and h are continuous iii Assume partial derivatives 3 M 4 7 3 zj 17 771 exist and are continuous iv Assume that the Jacobian of the inverse transformation 3m 3m 8 m am Bun am Bun Jdet det g 39 8u17 71 3x 31 3x 31 Tm 73 Tm 73 is different from 0 for all g in the range of 9 Then the nirv Q 9X has a joint absolutely continuous cdf with corresponding joint pdf 122 l J l f h1 7 1742 m Let g E B and BHfL177anCOOltfLiltUi 93 Then7 g glth1ltggt77hnltggtgt l J l do The result follows from differentiation of GE For additional steps of the proof see Rohatgi page 135 and Theorem 17 on page 10 or a book on multivariate calculus l Theorem 436 Let X X17 7X be a continuous nirv with joint pdf f5 i Let U1 91X Q E 9X E 7 Un ie7 Ui be a mapping from B into E ii Let X g f g gt 0 be the support of X iii Suppose that for each g E B g E B g gg for some g E X there is a nite number k of inverses iv Suppose we can partition X into X07 X17 Xk st a P 6 X0 O 10 Q 9X is a litoilimapping from X onto B for all 1 17 Wk with inverse hz1g transformation hlg 3 7g 6 B7 ie7 for each g 6 B7 hlg is the unique hmg g E X such that g v Assume partial derivatives 3 3722 l 1k7 Lj 1771 exist and are continuous vi Assume the Jacobian of each of the inverse transformations an ah ah ah ah 3h 1 7 3M1 39 39 39 Quin Bu an W1 39 39 39 m Jldet det 7117k awn awn 311m 3M 3M 311m aul 39 39 39 an aul 39 39 39 3U 3H1 39 39 39 an is different from 0 for all g in the range of g 94 Then the joint pdf of Q is given by k 122 2 11if h11277hm 11 Example 437 Let XY be iid N N01 De ne Y 7 0 Y X UglXY Y 70 07 and V ggXY Y i X R2 but U7 V are not litoil mappings from Xonto B since U7 Vm7 y U7 V7z7 7y ie7 conditions do not apply for the use of Theorem 435 Let X0 7yy0 X1 7yygt0 X2 7yylt0 Then PXY 6 20 0 Let B 7111 21 gt 0 9X1 9X2 Inverses BHX12 h11uvuv y 711204711 U B gt X2 m h21uv 71w y 7 h22uv 7U U u J 1 gt J U 1 0 1 i 1 i i 7v 7u J 1 gt J U 2 0 71 i 2 i ny y LeimZeig Z 7 27139 WWW U Leawwzewm U LeikquZeikvVZ 27139 27139 v 7ltu21gtu2 75 2 fooltult007 0ltUltOO 7 Marginal 00 v 7ltu2m2 142 1v2 dz 2 fUu 7 0 e 2 d1 27f iw 1U 1 00 772 7ru21 e 0 1 m fooltultoo Thus7 the ratio of two iid N07 1 rV s is a N that has a Cauchy distribution 44 Order Statistics Based on CasellaBerger Section 54 De nition 441 Let X17 7X be an nirV The kth order statistic XW is the kth smallest ofthe X1537 ie7 X1 minX1 n7 X2 minX1XnX17 7 maXX1 7 It is Xlt1gt Xlt2gt Xm and X1X2Xn is the set of order statistics for X17Xn I Note As shown in Theorem 4327 X1 and XW are rV s This result will be extended in the fol lowing Theorem Theorem 442 Let X17 7X be an nirV Then the kth order statistic X00 k 17717 is also a rV l Theorem 443 Let X17 7Xn be continuous iid rV s with pdf fX The joint pdf of X1 XW is n 71 folt90igt7 901 EMS m mn i1 O7 otherwise Proof For the case n 37 look at the following scenario how X17 X27 and X3 can be possibly ordered to yield Xlt1gt lt Xlt2gt lt X3 Columns represent X17 X27 and X3 Rows represent X17X27 and X3 Xlt1gtXlt2gtXltsgt X1 1 0 0 k1 X1ltX2ltX3 3X2 010 X3 0 0 1 k2 X1ltX3ltX2 0 01 k3 X2ltX1ltX3 10 0 k4 X2ltX3ltX1 OHO HOD O k52 X3ltX1ltX2 HOD OOH H k62 X3ltX2ltX1 HOD OHO O For n 3 there are 3 6 possible arrangements For example if k 2 we have X1 lt X3 lt X2 with corresponding inverse X1 Xlt1gtv X2 Xltsgtv X3 Xlt2gt d an 8x1 8x1 8x1 8m 1 8m 2 8m 3 8 8 8 1 0 0 J2 det 8 8 8 det 0 0 1 0 9 3 8mg 8x3 8x3 0 1 0 8mm 8mm 19 with l J2 l 1 In general there are n arrangements of X1 Xn for each X1 X This mapping is not litoil For each mapping we have a n x 71 matrix Jk that results from an n x 71 identity matrix through the rearrangement of rows Therefore Jk i1 and l Jk l 1 By Theorem 436 we get for 1 mg g zn fX1gtvgtXn117quot3971n fX1gtvgtXn117quot39717 L n Z l Jk l fX1mXnk17k277kn k1 nle1Xnk17m2 7zk n H inki i1 nl H i1 Theorem 444 Let X17 Xn be continuous iid rV s with pdf g and cdf FX Then the following holds i The marginal pdf of X00 k 1771 is nl meW WWXWWTIG FXnikfX ii The joint pdf of X0 and X00 1g j lt k g n is nl j 7 1m 7j71n 7 k X fXltjgtgtXltkgt 9577 95k Fxjj 1Fxk Fxjk j 11 Fxk kfxjfxk if zj lt zk and 0 otherwise 45 Multivariate Expectation Based on CasellaBerger Sections 42 46 amp 47 In this section we assume that X X1 X is an nirv and g R a R is a Boreli measurable function De nition 451 If n 1 ie g is univariate we de ne the following i Let X be discrete with joint pmf pilw PX1 mil Xn min If E Pilymu l9i177 n llt 007 We de ne E9X Z Pi1 9i177in up up and this value exists ii Let X be continuous with joint pdf lf 9g fXg dg lt 00 we de ne RT 7 ART ggf gdg and this value exists Note The above can be extended to vectorivalued functions 9 n gt 1 in the obvious way For example if g is the identity mapping from R a B then EX1 M1 EXn Mn provided that X lt oo Vi 1 n Similarly provided that all expectations exist we get for the varianceicovariance matrix VMX Ex EGXi 19 X E with ijth component EXi EXi Xi EXj 0011Xi7 Xi and with 2 component EX 7 EX X 7 EX VarX a3 The correlation pij of X and Xj is de ned as 000Xi7Xj Plquot 7 039in 100 Joint higheriorder moments can be de ned similarly when needed I Note We are often interested in weighted sums of rV s or products of rV s and their expectations This will be addressed in the next two Theorems l Theorem 452 Let Xhi 17717 be rV s such that Xi lt 00 Let a1an E R and de ne 71 S ZenXi Then it holds that S lt 00 and i1 ES imam i1 Proof Continuous case only E051 1 1 new i1 7L nZlaillmlf d 391 n l Z l ai l lmi l f gdm1dmi71dmi1dzngt dzi i1 1R RTH 7L 1 1 41milfxltzigtdzi 7L Dalmaxim i1 lt 00 n It follows that ES ZaiEXl by the same argument without using the absolute values i1 l l I Note lei7i 17717 are iid with u then i 1 n 1 EX E7 EX Z 7EXi 1 n i1 i1 n I 101 Theorem 453 Let Xhi 17717 be independent rV s such that X lt 00 Let 9132 17717 be Borelimeasurable functions Then TL 71 EH 91Xi H Elt91Xl 11 i1 if all expectations exist Proof 71 By Theorem 4257 f g H inm7 and by Theorem 4277 giXi 1 771 are also 391 independent Therefore7 1 TL 71 EltHgiltXlgtgt Halmnfwdg i1 1Rquot i1 Th 4 2 5 n 39 ltgiltzigtfxltzigtdzigt H AynR Th427 A2911fX11d1A2922fX22dx2H39Aagnxnanmndxn gR9iiinmidmi 7L H E9iXi i1 l Corollary 454 If XY are independent7 then CovX7 Y O l Theorem 455 Two rV s XY are independent iff for all pairs of Borelimeasurable functions 91 and 92 it holds that EglX 3992Y E91 39E92Y if all expectations exist Proof gt It follows from Theorem 453 and the independence of X and Y that E91X92Y E91X E92Y lt From Theorem 4267 we know that X and Y are independent iff PX E Al7 Y 6 A2 PX 6 A1 PY 6 A2 V Borel sets A1 and A2 102 How do we relate Theorem 426 to 91 and 92 Let us de ne two Borelimeasurable functions 91 and 92 as 17 m 6 A1 I 91w Alltmgt O7 otherwise 17 y 6 A2 I 92y A2ltygt O7 otherwise Then7 EglX OPXEAfH1PX 141PX Al7 E92Y 0 PY E A 1PY 6 A2 PY 6 A2 and E91X 92Y PX 6 A1 6 A2 W e01 gtPX 6A17Y 6A2 E91X92Y given E91X E92Y E 6 A2 gt X7 Y independent by Theorem 426 I De nition 456 The itthch multiiway moment ofX X17 7X is de ned as miligin EXf1X 2 Xflquot if it exists The itthch multiiway central moment ofX X17 7X is de ned as 7L Mm EHXj EXjlj j1 if it exists I Note If we set i is 1 and ii 0 Vj 7 r s in De nition 456 for the multiiway central moment7 we get u O39HO ILLTSCO UXT7XS 00 00 1 1 T T 7 S 103 Theorem 457 Cauchyischwarz Inequality Let X7 Y be 2 rV s with nite variance Then it holds i CovX7 Y exists ii EXY2 EX2EY2 iii EXY2 EX2EY2 iff there exists an 043 E B2 7 00 such that PaX BY 0 1 Proof Assumptions VowX7 VarY lt 00 Then also EX27 EX7 EY27 EY lt 00 Result used in proof 0 aib2a272abb2gtab 0 ab2a22abb2gtiab 27 bz gtlabl 123172 vabeiz2 9 E0 XY l R2 my l fXyltzygtdz dy m2y2 E R2 2 fXY7yd dy 2 yZ 24nylJOd 27fnyld dyd 2 2 hmm hmw EX2 EY2 2 gt EXY exists CouX Y EXY 7 EXEY exists ii 0 EaX 5102 a2EX2 2a EXY BZEOZ v 043 6 IR A If EX2 07 then X has a degenerate lipoint Dirac0 distribution and the inequality trivially is true Therefore7 we can assume that EX2 gt 0 As A is true for all 043 6 R we can choose 04 w B 1 104 E XY 2 E XY 2 72 EW20 gt 7EXY2 EY2EX2 2 0 EXY2 S EX2 EC iii When are the left and right sides of the inequality in ii equal Assume that EX2 gt O EXY2 EX2EY2 holds iff E04XBY2 0 based n ii It is therefore suf cient to show that EozX Y2 0 iffPltOzX Y 0 1 77 Let Z 04X BY Since EaX mo EZ2 VarZ 192 0 and VarZ 2 0 and EZ2 2 0 it follows that EZ 0 and VarZ O This means that Z has a degenerate lipoint Dirac0 distribution with PZ 0 PaX BY 0 1 77 If PaX BY 0 PY 7 1 for some 043 6 B2 7 00 ie Y is linearly dependent on X with probability 1 this implies ltEltXYgtgt2 ltEltX gtgt2 lt gt2ltEltX2gtgt2 EltX2gtltggt2EltX2gt EltX2gtEltY2gt 105 46 Multivariate Generating Functions Based on CasellaBerger Sections 42 amp 46 De nition 461 Let X X17Xn be an nirv We de ne the multivariate moment generating function mmgf ofx as Mm we E exp i1 71 if this expectation exists for l L l lt h for some h gt O l i1 De nition 462 Let X X17 7Xn be an nirv We de ne the nidimensional characteristic function 15Bnad39ofXas lt1gt g2 Eei E exp i thjgtgt j1 I m i ltIgt exists for any realivalued nirv ii If Mg exists7 then ltIgt Mg t I Theorem 463 i If MKQ exists7 it is unique and uniquely determines the joint distribution of X ltIgt is also unique and uniquely determines the joint distribution of X ii Mg if it exists and ltIgt uniquely determine all marginal distributions of X ie7 MXZ M Q 25139 and and Xiti Q 25139 iii Joint moments of all orders if they exist can be obtained as 8i1i2win mil WMEE 0 EXi1X 2 XZLquot if the mmgf exists and 1 8i1i2vin l l i 7 7 1 2 n mnm iii1i2min atilat 39 new PXQ 7 EX1 X2 Xn 106 iv X17 Xn are independent rv s iff M t177tn Mg hg M lt07t279 M ltQ7tn V2517tn E R given that Mg exists Similarly7 X17 Xn are independent rv s iff I t1tn EXCELQ lt1gt 07t279 lt1gt Q7tn V751 775 E B Proof Rohatgi7 page 162 Theorem 77 Corollary7 Theorem 87 and Theorem 9 for mmgf and the case n 2 l Theorem 464 Let X17 7Xn be independent rv s 71 i If mgf s MX1t7 7MXnt exist7 then the mgf of Y ZaiXi is i1 Myt MXiait Note if i1 on the common interval where all individual mgf s exist 71 ii The characteristic function of Y ZuniX7 is j1 7L ltIgtyt H ltIgtXjajt Note if j1 iii If mgf s MX1t7 7MXnt exist7 then the mmgf of X is 7L Mgz H MXitl Note ti i1 on the common interval where all individual mgf s exist iv The nidimensional characteristic function of X is lt gt IXjtj Note 257 j1 107 Proof Homework parts ii and iv only I Theorem 465 Let X17 Xn be independent discrete rV s on the noninegative integers with pgf s GX137 GXTL TI The pgf of Y EX is i1 11 Proof Version 1 00 Gms ZPXlksk k0 E3Xz Gm EsY w case n 2 only am PY 0 PY 1s PY 2s2 PX1 0X2 0 PX1 1X2 0 PX1 0X2 1 s PX1 2X2 0PX1 1X2 1PX1 0X2232 PX1 0PX2 0 PX1 1PX2 0 PX1 0PX2 15 t PX1 2 PX2 0 PX1 1PX2 1 PX1 0PX2 2 s ltPX1 OP X1 1sPX1 252 ltPX2 0 PX2 1s PX2 2s2 GX18 GX28 in ep 108 A generalized proof for n 2 3 needs to be done by induction on n l Theorem 466 Let X17 7XN be iid discrete rV s on the noninegative integers with common pgf GXs Let N be a discrete rV on the noninegative integers with pgf GN3 Let N be independent of the Xi s De ne SN ZXi The pgf of SN is i1 GSN8 GNGx8 Proof PSN k 13945 ipr klN n PN n n0 GSMS iplsN k 5k k0 i iPSNklNnPNnsk k0 ipw 70pr klN n 3k n0 160 ipm mimsn k 3k n0 160 ZPN nZPZXi k 3k n0 k0 i1 2 65 im ngt axltsgt n0 i1 flaw n GXW n0 GNGx8 Example 467 Starting with a single cell at time 07 after one time unit there is probability p that the cell will have split 2 cells7 probability q that it will survive without splitting 1 cell7 and probability r that it will have died 0 cells It holds that p7q7T 2 0 and p q r 1 Any surViVing cells have the same probabilities of splitting or dying What is the pgf for the of cells at time 2 109 SN GSA5 62110 0 Example 467 1K1 r O GN3p32qsrG s V 2298 q 2p P N 2X1 i1 33966 190982 p8 N2 610982 P8 T T GEMS przqrr 110 Theorem 468 Let X17 7XN be iid rV s with common mgf Let N be a discrete rV on the non7 negative integers with mgf MNt Let N be independent of the Xi s De ne SN ZXi 3971 l The mgf of SN is M5Nt MNlnMxt Proof Consider the case that the Xi s are noninegative integers We know that axe EsXEeh XEeh 39XMXlns gtMx8 GX59 sMsm amt l Th a GNltGXet GNMX5 MN1nMXt In the general case7 ie7 if the X153 are not noninegative integers7 we need results from Section 47 conditional expectation to proof this Theorem I 111 47 Conditional Expectation Based on CasellaBerger Section 44 In Section 417 we established that the conditional pmf of X given Y W for gt 0 fogt17 and fXY and fy are continuous7 then fX y y is a pdf and it is the conditional pdf ofX given Y y is a pmf For continuous rv s X and Y7 When fyy gt 07 fX y y De nition 471 Let XY be rv s on 9 L7 P Let h be a Borelimeasurable function Then the conditional expectation of hX given Y7 ie7 EhX Y7 is a rv that takes the value EhX It is de ned as Z hmPX z Y y7 if X7 Y is discrete and PY y gt 0 MM m X 00 hmfX yz ydm7 if X7 Y is continuous and fyy gt 0 700 Note i The rv EhX Y gY is a function of Y as a rv ii The usual properties of expectations apply to the conditional expectation a EC YC VCER b EaXb Y aEX Yb ml 6 JR c If 91 92 are Borelimeasurable functions and if EglX7 Egg exist7 then EalglX 1292X Y a1EglX Y a2EggX Y Vahag E R d If X 2 0 then EX Y 2 0 e If X1 2 X2 then EX1 Y 2 EX2 Y iii Moments are de ned in the usual way If X V Y lt 007 then EXT Y exists and is the rth conditional moment of X given Y 112 Example 472 Recall Example 4112 2 0 7 0 lt z lt y lt 1 otherwise fXY7y 7 The conditional pdf s fy Xy l x and fX yz l y have been calculated as fy Xylz1 formltylt1where0ltmlt1 13 and 1 fX yzlyifor0ltmltywhere0ltylt1 y So7 y 39 EX idx7 ly 0y 2 and 1 1 1 y2 117z2 1x EY 7 d 7 7 13 Alizyy 17m2m 21735 2 Therefore7 we get the rV s EX 1 Y g and EY l X Theorem 473 If EhX exists7 then EYEXlYhX lY EhX Proof Continuous case only EYEXlYhX lY ExiyhX lyfyydy MWCXWW l yfYydzdy ML fXgtY7ydydm hzfXxdz EhX 113 Theorem 474 If EX2 exists7 then VaryEX l Y EyVarX l Y VarX Proof varyltEltXYgtgtEyltvmltXYgtgt EyltltEltXmgt2gteltEyltEltXm EYltEltX2 l Y e M l W 4 EyltltEltX l W 7 MW EltX2gt e EyltltEltX l W EltX2gt7ltEltXgtgt2 VarX I Note lf EX2 exists7 then VarX Z VaryEX l VarX VaryEX l Y iff X 9Y The inequality directly follows from Theorem 474 For equality7 it is necessary that EyVaTX l W EYEX EX lY2 l W EYEX2 l Y EX l Y2 0 which holds if X EX l Y gY lf XY are independent7 FX y l y Vm Thus7 if EhX exists7 then EhX l Y l Proof of Theorem 468 MSW D f39 EW N E exptZXl i1 ltlthX Th 473 N 7 EN EN exptZX l N EX l N 11 i1 114 First consider Th 45141 11 X39 quotd n 2quot H MXt gt MSN 5 N EN i1 ENMXtNgt ENeXpN1H MN1HMX75 5 holds since MNk ENexpN 115

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

#### "When you're taking detailed notes and trying to help everyone else out in the class, it really helps you learn and understand the material...plus I made $280 on my first study guide!"

#### "Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

#### "Their 'Elite Notetakers' are making over $1,200/month in sales by creating high quality content that helps their classmates in a time of need."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.