An Introduction to Probability and Markov Chain Models
An Introduction to Probability and Markov Chain Models MATH 331
Popular in Course
Popular in Mathematics (M)
This 58 page Class Notes was uploaded by Zechariah Hilpert on Thursday September 17, 2015. The Class Notes belongs to MATH 331 at University of Wisconsin - Madison taught by Staff in Fall. Since its upload, it has received 7 views. For similar materials see /class/205292/math-331-university-of-wisconsin-madison in Mathematics (M) at University of Wisconsin - Madison.
Reviews for An Introduction to Probability and Markov Chain Models
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 09/17/15
Section 111 lecture notes Math331 Fall 2008 Instructor David Anderson Section 111 Moment Generating Functions Hw pgs 465 466 s 1 3 6 7 9 17 The moments of a random variable tell you things EX rst moment gives mean second gives spread third gives skewness etc Also if EUXlk lt 007 then Ple gt n g T This says that 13le gt n approaches zero at least as fast as 17111 as n a 00 Proof Ellekl Elleklleml EllekllXignl Z Ellekllenl Z nkEllleml 7111111le gt 71 Ellekl k gt P Xl gt71 3 n Moment generating functions have two major properties 1 They allow us to calculate moments of RV 2 No two different RVs have same moment generating function Thus to prove a RV has a certain distribution you really only need moment gen function De nition For a random variable X the moment generating function of X is Mm E etX Therefore for discrete continuous RV we have MXt Z empxw discrete case zERX MXt emfxd continuous case Example 1 Let X be a Bernoulli RV with parameter p Then MXt E etX 130 gtk PX 0 em gtk PX 1 1710et0 106M 17ppet Example 2 Let X be binomialnp Then letting q 1 7p we have MXt E 5 Zn 6 DPWL m Zn Pethn m pet q m0 m0 Theorem 1 Let X be 1 RV with moment gehehmtihg function Let M0 be nth derivative evaluated at z 0 Then E X M Lgto Proof Will do for discrete case d d tX 7 tr 7 tr MxtaEl5 lg Z 6 pxltegt7 Z m We z6RX m6RX gt M340 2 zeowpxm Z szzEX z6RX zERX n Mt Z emex Z xnetmpxw zERX zERX gt M0 Z zneoepxm Z xanxEX z6RX m6RX Back to examples Example 3 Let X be Bernoulli with parameter p Then MXt 1 7p pet Therefore7 Mt pet for all n gt 0 Thus7 ElX M0 1 Example 4 Let X be binomialnp Then MXt pet q Therefore7 MEN npe pe W MSW npe pe 90 7101 7 1p6 2p6 Q Thus7 Ele M340 71p ElXZl MSW 7119 7101 712 gt VarX ElXZ 7 ElX2 np nn 712 7 iisz np 7 71102 np17p Theorem 2 Let X and Y be RVs with MXt Myt for all t Then X and Y have the same distribution Many times7 a moment generating function can be computed or is known7 but distribution function is not Knowing which MGF7s go with which distributions would therefore be very helpful Example 5 Let X be a RV with range RX 117 7an and probability mass function Pi at 0 else 7 2 note that 211 1 Let7s nd the moment generating function of X SOLUTION We have that MXt EetX Z emipXal Zpieait 11 11 Example 6 A random variable has the following MGF Find PX S 175 1 1 2 1 MXQ ei gem 618t 634t SOLUTION By previous exarnple7 the range of X is 7270171834 with associated probabilities 197421713729717 This must be the case by the uniqueness of the moment generating function guaranteed by above theorern Thus7 PX 3175 PX 72 PX 0 PX 17 19 421 13 4063 Sections 35 lecture notes Math331 Fall 2008 Instructor David Anderson Section 35 Independence We think of A being independent from B if PAlB PA right But using that PBlA 7 PIE5 7 PAl 3 7 PB So if A is independent of B then B is independent of A Symmetric De nition 1 Two events A and B are called independent if PAB PAPB Example 1 A card is drawn at random from a deck of 52 Let S be the event that it is a spade and A be the event of an ace Are these events independent Solution There are 52 possible outcomes PS 1352 PA 452 PA S 152 Finally PSPA 13 gtk 4522 152 So yes Example 2 Consider choosing randomly from 123 We perform the the choice twice The sample space for this experiment is S ij l ij 6 123 Let E be the event of a 1 in the rst choice Thus E 11 12 13 Let F be the event that the sum total is 3 and let G be the event that the sum total is 4 Then F 12 21 and G 13 22 3 Assuming all are equiprobable we have PE 13 PF 29 PG 13 PFE 19 PGE 19 So E and G are independent whereas E and F are not Why lntuitively when considering whether or not you will sum to 4 the rst role tells you nothing as no matter what you get the probability will be 13 that the second roll will be such that the sum will be 4 Not so with a sum of 3 Getting a 1 on rst role makes a sum of 3 possible whereas getting a 3 for example makes it already impossible Thus information is there Theorem 1 IfA and B are independent then A and B0 are independent as well Proof PA PAB PABC Therefore PABC 7 PA 7 PAB 7 PA 7 PAPB 7 PA1i PB 7 PAPBc Corollary 1 IfA and B are independent then A0 and B0 are independent Note If A and B are mutually exclusive and PA gt 0 and PB gt 0 then they are dependent counter to intuition Because PA B 0 De nition 2 Multiple events A1 An are independent if all subsets of those sets satisfy De nition 3 Multiple events A1 An are called pairwise independent if PAA PAPAj for any two Example 3 Consider electrical circuit with 4 switches on page 117 DRAW Suppose that each switch is open with probability p and closed with probability 1 7 p If a signal is fed in what is the probability that it gets through Solution What is the sample space for this problem I would put S a1a2a3a4 l a E 01 where 0 represents closed and 1 represents open Note so equivalent to sequences of 07s and 17s of length 4 so 24 total elements Let E be event that the ith switch is closed Then for example E1 0a2 a3 14 Note we are told that the events are independent For signal to get through we need PE1E2 o E3E4 PE1E2 PE3E4 i PE1E2E3E4 p2 p2 i p4 2p2 7 p4 Example 4 Adam tosses a coin n 1 times Brian tosses a coin n times What is the probability that Adam gets more heads than Brian Solution Let H1 and H2 be the number of heads of each Then PH1 gt H2 PT1 gt T2 because the coin is fair But PT1 gt T2 Pn 17 H1 gt n 7 H2 PH1 3 H2 Thus PH1 gt H2 PH1 H2 and PH1gt H2 PH1 H2 1 and so 1 PH1gt H2PH1 H2 5 Hw pg 119 s 2 4 15 17 29 Section 115 lecture notes Math331 Fall 2008 Instructor David Anderson Section 115 Central Limit Theorem HW pg 506 s 2 6 uniform 8 Consider X1 X2 X3 which are independent and identically distributed with mean u and variance 02 In nature it is observable that no matter what the distribution of X is X1X2Xn looks like a normal distribution if you back far enough away Example 1 Consider rolling a die 100 times each X is the output from one roll and adding outcomes You will get a value around 350 plus or minus some Do this experiment 10000 times and plot the number of times you get each outcome Will look like a bell curve Example 2 Go to a library and go to the stacks Each row of books is divided into n gtgt 1 pieces Let X be number of books on piece i Then 22 X is the number of books on a given row Do this for all rows of same length You will get a plot that looks like a bell curve Do these bell curves have anything in common Consider Wn X1 Xn X How far back77 should we go out to view this What does this mean scaling Why not standardize it Recall to standardize we do Wn 7 EW awn 39 This has mean zero shift over to zero and variance 1 This seems like right way to back off77 So EW EX1 X Xn nu own VarX1 X VarX1 VarXn n02 IxE So what does Wninu 7 X1Xn7nu z T UK look like for large n Theorem 1 Central Limit Theorem Let X1X2 be a sequence of independent and identically distributed random variables each with empectation u and variance 02 Then the distribution of z converges to the distribution of a standard normal random variable That is X1Xn7nv S z m22dx Zn lim PZn S t lim P lt 1 t 7 e N27T oo Proof in book is based on moment generating functions But I won t spend time on it Example 1 Let X17X27 be independent and identically distributed RVs with mean M and standard deviation 039 Set SE X1 X2 XE For large n7 what is the approximate probability that SE is between ELSE 7 kagn and ELSE kagn the probability of being k deviates from mean Solution We have that ELSE EX1 XE 71M Us VarSE VarX1 XE VarX1 VarXE m Thus7 letting Z N N0717 PELSE 7 kagn S SE S ELSE kagn Pni 7 kinn S SE 3 71M kinn P7kUn S SE 7 71M 3 coxn pk S M S k 039 n P7k lt z lt k 7 L k 5w22d V27T 7k 0 6826 k 1 i 0 9545 k 7 2 7 09973 k 3 09999366 k 4 Recall that Chebyschev7s inequality gave the following PSE 7 71 lt kinn 17 PLSE 7 71M 2 kinn a n 20271 1 1 E 0 k 1 075 k 2 08889 k 3 09375 k 4 Example 2 At a party each person will independently eat 17 27 or 3 appetizers with a probability of 147 127 14 respectively You know there will be 80 people at this party You want to buy enough appetizers so that with probability 95 you do not run out How many should you buy Solution Let X be the number of appetizers eaten and Xi be the number eaten by the ith person Then 80 X Z Xi i1 We want to nd 71 so that The Xi are iid and EXll 1142123142 VarXl 114412 914 92 45 0X2 Thus7 by Central Limit Theorern7 if Z is a standard normal RV7 X1X80780gtk2 n780gtk2 3N V 3N V n i 160 m P Z lt gt 1897 PXgtnP 1 22 7 e x dx V2 Elsie Checking a few 71 gives us 12 71 160 001747 71 200 1 0 2 01459 n 180 x 7m 2 7 PX gt 157T im 6 dyc 005688 71 190 005117 71 191 004587 71 192 So we should buy 192 appetizers Note that if we had used Markov7s inequality PXgtn802 933905 n n n 160 gt gt 7 3200 n 05 If we had used ChebysheV7s inequality VarX PXgt PX7160gt i160ltPX7160gt i160lt lt n lt n 7 0 l n LWMO 3 05 n 7160 Thus we need to nd the smallest n for which 45 80 n2 7 3201 1602 2 T 7200 s n2 7 32071 18400 2 0 This is n 2457 which is much better than Markov7s result7 but still not too Close to 192 Sections 157 167 and 17 lecture notes Math3317 Fall 2008 Instructor David Anderson Section 15 Continuity of Probability Functions is a function from 735 to R So what does continuity rnean Recall A function f R a R is continuous at a point c if for all z in we fc Equivalently7 f is continuous on R if and only if for every convergent sequence zn7 lirn x f lirn We will use this as method to de ne continuity A sequence of subsets is called increasing if E1 CEZ C EnCEn1 C and is called decreasing if E1 3 E2 3 EnDEn1 D Note that for an increasing sequence EnUEn hm EnUEn i1 TH i1 and that for a decreasing sequence En En hm En En i1 TH i1 Theorem 1 Continuity of Prob Funcs For any increasing or decreasing sequence of events En lirn PEn P lirn Proof Let E be increasing Let E EiiEZA Then the F7s are mutually exclusive Why Thus7 ZPFi F1 hm PU E1 1 i1 C hm 2133 hm P i1 hm PEn i Section 16 Probabilities 0 and 1 and an example for Sec 15 Whole important point of section If E and F are events with probabilities l and 0 respectively it is not correct to say that E is the sample space S and F is the empty set 0 Example 1 Selecting a point at random from the interval 0 How do you do it Every point has a decimal representation So for example 4927848548381 Each goes on forever but may repeat So we can pick a number by using the sample space S 1 2 3 3 E Consider the probability of selecting 13 3333333 Let An be the event that we have chosen a 3 for the rst 71 digits Then Alenggj39 Also PAn PAn110 since there is a l in 10 chance of picking a 3 at each step Because PA1 110 we have PAn 110 Also An 13 Thus by continuity theorem P P 211am 2111210 0 Also note that 1 135 PMS 13 U 13 PS 13 P13 PS 13 Trivial to do for any number Much more surprisingly if 1 is a countably in nite collection of points P Um 2 ma 0 o i1 i1 i1 Read Section 17 not Example 121 unless you want to Homework Pg 34 7s 1 3 4 5 10 Sections 11 and 12 lecture notes Math331 Fall 2008 Instructor David Anderson Section 11 Non rnathernatical De nition In any experiment an event that may or may not happen is called random Ex weather outcome of biological experirnent dice cards How to make sense of this notion And how to calculate Good idea Relative frequency interpretation To nd the probability that an event A occurs in an experiment Let nA be the number of times that A occurs during 71 performances of the experiment Finally de ne probabilityA pA 11m mace n Example 1 Rolling a fair die Will nd p6 is rolled 16 Example 2 Flipping a fair coin twice Will nd p rst heads then tails 14 Problems 1 Can not be computed exactly 2 No reason to believe that lirnH00 always exists 3 Notions that do not have repeatability do not have meaning examples weather guiltinnocence in criminal cases More rigor is needed In fact we need set theory for an understanding of probability Section 12 Sample spaces and events De nition 1 The sample space of an experiment S is the set of all possible outcomes of that experirnent Each individual outcome is called a sample point or point Subsets of S are called events 0 Have found confusion of problems in this class typically come from not understanding the sample space Example 3 A coin is tossed twice and the outcome of each is recorded Then S HH HT TH TT The event that the second toss was a Head is HH TH Example 4 Consider 3 lightbulbs Our experiment consists of nding out which lightbulb burns out rst and how long in hours it takes for this to happen S m 2 6 123t 2 0 239 tells you which one burns out and It gives how long it lasted in hours The event that the 2nd bulb burns out rst and it lasts less than 3 hours is the set 2t t lt 3 C S Example 5 A bus arrives at a bus stop sometime between 11 PM and 1130 PM every night Give a sample space for the arrival time of the bus S t t E 030 where t is number of minutes past 11 or S t t E 11 0011 30 where t is the time Example 6 You roll a four sided die until a 4 comes up The event you are interested in is getting a three with the rst two rolls Sa1a2an n21an4a 123for 7 n E33a3a4an n23an4ai 6 123 foriy n De nition 2 E is a sub set of F if z E E implies z E F Notation E C F De nition 3 E and F are equal denoted E F if E C F and F C E De nition 4 The intersection of E and F denoted EF or E F is xES Eandz F De nition 5 The union of E and F denoted E U F is zES EorEF De nition 6 The complement of E denoted E0 is x E S z Q So De nition 7 The difference of E and F is E 7 F x E S x E E and z De nition 8 Two sets are mutually exclusive if E F Q The sets 1 El Ei HE E E are de ned in obvious way Discuss Venn Diagrams no rigor E0 G U F is good ex Important set relations Commutative law EUFFUE E FF E Associative law EUFUG EUFUG E F G E F G 2 Distributive law E FUHEUH FUH7 EUF HE HUF H De Morgan7s rst law U Ef i1 i1 Note that if E Q fort 37 have EUFV Ec Fc De Morgan7s second law U Ef i1 i1 Note that if E Q for 239 2 37 have E Fc E0 U F0 Proof of De Morgan7s second law elementwise method Consider E C S Let x E Ei Thus7 z Ei Thus7 z E for at least one Ei So z E Ef for that E and 50 6 Ef gt Ede C Ef Let x E Ef Then z E Ef for at least one El and so z El and7 hence7 z E D Therefore z E E00 and Ef C E00 Proof of De Morgan7s rst law We have De Morgan7s second law Take complements of second law with instead of E D Homework Clear answers wanted Pgs 9 117 s 17 27 37 97 107 20 Chapter 9 lecture notes Math3317 Fall 2008 Instructor David Anderson Section 91 Joint Distributions of n gt 2 Random Variables Everything is pretty much as it should be De nition 1 Let X17X27 7X be discrete RVs de ned on the same sample space7 with ranges A17A27 7A 7 respectively Then pz1727 7x PX1 17 7X ml is called the joint probability mass function of X17 7X We have a pltz177zngt 2 o b If for some 2397 eZ A then 10x17 7x 0 c 2 px177xn 1 zieRXi1gign The marginals are as they should be also PXiWi PXi MXj E AjJ 7 Z P17777 mjeAjj7i Can also get the marginal of X7 Y out of X7Y7 Z innLy Z Wag72 ZERltZgt Can easily generalize Naturally the joint distribution function is Ft17t277tn PX1 tth t277Xn tn for all ti 6 R7 239 1727 quot771 The RVs X17 7X are independent if for any sets 117 7A we have PX1 e A177Xn e An PX1 6 A1 PXn e An For discrete RVs this again translates to for any numbers 17 7 xn PltX1177Xnn p17pX1139 Also if X177Xn are independent7 then 91X17 7gnXn are independent for any functions 917 7 9 Finally7 expectations For any function h R a R EhX177Xn Z Z hx177xnpz17 7x 16141 MEAN For independent RVs7 have El91X1 9nXnl El91X1l Example 1 Let p7y7z kx2 y2 1127 z 071727 y 2737 2 374 a For what value of k is p a prnf Solution Need to solve 4 2amp2 12 12 23 s H P H W O QM 03 7 02H 2x 2y 7y H MN a H o m y 4952 26 35 ll MN a H o k 4952 61 M20 183 203k H o Mm So k 1203 b Find pyz and pg Solution For y 27 37 z 37 4 mu 2 Lee y 92 7 10 203 l 312 312 For 2 374 have 0 Find EXZ have 3 2 pZZ W2 92 92 3y2 3yz 5 39 15 10 203 2 i 1 15 49 7203 Z 39 EXZ ZZZzzaanyanyz m0 112 23 7 774 7 E m 381 HW Pgs 383 384 73 17 37 5 Sections 33 and 34 lecture notes Math331 Fall 2008 Instructor David Anderson Section 33 Theorem 1 Law of total probability Let A and B be events with PB gt 0 Then PA PA l BPB PA l B0PB0 Proof PA l BPB PA l B0PB0 PAB PABC PA where nal equality follows from rnutual exclusiveness El Example 1 In a hospital 35 of those with high blood pressure have had strokes and 20 of those without high blood pressure have had strokes If 40 of the patients have high blood pressure what percent of the patients have had strokes Solution Let A be the event of a random patient having had a stroke Let B be the event that a person had high blood pressure We want PA PA PA l BPB PA l B0PB0 35 gtk 4 2 gtk 6 14 12 26 De nition 1 Let B1 B be a set of nonernpty subsets of the sample space S If the sets B are mutually exclusive and UB S then the set B1 B is called a partition of S Theorem 2 Law of total probability Let Bh B be a partition ofS with PB gt 0 Then for any A PM PA l BiPBiPA l B2PB2 PA l BnPBn ZPA l BiPBi39 i1 Proof 2PM l BgtPltBgt Ema PM where nal equality follows from rnutual exclusiveness COULD BE INFINITE TOO El Example 2 An army has 4 sharpshooters The probabilities of each hitting a target at a given distance are 4 6 35 and 7 respectively What is the probability that a given target will be hit if the shooter is chosen randomly Solution Let A be the event that the target is hit Let B be the probability that the ith shooter is chosen Then PA 2PM l BPB 446435474 463574 2054 5125 Hw pgs 96 97 7s17913 Section 34 Bayes Formula Motivation Suppose that A is an event that typically follows the events B1 B2 B which we know a lot about and that are a partition with PB gt 0 Suppose we know PBk and PA Bk for each k But we want PBk A Terminology Bs are the hypotheses PB is primquot probability of Bi and PB A is called the posterior probability of B given A What to do PBkA PM 1 BkPBk PM mm PM 1 BkPBk 2PM 1 BiPB39 PBklA This is Bayes7 theorem DON7T MEMORIZE SIMPLE TO DERIVE Example 3 An army has 4 sharpshooters The probabilities of each hitting a target at a given distance are 4 6 35 and 7 respectively The shooter is always chosen at random Suppose that the target is hit What is the probability the shooter was shooter number 2 Solution Let A be the event that the target was hit Let B be the event that the 2th shooter was chosen Then P A B P B 6 gtk 25 15 L 39 293 Z PA 1 BPB 4 25 6 25 35 25 7 25 5125 Example 4 Suppose that 5 of the men and 2 of the women working at a company make over 100000 a year If 30 of the employees of the company are women what percent of those who make over 100000 a year are women Solution Let A be the event that an employee is a woman and B be the event that an employee makes over 100k Have PBA0 05 PBA 02 PA 3 PAc 7 Want PAB Using Theorem PBAPA 02 3 006 1463 B1APAPB1A0PA0 023057 006035 PAB P Hw pgs 105 106 s 1 28 Chapter 10 lecture notes Math3317 Fall 2008 Instructor David Anderson Section 102 Covariance Hw pgs 424 4257 s 47 57 77 9 For discrete RVs X17 7X we have EhX1Xn Z Z hz1znPX1 9 an zn m16RX1 mn6RXn Variance of linear combinations Suppose that XX1X2Xn Already know n Ele 2 EM i1 and this doesn7t need independence What about variance Consider n 2 Want ElX Ele2l ElX1 X2 7 1 M22l ElX1 1 X2 22l ElX1 M12l ElX2 M22l 2ElX1 M1X2 M2l VarX1 VarX2 2C 01X17X27 where last is a de nition Also have 000X17X2 ElX1 M1X2 2 ElXinl 2M1M2 M1M2 ElXinl MHZ Note that if X1 and X2 are independent then 000X17X2 EX1X2 7 MM EX1EX2 7 MM 0 Now want 1 Xi Zn Xi n71 n VarXl 2 Z 000XZ7Xj Or just write 239 lt j 39 1 i1 ji1 E H M H i VarX EX i EX2 E lt H M Now7 if the X s are pairwise independent then Var Xi ZVaMXi i1 i1 1 Example Let X be a binomial RV with parameters 71 and p X is the number of successes in n independent trials Therefore XX1Xm where X is 1 if 2th trial was success and zero otherwise Therefore Xs are independent Bernoulli RVs and PX 1 p Thus not from independence Variance of binomial Each of the Xs are independent Therefore VarX Z VarX 21 7 p2 np17 p i1 13971 So much easier In General VaraX bY aZVarX bZVarY 2abOmX Y VarX Y VarX VarY 2000X Y VarX 7 Y VarX VarY 7 2000X Y Var llXi Z aVarX 2 Z Z aiajCovX Xj i1 i1 iltj Independence irnplies Var llXi iagVaMXi i1 13971 Discuss What it means for covariances to be positive negative and zero in terms of X Y being positively negatively or un correlated All follows from de nition CmX Y E X 7 EXY 7 EY Example 1 Two random variables may be dependent but uncorrelated Let RX 710 1 with PX 239 13 for each 239 Let Y X2 Then OauX Y EXY7EY EXX27EX2 EX37EXEX2 070EX2 0 but not independent PX 1Y 0 0 31 PX 1PY 0 1313 19 So these are just uncorrelated Example 2 Suppose there are 100 cards numbered 1 through 100 Draw a card at random Let X be of digits RX 1 23 and Y be number of zeros RY 0 1 Chart giving probabilities is 81100 9100 0 90100 0 0 1 00 1 00 9 90 1 EX 1 2 3 192 100 100 100 EY01i21 11 100 100 9 1 EM3121 32 a 100 100 CmX7 Y EXY 7 EXEY 24 7192 gtlt 11 00288 VarX 0936 1arY11797 p COUltX7YUXUY 02742 If want calculate variance of the sum of 71 rolls of die Lemma Let X17 7X be a random sample of size n from a distribution F with mean M and variance 02 just 71 experiments Let X X1 be the mean of the random sample Then 7 7 0392 EX M VarX n proof Easy 77 X X 1 EXE Zn 2 VarX VarltlX1 Xn izVar X1 X inaz U n 71 n2 Sections 53 lecture notes Math331 Fall 2008 Instructor David Anderson Section 53 other discrete RVs Geometric RV We7ve seen this Consider lots of Bernoulli trials performed UNTIL A SUCCESS HAP PENS Sample space is S sfsffsfffs Suppose prob of success is p Let X be number of trials until a success WHAT IS THE RANGE RX 1 23 The probability mass function is given by PX n 17p 1p n E RX Note by Geometric series oo oo 7 co 7 00 n 1 ZPX 71 2071 1p 920719 1 pZG 10 p 1 n1 n1 n1 n0 1 7 7 A random variable with this prob mass function is called GEOMETRIC Properties 1 1 7 p E X Var X l l p p2 Example 1 Drawing cards with replacement and wanting an Ace p 113 So 12 11 1 12 1129 12 PXgt10 2313 13 1 13 13 13 23 n10 n9 1 12 9 1 12 9 m 49 13 13 17 1213 13 What is the minimum number of draws required to get an Ace with probability 95 Solution We want the smallest n such that PX S n 2 95 Therefore we want 17 PX gt n 2 95 or PX gt n S 005 PX gt n 1213 So need 1213 S 005 or nln1213 S ln005 or n 2 ln005ln1213 3742 Thus 71 is 38 Memoryless Property Let X be Geometric Then for all nm 2 1 PX gt n m 7 n m failures to start PX gt m T m failures to start 1 7 MW 17p PXgtn 1 7 p So prob that next 71 will be failures given rst m were is same as prob that rst 71 are failures lntuitive by independence However Geometric is the only discrete RV with this property PXgtnlegtm Quicker now Negative Binomial Generalized geometric Have a sequence of Bernoulli trials with prob success p Let X be the number of experiments until the rth success occurs Then X is Negative Binomial with parameters 7 p Note that NegBin1p Geomp The range is RX rr1 r2 and for n E RX using independence PX n POquot 7 1 successes in n 7 1 trials and then success on nth Pr 7 1 successes in n 7 1 trialsPsuccess Pbinomn 7110 r 71 n 71 1 1 39r 17 n 39r 1 T 71 p p gym 1 EX 5 VarX M 2 p 10 10 Example 2 The Red Sox and Brewers are playing for the World Series The prob that the Red Sox win each game is 55 What is the probability that the Brewers win in 6 We have Solution Let X be number of games needed until Brewers win 4th This is NegBinr4p45 Therefore PX 6 454556 4 124 Hypergeometric RV Suppose we have a box with N total items D of which are defective and N 7 D are non defective Suppose n are drawn WITHOUT REPLACEMENT Also suppose n S minDN 7 D Suppose X is number of defective items chosen Then X is discrete RV with range RX 0 12 and prob mass function x E 0 1 lt5 lt55 13 This is Hypergeometric prob mass function We have EX 7 97 2n VarX 7 w 1 7 pm 7 M 7 x 7 N N Note that if we had replacement X would be binomial with parameters n and DN p Thus D D D nDN 7 D EX np 71W VarX np17p nN17 This is accurate so long as 71 ltlt N Why Because replacement shouldn7t matter in that case So when 71 ltlt N binomial is a good approximation to Hypergeometric Hw pgs 224 226 s 3 5 9 15 16 Chapter 6 lecture notes Math331 Fall 2008 Instructor David Anderson Ch 6 Continuous RVs We are going to go quickly to see analog with discrete Not all RVs are discrete Think about waiting times for anything or distance a ball is thrown etc Many RVs have uncountable ranges 61 Probability Density functions Motivation Recall for discrete RVs the probability mass function can be reconstructed from the distribution function and vice versa ldea changes in distribution function told us about prob mass function Now consider a continuous RV with distribution function Ft PX S t SUPPOSE that Ft is differentiable and that Then for any a lt b E R b PX E ab PX E ab etc FB 7 Fa ftdt f is called the probability density function or just density function In general we need that for nice77 sets A C R an f exists which satis es mXemfmm A Properties a Fm 7 men m fw r c F x f if f is continuous at x Note making this rigorous PX a faafdz 0 So density function does not represent a probability However its integral gives probability of being in certain regions of R Also fa gives a measure of likelihood of being around a That is 162 Pa762ltXlta62 ftdtfa6 1762 So if fa lt fb then Pa 7 62 lt X lt 1 62 lt Pb 7 62 lt X lt b 62 Note that a RV does not have to be discrete or continuous It could have both properties 1 Example 1 Consider a RV with density 171195731 1ltxlt5 2 4 7 7 7 fa 0 otherwise 39 1 sketch f and show it is a density 2 Find the distribution function F and show continuous 3 Sketch F b For t lt 1 111 7 foo mm 7 0 For 1 S t lt 3 ft 12 14t7 3 and so 111 1 mm 71 f95dz 71 12 7 1mm 7 3m 7 12 7 it For 3 S t lt 5 ft 12 7 14t 7 3 and so Ft f ndmfn 714x73dz g 41591t2 541 7 218 718 54t7178 And for t 2 5 Ft 1 So 0 tlt1 121 1 Fa g2 4H8 13tlt3t 718t 54t7178 3gtlt5 1 125 Show continuous via limits c sketch using part Hw Section 61 pgs 238 239 s 2 3 4 63 Expectations and Variances of Cont RVs Again moving quickly Read section same idea as center of mass in physics where weight is area under De nition 1 If X is RV with density function f the expected value is EX 7 zfzdz Not really hard to see why Discretize X into small ranges thxll where x 7 11 h is small Now think of X as discrete with only the values xi Then EX ZziP1 lt X S Zzi z h 00 xfzdx Theorem 1 Let X be a continuous RV with probability density function Then for any h R a R Corollary 1 Let X be cont RV with density Let h177hn be functions and 0417 704 be numbers Then Ea1h1X Lynmm Z aiEhiX Of course we then have EaX 3 aEX B Finally7 De nition 2 If X is RV with mean M then the variance and standard deviation are given by 00 VarX EX 7 of 1 z 7 u2fd 0X M We also still have VarX EX2 EiXH2 VaraX b 02V0TX Uaxb 00X The proofs are exactly the same as in discrete case Hw Section 637 pgs 254 2557 s 17 27 4 Section 44 lecture notes Math331 Fall 2008 Instructor David Anderson Section 44 Expectations of discrete RVs Mention that some of the example are looong Dont get discouraged We care about the expected value because it gives a prediction about the future77 Suppose we repeat an experiment over and over again measuring the same quantity X each time Suppose 7ZX x1z2 The average is then approx law of large numbers to come later X1quot39Xn N i n N n 7 De nition 1 Let X be a discrete random variable with prob mass function p and range The expected value of X is EX 67300 Also called the mean or expectation of X and is sometimes denoted M Example 1 Consider a random variable taking values in 1 n with PX 239 171 for each 239 E 1 n We say that X is distributed uniformly over 1 71 What is the expectation We have 1 1nn1 n1 EX PX 777739 l 12 Z 2271 n 2 2 Example 2 Consider rolling a die and letting X be value Then X is uniformly distributed on 1 6 Thus EX 72 35 Note this can never really happen But if someone offers you the winnings from the roll if you pay 3 you should probably take the bet Example 3 Suppose that X takes values on 012 with probability mass function g PXze Then X has a Poisson distribution with parameter A The expected value is EX f MEX AHA 0 M71 AHA f E Ae AeA A F0 x m1x 71 F0 x 39 Example 4 Let I be 1 if A C S occurs and 0 if A0 occurs maybe A is event of 1 4 or 6 occurring in roll of die This is called the indicator function of A Then EU 1p1 PI 1 PA So the expected value of an indicator function gives the probability that that event occurs 1 Expectation of a function of a random variable Suppose we have gX g o X instead of X and want the expectation7 ls gX a random variable Why Call it Y Thus7 we have E9Xl ElYl ZyPW y ZyP9X y y y 271 Z PX xi 3 mi 9wiy Z Z 9961PX 961 Z 9967P y mi 57111 Hence we have the following for discrete X and a real valued function g E9Xl EdaWW Properties of Expectations 1 Consider 04191 X agggX can generalize We have Ela191X 0429200 ZltOt1g139 04292i1039 a1 91961px1 a2 9206011967 a1E91Xl a2E92Xl 2 Note For a7b E R7 EaX b aEX b 3 If X 2 07 then EX 2 0 4 If f s 9 then E17001 E19001 Example 5 Suppose X has 7ZX 727476 and 1072 177 p4 277 and p6 47 What is EXX i 2 EXX 7 2 72 7472 4 2 94 6 4 95106 8 8 24 i 1207 EXX i 2 EX2 7 2X EX2 i 2EX 4 17 16 27 36 47 7 272174 27 647 1207 Hw pgs 173 174 s 27377711712 Section 13 lecture notes Math3317 Fall 2008 Instructor David Anderson Section 13 foundations De nition 1 Let S be a sample space Suppose that for each A C S there is a number PA that satis es 1 PA 2 0 for all A 2 PS 1 3 gtk lf are mutually exclusive A Aj Q if 239 31 j7 then Then P is a probability for S and PA is the probability of A Example 1 Consider ipping a coin and recording the outcome Then S H7 T Need a probability Assume fair coin Then PH12 PT12 PH T 1 Note that PHUT112 12 PH PT This is not the only probability possible for this experimentsample space Could have an unfair coin PHp PT17p PHT1 Simple Theorems Theorem 1 P 0 Proof PS PS u j 0 PS f Pm 12 12 Thus7 130 0 El Theorem 2 Let A1 An be mutually emelusive but riite Theri Proof Use previous result and take sum to in nity with the rest being the empty set El Example Rolling a dice P12 U P12 P3 P1 P2 P3 16 16 16 12 Corollary 1 For arty set A PA S 1 Proof PA U Ac PS 1 But A and A0 are mutually exclusive So 1 PA U Ac PA PAc Non negativity now shows PA S 1 I In book with examples Will be used throughout course Theorem 3 Let S be a sample space with N elemerits that all have same probability of occurring Theri for arty A C S PA 7 elemzits of A Proof Let A consist ofj elements w1w2 wj C S Then by the sets being mutually exclusive PA Pw17w27 710739 Pw1 Pw2 Pw jN D Example 2 A number is selected at random from the numbers 1 52 What is the probability that the number is divisible by 5 Solution S 152 and N 52 525 10 25 so there are 10 numbers in S that are divisible by 5 why Thus the probability is 1052 Caution You do not always have sample points with equal probability This could depend upon your choice of sample space Example Consider all families with two children You want to describe the gender of the children One option which accounts for age is S 557bg7gb7ggl Another that doesnt account for gender would be W 51 ab 99 W does not have equiprobable events Pgb 5 Sections 41 lecture notes Math3317 Fall 2008 Instructor David Anderson Section 41 Random Variables Example 1 Consider rolling a fair die twice 5 297 i 297 E 1776 Suppose we are interested in computing the sum Let X be the sum Then X E 27 37 712 is random as it depends upon the outcome of the experiment It is a random variable We can compute probabilities associated with X PX 2 P171 136 PX 3 P1727 271 236 PX 4 P1737 2727 173 336 Can write succinctly Sunni 2 3 4 5 6 12 PX 2 136 236 336 436 536 136 Loose De nition Let S be a sample space Then a function X S a R is a random variable Technical De nition in book Let S be a sample space Then X S a R is a random variable if for each interval I C R s E S Xs E I is an event has to do with sets with no probability Terminology Instead of s E S Xs E I7 we usually write X E I or X E I Example 2 Consider a bin with 5 white and 4 red chips Let S be the sample space associated with selecting three chips7 with replacement7 from the bin Let X be the number of white chips chosen So7 S a7b7c a7b7c E W7R We have XR7R7R XRWR XRRW 17 XWWR XWRW X XlV7 W W 3 We want probabilities 07 XWRR RdKW 27 and PX 0 PR7R7R 493 PX 1 PW7R7R u R W7R 6 RR W PW R7R PR W7R PR R W 3 59492 PX 2 359249 PX 3 59 Note 493 3 gtk 59492 359249 593 1 What if no replacement Then7 PX 0 M 2 2331 PX 3 Again7 can show this sums to one Note that if X and Y are random variables on the sample space S7 then so are X Y7 XiY7 aXbY for a7b E R7 XY7 XY so long as Y 31 0 Also7 iff R 2 R then fX is also a random variable So7 X27 X37 sinX7 etc are all random variables Also have miX of two ideas X2 Y27 X2 Y 7 3Z47 etc Example 3 Consider choosing a cube with side length chosen randomly from the interval 01 Let X be the side length of the cube chosen Let Y X2 be the side surface area of the cube chosen7 and let Z X3 be the volume of the cube chosen By assumption7 PX gt12 12 What is PY gt14 and PZ gt18 We have PY gt14 PX2 gt14 PX gt1212 PZ gt18 PX3 gt18 PX gt 12 12 Also7 PZ lt12 PX3 lt12 PX lt 7937 7937 Section 112 lecture notes Math3317 Fall 2008 Instructor David Anderson Section 112 Sums of Ind Random Variables Hw7 pg 4747 s 27 67 8 Moment generating functions can be used to prove sums of RVs have certain distributions Theorem 1 Let X17X2Xn be independent RVs with moment generating functions MX1t7 7 MXTL Then the moment generating function of X1 Xn is MX11Xnt MXI t MXt Proof Let W X1 1 X then by de nition tX1tXn tX1 etXn etXl E etXquot independence MX1tquot39MXit 1 Point Can use moment generating functions to nd distribution of sums of random vari ables Need uniqueness theorem from previous section Example Sums of binomial RVs are binomial RVs Theorem 2 Let X17 X27 7X7 be independent binomial RVs with parameters 711710 71210 nhp Then X1 X2 XT is binomial with parameters n1 n2 n7 p Proof This should be intuitively clear Let q 1 7 p Then we know that Mx pet QW Let W X1 X2 Xn then7 we have Mwt MX1tMX2t 39 39 39Mx pet a pet Q pet Qm pet qn1n2m39 But pet qn1n2m is the moment generating function of a binomial RV with parameters n1 n2 7177 p Now use the uniqueness property of moment generating functions Poisson7s are the limit of binomial so the following theorem should be true This is surprising though Theorem 3 Let X17 X27 7X be independent Poisson RVs with parameters A17 A27 7 An Then X1 X2 Xn is Poisson with parameter A1 A2 An Proof Let7s rst nd the MGF of a Poisson RV Y with parameter7 mean7 A We have Myt Ee Y Z all A 210 y39 00 t y 7A 5 A Z w 110 7 67A6Aet exp Met 71 Now let WX1Xn Then Mwt MX1tMX2t 39 39 39MXTW exp 1et 7 1 exp 2et 71 exp Aidet 71 exp A1 An t 71 7 which is the moment generating function for a Poisson RV with parameter A1 A2 An D Theorem 4 Let X17X277Xn be a set of independent RVs with Xi Nni7oi2 Then for any constants 0417 704 italXi N N OilM7 04120 i1 i1 i1 In particular7 if X17 7X are independent normal with the same mean7 M7 and variance7 027 then Sn X1 Xn is Nni7 n02 and the sample mean X l2X1 is Nn702n Chapter 8 lecture notes Math3317 Fall 2008 Instructor David Anderson Section 81 Joint Distributions of Two Random Variables WILL ONLY CONSIDER DISCRETE RANDOM VARIABLES FOR NOW De nition 1 Let X and Y be two discrete RVs de ned on the same sample space Thus7 X7Y S a R Then Way PX 967Y y is called the joint probability mass function of X and Y Note Let A7 B be the ranges of X7 Y7 respectively 1 May 2 0 and ifs A or y B7 then May 0 2 ZZMLy 1 zEA yEB Recovering pX or py from pz7y Suppose X and Y have joint probability mass function pz7y Let pX be the probability mass function of X Then pXz PX s PX LY E B 2PM 967Yy yEB yEB Similarly7 pm 2190679 zEA pX and py so de ned are called the marginal probability mass functions or just simply marginals of X and Y Example 1 A college has 90 male and 30 female profs A committee of 5 is selected at random Let X and Y be the of men and women selected respectively Find joint pmf May and marginal pmf pXz and pyy Solution The range for both RV7s is 07172737475 The joint pmf is 301 0 We07127345andzy5 7 yo else To nd pXz we note that for z in range of X7 pXz 2251010z7y7 which is only nonzero when x y 5 Thus7 we pw 7 z g10 1 and for y in the range of Y pm p5 7 MI 5 iO10 Both are hypergeometric Example 2 Flip a coin twice Let X be 1 if head on rst ip7 0 if tail on rst Let Y be number of heads Find May and pX7 py Solution The ranges for X and Y are 0717071727 respectively We have 10070 PX 07Y 0 PX 0PY O X 0 121214 10071 PX 071 1 PX PY 1X 0 121214 10072 PX07Y2 0 10170 PX 17Y0 0 10171 PX 17Y 1 PX 1PY l X 1 121214 10172 PX 17y 2 PX 1PY i 2X i1 121214 Further7 px0 2110711 p070 p071 1107 12 MO 2190711 p170p171p172 12 IMO 2109570 P070P170 14 pY1 2109571 P071P171 12 PY2 210 P072P172 14 We have EX ZsZMLy ZZxMLy zeA zeA 163 x ElYl 211 pm 21121496711 2 211 MM yEB y m y w Theorem 1 Ifh R2 a R then ElMX Yl Z Z h7ypz7y zeA yEB Corollary 1 For discrete RVs X and Y on the same prob space EX Y EX Proof Let May z y in Theorem Then7 EWWZZNWMM w y EjgmemX wmm z y w y z zhmwZMzhmw m y y m 2 96192496 Zypiy w y mmEwy Finally7 can de ne joint probability distribution function F R2 a R Ft7u PX S t7Y S The marginal probability distribution functions are FXt PX S t PX S t7Y lt 00 S t7Y S PX S t7Y S n JLIEOFt7n F0200 Fyu Foo7u Hw Section 817 pgs 325 32677s17273747577 Sections 42 lecture notes Math3317 Fall 2008 Instructor David Anderson Section 42 Distribution Functions Typically we want the probabilities associated with a random variable either PX a for discrete random variables or PX S a for continuous random variables Both can be captured with distribution function of X De nition 1 If X is a random variable7 then the function FX or F de ned on foo7 00 by FXt Ft PX S t is called the distribution function of X Sometimes called cumulative distribution function Example 1 Consider rolling a die Let S 17273747576 Let X S a R be given by Xs 52 Then7 PX 239 16 for each 239 6 17479716725736 Note7 however7 that F is de ned for all t E R So fuction is constant between numbers and F1 PX S 1 PX 1 167 F4PX 4 PX1orX4 PX1PX4 26 etc DRAW PLOT Example 2 Reconsider cube example Have PX S t t for all t E 0717 so 07 tgo FXt t7 0ltt 1 17 1ltt 07 tgo Fm Va 0ltt 1 17 1ltt Similarly for volume Properties of the distribution function 1 F is nondecreasing Thus7 if s 3 t7 then Fs PX S s S PX S t lt7s because X S s C X S t 2 limtH00 Ft 1 Proof by continuity of probability function using the sets X S tn with tn increasing to 00 Noting that 1 PX lt oo PUX m lim PX tn lim Fm gt tlim Ft 1 3 lirntnooFt 0 Same proof 4 F is right continuous So lirnhno ft h ft for all t Proof by continu ity of probability functions with tn a H and nX S t X S t Shows lirnH00 Ft Distribution function can be used to compute rnost values of interest 1 PXgta17PX a17Fa 2Letaltb WantPaltX b altX bX biX aand Xga CX b and so by theorern Pa lt X g b PX b 7PX a Fb 7Fa 3 By continuity of prob functions PX lt a hlirgii Fa h F017 4 PX 2 a 17 F017 5 XaX a7Xlta andXltaCX aso PXa PX Sa 7PX lta Fa 7Fa7 So a jump in the distribution function corresponds with some event being given positive probability and if the function is continuous at a point then that event has zero probability Give example with partially de ned values over different intervals Maybe a person showing up at a random time between 1 and 2 PM and being held in a room if arrives from 115 to 130 and then being able to come in at 130 So 130 has weight Hw pgs 150 152 s 1 4 5 6 7 16 Chapter 8 lecture notes Math331 Fall 2008 Instructor David Anderson Section 83 Conditional Distributions Let X and Y be two RVs Want to understand the distribution of X given that Y y is known Need probability mass function of X given y This is called the conditional probability mass function of X given that Y y De nition makes sense Let py be joint prob mass function of X and Y pxwm PltX x l Y 11 PX M y my P Y 2 pm Recall my 2 Way zERX Note 1 P 79 2 payMy Z 2 Way 1 m6Rz m6Rz m6Rz So pXDzly is itself a prob dist function Example 1 Let py be given by 115gtltzy ifz012 y12 px7yi 0 else Find pXDzly and PX OlY 2 Solution Want gym y Need pyy 2 2 y 1y 2y 33y 1y 115 pm gum g z l 15 15 15 15 5 Thus pxwly lt115gtltzygt x 5lt1ygt z 012 y 12 30 Also PX OlY 2 pXy0l2 23 gtk 3 29 Suppose that X and Y are independent Then Way pxzpyy giny pyy pXm39 leYlt l9gt Expected values are as they should be Called conditional expectation of X given Y EXlY 21 Z 96PXY95ll m6RX As before7 if h R a R we have EWXNY yl Z hpX Y95ll zERX Example 2 Previous example Want EleY 2 Had zy 10Xu95lym7 96071727 11172 So 2 2 2 1 1 11 EXY2 20 3 8 lt l Epoxgum gz d l 9 9 Hw Section 837 pgs 353 3557 s 17 37 5 just discrete case7 13 Sections 22 lecture notes Math331 Fall 2008 Instructor David Anderson Section 22 Combinatorial Methods How to count Lots of real world examples have equiprobable sample points In this case we know where NA is number of points contained in A and N is nite size of sample space So counting to get NA is important Theorem 1 Generalized Counting principle Let E1 E2 Ek be sets with 711712 71 elements respectively Theri there are 711 gtlt n2 gtlt gtlt nk ways iri which we cari rst choose ari elemerit of E1 theri ari elemerit of E2 theri ari elemerit of E3 etc uritil Ek Proof Case with h 2 Consider E1 a1 an and E2 b1 bm Can simply write down all possible ways For each i S n have ab1 ab2 aibm which gives in terms But there are n of these Multiplying gives the answer For k 2 3 simply consider the events of the rst h 7 1 sets as one large event set with 711 gtlt n2 gtlt gtlt mpL events and the left over Ek with nk terms Now multiply Done by induction explain Example 1 Consider rolling 7 dice How many possible outcomes are there What is the probability of rolling at least one three Solution For i S 7 let E be the possible outcomes of the ith die 1 2 Therefore by theorem number of ways to throw 5 dice is the number of ways we can choose sequentially from E1 E2 Thus there are 6 gtlt 6 gtlt 6 gtlt 6 gtlt 6 gtlt 6 gtlt 6 67 ways Let A be the event of rolling at least one 3 Then we know that PA 1 7 PAc So we want the probability of rolling no threes There are 57 ways of doing this Thus 57 PA 17 PAc 17 g 17 279 721 Example 2 Suppose a states license plate consists of 3 numbers followed by 3 letters Suppose this state has 24 million people in it Are there enough license plates Solution Let E1E2 E3 consist of the different possibilities of the rst second and third number respectively Let E4 E5 E6 consist of the different possibilities for the letters Then E1 E2 E3 consist of the numbers 0 7 9 and have 10 elements and the others consist of the 26 letters of the alphabet Therefore there are 103 gtlt 263 17576 000 Not enough Example 3 At a party of 71 people everyone is expected to give a gift to everyone else How many gifts are given Solution Let E1 be the set of people giving a gift and E2 be the set of people receiving a gift Each consist of n elements Thus there are 712 ways to choose from givers and takers 1 However we have over counted We allowed a person to give a gift to himself There were 71 ways for this to occur Thus 712 7 n gifts are given Example 4 At same party everyone shakes everyone else7s hand How many handshakes took place Solution Again consider the two sets representing the people shaking hands E1 and E2 again There are still 712 ways for combinations to take place But again we cannot count a self handshake Thus there are n2 7 n pairings of different people However we have double counted A B vs BA for example Thus the real number is n2 7 n2 nn 7 12 is this actually divisible7 ln book nice birthday problem7 Very counter intuitive In a class with 23 students there is greater than a 50 chance of two people having the same birthday In class of 60 chance is 995 How large is the power set of S7 Theorem 2 A set with n elements has 2 subsets Proof Let A a1 an have 71 ordered elements To each subset B C A we associate a sequence of 17s and 07s in the following way B 77 blbg bn with b 0 or 1 and b 0 if 1 B and b 1 if a E B EX If n 3 the empty set is 000 The set alag is 101 etc Now question is how many sequences of length n are there consisting only of 17s and Us By counting principle there are 2 El Homework Pgs 44 and 45 7s 1 3 5 14 15 17 Chapter 8 lecture notes Math3317 Fall 2008 Instructor David Anderson Section 82 Independent Random Variables De nition 1 Let X and Y be two discrete RVs X and Y are independent if for any sets of real numbers A and B PX e A7Y e B PX e APY e B Can show that this condition is equivalent to PX S a7Y S b PX S aPY S b holding for any real numbers a7 b Therefore7 Theorem 1 IfF is the joint prob dist function for X and Y then X and Y are inde pendent random variables if and only iffor all tn 6 R Fail FXtFyu Now to focus on Discrete Random Variables Let X7 Y be discrete RVs with ranges E and F Then X and Y are independent if for alleEandyEF PX LY y PX zPY This says something about the probability mass functions Theorem 2 Let X and Y be two discrete RV on same sample space If pp7y is joint pmf ofX and Y then they are independent if and only iffor all real z and y 1996 1 pxzpyy Note that if my are in the ranges of X7 Y7 respectively7 then PX 96 Y y PYlezWPYg Not surprisingly functions of independent RVs are independent Theorem 3 Let X and Y be independent RVs and g R a R and h R a R be real valued functions then gX and hY are also independent random variables Proof Will ShOW that PMMSmMWSMPMM HMWltM Let A x g S A and B y My 3 b Then PgX a7hY b PX e A7Y e B PX e APY e B PmmmHMpw El Theorem 4 Let X and Y be independent RVs Then for all real valued functions 9 and h Proof Let A7 B be the ranges of X7 Y7 respectively Let May be the joint prnf Then E9XhYl Z Z 9hypy zEA yEB 2 995 2 hypx9 PYl zEA yEB Z 9951 0X95 Z Momy yEB zEA Emmmwmi Important application is EXY EXEY if X and Y are independent CONVERSE IS IN GENERAL FALSE Example 1 Let RX 717071 with 1071 p0 p1 13 Let Y X2 Thus7 EX 710113 0 EY 11311323 EXY 711130031130 But Clearly not independent PX 171 1 PY 1X 1PX1 11313 PX 1PY 1 13 23 29 Hw Section 827 pgs 339 3407 73 17 27 37 57 77 8 Chapter 4 note A new game Math331 Fall 2008 Instructor David Anderson Chapter 4 Understanding a game The owner of a casino has been presented a new game and wants to understand it better before putting it in his casino He wants a game in which he will win money and so the player will lose money in the long run but the players have the potential for big wins Would the following game be what he is looking for Name of the game Hearts On Fire Rules The player pays 1 to play The player is then dealt 4 cards If the player is dealt one heart the player loses and wins nothing If two hearts are dealt the player wins 2 if 3 hearts are dealt the player wins 5 and if 4 hearts are dealt the player wins 50 and sirens and whistles go off If the player is dealt no hearts the player is dealt a completely new hand with a new full deck of cards where the pay outs for this hand are the same as before except that now if no hearts are dealt a second time the player wins nothing What we need Compute the expected value variance and standard deviation of the earnings the casino makes on each play of this game Should the casino pick up the game Advice give a good sample space come up with a relevant random variable consider its range and pmf de ne the sets of interest and proceed in an algorithmic manner Solution A good sample space for this experiment is S a1a2a3a4b1b2b3b4 l ahbj E The 52 cards with order not mattering It is understood that the as in the above sample space represent the cards from the rst hand and the bs represent the cards from the second hand which for the purposes of our computation we assume is drawn even if we dont see it Let X be the net earnings in dollars of the casino in one instance of the game Then the range of X is RX 1 172 175 1 501 71 i4 49 We need the probability mass function of X To do this algorithmically I will de ne the following events Let A and B denote the events ofz39 hearts in the rst four cards and j hearts in the second four cards respectively Note that for each 27 A and B are independent The probabilities of each can be computed We have that X 1 player loses A1 U A0 B0 U B1 X 71 player wins 2 A2 U A0 B2 X 74 player wins 5 A3 U A0 B3 X 749 player wins 50 A4 U A0 B4 Note that the sets A07A17A27A37A4 are mutually exclusive and the sets B07B17B27B37B4 are mutually exclusive as well Using that fact with the independence of the As and Bs we can compute the desired probabilities to get the probability mass function PX 1 PA1 u A0 o Bo U B1 PA1 PA0 B0 U B1 frorn mutually exclusive PA1 PA0PB0 U B1 frorn independence PA1 PA0PBo PB1 frorn mutually exclusive 11W 2 Q 113 339 lt53 i 55gt lt53 i lt53 0 66448 PX 71 PA2 U A0 B2 A2 PA0 B2 A2 PA0PBz w w 542 542 542 027836 p p PX 44 PA3 u 40 o 33 PA3 PA0 Bs 1343 PA0PB3 133 35 Q 133 319 53 53 if 00537179 PX 749 P A4U 40 B4 P 44 P40 0 B4 1344 PAoPB4 143 349 143 if i 1255 0003443 AA Therefore7 EX 1 gtk 066448 71 gtk 027836 74 gtk 00537179 749 gtk 0003443 000254 Thus7 the average pro t per play for the Casino is 254 cents The variance is given by EX2 12 4 066448 4 412 4 027836 4 44 4 00537179 4 449 4 0003443 100689694 VarX EX2 4 EX2 10068969 4 002542 1006896294 gt TX VarX 3173 This seems like an okay game for the casino to pick up Markov Chain lecture notes Math331 Fall 2008 Instructor David Anderson Markov Chains lecture 1 Markov chains are probability models for trials of random experiments of great variety and their de ning characteristic is that they allow us to consider situations where the future evolution of the process of interest depends on where it is at present but not on how it got there This contrasts with the independent trials models we have considered in the law of large numbers and the central limit theorem For independent trial processes the possible outcomes of each trial of the experiment are the same and occur with the same probability Furthermore what happens on any trial is not affected by what happens on any other trial With Markov chain models we can generalize this to the extent that we allow the future to depend on the present We formulate this notion precisely in the following de nition De nition 1 Let Xn n 2 0 be a sequence of random variables with values in a nite or countably in nite set S Furthermore assume PXn1 y l X0 07X1 hwana Mian PXn1 len 107y for all n gt 0 and all states xy0x1xn1 in S Then Xn l n gt 0 is called a Markov chain with state space S and transition matrix P The property is called the Markov property 0 Therefore Markov chains are processes whose future is dependent only upon the present state Such processes arise abundantly in the natural mathematical and social sciences Some examples Position of a random walker number of each species of an ecosystem in a given year The initial distribution for the Markov chain is the sequence of probabilities 7Tx PX0 z z E S or in vector form 7139 l x E S Remarks i can be interpreted as stating that the conditional distribution of the random future state Xn1 depends only on the present state Xn and is independent of the past states X0 Xnil ii One can think of a Markov chain as a model for jumping from state to state of S and all jumps are governed by the jump probabilities pz39j A frog jumping from one pad to the next in a pond where the pads consist of the states of S and the jumps are taken according to the transition probabilities is a good picture to keep in mind iii Observe that py 2 0 for all my 6 S and 21196711 1 yES iv If S has 7 elements7 we will frequently denote S by 1727 77 or 0717 7 r 1 This appears in the notation used7 sometimes without explicit mention Hence keep this in mind as you read Example 1 A colleague travels between four coffee shops located as follows 3 4 Assume heshe chooses among the paths departing from a shop by treating each path as equally likely If we model our colleagues journey as a Markov chain7 then a suitable state space would be S 17 27 37 4 and the transition matrix is easily seen to be 0 1 0 0 13 0 13 13 0 12 0 12 0 12 12 0 13 Suppose this person has an initial distribution among the shops as follows x0147 for x0 1727374 or7 in vector form7 Show that PX0 17X1 27X2 47X3 3 771P172P274P473124 Using conditioning and the Markov property we see that PX0 17X1 2X2 47X3 3 PX3 3 1 X0 17X1 2X2 4PX0 17X1 2X2 4 p473PX2 4lX0 17X1 2PX0 17X1 2 p473p274PX1 2lX0 1PX0 1 p473p274p1727r1 7T11172274p473 14113 12 124 Using similar reasoning7 we have the general fact that PX0 07X1 17 7Xn1 m17Xn in 7T0P071P17 2 39 39 Pn17 2 which holds for all n 2 0 and all Choices of states in S We also have 4 19179 PltX1 y 2 Mo 957 X1 y Zwltzgtpltz7ygt W1 wlt2gt no 7r4l 8333 7 11 11 M479 and hence 17 37 WP Similarly7 PX24 iPX17X24 iPX24 l X1PX1 21m zgtpltz74gt pE174 pE174 7 7 7 7 7 p 274 7 p 24 WW 19474 More generally7 PX2 710 PX1 PWW PELlg PELlg 10 27y i 19 27y M479 M479 Thus7 PX2 17PX2 27 PX2 37PX2 4 7rPP irPZ Continuing in this fashion we get the following general fact Theorem 1 Let P denote the transition matrico ofa Markov chain Xn l n 2 0 with initial distribution 7139 Then PXn y yth entry of WP 7 and PXn y l X0 nyth entry of P Proof Assume that S 17277r Let W0 7T7 W1 PX1 17 7PX1 r77lVn PXn 177PXn Then7 W17TP Similarly7 2 W27TP 77lVn7rP Hence PXn y yth entry of WP lf7r 7T17 77TltTgtl with Ms 1 and 7Ty 0 if y 31 x then WP 10 z71 7p z7r where p z7y PXn leO But note that because of the special form of 7T7 WP is simply the xth row of P 7 and so p z7y is the my entry of P I Exercises 1 E0 Suppose there are three white and three black balls in two urns distributed so that each urn contains three balls We say the system is in state 239 239 01237 if there are 239 white balls in urn one At each stage one ball is drawn at random from each urn and interchanged Let Xn denote the state of the system after the nth draw7 and compute the transition matrix for the Markov chain Xn n 2 0 Suppose that whether or not it rains tomorrow depends on previous weather conditions only through whether or not it is raining today Assume that the probability it will rain tomorrow given it rains today is 04 and the probability it will rain tomorrow given it is not raining today is B If the state space is S 01 where state 0 means it rains and state 1 means it does not rain on a given day7 nd the transition matrix when we model this situation with a Markov chain If there is a 50 50 chance for rain today7 compute the probability it will rain three days from now if 04 710 and B 310 Sections 31 lecture notes Math331 Fall 2008 Instructor David Anderson Section 31 Conditional probability We want the probability of an event A given that B has occurred Consider Venn Diagrams Shows that PAB 133 39 However this is a de nition and requires that PB gt 0 PA B Example 1 The probability that a person lives to 80 is 64 The probability that a person lives until 90 is 51 What is the probability that a person will live to 90 given that that person is 80 Solution Let A be the event of living until 90 and B be the event of living until 80 Then PABPAB PA PB PB 398539 Example 2 In a class everyone ips a fair coin twice Suppose that you choose a student and see that that person had at least one heads What is the probability that the other ip for that person is a heads Solution Let A be the event of a someone having at least one heads and B be the event of having two We want PB l A We see PBA PB 14 1 PA PA 34 339 and they are equally likely PB A V Why New sarnple space is H T T H H V Theorem 1 Let S be a sample space with B C S and PB gt 0 Then a PA l B 2 0 for any event A ofS b PS l B 1 e If A1A2 is a sequence of mutually emeluslue events then P UA B EMA B i1 i1 Proof Prove if time I Important Can think of conditional probabilities as their own probability function from 73B to R For E C B de ne QE PE l B By theorem above this function satis es all the axioms of a probability and so all of the theorems we have hold Example 3 In the Senate there are 53 liberals and 47 conservatives A committee is composed randomly of 9 senators We know that the committee does not contain more than 3 conservative senators What is the probability that there are exactly 7 liberal senators Solution Let A be the event of exactly 7 liberal senators Let B be the event of 0 1 2 or 3 conservative senators We want PAlB PA BPB Firstly PB 4 gig180 0307519 1 0 Second A B A 7 liberal and 2 conservative senators Thus 53 47 PA m B PA 7100 0087596 9 Therefore the probability of interest is 0087596 7 28485 0307519 7 39 39 Homework HARD pgs 82 84 7s 2 3 8 9 18 Section 32 Law of Multiplication We have PAB 133 7 PBA PAB PM POEM Therefore PAB PBPA B PBA PAB PAPB A Example 4 Suppose that in a group of 10 people two are criminals who can not lie You7re a cop what are the odds that the rst two people you ask admit to being criminals Solution Let A1 and A2 be the events of nding a criminal on the rst and second question respectively Then PA1A2 PA1PA2 A1 Theorem 2 Good for sequential things IfPA1A2 An gt 0 then PA1A2An PA1PA2 A1PA3 A1A2PA4 A1A2A3quot39PAn A1A2 An1 Proof PA1PA2 l A1PAs l A1A2PA4 AlAZAS PAn A1A2 MAM PA1A2PA3 A1A2PA4 A1A2A3PA A1A2 A1 PA1A2A3PA4 A1A2A3PA A1A2 A1 PA1A2 WA Homework really is cummulative Pgs 87 88 7s 1 6 10 Sections 24 lecture notes Math331 Fall 2008 Instructor David Anderson Section 24 Combinations Combinations are unordered subsets of a set De nition 1 An unordered arrangement of 7 objects from a set A with 71 objects 7quot S n is called an r element combination of A Or I think more clearly De nition 2 Suppose A is a set with n elements A combination of size 7 of elements of A is a subset with 7 elements So two combinations differ only if they differ in composition Example 1 Toppings on a pizza Doesn7t matter if pizza has Pepperoni harn cheese or cheese pepp harn Both the same How many combinations are there Let A be a set with 71 objects Let x be the number of r element combinations ways to choose 7 elements from A Suppose that for a given combination of size 7 we could nd all r element permutations of that combination If we could do that for all combinations we would have all r element permutations of the set A We know there are r permutations of each individual combination Thus and so i 71 i n0 lt is read 71 choose 7 r Note n0 is the number of subsets of size 7 that can be constructed from a set of size n which has a total of 2 subsets Example 2 ln how many ways can 3 boys and 2 girls be selected from 10 boys and 5 girls What is the probability that after selecting 5 people at random you ended up with 3 boys and 2 girls 10 i 5 Solution There are 3 ways of selecting the three boys and 2 ways of selecting the girls Thus by the counting principle the number of combinations is l l 3 2 7l3l3l2l 15 For the second question there are 5 ways of selecting 5 people from the class of 15 Thus the prob that we get 3 boys and 2 girls is 1200 15 712001039539 7 3996 5 7 15 739 Theorem 1 Binomial expansion For any integer n 2 0 n n n i 227239 239 22 7 2 20 Proof idea of proof Think about how to get terms of the form xn iyi why must we only have terms of this form We need to choose 72 7 2 27s and 2 y7s This is equivalent to simply choosing 2 y7s for then the leftover is given to There are ways of doing 2 this That7s the proof I Example 3 What is the coef cient of 22y3 in the expansion 22 3y57 5 Solution Let u 2x and w By The coef cient of 222203 in expansion of u w5 is 3 5 and 222203 223322y3 Thus coef cient is 3 2233 72 Note is the number of subsets of size 2 of a set with n elements Therefore because 2 MH 20 such a set has 2 total subsets EX What is the probability of getting a ush dealt to you in a game of poker allow straight and royal ushes 52 Solution There are 5 possible poker hands To get a ush you need ve cards of the same suit For each suit there are 5 such hands There are four suits Therefore the 13 52 13151471 4 gtlt 4 gtk 002 5 5 5181 521 Student ex What is the probability of getting a four of a kind dealt to you probability of a ush is 52 Solution Again there are 5 possible poker hands We need to count the number of ways to get dealt four of the same card For each number or face of which there are 13 4 the number of ways to get dealt all four of those cards is 4 1 The number of ways to 48 be dealt one of the remaining cards is lt 1 48 Therefore7 the probability of being dealt a four of a kind is X148713485477 1 52 52 7 4165 5 13 00024 Stirling7s Formula Theorem 2 For any n n N 27m 6 where N means 71 lirn 1 H00 27m 6 Homework Pgs63 66 73 1 2 7713719730 7 7
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'