### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# PROBABILITY MATH 511

GPA 3.51

### View Full Document

## 23

## 0

## Popular in Course

## Popular in Mathematics (M)

This 272 page Class Notes was uploaded by Cassidy Grimes on Monday October 26, 2015. The Class Notes belongs to MATH 511 at University of South Carolina - Columbia taught by J. Tebbs in Fall. Since its upload, it has received 23 views. For similar materials see /class/229537/math-511-university-of-south-carolina-columbia in Mathematics (M) at University of South Carolina - Columbia.

## Reviews for PROBABILITY

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 10/26/15

STAT MATH 511 PROBABILITY Fan 2009 Lecture Notes Joshua M Tebbs Department of Statistics University of South Carolina TABLE OF CONTENTS STATMATH 5117 J TEBBS Contents 2 Probability 1 21 Introduction 1 22 Sample spaces 3 23 Basic set theory 3 24 Properties of probability 6 25 Discrete probability models and events 8 26 Tools for counting sample points 10 261 The multiplication rule 10 262 Permutations 11 263 Combinations 15 27 Conditional probability 17 28 lndependence 20 29 Law of Total Probability and Bayes Rule 22 3 Discrete Distributions 27 31 Random variables 27 32 Probability distributions for discrete random variables 28 33 Mathematical expectation 32 34 Variance 35 35 Moment generating functions 37 36 Binomial distribution 41 37 Geometric distribution 45 38 Negative binomial distribution 48 39 Hypergeometric distribution 51 310 Poisson distribution 55 TABLE OF CONTENTS STATMATH 5117 J TEBBS 4 Continuous Distributions 62 41 Introduction 62 42 Cumulative distribution functions 62 43 Continuous random variables 64 44 Mathematical expectation 70 441 Expected value 70 442 Variance 72 443 Moment generating functions 73 45 Uniform distribution 74 46 Normal distribution 76 47 The gamma family of distributions 81 471 Exponential distribution 82 472 Gamma distribution 85 473 X2 distribution 90 48 Beta distribution 91 49 Chebyshev7s Inequality 95 410 Expectations of piecewise functions and mixed distributions 96 4101 Expectations of piecewise functions 96 4102 Mixed distributions 99 5 Multivariate Distributions 101 51 Introduction 101 52 Discrete random vectors 102 53 Continuous random vectors 103 54 Marginal distributions 106 55 Conditional distributions 109 56 Independent random variables 113 TABLE OF CONTENTS STATMATH 5117 J TEBBS 57 Expectations of functions of random variables 117 58 Covariance and correlation 120 581 Covariance 120 582 Correlation 124 59 Expectations and variances of linear functions of random variables 126 510 The multinomial model 128 511 The bivariate normal distribution 130 512 Conditional expectation 131 5121 Conditional means and curves of regression 131 5122 lterated means and variances 132 CHAPTER 2 STATMATH 5117 J TEBBS 2 Probability Complementary reading Chapter 2 21 Introduction TERMINOLOGY The text de nes probability as a measure of ones belief in the occurrence of a future random event Probability is also known as the mathematics of uncertainty7 REAL LIFE EVENTS Here are some events we may wish to assign probabilities to o tomorrow7s temperature exceeding 80 degrees 0 getting a at tire on my way home today 0 a new policy holder making a claim in the next year 0 the NASDAQ losing 5 percent of its value this week 0 you being diagnosed with prostatecervical cancer in the next 20 years ASSIGNING PROBABILITIES How do we assign probabilities to events There are three general approaches 1 Subjective approach 0 This approach is based on feeling and may not even be scienti c 2 Relative frequency approach 0 This approach can be used when some random phenomenon is observed re peatedly under identical conditions 3 Apiomatz39cModel based approach This is the approach we will take in this course PAGE 1 CHAPTER 2 STATMATH 5117 J TEBBS 015 Pranaman am 010 Pranaman am 00 005 010 015 020 005 0 200 400 600 500 1000 o N o o n o o m o o m o o o o o 015 010 Pranaman am 005 010 015 020 Pranaman am 005 0 0 0 200 400 600 500 1000 o N o o n o o 600 500 1000 Tna1 Tna1 Figure 21 The relative frequency of die rolls which result in a 2 each plot represerits 1000 simulated rolls of a fair die Example 21 Relative frequency approach Suppose that we roll a die 1000 times and record the number of times we observe a 277 Let A denote this event The relative frequency approach says that where nA denotes the frequency of the event7 and n denotes the number of trials performed The proportion is called the relative frequency The symbol PA is shorthand for the probability that A occurs7 RELATIVE FREQUENCY APPROACH Continuing with our example7 suppose that nA 158 We would then estimate PA by 1581000 0158 If we performed the experiment of rolling a die repeatedly7 the relative frequency approach says that w n a P l7 as n a 00 Of course7 if the die is fair7 then a PA 16 D PAGE 2 CHAPTER 2 STATMATH 5117 J TEBBS 22 Sample spaces TERMINOLOGY Suppose that a random experiment is performed and that we observe an outcome from the experiment eg7 rolling a die The set of all possible outcomes for an experiment is called the sample space and is denoted by S Example 22 In each of the following random experiments7 we write out a correspond ing sample space a The Michigan state lottery calls for a three digit integer to be selected S 000 0010027 998 999 b A USC student is tested for chlamydia 0 negative7 l positive S 07 1 c An industrial experiment consists of observing the lifetime of a battery7 measured in hours Different sample spaces are 1 w w 2 0 2 07 17 237 Sg defective not defective Sample spaces are not unique in fact7 how we describe the sample space has a direct in uence on how we assign probabilities to outcomes in this space E 23 Basic set theory TERMINOLOGY A countable set A is a set whose elements can be put into a one to one correspondence with N 127 7 the set of natural numbers A set that is not countable is said to be uncountable TERMINOLOGY Countable sets can be further divided up into two types 0 A countably in nite set has an in nite number of elements 0 A countably nite set has a nite number of elements PAGE 3 CHAPTER 2 STATMATH 5117 J TEBBS Example 23 Say whether the following sets are countable and7 furtherrnore7 nite or in nite or uncountable a A 012u10 b B 123 c Ox0ltxlt2 TERMINOLOGY Suppose that A and B are sets events We say that A is a subset of B if every outcome in A is also in B7 written A C B or A Q B o IMPLICATION In a random experirnent7 if the event A occurs7 then so does B The converse is not necessarily true TERMINOLOGY The null set7 denoted by Q is the set that contains no elernents TERMINOLOGY The union of two sets A and B is the set of all elements in either A or B or both7 written A U B The intersection of two sets A and B is the set of all elements in both A and B7 written A B Note that A B Q A U B o REMEMBER Union lt gt or Intersection lt gt and77 EXTENSION We extend the notion of unions and intersections to more than two sets Suppose that A17A2An is a nite sequence of sets The union of A17A2An is C ampampUumuamp x H that is7 the set of all elements contained in at least one Aj The intersection of A17A2 lS ampampmampmmmamp J H the set of all elements contained in each of the sets A j 12771 PAGE 4 CHAPTER 2 STATMATH 5117 J TEBBS EXTENSION Suppose that A1A27 is a countable sequence of sets The union and intersection of this in nite collection of sets is denoted by U Aj and Aj j1 7391 respectively The interpretation is the same as before Example 24 De ne the sequence of sets Aj 1 71j11j for j 127 Then7 UAJ 02 and Aj 1 a j1 j1 TERMINOLOGY Suppose that A is a subset of S the sample space The comple ment of a set A is the set of all elements not in A but still in S We denote the complement by A Distributive Laws 1 A BUO A BUA O 2 AUB OAUB AUO DeMorgans Laws rA BZu 2AUBZ TERMINOLOGY We call two events A and B mutually exclusive7 or disjoint7 if A B 0 that is7 A and B have no common elements Example 25 Suppose that a fair die is rolled A sample space for this random exper iment is S 123456 a If A 123 then X 456 b lfA 1237 B 457 and O 2367 then A B Q and B O Q NotethatA O23 Note alsothatA B O andAUBUOS D PAGE 5 CHAPTER 2 STATMATH 5117 J TEBBS 24 Properties of probability KOLMOLGOROV AXIOMS OF PROBABILITY Given a nonempty sample space S the measure PA is a set function satisfying three axioms 1 PA 2 0 for every A Q S 2 135 1 3 If A1A2 is a countable sequence of pairwise disjoint events ie A Aj Q for 239 31 j in S then RESULTS The following results are important properties of the probability set function P and each one follows from the Kolmolgorov axioms just stated All events below are assumed to be subsets of a nonempty sample space S 1 Complement rule For any event A Hm1im Proof Note that S A U A Thus since A and A are disjoint PA U A PA PA by Axiom 3 By Axiom 2 PS 1 Thus 1HPMUEPMHPD E0 130 0 Proof Take A Q and A S Use the last result and Axiom 2 D 9 Monotonicity property Suppose that A and B are two events such that A C B Then PA S PB Proof Write B A U B Clearly A and B A are disjoint Thus by Axiom 3 PB PA PB Because PB 2 0 we are done D PAGE 6 CHAPTER 2 STATMATH 5117 J TEBBS 4 For any event A7 PA S 1 Proof Since A C S this follows from the monotonicity property and Axiom 2 D U Inclusionexclusion Suppose that A and B are two events Then7 PA 0 B PA PB 7 PA m B Proof Write A U B A U A B Then7 since A and A B are disjoint7 by Axiom 37 PA U B PA PA B Now7 write B A B U A B Clearly7 A B and A B are disjoint Thus7 again7 by Axiom 37 PB PA B PA B Combining the last expressions for PA U B and PB gives the result D Example 26 The probability that train 1 is on time is 0957 and the probability that train 2 is on time is 093 The probability that both are on time is 090 a What is the probability that at least one train is on time SOLUTION Denote by Al the event that train 239 is on time7 for 239 12 Then7 PA1 U A2 PA1 PA2 7 PA1 A2 095 093 7 090 098 b What is the probability that neither train is on time SOLUTION By DeMorgan7s Law7 PA1 022 PA1 u A217 PA1 0 A2 1 i 098 002 D EXTENSION The inclusionexclusion formula can be extended to any nite sequence of sets A1A2An For example7 if n 37 PA1 U A2 U A3 PA1 PA2 PA3 7 PA1 A2 7 PA1 A3 7 Ag Ag PAGE 7 CHAPTER 2 STATMATH 5117 J TEBBS ln general7 the inclusion exclusion formula can be written for any nite sequence P ltQAgt Pmi 7 PA1 M195 PA1 A2 A197 1 z lt12 z1lt12lt13 71 1PA1 A2 A Of course7 if the sets A1A2 An are pairwise disjoint7 then we arrive back at inm a result implied by Axiom 3 by taking An An Q 25 Discrete probability models and events TERMINOLOGY If a sample space for an experiment contains a nite or countable number of sample points7 we call it a discrete sample space 0 Finite number of sample points lt oo77 o Countable number of sample points may equal 007 but can be counted ie7 sample points may be put into a 11 correspondence with N 17 27 gt77 Example 27 A standard roulette wheel contains an array of numbered compartments referred to as pockets7 The pockets are either red7 black7 or green The numbers 1 through 36 are evenly split between red and black7 while 0 and 00 are green pockets On the next play7 we are interested in the following events A1 13 A2 red A3 000 TERMINOLOGY A simple event is an event that can not be decomposed That is7 a simple event corresponds to exactly one sample point Compound events are those events that contain more than one sample point In Example 27 because A1 contains PAGE 8 CHAPTER 2 STATMATH 5117 J TEBBS only one sample point it is a simple event The events A2 and A3 contain more than one sample point thus they are compound events STRATEGY Computing the probability of a compound event can be done by 1 counting up all sample points associated with the event this can be very easy or very dif cult 2 adding up the probabilities associated with each sample point NOTATION Your authors use the symbol E to denote the 2th sample point ie z th simple event Thus adopting the aforementioned strategy if A denotes any compound event PA Z PE iiEiEA We simply sum up the simple event probabilities for all 239 such that E E A Example 28 An equiprobabz39lz39ty model Suppose that a discrete sample space S contains N lt 00 sample points each of which are equally likely If the event A consists of 71 sample points then PA nilN Proof Write S E1 U E2 U U EN where E corresponds to the 2th sample point 239 1 2 N Then N 1 135 PE1UE2UUEN ZPE i1 Now as PE1 PE2 PEN we have that N 1 Z PE NPltE1gt i1 and thus PE1 PE2 PEN Without loss of generality take AE1UE2Uquot39UEna Then PAPE1UE2UUEMZPEinaN a CHAPTER 2 STATMATH 5117 J TEBBS Example 29 Two jurors are needed from a pool of 2 men and 2 women The jurors are randomly selected from the 4 individuals A sample space for this experiment is S M1 M27 M17 W17 M17 W27 M27 W17 M27 W27 W1W2 What is the probability that the two jurors chosen consist of 1 male and 1 female SOLUTION There are N 6 sample points7 denoted in order by E17E2E6 Let the event A one male7 one female M1lV17 M1lV27 M271V17 M27 VV27 so that 71A 4 If the sample points are equally likely probably true if the jurors are randomly selected7 then PA 46 D 26 Tools for counting sample points 261 The multiplication rule MULTIPLIC39A TION RULE Consider an experiment consisting of k 2 2 stages7 where 711 number of ways stage 1 can occur n2 number of ways stage 2 can occur nk number of ways stage k can occur Then7 there are k Hmnl X712gtltgtlt71k i1 different outcomes in the experiment Example 210 An experiment consists of rolling two dice Envision stage 1 as rolling the rst and stage 2 as rolling the second Here7 n1 6 and n2 6 By the multiplication rule7 there are 711 gtlt n2 6 gtlt 6 36 different outcomes D PAGE 10 CHAPTER 2 STATMATH 5117 J TEBBS Example 211 In a controlled eld experiment7 I want to form all possible treatment combinations among the three factors Factor 1 Fertilizer 60 kg7 80 kg7 100kg 3 levels Factor 2 lnsects infectednot infected 2 levels Factor 3 Precipitation level low7 high 2 levels Here7 n1 3712 27 and n3 2 Thus7 by the multiplication rule7 there are 711 gtltn2 gtltn3 12 different treatment combinations D Example 212 Suppose that an Iowa license plate consists of seven places the rst three are occupied by letters the remaining four with numbers Compute the total number of possible orderings if a there are no letternumber restrictions b repetition of letters is prohibited c repetition of numbers is prohibited d repetitions of numbers and letters are prohibited ANSWERS a 26 X 26 X 26 X 10 X 10 X 10 X 10 17577607000 b 26 X 25 X 24 X 10 X 10 X 10 X 10 15670007000 c 26gtlt26gtlt26gtlt10gtlt9gtlt8gtlt788583040 d 26gtlt25gtlt24gtlt10gtlt9gtlt8gtlt778624000 262 Permutations TERMINOLOGY A permutation is an arrangement of distinct objects in a particular order Order is important PAGE 11 CHAPTER 2 STATMATH 5117 J TEBBS PROBLEM Suppose that we have n distinct objects and we want to order or perrnute these objects Thinking of 71 slots we will put one object in each slot There are 0 71 different ways to choose the object for slot 1 o n 7 1 different ways to choose the object for slot 2 o n 7 2 different ways to choose the object for slot 3 and so on down to o 2 different ways to choose the object for slot 71 7 1 and o 1 way to choose for the last slot IMPLICATION By the multiplication rule there are nn 7 1n 7 2 21 711 different ways to order permute the n distinct objects Example 213 My bookshelf has 10 books on it How many ways can I permute the 10 books on the shelf ANSWER 101 3628800 D Example 214 Now suppose that in Example 213 there are 4 math books 2 chemistry books 3 physics books and 1 statistics book I want to order the 10 books so that all books of the same subject are together How many ways can I do this SOLUTION Use the multiplication rule Stage 1 Permute the 4 math books 41 Stage 2 Permute the 2 chemistry books 21 Stage 3 Permute the 3 physics books 31 Stage 4 Permute the 1 statistics book 11 Stage 5 Permute the 4 subjects mcps 41 Thus there are 41 gtlt 21 gtlt 31 gtlt 11 gtlt 41 6912 different orderings D PAGE 12 CHAPTER 2 STATMATH 5117 J TEBBS PERMUTATIONS With a collection of n distinct objects7 we now want to choose and permute r of them 7quot S The number of ways to do this is n PM E The symbol P is read the permutation of 71 things taken 7 at a time77 Proof Envision 7 slots There are 71 ways to ll the rst slot7 n 71 ways to ll the second slot7 and so on7 until we get to the rth slot7 in which case there are n 7 7 1 ways to ll it Thus7 by the multiplication rule7 there are nn71n7r1 W different permutations D Example 215 With a group of 5 people7 I want to choose a committee with three members a president7 a Vice president7 and a secretary There are 5l 120 P5gt3m7 60 different committees possible Here note that order is important D Example 216 What happens if the objects to permute are not distinct Consider the word PEPPER How many permutations of the letters are possible TRICK lnitially7 treat all letters as distinct objects by writing7 say7 P1E1P2P3E2R There are 6 720 different orderings of these distinct objects Now7 there are 3 ways to permute the Ps 2 ways to permute the Es ll ways to permute the Rs So7 6 is actually 3 gtlt 2 gtlt ll times too large That is7 there are 6 i 60 possible permutations D PAGE 13 CHAPTER 2 STATMATH 5117 J TEBBS MULTINOMIAL COEFFICIENTS Suppose that in a set of 71 objects there are 711 that are similar 712 that are similar nk that are similar where 711 712 nk n The number of permutations ie distinguishable permutations of the 71 objects is given by the multinomial coe icient n i n n1n2nk 7771712 nk39 NOTE Multinomial coef cients arise in the algebraic expansion of the multinomial ex pression 1 x2 xk ie n V L 12quot39kn lt x1x 2xkk7 D nlnznk where k D n1n2nk n j1 Example 217 How many signals each consisting of 9 ags in a line can be made from 4 white ags 2 blue ags and 3 yellow ags ANSWER 9 7 1260 D 4 2 3 Example 218 In Example 217 assuming all permutations are equally likely what is the probability that all of the white ags are grouped together We offer two solutions The solutions differ in the way we construct the sample space De ne A all four white ags are grouped together SOLUTION 1 Work with a sample space that does not treat the ags as distinct objects but merely considers color Then we know from Example 217 that there are 1260 different orderings Thus N number of sample points in S 1260 Let nu denote the number of ways that A can occur We nd 71 by using the multipli cation rule PAGE 14 CHAPTER 2 STATMATH 5117 J TEBBS Stage 1 Pick four adjacent slots n1 6 Stage 2 With the remaining 5 slots permute the 2 blues and 3 yellows n2 10 Thus 71 6 gtlt 10 60 Finally since we have equally likely outcomes PA naN 601260 00476 D SOLUTION 2 Initially treat all 9 ags as distinct objects ie W1W2W3W4B132Y1Y2Y3 and consider the sample space consisting of the 91 different permutations of these 9 distinct objects Then N number of sample points in S 91 Let nu denote the number of ways that A can occur We nd nu again by using the multiplication rule Stage 1 Pick adjacent slots for W1 W2W3 W4 n1 6 Stage 2 With the four chosen slots permute W1 W2 W3 W4 n2 41 Stage 3 With remaining 5 slots permute B1 B2 Y1Y2 Y3 n3 51 Thus 71 6 gtlt 41 gtlt 51 17280 Finally since we have equally likely outcomes PA naN 172809 x 00476 D 263 Combinations COMBINATIONS Given n distinct objects the number of ways to choose r of them r S 71 without regard to order is given by n n1 07 r r1017 r The symbol On is read the combination of 71 things taken r at a time77 By convention we take 01 1 PAGE 15 CHAPTER 2 STATMATH 5117 J TEBBS Proof Choosing 7 objects is equivalent to breaking the 71 objects into two distinguishable groups Group 1 r chosen Group 2 n 7 r not chosen n 39rn739r There are On ways to do this D REMARK We will adopt the notation read 71 choose 7 as the symbol for CT The terms are called binomial coe icients since they arise in the algebraic expansion of a binomial viz7 z y Zn DyHy 70 Example 219 Return to Example 215 Now7 suppose that we only want to choose 3 committee members from 5 without designations for president7 vice president7 and secretary Then7 there are 5 i 5 75gtlt4gtlt3l710 3 734573 31x2 7 different committees D NOTE From Examples 215 and 2197 one should note that PW r gtlt OW Recall that combinations do not regard order as important Thus7 once we have chosen our 7 objects there are On ways to do this7 there are then Tl ways to permute those 7 chosen objects Thus7 we can think of a permutation as simply a combination times the number of ways to permute the r chosen objects Example 220 A company receives 20 hard drives Five of the drives will be randomly selected and tested If all ve are satisfactory7 the entire lot will be accepted Otherwise7 the entire lot is rejected If there are really 3 defectives in the lot7 what is the probability of accepting the lot PAGE 16 CHAPTER 2 STATMATH 5117 J TEBBS SOLUTION First7 the number of sample points in S is given by l N 20 15504 5 5 20 7 5 Let A denote the event that the lot is accepted How many ways can A occur Use the multiplication rule Stage 1 Choose 5 good drives from 17 Stage 2 Choose 0 bad drives from 3 17 By the multiplication rule7 there are nu 5 gtlt 6188 different ways A can occur Assuming an equiprobability model ie7 each outcome is equally likely7 PA naN 618815504 0399 D 27 Conditional probability MOTIVATION In some problems7 we may be fortunate enough to have prior knowledge about the likelihood of events related to the event of interest We may want to incorporate this information into a probability calculation TERMINOLOGY Let A and B be events in a nonempty sample space S The condi tional probability of A7 given that B has occurred7 is given by Pmmm PAlB MB provided that PB gt 0 Example 221 A couple has two children a What is the probability that both are girls b What is the probability that both are girls7 if the eldest is a girl PAGE 17 CHAPTER 2 STATMATH 5117 J TEBBS SOLUTION a The sample space is given by S M7M7M7F7F7M713713 and N 4 the number of sample points in S De ne A1 1st born child is a girl A2 2nd born child is a girl Clearly A1 A2 F and PA1 A2 14 assuming that the four outcomes in S are equally likely SOLUTION b Now we want PA2lA1 Applying the de nition of conditional proba bility we get PA1 A2 14 PA A 712D 20 Pm 24 Example 222 In a certain community 36 percent of the families own a dog 22 percent of the families that own a dog also own a cat and 30 percent of the families own a cat A family is selected at random a Compute the probability that the family owns both a cat and dog b Compute the probability that the family owns a dog given that it owns a cat SOLUTION Let C family owns a cat and D family owns a dog From the problem we are given that PD 036 PClD 022 and 130 030 In a we want PC D We have 022 Help W W Thus PO m D 036 gtlt 022 00792 For b we want PDlC Simply use the de nition of conditional probability PO m D 7 00792 130 7 PAGE 18 PDlO 0264 D CHAPTER 2 STATMATH 5117 J TEBBS RESULTS It is interesting to note that conditional probability PHB satis es the ax iorns for a probability set function when PB gt 0 ln particular7 1 PAlB 2 0 2 PBlB 1 3 If A17A2 is a countable sequence of pairwise mutually exclusive events ie7 Ai Aj Q for 239 in S7 then P UPL i1 EXERCISE Show that the measure PHB satis es the Kolrnolgorov axioms when B iPUillB PB gt 0 ie7 establish the results above MULTIPLIC39ATION LAW OF PROBABILITY Suppose A and B are events in a non ernpty sarnple space S Then7 PA B PBlAPA PAlBPB Proof As long as PA and PB are strictly positive7 this follows directly from the de nition of conditional probability D EXTENSION The rnultiplication law of probability can be extended to more than 2 events For exarnple7 PA1 A2 A3 PA1 A2 A3 PA3lA1 A2 gtlt PA1 A2 PA3lA1 A2 gtlt PA2lA1 gtlt PA1 NOTE This suggests that we can compute probabilities like PA1 A2 A3 sequen tially77 by rst cornputing PAl7 then PltA2lA17 then PA3lA1 A2 The probability of a k fold intersection can be computed sirnilarly ie7 kil A i1 k P A PA1 gtlt PA2lA1 gtlt PA3lA1 m A2 gtlt gtlt P Ak PAGE 19 CHAPTER 2 STATMATH 511 J TEBBS Example 223 I am dealt a hand of 5 cards What is the probability that they are all spades SOLUTION De ne A to be the event that card 239 is a spade 239 1 2345 Then 13 PltA1gt a 12 PA2A1 5 ll PltA3lA1 A2gt 10 PltA4lA1 A2 A3gt E 9 PltA5lA1 A2 A3 A4gt E so that 713X12X11X10X9N00005 52 51 50 49 48N39 39 PA1 A2 A3 A4 A5 NOTE As another way to solve this problem a student recently pointed out that we could simply regard the cards as belonging to two groups spades and non spades There are 153 ways to draw 5 spades from 13 There are 552 possible hands Thus the probability of drawing 5 spades assuming that each hand is equally likely is 153552 00005 D 28 Independence TERMINOLOGY When the occurrence or non occurrence of A has no effect on whether or not B occurs and Vice versa we say that the events A and B are independent Mathematically we de ne A and B to be independent iff PA m B PAPB Otherwise A and B are called dependent events Note that if A and B are independent PA B PAPB PAB W W PM and PBA P 1512 P 135 PB PAGE 20 CHAPTER 2 STATMATH 5117 J TEBBS Example 224 A red die and a white die are rolled Let A 4 on red die and B sum is odd Of the 36 outcomes in S 6 are favorable to A7 18 are favorable to B7 and 3 are favorable to A B Assuming the outcomes are equally likely7 6 18 3 PA B PAPB gtlt and the events A and B are independent D Example 225 In an engineering system7 two components are placed in a series that is7 the system is functional as long as both components are Let Ai 239 127 denote the event that component 239 is functional Assuming independence7 the probability the system is functional is then PA1 A2 PA1PA2 lf PAl 0957 for example7 then PA1 A2 095 gtlt 095 09025 If the events A1 and A2 are not independent7 we do not have enough information to compute PA1 A2 D INDEPENDENCE OF OOMPLEMENTS If A and B are independent events7 so are a A and B b A and E c A and E Proof We will only prove a The other parts follow similarly PA B PABPB 1i PABPB 17 PAPB PAPB D EXTENSION The concept of independence and independence of complements can be extended to any nite number of events in S TERMINOLOGY Let A1A2An denote a collection of n 2 2 events in a nonempty sample space S The events A17 A27 An are said to be mutually independent if for any subcollection of events7 say7 Al1Al2Aik7 2 S k S n we have 11 PAGE 21 CHAPTER 2 STATMATH 5117 J TEBBS CHALLENGE Come up with a random experiment and three events which are pairwise independent but not mutually independent COMMON SETTING Many experiments consist of a sequence of 71 trials that are viewed as independent eg ipping a coin 10 times If A denotes the event associated with the 2th trial and the trials are independent then P Agt H PA i1 i1 Example 226 An unbiased die is rolled six times Let A 239 appears on roll 239 for 239 1 2 6 Then PA 16 and assuming independence 6 6 PA1 A2 A3 A4 A5 A6HPA i1 Suppose that if A occurs we will call it a match77 What is the probability of at least one match in the six rolls SOLUTION Let B denote the event that there is at least one match Then F denotes the event that there are no matches Now 6 pm pal mg m 23 mm Ms MG H pm 0335 i1 Thus PB 1 7 P 1 7 0335 0665 by the complement rule EXERCISE Generalize this result to an n sided die What does this probability converge toasn oltgt7D 29 Law of Total Probability and Bayes Rule SETTING Suppose A and B are events in a nonempty sample space S We can express the event A as follows AA BuA union of disjoint events PAGE 22 CHAPTER 2 STATMATH 5117 J TEBBS By the third Kolmolgorov axiom7 PA PA B PA HF PAlBPB PAl P where the last step follows from the multiplication law of probability This is called the Law of Total Probability LOTP The LOTP is helpful Sometimes PAlB7 PAl 7 and PB may be easily computed with available information whereas computing PA directly may be dif cult NOTE The LOTP follows from the fact that B and F partition S that is7 a B and F are disjoint7 and b Bo 5 Example 227 An insurance company classi es people as accident prone77 and non accident prone77 For a xed year7 the probability that an accident prone person has an accident is 047 and the probability that a non accident prone person has an accident is 02 The population is estimated to be 30 percent accident prone a What is the probability that a new policy holder will have an accident SOLUTION De ne A policy holder has an accident and B policy holder is accident prone Then PB 03 PAlB 04 HP 07 and MAE 02 By the LOTP PA PAlBPB PAl P 0403 0207 026 D b Now suppose that the policy holder does have an accident What is the probability that he was accident prone 7 SOLUTION We want PBlA Note that i PA m B i PAlBPB i 0403 i PBlA PM 7 PM i 026 7046 D PAGE 23 CHAPTER 2 STATMATH 5117 J TEBBS NOTE From this last part7 we see that7 in general7 AlBPB i PAlBPB PA PAlBPB PA P 39 PBlA M This is a form of Bayes Rule Example 228 A lab test is 95 percent effective at detecting a certain disease when it is present sensitivity When the disease is not present7 the test is 99 percent effective at declaring the subject negative speci city lf 8 percent of the population has the disease prevalence7 what is the probability that a subject has the disease given that a his test is positive b his test is negative SOLUTION Let D disease is present and gt14 test is positive We are given that PD 008 prevalence7 PgtI4lD 095 sensitivity7 and P l5 099 speci city In part a7 we want to compute PDlgtI4 By Bayes Rule7 Paw PIlDPD PgtIltlDPD PgtIltl5P5 095008 0 892 095008001092 39 39 In part b7 we want PDl By Bayes Rule7 P lDPD P lDPD P lBPB 005008 005008 099092 HIM m 0004 Table 21 The general Bayesian scheme Measure before test Result Updated rneasure PD F PDlF 008 E gt14 0892 008 a i a 0004 NOTE We have discussed the LOTP and Bayes Rule in the case ofthe partition B7B However7 these rules hold for any partition of S PAGE 24 CHAPTER 2 STATMATH 5117 J TEBBS TERMINOLOGY A sequence of sets B1B2 Bk is said to form a partition of the sample space S if a B1 U B2 U U Bk S exhaustive condition and b B Bj Q for all 2 79739 disjoint condition LAW OF TOTAL PROABILITYrestated Suppose that B1 B2 Bk form a partition of S and suppose PB gt 0 for all 2 1 2 2 Then k M Z mama 21 Proof Write Cw AA SA B1UB2UUBk Ana 21 Thus k k k PAP U A B ZPA BZPAlBPB a 21 21 21 BA YES RULE restatecO Suppose that B1 B2 Bk form a partition of S and suppose that PA gt 0 and PB gt 0 for all 2 1 2 2 Then 221PAlBiPBi Proof Simply apply the de nition of conditional probability and the multiplication law of probability to get PAlBjPBj PM 39 Then just apply LOTP to PA in the denominator to get the result D HEM REMARK Bayesians will call PBj the prior probability for the event B7 they call PleA the posterior probability of B7 given the information in A Example 229 Suppose that a manufacturer buys approximately 60 percent of a raw material in boxes from Supplier 1 30 percent from Supplier 2 and 10 percent from PAGE 25 CHAPTER 2 STATMATH 5117 J TEBBS Supplier 3 For each supplier defective rates are as follows Supplier 1 001 Supplier 2 002 and Supplier 3 003 The manufacturer observes a defective box of raw material a What is the probability that it came from Supplier 2 b What is the probability that the defective did not come from Supplier 3 SOLUTION a Let A observe defective box Let B1 B2 and B3 respectively denote the events that the box comes from Supplier 1 2 and 3 The prior probabilities ignoring the status of the box are 1331 06 1332 03 1333 01 Note that B1 B2 B3 partitions the space of possible suppliers Thus by Bayes Rule 1 PltAiBzgtPltBZgt MB PltAiBlgtPltBlgtPltAiBzgtPltBzgtPltAiBsgtPltBsgt 00203 0 40 00106 00203 00301 39 39 This is the updated posterior probability that the box came from Supplier 2 updated to include the information that the box was defective SOLUTION b First compute the posterior probability PB31A By Bayes Rule P 111933 Ba PAlB1PB1 PAlePBz PAlBsPBs 00301 00106 00203 00301 PleA 020 Thus PB31A17 PBglA 17 020 080 by the complement rule D NOTE Read Sections 211 Numerical Events and Random Variables and 212 Random Sampling in WMS PAGE 26 CHAPTER 3 STATMATH 5117 J TEBBS 3 Discrete Distributions Complementary reading Chapter 3 WMS7 except 310 and 311 31 Random variables PROBABILISTIC39 DEFINITION A random variable Y is a function whose domain is the sample space S and whose range is the set of real numbers R y foo lt y lt 00 That is7 Y S a R takes sample points in S and assigns them a real number WORKING DEFINITION ln simpler terms7 a random variable is a variable whose observed value is determined by chance Example 31 Suppose that an experiment consists of ipping two fair coins The sample space is S H7H7 H7T7 THTT Let Y denote the number of heads observed Before we perform the experiment7 we do not know7 with certainty7 the value of Y We can7 however7 list out the possible values of Y corresponding to each sample point E WE y E YEi y H H 2 T H 1 H T 1 T T 0 For each sample point E Y takes on a numerical value speci c to E This is precisely why we can think of Y as a function ie7 YlH7Hl 2 YlH7Tl 1 YlT7Hl 1 YlT7Tl 07 so that PY2 PHH14 PY1 PHTPTH141412 PY 0 PTT 14 PAGE 27 CHAPTER 3 STATMATH 5117 J TEBBS NOTE From these probability calculations note that we can 0 work on the sample space S and compute probabilities from S or 0 work on R and compute probabilities for events Y E B7 where B C R NOTATION We denote a random variable Y using a capital letter We denote an observed value of Y by y a lowercase letter This is standard notation For example7 if Y denotes the weight in ounces of the next newborn boy in Columbia7 SC7 then Y is random variable After the baby is born7 we observe that the baby weighs y 128 oz 32 Probability distributions for discrete random variables TERMINOLOGY The support of a random variable Y is set of all possible values that Y can assume We will denote the support set by R TERMINOLOGY If the random variable Y has a support set R that is countable nitely or in nitely7 we call Y a discrete random variable Example 32 An experiment consists of rolling an unbiased die Consider the two random variables X face value on the rst roll Y number of rolls needed to observe a six The support of X is RX 17 23456 The support of Y is Ry 17 237 RX is nitely countable and By is in nitely countable thus7 both X and Y are discrete D GOAL For a discrete random variable Y7 we would like to nd PY y for any y E R Mathematically7 my 2 my y ZHE e S HE y for all y E R PAGE 28 CHAPTER 3 STATMATH 5117 J TEBBS TERMINOLOGY Suppose that Y is a discrete random variable The function pyy PY y is called the probability mass function pmf for Y The prnf pyy consists of two parts a R7 the support set of Y b a probability assignment PY y7 for all y E R PROPERTIES A prnf pyy for a discrete random variable Y satis es the following 1 pyy gt 07 for all y E R NOTEz if y R7 then pyy 0 2 The sum of the probabilities taken over all support points7 must equal one ie7 Elky 1 yER IMPORTANT Suppose that Y is a discrete random variable The probability of an event Y E B is computed by adding the probabilities pyy for all y E B ie7 W e B 2mg yEB Example 33 An experiment consists of rolling two fair dice and observing the face on each The sample space consists of 6 gtlt 6 36 sample points Let the random variable Y record the sum of the two faces Note that R 27 37 12 We now compute the probability associated with each support point y E R PY 2 Pall El 6 S where y 2 P11 136 PAGE 29 CHAPTER 3 STATMATH 5117 J TEBBS PY 3 Pall E E S where y 3 Pl2lP2ll 236 The calculation PY y is performed similarly for y 4 5 12 The pmf for Y can be given as a formula a table or a graph ln tabular form the pmf of Y is given by y 2 3 4 5 6 7 3 9 10 11 12 pyy 136 236 336 436 536 636 536 436 336 236 136 A probability histogram is a display which depicts a pmf in graphical form In this example the probability histogram looks like 015 010 PMPlYVl 005 A closed form formula for the pmf exists and is given by ilt6il7iyby y2312 0 103 y otherwise De ne the event B 357911 ie the sum Y is odd We have My 6 B Elky PY3 10Y5 10Y7 10Y9 10Y11 yEB 236436 636436 236 12 D PAGE 30 CHAPTER 3 STATMATH 5117 J TEBBS Example 34 An experiment consists of rolling an unbiased die until the rst 677 is observed Let Y denote the number of rolls needed The support is R 12 Assuming independent trials7 we have PY1 5 l PY2 8X8 5 5 l PY3 gx xg Recognizing the pattern7 we see that the pmf for Y is given by aw y 12 0 pyy otherwise 7 This pmf is depicted in a probability histogram below 015 005 7 i i 0 5 1 pMPlYyl 0 a 000 llllllllllllll i i 15 20 i 0 25 30 QUESTION ls this a valid pmf ie7 do the probabilities pyy sum to one Note that 2pm yER y1 CHAPTER 3 STATMATH 5117 J TEBBS IMPORTANT In the last calculation we have used an important fact concerning in nite geometric series namely if a is any real number and M lt 1 Then 00 a E aiquotm 7 177 m0 We will use this fact many times in this course EXERCISE Find the probability that the rst 677 is observed on a an odd numbered roll b an even numbered roll Which event is more likely D 33 Mathematical expectation TERMINOLOGY Let Y be a discrete random variable with pmf pyy and support R The expected value of Y is given by 190 Z yiny yER The expected value for discrete random variable Y is simply a weighted average of the possible values of Y Each support point y is weighted by the probability pyy ASIDE When R is a countably in nite set then the sum Eye ypyy may not exist not surprising since sometimes in nite series do diverge Mathematically we require the sum above to be absolutely convergent ie Z mmy lt 00 yER If this is true we say that EY exists If this is not true then we say that EY does not exist NOTE If R is a nite set then EY always exists because a nite sum of nite quantities is always nite Example 35 Let the random variable Y have pmf 6721 y1234 0 pyy otherwise 7 PAGE 32 CHAPTER 3 STATMATH 5117 J TEBBS The expected value of Y is given by W Emu 2y Ea 7 ml 1lt41ogt 2310 3lt21ogt 4110 2 D yER INTERPRETATION The quantity EY has many interpretations a the center of gravity77 of a probability distribution b a long run average c the rst moment of the random variable d the mean of a population FUNCTIONS OF Y Let Y be a discrete random variable with pmf pyy and support R Suppose that g is a real valued function Then7 gY is a random variable and E 9Yl Z 9ypyy yER The proof of this result is given on pp 93 Again7 we require that Z l9ylpyy lt 00 yER If this is not true7 then EgY does not exist Example 36 In Example 357 nd EY2 and E6Y SOLUTION The functions 91 Y Y2 and 92Y eY are real functions of Y From the de nition7 we have EY2 Zyzpiy yER 92 5 7 ml 12lt410 22310 32210 42110 5 E6Y Z eyp y yER 5 5 i 11 51410 52310 53210 54110 1278 B PAGE 33 CHAPTER 3 STATMATH 5117 J TEBBS Example 37 The discrete uniform distribution Suppose that the random variable X has prnf 1m7 z 12m me 07 otherwise7 where m is a positive integer larger than 1 Find the expected value of X SOLUTION The expected value of X is given by m 1 1 m 1 mm 1 m 1 Em Em 29 a 29 a T39 zER 11 m1 We have used the well known fact that 2quot1 z mm 12 this can be proven by induction If m 67 then the discrete uniform distribution serves as a probability model for the outcome of an unbiased die z 1 2 3 4 5 6 pXz 16 16 16 16 16 16 The expected value of X is EX 6 12 35 D PROPERTIES OF EXPECTATIONS Let Y be a discrete random variable with prnf pyy and support R Suppose that 99192 gk are real valued functions7 and let 0 be any real constant Expectations satisfy the following linearity properties a EC C b E69Yl 0E9Yl C Ef21 900 21 ElyYM Example 38 In a one hour period7 the number of gallons of a certain toxic chemical that is produced at a local plant7 say Y7 has the following prnf y 0 1 2 3 pyy 02 03 03 02 PAGE 34 CHAPTER 3 STATMATH 5117 J TEBBS a Compute the expected number of gallons produced during a one hour period b The cost in hundreds of dollars to produce Y gallons is given by the cost function CY 3 12Y 2Y2 What is the expected cost in a one hour period SOLUTION a The expected value of Y is EY Zypyy 002103 203 302 15 yER That is7 we would expect 15 gallons of the toxic chemical to be produced per hour For b7 we rst cornpute EY2 EY2 Zyzpyy 02021203 2203 3202 33 yER Finally7 EOY E312Y 2Y2 3 12EY 2EY2 3 1215 233 276 The expected hourly cost is 276000 D 34 Variance TERMINOLOGY Let Y be a discrete random variable with prnf 103y7 support R7 and expected value EY u The variance of Y is given by U2 E WY 2 EY W 0 7 WM yER The standard deviation of Y is given by the positive square root of the variance ie7 U W iVY FACTS The variance 02 satis es the following PAGE 35 CHAPTER 3 STATMATH 5117 J TEBBS b 02 0 if and only if the random variable Y has a degenerate distribution ie7 all the probability mass is located at one support point c The larger smaller 02 is7 the more less spread in the possible values of Y about the mean M d 02 is measured in units2 and o is measured in the original units VARIANCE COMPUTING FORMULA Let Y be a random variable with nite mean EY M Then WY EKY M2l EYZ EYNZ Proof Expand the Y 7 2 term and distribute the expectation operator as follows EKY 7 m E Y2 7 2W 2 Example 39 The discrete uniform distribution Suppose that the random variable X has pmf 17717 2 12m 0 OX 95 otherwise7 where m is a positive integer larger than 1 Find the variance of X SOLUTION We nd 02 VX using the variance computing formula ln Example 37 we computed in 1 M EX We rst nd EXZ m 1 1 m 1 mm12m1 2 7 2 7 2 2 EltXgtezmltxgtezze Z Elf we m1 H l 82 H A m 12m 1 6 PAGE 36 CHAPTER 3 STATMATH 5117 J TEBBS We have used the well known fact that 21 2 mm 12m 16 this can be proven by induction The variance of X is equal to 02 M2 7 E00 7711 2m17771T1L12mzil39D 12 EXERCISE Find 02 VY in Examples 35 and 38 notes IMPORTANT RESULT Let Y be a random variable not necessarily a discrete random variable Suppose that a and b are xed constants Then Va bY b2VY REMARK Taking b 0 above7 we see that Va 07 for any constant a This makes sense intuitively The variance is a measure of variability for a random variable a constant such as a does not vary Also7 by taking a 07 we see that VbY b2VY 35 Moment generating functions TERMINOLOGY Let Y be a discrete random variable with pmf pyy and support R The moment generating function mgf for Y7 denoted by 7713t7 is given by Viit EWY Z etyp y yeR provided EetY lt 00 for all t in an open neighborhood about 0 ie7 there exists some h gt 0 such that EetY lt 00 for all t E 7h7h lf EetY does not exist in an open neighborhood of 07 we say that the moment generating function does not exist TERMINOLOGY We call Mg E EYk the kth moment of the random variable Y EY 1st moment meanl Y2 2nd moment EY3 3rd moment Y4 4th moment PAGE 37 CHAPTER 3 STATMATH 5117 J TEBBS REMARK The moment generating function mgf can be used to generate moments In fact from the theory of Laplace transforms it follows that if the mgf exists it char acterizes an in nite set of moments So how do we generate moments RESULT Let Y denote a random variable not necessarily a discrete random variable with support R and mgf myt Then dkmyt EYk dtk t0 Note that derivatives are taken with respect to t Proof Assume without loss that Y is discrete With k 1 we have d i d t i d t i t i tY Emat i a Z 6 yiny Z a6 glib10 Eye yPYl EY6 yeR yeR yeR Thus det tY E Y E Y dt H 5 t0 Continuing to take higher order derivatives we can prove that dkat 7 E Yk dtk 7 t0 for any integer k 2 1 See pp 139 140 WMS for a slightly different proof D ASIDE In the proof of the last result we interchanged the derivative and possibly in nite sum This is permitted as long as myt Eety exists MEANS AND VARIANCES Suppose that Y is a random variable not necessarily a discrete random variable with mgf myt We know that det E Y d t0 and dzmyt E Y2 7 dtz H We can get VY using VY EY2 7 PAGE 38 CHAPTER 3 STATMATH 5117 J TEBBS REMARK Being able to nd means and variances is important in mathematical statis tics Thus we can use the mgf as a tool to do this This is helpful because sometimes computing E Y Z yiny yER directly or even higher order moments may be extremely dif cult7 depending on the form of py Example 310 Suppose that Y is a random variable with pmf pg y 123 0 pyy 7 otherwise Find the mean of Y SOLUTION Using the de nition of expected value7 the mean of Y is given by 00 1 y 190 gymy Finding this in nite sum is not obvious at least7 this sum is not a geometric sum Another option is to use moment generating functions The mgf of Y is given by quotIit E6ty Zetyp y 9263 L ill The series 206t2y is an in nite geometric sum with common ratio r et2 This series converges as long as et2 lt 17 in which case 1 6t t7717 mY 1752 276 for et2 lt 1 ltgt t lt ln 2 Note that 7h7h with h ln2 is an open neighborhood around zero for which myt exists Now i 6 7 dt 276 t0 HQ 7 6t 7 et7et 2 7 at 2D t0 PAGE 39 CHAPTER 3 STATMATH 5117 J TEBBS Example 311 Let the random variable Y have pmf pyy given by 3 y7 y012 0 pyy otherwise Simple calculations show that EY 23 and VY 59 verifyl Let7s check77 these calculations using the mgf of Y It is given by mm WY 2 WW yER 3 2 1 7 KO W1 W e 6 e 6 e 6 7 3 2 t 1 2t 7 6 66 66 39 Taking derivatives of myt with respect to t we get d 2 2 i t 7 t 7 2t dtmyltgt 66 66 d2 2 t 4 2t gnut 86 86 Thus7 det 2 2 MW 760752lt0gt 4623 dt t0 6 6 dZmyt 2 4 Ey2 i 7 70 7201 dt2 66 66 t0 so that VY EY2 i EY2 14232 59 In this example7 it is easier to compute EY and VY directly using the de nition However7 it nice to see that we get the same answer using the mgf approach D REMARK Not only is the mgf a tool for computing moments7 but it also helps us to characterize a probability distribution How When an mgf exists7 it happens to be unique This means that if two random variables have same mgf7 then they have the same probability distribution This is called the uniqueness property of mgfs it is based on the uniqueness of Laplace transforms For now7 however7 it suf ces to envision the mgf as a special expectation77 that generates moments This7 in turn7 helps us to compute means and variances of random variables PAGE 40 CHAPTER 3 STATMATH 5117 J TEBBS 36 Binomial distribution BERNO ULLI TRIALS Many processes can be envisioned as consisting of a sequence of trials where i each trial results in a success or a failure ii the trials are independent and iii the probability of success denoted by 10 0 lt 10 lt 1 is the same on every trial TERMINOLOGY In a sequence of n Bernoulli trials denote by Y the number of suc cesses out of n where n is xed We say that Y has a binomial distribution with number of trials 71 and success probability 10 Shorthand notation is Y N bn10 Example 312 Each of the following situations could be conceptualized as a binomial experiment Are you satis ed with the Bernoulli assumptions in each instance a We ip a fair coin 10 times and let Y denote the number of tails in 10 ips Here Y N 0n 1010 05 b Forty percent of all plots of land respond to a certain treatment I have four plots to be treated If Y is the number of plots that respond to the treatment then Y N 0n 410 04 c In rural Kenya the prevalence rate for HIV is estimated to be around 8 percent Let Y denote the number of HIV infecteds in a sample of 740 individuals Here Y N bn 74010 008 d Parts produced by a certain company do not meet speci cations ie are defective with probability 0001 Let Y denote the number of defective parts in a package of 40 Then Y N bn 4010 0001 D DERIVATION We now derive the pmf of a binomial random variable The support of Y is R y y 0 1 2 We need to nd an expression for 10yy PY y for each value of y E R PAGE 41 CHAPTER 3 STATMATH 5117 J TEBBS QUESTION In a sequence of 71 trials how can we get exactly y successes Denoting success and failure77 by S and F respectively one possible sample point might be SSFSFSFFS FSF Because the trials are independent the probability that we get a particular ordering of n y successes and n 7 y failures is gag1 7 p y Furthermore there are y sample points that contain exactly y successes Thus we add the term gag1 7p y a total of times to get PY The pmf for Y is for 0 lt p lt1 Zpy1 M y 0 1 2 0 pyy otherwise Example 313 In Example 312b assume that Y N bn 410 04 Here are the probability calculations for this binomial model PY 0 py0 30 4017 04 1 gtlt 04 gtlt 0 6 01296 PY 1 pyl110 4117 044i1 4 gtlt 041 gtlt 0 63 0 3456 PY 2 py2 0 4217 0 4 6 gtlt 04 gtlt 0 62 0 3456 PY 3 py3 0 4314 044i3 4 gtlt 043 gtlt 0 61 01536 PY 4 py4 0 4417 0 444 1 gtlt 044 gtlt 0 60 0 0256 EXERCISE What is the probability that at least 2 plots respond at most one What are EY and VY D Example 314 In a small clinical trial with 20 patients let Y denote the number of patients that respond to a new skin rash treatment The physicians assume that a binomial model is appropriate and that Y N bn 2010 04 Under this model compute a PY 5 b PY 2 5 and c PY lt10 a PY 5 py5 2500450620 5 00746 b PY 2 5 Z PY y Z 04yo62oy PAGE 42 CHAPTER 3 STATMATH 5117 J TEBBS 015 010 005 I ll 1 l l 5 10 15 PM PlYVl 000 1 V Figure 32 Probability histogram for the number of patierits responding to treatmerit This represerits the bn 20p 04 model iri Epample 314 This calculation involves using the binomial pmf 16 times and adding the results TRICK Instead of computing the sum 225 0 04y0620 y directly7 we can write HY2317HY30 by the complement rule We do this because WMS7s Appendix 111 Table 17 pp 839 841 contains binomial probability calculations of the form a n PltY a pyu why for different it and p With it 20 and p 047 we see from Table 1 that PY 4 0051 Thus PY 2 5 1 i 0051 0949 c PY lt10 PY S 9 07557 from Table 1 D PAGE 43 CHAPTER 3 STATMATH 5117 J TEBBS REMARK The function PY S y is called the cumulative distribution function of a random variable Y well talk more about this function in the next chapter RECALL The binomial expansion of a b is given by a b i n anikbk k k0 CURIOSITY ls the binomial pmf a valid pmf Clearly pyy gt 0 for all y To check that the pmf sums to one consider the binomial expansion n n 7 17 p pl Zlt py 7p y y0 y The LHS clearly equals 1 and the RHS is the bnp pmf Thus pyy is valid D BINOMIAL MGF Suppose that Y N bnp The mgf of Y is given by n n 7 n n n7 n mylttgt me Z W 7p y Zlt ltpetgtylt1ipgt y q we 210 y 210 y where q 1 7 p The last step follows from noting that 230 pety1 7 p y is the binomial expansion of q pet D MEAN AND VARIANCE We want to compute EY and VY where Y N bnp We will use the mgf Taking the derivative of myt with respect t we get E innt d d q pet 71Q pet 1196 E Thus 11960 71Q pYHP 71p d Em 3mm nltq WOW t t0 since qp 1 Now we need to nd the second moment By using the product rule for derivatives we have d2 d Emat a W petYHpe 7101 7 1q pet 2p6t2 nq pe 1pe quotLlt Thus d2 7 n7 EY2 Emat nn1qp60 2p602nQp60 11960 7101 1p2np t0 PAGE 44 CHAPTER 3 STATMATH 5117 J TEBBS Appealing to the variance computing formula7 we have WY 7 137527lEYl2 7 7101 712 7119 7 71p2 7 np17 p NOTE WMS derive the binomial mean and variance using a different approach not using the mgf See pp 107 108 D Example 315 Artichokes are a marine climate vegetable and thrive in the cooler coastal climates Most will grow in a wide range of soils7 but produce best on a deep7 fertile7 well drained soil Suppose that 15 artichoke seeds are planted in identical soils and temperatures7 and let Y denote the number of seeds that germinate If 60 percent of all seeds germinate on average and we assume a b157 06 probability model for Y7 the mean number of seeds that will germinate is EY M np 1506 9 The variance of Y is VY g2 npl i p 150604 36 seeds2 The standard deviation of Y is 039 w 19 seeds D BERNOULLI DISTRIBUTION In the bnp family7 when n 17 the binomial pmf reduces to py17p1 y y 7 01 0 pyy 7 7 otherwise This is called the Bernoulli distribution Shorthand notation is Y N b1p or Y N Bernp 37 Geometric distribution TERMINOLOGY Envision an experiment where Bernoulli trials are observed If Y denotes the trial on which the rst success occurs7 then Y is said to follow a geometric distribution with parameter p7 where p is the probability of success on any one trial PAGE 45 CHAPTER 3 STATMATH 5117 J TEBBS GEOMETRIC PMF The pmf for Y N geomp is given by 1 My lp y 1 23 0 pyy otherwise RATIONALE The form of this pmf makes intuitive sense we rst need y 7 1 failures each of which occurs with probability 1 7 p7 and then a success on the yth trial this occurs with probability p By independence7 we multiply 171 X 171 X X 17pXp17py 1p y71 failures NOTE Clearly pyy gt 0 for all y Does pyy sum to one Note that 217 W419 719230712 1 y1 z0 In the last step7 we realized that 2201 7pw is an in nite geometric sum with common ratio 1 7 p D Example 316 Biology students are checking the eye color of fruit ies For each y7 the probability of observing white eyes is p 025 What is the probability the rst white eyed y will be observed among the rst ve ies that are checked SOLUTION Let Y denote the number of ies needed to observe the rst white eyed y We can envision each y as a Bernoulli trial each y either has white eyes or not If we assume that the ies are independent7 then a geometric model is appropriate ie7 Y N geomp 025 We want to compute PY S 5 We use the pmf to compute 025 019 Adding these probabilities7 we get PY 5 m 077 The pmf for the geomp 025 model is depicted in Figure 33 D PAGE 46 CHAPTER 3 STATMATH 5117 J TEBBS 025 020 0107 0057 l l I I I I I 5 10 000 l 39 l 0 15 20 pMPlYyl 0 a l V Figure 33 Probability histogram for the number of ies rieeded to rid the rst white eyed fly This represerits the geomp 025 model iri Example 316 GEOMETRIC MGF Suppose that Y N geomp The mgf of Y is given by 196 717qe 7 My t where q 1 7p for t lt 7lnq Proof Exercise D MEAN AND VARIANCE Differentiating the rhgf7 we get d 7 d pet 7 MO 7 get 7p6 7q6 gnut i E 17 get i 17 Jet2 39 Thus7 7 d 7 196017 160 7p607q60 7 p17 Q 7197Q 1 my 7 3mm M 7 17 160 7 17 12 7 5 Similar but lengthier calculations show EY2 771140 PAGE 47 CHAPTER 3 STATMATH 5117 J TEBBS Finally7 2 2 1 q 1 2 1 VY EY 7 7 7 D NOTE WMS derive the geometric mean and variance using a different approach not using the mgf See pp 116 117 D Example 317 At an orchard in Maine7 20 lb bags of apples are weighed Suppose that four percent of the bags are underweight and that each bag weighed is independent If Y denotes the the number of bags observed to nd the rst underweight bag7 then Y N geomp 004 The mean of Y is 1 1 EY E 25 bags The variance of Y is i q i 096 i 2 VY 7 pz 7 00402 7 600 bags D 38 Negative binomial distribution NOTE The negative binomial distribution can be motivated from two perspectives o as a generalization of the geometric o as an inverse77 version of the binomial TERMINOLOGY Imagine an experiment where Bernoulli trials are observed If Y denotes the trial on which the rth success occurs7 r 2 17 then Y has a negative binomial distribution with waiting parameter r and probability of success p NEGATIVE BINOMIAL PMF The pmf for Y N nibrp is given by iilpT17pyT y 7 m 17 2 0 pyy otherwise 7 Of course7 when r 17 the nibrp pmf reduces to the geomp pmf PAGE 48 CHAPTER 3 STATMATH 5117 J TEBBS RATIONALE The form of pyy can be explained intuitively If the rth success occurs on the yth trial7 then r 71 successes must have occurred during the 1st y 71 trials The total number of sample points in the underlying sarnple space S where this occurs is given by the binomial coef cient 11117 which counts the number of ways you can choose the locations of r 7 1 successes in a string of the 1st y 7 1 trials The probability of any particular such ordering7 by independence7 is given by p7 11 7 my Thus7 the probability of getting exactly 7 7 1 successes in the y 7 1 trials is 31p7 11 7 my On the yth trial7 we observe the rth success this occurs with probability p Because the yth trial is independent of the previous y 7 1 trials7 we have 130 y y 71 10719 Xp y 71p 17py r 7 1 r 7 1 pertains to lst y71 trials Example 318 A botanist is observing oak trees for the presence of a certain disease P rorn past experience7 it is known that 30 percent of all trees are infected p 030 Treating each tree as a Bernoulli trial ie7 each tree is infectednot7 what is the proba bility that she will observe the 3rd infected tree 7quot 3 on the 6th or 7th observed tree SOLUTION Let Y denote the tree on which she observes the 3rd infected tree Then7 Y N nibr 310 03 We want to compute PY 6 or Y 7 The nib37 03 prnf gives py6 PY 6 0331 i 03 3 00926 py7 PY 7 10331 i 037 3 00972 Thus7 PY 6 or Y 7 PY 6 PY 7 00926 00972 01898 D RELATIONSHIP WITH THE BINOMIAL Recall that in a binomial experirnent7 we x the number of Bernoulli trials7 n and we observe the number of successes In a negative binornial experirnent7 we x the number of successes we are to observe7 r and we continue to observe Bernoulli trials until we reach that nurnbered success In this sense7 the negative binomial distribution is the inverse of the binomial distribution PAGE 49 CHAPTER 3 STATMATH 5117 J TEBBS RECALL Suppose that the real function f is in nitely differentiable at z a The Taylor series expansion of f about the point z a is given by ma we 2f yew n fltagtifflltz7agt1lltz7agt2 When a 07 this is called the McLaur39in series expansion of NEGATIVE BINOMIAL MGF Suppose that Y N nibrp The mgf of Y is given by t 7 p6 mYt lt17 get 7 where q 1 7 p7 for all t lt 7 ln q Before we prove this7 let7s state and prove a lemma LEMMA Suppose that r is a nonnegative integer Then7 7 t 1177 1 7 t 7 2T71ltqegt lt w y39r Proof of lemma Consider the function fw 1 7 w 7 where r is a nonnegative integer It is easy to show that W M 7 WW f w rltr 1gtlt17 WW ln general7 fzw rr 1 7quot z 7 11 7 w TZ7 where fzw denotes the 2th derivative of f with respect to w Note that fzw rr1rz71 w0 Now7 consider writing the McLaurin Series expansion of fw ie7 a Taylor Series ex pansion of fw about w 0 this expansion is given by 001 wz 007quotquot r 27 00 z r7 Msz 0 Z1 1w20 1 z z r 7 1 20 20 Letting w get and z y 7 r the lemma is proven for 0 lt q lt 1 D PAGE 50 CHAPTER 3 STATMATH 5117 J TEBBS Now that we are nished with the lemma7 let7s nd the mgf of Y N nibrp With q17pwe have 117quot i 1 Z 6tyir6tr prqyir 217 7 7 1 W Z i Duet ltpetgtrlt1 7 get a 117 REMARK Showing that the nib7 7 p pmf sums to one can be done by using a similar series expansion as above We omit it for brevity MEAN AND VARIANCE For Y N nib7quotp7 with q 1 7p EY and VY7g 39 Hypergeornetric distribution SETTING Consider a collection of N objects eg7 people7 poker chips7 plots of land7 etc and suppose that we have two dichotomous classes7 Class 1 and Class 2 For example7 the objects and classes might be Poker chips redblue People infected not infected Plots of land respond to treatmentnot From the collection of N objects7 we sample n ofthem without replacement7 and record Y7 the number of objects in Class 1 REMARK This sounds like a binomial setup However7 the difference here is that N7 the population size7 is nite the population size7 theoretically7 is assumed to be in nite in the binomial model Thus7 if we sample from a population of objects Without replace merit7 the success probability changes from trial to trial This7 Violates the binomial PAGE 51 CHAPTER 3 STATMATH 5117 J TEBBS model assumptions If N is large ie in a very large population the hypergeometric and binomial models will be similar because the change in the probability of success from trial to trial will be small maybe so small that it is not of practical concern HYPERGEOMETRIC DISTRIBUTION Envision a collection of 71 objects sampled at random and without replacement from a population of size N where 7 denotes the size of Class 1 and N 7 7 denotes the size of Class 2 Let Y denote the number of objects in the sample that belong to Class 1 Then Y has a hypergeometric distribution written Y N hyperN n r where N total number of objects 7 number of the 1st class eg success N 7 r number of the 2nd class eg failure 71 number of objects sampled HYPERGEOMETRIC PMF The pmf for Y N hyperN n r is given by T N7T 0 y E R pyy n 0 otherwise where the support set R y E N max0n 7 N r S y S minnr BREAKDOWN In the hyperNnr pmf we have three parts number of ways to choose y Class 1 objects from r number of ways to choose 71 7 y Class 2 objects from N 7 r number of sample points REMARK The hypergeometric pmf pyy does sum to 1 over the support R but we omit this proof for brevity see Exercise 3216 pp 156 WMS Example 319 In my sh tank at home there are 50 sh Ten have been tagged lfl catch 7 sh and random and without replacement what is the probability that exactly two are tagged SOLUTION Here N 50 total number of sh n 7 sample size 7 10 tagged PAGE 52 CHAPTER 3 STATMATH 5117 J TEBBS sh Class 17 N 7 r 40 untagged sh Class 27 and y 2 number of tagged sh caught Thus7 10 40 PY 2 py2 02964 7 What about the probability that my catch contains at most two tagged sh SOLUTION Here7 we want PY2 PY0PY1PY2 100 470 110 460 120 450 50 50 50 7 7 7 01867 03843 02964 08674 D Example 320 A supplier ships parts to a company in lots of 25 parts The company has an acceptance sampling plan which adopts the following acceptance rule sample 5 parts at random and without replacement If there are no de fectives in the sample7 accept the entire lot otherwise7 reject the entire lot77 Let Y denote the number of defectives in the sample Then7 Y N hyper257 57quot7 where 7 denotes the number defectives in the lot in real life7 7 would be unknown De ne 6 25 001 PY 0 v 5 where p r25 denotes the true proportion of defectives in the lot The symbol 001 denotes the probability of accepting the lot which is a function of p Consider the following table7 whose entries are computed using the above probability expression 7 p 0009 0 0 100 1 004 080 2 008 063 3 012 050 4 016 038 5 020 029 10 040 006 15 060 001 PAGE 53 CHAPTER 3 STATMATH 5117 J TEBBS REMARK The graph of 001 versus p is called an operating characteristic curve For sensible sampling plans 001 is a decreasing function of p Acceptance sampling is an important part of statistical process control which is used in engineering and manufacturing settings D MEAN AND VARIANCE If Y N hyperNnr then EY 74 Ni ii RELATIONSHIP WITH THE BINOMIAL The binomial and hypergeometric models S 5 H are similar The key difference is that in a binomial experiment p does not change from trial to trial but it does in the hypergeometric setting However it can be shown that for y xed N 2 nil n n y10y1 10 y7 z 170 pmf as rN a p The upshot is this if N is large ie the population size is large a binomial probability calculation with p rN closely approximates the corresponding hypergeometric probability calculation Example 321 In a small town there are 900 right handed individuals and 100 left handed individuals We take a sample of size n 20 individuals from this town at random and without replacement What is the probability that 4 or more people in the sample are left handed SOLUTION Let X denote the number of left handed individuals in our sample We compute the probability PX 2 4 using both the binomial and hypergeometric models 0 Hypergeometric Here N 1000 r 100 N 7 r 900 and n 20 Thus 3 100 900 PX2417PX 3170130947 10 20 PAGE 54 CHAPTER 3 STATMATH 5117 J TEBBS o Binomial Here7 n 20 and p rN 010 Thus7 3 20 PX 2 4 17 PX 3 1 i Z 01w0920w m 0132953 D x m0 REMARK Of course7 since the binomial and hypergeometric models are similar when N is large7 their means and variances are similar too Note the similarities recall that the quantity rN a p7 as N a 00 and 310 Poisson distribution TERMINOLOGY Let the number of occurrences in a given continuous interval of time or space be counted A Poisson process enjoys the following properties 1 the number of occurrences in non overlapping intervals are independent random variables 2 The probability of an occurrence in a suf ciently short interval is proportional to the length of the interval 3 The probability of 2 or more occurrences in a suf ciently short interval is zero GOAL Suppose that a process satis es the above three conditions7 and let Y denote the number of occurrences in an interval of length one Our goal is to nd an expression for pyy PY y7 the pmf of Y APPROACH Envision partitioning the unit interval 01 into n subintervals7 each of size 171 Now7 if n is suf ciently large ie7 much larger than y7 then we can approximate the probability that y events occur in this unit interval by nding the probability that exactly one event occurrence occurs in exactly y of the subintervals PAGE 55 CHAPTER 3 STATMATH 5117 J TEBBS o By Property 27 we know that the probability of one event in any one subinterval is proportional to the subinterval7s length7 say xn7 where A is the proportionality constant By Property 37 the probability of more than one occurrence in any subinterval is zero for 71 large Consider the occurrencenon occurrence of an event in each subinterval as a Bernoulli trial Then7 by Property 17 we have a sequence of n Bernoulli tri als7 each with probability of success p An Thus7 a binomial approximate A y A 11 W 10 e C t 1 y n 71 To improve the approximation for PY y7 we let 71 get large without bound Then7 A y A 1 lim PY y lirn i lt17 7 n7loo n7loo y 77 71 n1 1 y A 1 y lirn 39y 7 1 7 7 A Hoe yln7yl n n 17 y 7 7 y n hm nn 1 n y1 i li 1 Hoe y 71 17A War z my V bu an calculation gives Now7 the limit of the product is the product of the limits nn71n7y1 lirn an lirn 1 Hoe Lace my M M lirn bn 1m 7 7 L700 L700 yl A 11 lirn on lirn 1 7 7 67A n7loo n7loo n y 1 lirndn lirn lt1 A 1 We have shown that PAGE 56 CHAPTER 3 STATMATH 5117 J TEBBS POISSON PMF A discrete random variable Y is said to follow a Poisson distribution with rate A if the pmf of Y is given by 0 5 y 012 pyy otherwise 7 We write Y N Poisson NOTE Clearly pyy gt 0 for all y E R That pyy sums to one is easily seen as 00 We 21W Z i yER y0 y39 00 Ay 7 7A if 7A A i 6 2y 76 e 71 110 since 220 Ayyl is the McLaurin series expansion of 6A B EXAMPLES Discrete random variables that might be modeled using a Poisson distri bution include 1 the number of customers entering a post of ce in a given day 2 the number of a particles discharged from a radioactive substance in one second 3 the number of machine breakdowns per month 4 the number of blemishes on a piece of arti cial turf 5 the number of chocolate chips in a Chips Ahoy cookie Example 322 The number of cars Y abandoned weekly on a highway is modeled using a Poisson distribution with A 22 In a given week7 what is the probability that a no cars are abandoned b exactly one car is abandoned c at most one car is abandoned d at least one car is abandoned PAGE 57 CHAPTER 3 STATMATH 5117 J TEBBS SOLUTIONS We have Y N Poisson 22 a 0 722 PY 0 py0 23921 5 01108 1 722 PY 1 py1 225 02438 c PY lt1 PY 0 PY 1 py0py1 01108 02438 03456 d PY gt 1 14 PY 0 1 71010 17 01108 08892 D 025 020 015 PM PlYVl 010 005 ltm Figure 34 Probability histogram for the number of abandoned cars This represents the Poisson 22 model in Example 322 REMARK WMS7s Appendix lll7 Table 37 pp 843 847 includes an impressive table for Poisson probabilities of the form a Aye Fm PY a 2 y Recall that this function is called the cumulative distribution function of Y This makes computing compound event probabilities much easier PAGE 58 CHAPTER 3 STATMATH 5117 J TEBBS POISSON MGF Suppose that Y N PoissonA The mgf of Y7 for all t is given by mm E6 Y fem 210 7A 00 Mt e Z w 210 expAet ETAEAet expAet 7 D MEAN AND VARIANCE With the mgf7 we can derive the mean and variance Differ entiating the mgf7 we get me 7 771340 7 expAet 7 1 7 Act expAet 7 1 Thus7 EY 7 gmyc H 7 AeoexpA60 7 1 7 A Now7 we need to nd the second moment Using the product rule7 we have d2 7 d t t gnut 7 aAe expAe 71 m y t AetexpAet 71Aet2expAet 7 2 d2 0 0 0 2 0 2 EY gnut A6 eXpAe 71A6 expAe 71 A A 70 so that WY EY2 lEYl2 AA27A2A D REVELATION The mean and variance of a Poisson random variable are always equal Example 323 Suppose that Y denotes the number of defects observed in one month at an automotive plant From past experience7 engineers believe that a Poisson model is appropriate and that EY A 7 defectsmonth PAGE 59 CHAPTER 3 STATMATH 5117 J TEBBS QUESTION 1 What is the probability that7 in a given month7 we observe 11 or more defects SOLUTION We want to compute PY 21117 PY S 10 17 0901 0099 Table 3 QUESTION 2 What is the probability that7 in a given year7 we have two or more months with 11 or more defects SOLUTION First7 we assume that the 12 months are independent is this reasonable7 and call the event A 11 or more defects in a month a success7 Thus7 under our independence assumptions and viewing each month as a trial7 we have a sequence of 12 Bernoulli trials with success probability p PA 0099 Let X denote the number of months where we observe 11 or more defects Then7 X N b127 0099 and PX22 17PX07PX1 17 1020099017 0099 7 1120099117 009911 1 7 02862 7 03774 03364 D POISSON PROCESSES OF ARBITRARY LENGTH lf events or occurrences in a Pois son process occur at a rate of A per unit time or space7 then the number of occurrences in an interval of length It follows a Poisson distribution with mean At Example 324 Phone calls arrive at a call center according to a Poisson process7 at a rate of A 3 per minute If Y represents the number of calls received in 5 minutes7 then Y N Poisson15 The probability that 8 or fewer calls come in during a 5 minute span is 8 15615 PY lt 8 7 00377 2 y using Table 3 D POISSON BINOMIAL LINK We have seen that the hypergeometric and binomial mod els are related as it turns out7 so are the Poisson and binomial models This should not be surprising because we derived the Poisson pmf by appealing to a binomial approximation PAGE 60 CHAPTER 3 STATMATH 5117 J TEBBS RELATIONSHIP Suppose that Y N bnp If n is large and p is small7 then We yl 7 n n my ypy1ip y for y E R 012n where A np Example 325 Hepatitis C HCV is a viral infection that causes cirrhosis and cancer of the liver Since HCV is transmitted through contact with infectious blood7 screening donors is important to prevent further transmission The World Health Organization has projected that HCV will be a major burden on the US health care system before the year 2020 For public health reasons7 researchers take a sample of n 1875 blood donors and screen each individual for HCV lf 3 percent of the entire population is infected7 what is the probability that 50 or more are HOV positive SOLUTION Let Y denote the number of HOV infected individuals in our sample We compute the probability PY 2 50 using both the binomial and Poisson models 0 Binomial Here7 n 1875 and p 003 Thus7 1875 PY 2 50 Z 003y0971875y m 0818783 1150 0 Poisson Here7 A np 1875003 5625 Thus7 f 5625ye 563925 PY 2 50 y 0814932 1150 As we can see7 the Poisson approximation is quite good D RELATIONSHIP One can see that the hypergeometric7 binomial7 and Poisson models are related in the following way hyperNnr lt gt bnp lt gt Poisson The rst link results when N is large and rN a p The second link results when n is large and p is small so that An a p When these situations are combined7 as you might suspect7 one can approximate the hypergeometric model with a Poisson modell PAGE 61 CHAPTER 4 STATMATH 5117 J TEBBS 4 Continuous Distributions Complementary reading from WMS Chapter 4 41 Introduction RECALL In Chapter 37 we focused on discrete random variables A discrete random variable Y can assume a nite or at most a countable number of values We also learned about probability mass functions pmfs These functions tell us what probabilities to assign to each of the support points in R a countable set PREVIEW Continuous random variables have support sets that are not countable ln fact7 most often7 the support set for a continuous random variable Y is an interval of real numbers egRy0 y 1Ry0ltyltooRy foo lty lt007 etc Thus7 probabilities of events involving continuous random variables must be assigned in a different way 42 Cumulative distribution functions TERMINOLOGY The cumulative distribution function cdf of a random vari able Y7 denoted by Fyy7 is given by the probability Fm 130 S y for all foo lt y lt 00 Note that the cdf is de ned for all y E R the set of all real numbers7 not just for those values of y E R the support of Y Every random variable7 discrete or continuous7 has a cdf Example 41 Suppose that the random variable Y has pmf 3 y7 y012 0 pyy otherwise 7 PAGE 62 CHAPTER 4 STATMATH 5117 J TEBBS We now compute probabilities of the form PY S y o forylt0FyyPYSy0 o for0 ylt1FyyPYSyPY0 o forl ylt2FyyPY yPY0PY1 fory 2 2 Fm PY y PY 0PY 1PY 2 g 1 Putting this all together we have the Cdf for Y 0 ylt0 0 ylt1 1 ylt2 H Cth Chlm 122 7 It is instructive to plot the prnf of Y and the Cdf of Y side by side mm 103 y Cdi FYy Figure 45 Probability mass function pyy and cumulative distribution function Fyy in Example 41 o PMF The height of the bar above y is the probability that Y assumes that value 7 For any y not equal to 01 or 2 pyy 0 PAGE 63 CHAPTER 4 STATMATH 5117 J TEBBS o CDF Fyy is a nondecreasing function i 0 S Fyy S 1 this makes sense since Fyy PY S y is a probability 7 The cdf Fyy in this example takes a step at the support points and stays constant otherwise The height of the step at a particular point is equal to the probability associated with that point D ODF PROPERTIES Let Y be a random variable discrete or continuous and suppose that Fyy is the cdf for Y Then i Fyy satis es the following lim Fyy 0 and lim Fyy 1 yaioo yaltgt0 ii Fyy is a right continuous function that is7 for any real 1 lim Fyy Fya yam iii Fyy is a non decreasing function that is7 11 S 12 gt Fyyi S FY12 EXERCISE Graph the cdf for a Y N b57 02 and b Y N Poisson2 43 Continuous random variables TERMINOLOGY A random variable Y is said to be continuous if its cdf Fyy is a continuous function of y REMARK The cdfs associated with discrete random variables are step functions see Example 41 Such functions are not continuous however7 they are still right continuous PAGE 64 CHAPTER 4 STATMATH 5117 J TEBBS OBSERVATION We can immediately deduce that if Y is a continuous random variable7 then for all y That is speci c points are assigned zero probability in continuous probability models This must be true If this was not true7 and PY y p0 gt 07 then Fyy would take a step of height p0 at the point y This would then imply that Fyy is not a continuous function TERMINOLOGY Let Y be a continuous random variable with cdf The prob ability density function pdf for Y7 denoted by fyy7 is given by nam provided that digF y E exists Appealing to the Fundamental Theorem of Cal culus7 we know that nf ww These are important facts that describe how the pdf and cdf of a continuous random variable are related Because Fyy PY S y7 it should be clear that probabilities in continuous models are found by integration compare this with how probabilities are obtained in discrete models PROPERTIES OF OONTINUO US PDFs Suppose that Y is a continuous random vari able with pdf fyy and support R Then 1 fyy gt 07 for all y E R 2 The function fyy satis es nmmi R OONTINUO US MODELS Probability density functions serve as theoretical models for continuous data just as probability mass functions serve as models for discrete data These models can be used to nd probabilities associated with future random events PAGE 65 CHAPTER 4 STATMATH 5117 J TEBBS a a U i i i i U 4 E E in i i i i i n i 5 a 1U 12 Maia mmmeigrns Maia mmmeigrns in ms m ms Figure 46 Canadian male birth weight data The histogram left is constructed from a sample ofn 1250 subjects A normal probability density function has been t to the empirical distribution right Example 42 A team of Montreal researchers who studied the birth weights of ve million Canadian babies born between 1981 and 2003 say environmental contaminants may be to blame for a drop in the size of newborn baby boys A subset n 1250 subjects of the birth weights7 measured in lbs7 is given in Figure 46 D IMPORTANT Suppose Y is a continuous random variable with pdf fyy and cdf The probability of an event Y E B is computed by integrating fyy over B7 that is7 130 E B Mindy B for any B CR If B y a S y g b ie7 B a7b7 then b PY e B Pa Y s b Mindy b a nmai oma 7 Fya Compare these to the analogous results for the discrete case see page 29 in the notes In the continuous case7 fyy replaces pyy and integrals replace sums PAGE 66 CHAPTER 4 STATMATH 5117 J TEBBS RECALL We have already discovered that if Y is a continuous random variable7 then PY a 0 for any constant a This can be also seen by writing PYa Pa Y a afyydy07 where fy is the pdf of Y An immediate consequence of this is that if Y is continuous7 Pa Y bPa YltbPaltY bPaltYltbbfyydy Example 43 Suppose that Y has the pdf 2y 0 lt y lt 1 fYl 07 otherwise Find the cdf of Y SOLUTION We need to compute Fyy PY S y for all y E R There are three cases to consider 0 wheny S 07 my y made 1 Odt0 y 0 y Fyy fytdt 0dt 2tdt0t2 foo foo 0 o when0ltylt17 22 y 0 0 when y 21 y 0 1 y Fyy fytdt 0dt 2tdt Odt0101 foo foo 0 1 Putting this all together7 we have 07 y lt 0 Till 12 0 S y lt1 1 y 21 The pdf fyy and the cdf Fyy are plotted side by side in Figure 47 EXERCISE Find a P03 lt Y lt 07 b PY 03 and c PY gt 07 D PAGE 67 CHAPTER 4 STATMATH 5117 J TEBBS pdf fyy cdf Fyy Figure 47 Probability density function fyy and cumulative distribution function Fyy in Encamch 43 Example 44 From the onset of infection7 the survival time Y measured in years of patients with chronic active hepatitis receiving prednisolone is modeled with the pdf 1710 cy ygt0 0 fYl otherwise Find the cdf of Y SOLUTION We need to compute Fyy PY S y for all y E R There are two cases to consider 0 when y S 07 y 1 my mach Odto 0 when y gt 07 y 0 y 1 my fytdt 0dt Eff10cm oo 700 0 y 1 710 7 10 0 4105 1iey 0 PAGE 68 CHAPTER 4 STATMATH 5117 J TEBBS min l l nna l l nne l l mm l l 002 l l pdf fyy cdf Fyy Figure 48 Probability density function fyy and cumulative distribution function Fyy in Encamch 44 Putting this all together7 we have 0 y 0 FYUJ 7 17 c y lo7 y gt 0 The pdf fyy and the Cdf Fyy are plotted side by side in Figure 48 EXERCISE What is the probability a patient survives 15 years after being diagnosed less than 5 years between 10 and 20 years D Example 45 Suppose that Y has the pdf WW 9 Z 0 fYl 07 otherwise Find the value of c that makes this a valid pdf SOLUTION Because fyy is a pdf7 we know that fyydy cyc yZdy 1 0 0 PAGE 69 CHAPTER 4 STATMATH 5117 J TEBBS Using integration by parts with u By and d1 e yZdy7 we have 1 eye yZdy 72cye y2 2ce y2dy 0 0 Solving for c we get 0 14 D QUANTILES Suppose that Y is a continuous random variable with cdf Fyy and let 0 lt p lt 1 The pth quantile of the distribution of Y7 denoted by 45p solves by Fylt pgt PltY asp M10619 2 The median of the distribution of Y is the p 05 quantile That is7 the median i505 solves 05 FY 05 PY S 1505 fYldy 05 Another name for the pth quantile is the 100pth percentile EXERCISE Find the median of Y in Examples 437 447 and 45 REMARK For Y discrete7 there are some potential problems with the de nition that 45 solves Fy p PY S 1 p The reason is that there may be many values of 45 that satisfy this equation For example7 in Example 417 it is easy to see that the median i505 1 because Fy1 PY S 1 05 However7 1505 15 also satis es Fly1505 05 By convention7 in discrete distributions7 the pth quantile 45 is taken to be the smallest value satisfying Fy p PY S 1 2 p 44 Mathematical expectation 441 Expected value TERMINOLOGY Let Y be a continuous random variable with pdf fyy and support R The expected value of Y is given by MW Ayn96111 PAGE 70 CHAPTER 4 STATMATH 5117 J TEBBS Mathematically we require that mnmmltw R If this is not true we say that EY does not exist If g is a real valued function then gY is a random variable and ElgYl 9yfyydy R provided that this integral exists Example 46 Suppose that Y has pdf given by 2y 0 lty lt1 fYl 0 otherwise Find EY EY2 and Eln Y SOLUTION The expected value of Y is given by 1 MW yfyydy 0 1 y2ydy 0 1 3 y 22d27 Oyy lt3 EW5 f 01y229dy 1 2 L0 23 gt 3 gt The second moment is Finally 1 ElnY lny2ydy 0 To solve this integral use integration by parts with u lny and d1 2ydy 1 1 2 1 ElnY yzlny 7 ydy 7 yf 0 0 2 W 0 PAGE 71 CHAPTER 4 STATMATH 5117 J TEBBS PROPERTIES OF EXPECTATIONS Let Y be a continuous random variable with pdf fyy and support R7 suppose that 97 9192 gk are real valued functions7 and let 0 be any real constant Then7 a Ec c b El09Yl CElgYl C ElZf19jYl 221 El9jYl These properties are identical to those we discussed in the discrete case 442 Variance TERMINOLOGY Let Y be a continuous random variable with pdf jfyy7 support R7 and mean EY M The variance of Y is given by a 7 W 7 EKY 7 m 7 y 7 WyeMy The variance computing formula still applies in the continuous case7 that is7 WY EY2 lEYl2 Example 47 Suppose that Y has pdf given by 2y 0ltylt1 0 fYl 7 otherwise Find 02 VY SOLUTION We computed EY M 23 in Example 46 Using the de nition above7 WY 01 y 7 92 2106111 Instead of doing this integral7 it is easier to use the variance computing formula VY EY2 7 ln Example 467 we computed the second moment EY2 12 Thus7 VY EY2 7 EY2 g 7 118 a PAGE 72 CHAPTER 4 STATMATH 5117 J TEBBS 443 Moment generating functions TERMINOLOGY Let Y be a continuous random variable with pdf fyy and support R The moment generating function mgf for Y7 denoted by m t7 is given by vw EwUAanw provided Eety lt 00 for all t in an open neighborhood about 0 ie7 there exists some h gt 0 such that Eety lt 00 for all t E flu1 lf Eety does not exist in an open neighborhood of 07 we say that the moment generating function does not exist Example 48 Suppose that the pdf of Y is given by e y ygt0 0 fYl otherwise 7 Find the mgf of Y and use it to compute EY and VY SOLUTION 7w EwU thm 0 etye ydy O ety ydy O 00 7 7 1 7 7 00 6 yo tgtdy 5 20 t 0 lit 210 In the last expression7 note that lim giglit lt 00 yaoo if and only if 1 it gt 07 ie7 t lt 1 Thus7 for t lt 17 we have 1 7 7 W 1 1 myt 7 e ya t 0 110 Note that flu1 with h 1 is an open neighborhood around zero for which myt exists With the mgf7 we can calculate the mean and variance Differentiating the mgf7 PAGE 73 CHAPTER 4 STATMATH 5117 J TEBBS we get so that d EY innt 1 2 7 1 dt 170 To nd the variance7 we rst nd the second moment The second derivative of myt is d2 d 1 2 1 3 mytamp1it 217t39 W m y t 1 3 47 2 t0 1 0 WY EY2 ECl2 2 i 12 1 The second moment is EY2 d2 EWY t The computing formula gives EXERCISE Find EY and VY without using the mgf D 45 Uniform distribution TERMINOLOGY A random variable Y is said to have a uniform distribution from 91 to 02 if its pdf is given by 1 fYl m7 0 01 lt y lt 62 otherwise 7 Shorthand notation is Y 10 62 Note that this is a valid density because fyy gt 0 for allyERy01 lty lt62 and 92 92 1 y d 7 d 7 Alt911 A 62761 1 62761 STANDARD UNIFORM A popular member of the U6162 family is the 107 1 dis 9262761139 91 62761 tribution ie7 a uniform distribution with parameters 91 0 and 02 1 This model is used extensively in computer programs to simulate random numbers PAGE 74 CHAPTER 4 STATMATH 5117 J TEBBS Example 49 Derive the Cdf of Y 10 62 SOLUTION We need to compute Fyy PY S y for all y E R There are three cases to consider 0 when y S 01 Fyy fytdt Odt 0 y l 7 dt 192491 9 91 91 627617 0 when 91 lty lt02 FYyOfytdt Odt N d t 0 62 61 0 when y 2 02 y 91 92 1 y Fyy fytdt 0dt 7 dt Odt0101 foo foo 91 62761 92 Putting this all together7 we have 07 y S 91 Fyy 3299 01 ltylt02 17 y 2 92 The 107 1 pdf fyy and Cdf Fyy are plotted side by side in Figure 49 EXERCISE If Y N U01 nd a P02 lt Y lt 04 and b PY gt 075 D MEAN AND VARIANCE If Y 210 62 then 92 i 902 06 my 12 2 12 and VY UNIFORM MGF If Y N 10 62 then 92 91 e e 7 t 0 mYt 92141 t 7g 0 EXERCISE Derive the formulas for EY and VY PAGE 75 CHAPTER 4 STATMATH 5117 J TEBBS pdf fad cdi Fad Figure 49 The 107 1 probability density function and cumulative distribution function 46 Normal distribution TERMINOLOGY A random variable Y is said to have a normal distribution if its pdf is given by 1 new fyy We 7 700 lt y lt 00 0 otherwise Shorthand notation is Y N NW 02 There are two parameters in the normal distribu tion the mean EY u and the variance VY 02 FACTS a The Nuoz pdf is symmetric about u that is7 for any a E R MM 7 a MM a b The Nuoz pdf has points of in ection located at y u i o verifyl c limynioo fyy 0 PAGE 76 CHAPTER 4 STATMATH 5117 J TEBBS TERMINOLOGY A normal distribution with mean 1 0 and variance 02 1 is called the standard normal distribution It is conventional to let Z denote a random variable that follows a standard normal distribution we write Z N J07 1 IMPORTANT Tabled values of the standard normal probabilities are given in Appendix 111 Table 47 pp 848 of WMS This table turns out to be helpful since the integral 1 6 2 Adi my PltY y 2W does not exist in closed form Speci cally7 the table provides values of 17 FZ2 PZ gt z fzudu7 where fzu denotes the nonzero part of the standard normal pdf ie7 To use the table7 we need to rst prove that any NM0 Z distribution can be trans formed7 to the standard N01 distribution well see how to do this later Once we do this7 we will see that there is only a need for one table of probabilities Of course7 probabilities like Fyy PY S y can be obtained using software too Example 410 Show that the NW 02 pdf integrates to 1 Proof Let 2 y 7 M0 so that dz dyU and dy UdZ De ne 00 2 I 1 57y dy 00 27W 1 2 i 72 2d 7 76 2 00 V 277 We want to show that I 1 Since I gt 07 it suf ces to show that 2 1 Note that 00 1 2 00 1 2 2 67m 2dz 67y 2d 00 x27T OO x27T y 1 oo 00 2 y2 7 exp 7 27139 700 700 2 Switching to polar coordinates ie7 letting z 7 cos 6 and y rsin 0 we get 2 y2 dxdy 7quot2cos2 6 sin2 9 r2 and dzdy rdrd ie7 the Jacobian of the transformation from PAGE 77 CHAPTER 4 STATMATH 5117 J TEBBS my space to 76 space Thus7 we write 2 1 27r co 7722 I 7 e rdrd 277 90 70 1 27r co 2 7 T677 2dr d0 2 90 70 27r 00 i 5422 d9 2717 90 70 1 27r 6 27f 7 7 1 D 27139 90 27139 90 NORMAL MGF Suppose that Y N Np702 The rngf of Y is Uztz myt exp M17 Proof Using the de nition of the rngf7 we have 0 1 1 2 myt Eety ety e 2 0 dy oo x27T039 l 00 1 11 2 5ty 2 o dy 27m 00 De ne b ty 7 y the exponent in the last integral We are going to rewrite b in 7 the following way 19792 bw 0 1 t 7 22 2 9 2029 99M 1 7 92 7 299 7 202w 92 1 wz 2W 0209 M2 complete the square 1 7 7 92 7 29 0209 9 0202 7 9 0202 92 add and subtract 1 l 9 i M Uztllzl g M 0202 i 2 1 1 y 7 a2 012 290 04 7 92 7202 l y 602 W 021927 W 0 say PAGE 78 CHAPTER 4 STATMATH 5117 J TEBBS where a M Uzt Noting that c pt 02t22 is free of y we have 1 00 myt ebdy 1 1 2 7 fie7a 0 e 202 d 27m 700 y 00 l 1 2 50 757 20217 1 d 507 00 x27T039 y Naa2 density since the Na02 pdf integrates to 1 Now nally note 2t2 60 E expc exp Mt UT D EXERCISE Use the mgf to verify that EY M and VY 02 IMPORTANT Suppose that Y N NW 02 The random variable 7Y7M ZiTN01 Proof Let Z Y 7 MU The mgf of Z is given by W m Eegt p t Y l i EltgtYUMtU 6MtUE6tYU e MtUmy tU 2 2 gthU exp MaU 6227 which is the mgf of a N0 1 random variable Thus by the uniqueness of moment generating functions we know that Z N N0 1 D USEFULNESS From the last result we know that if Y N NW 02 then the event 7 Y7 7 7 7 w guw 7 lt77 lt 7 7 ltzlt7 039 039 039 039 039 As a result plty1ltylty2gt plt2lt FZlt92M7FZltl1M7 a 0 PAGE 79 CHAPTER 4 STATMATH 5117 J TEBBS fY 006 000 010 l l l 004 l 002 l 000 y mercury levels ppm Figure 410 Probability density function fyy in Example 411 A modclfoiquot mercury contamination in large mouth bass where is the cdf of the N01 distribution Note also that Fz7z 17Fzz7 for z gt 0 verifyl The standard normal table Table 47 pp 848 gives values of 1 7 FZ27 for z gt 0 Example 411 Young large mouth bass were studied to examine the level of mercury contamination7 Y measured in parts per million7 which varies according to a normal distribution with mean it 18 and variance 02 167 depicted in Figure 410 a What proportion of contamination levels are between 11 and 21 parts per million SOLUTION We want P11 lt Y lt 21 By standardizing7 we see that P11ltYlt21 4 4 lt 4 11718 21718 lt Z lt 7 lt 4 4 P7175 lt Z lt 075 11718 Y718 21718 lt FZ075 4 FZ7175 07734 4 00401 07333 PAGE 80 CHAPTER 4 STATMATH 5117 J TEBBS b For this model ninety percent of all contamination levels are above what mercury level SOLUTION We want to nd 45310 the 10th percentile of Y N N1816 ie 15310 solves FYlt K10gt 130 S 45310 010 We7ll start by nding bglo the 10th percentile of Z N N01 ie bglo solves FZlt 10 PZ S 45310 010 P rom the standard normal table Table 4 we see that 45510 x 7128 We are left to solve the equation K10 7 4 310 m 7128 10 m 71284 18 1288 Thus 90 percent of all contamination levels are greater than 1288 parts per million D 47 The gamma family of distributions INTRODUCTION In this section we examine an important family of probability dis tributions namely those in the gamma family There are three well known named distributions in this family 0 the exponential distribution 0 the gamma distribution 0 the X2 distribution NOTE The exponential and gamma distributions are popular models for lifetime ran dom variables ie random variables that record time to event77 measurements such as the lifetimes of an electrical component death times for human subjects waiting times in Poisson processes etc Other lifetime distributions include the lognormal Weibull loggamma among others PAGE 81 CHAPTER 4 STATMATH 5117 J TEBBS 471 Exponential distribution TERMINOLOGY A random variable Y is said to have an exponential distribution with parameter 6 gt 0 if its pdf is given by l y e 7 y gt 0 fYy 07 otherwise Shorthand notation is Y N exponential6 The value of 6 determines the scale of the distribution7 so it is called a scale parameter EXERCISE Show that the exponential pdf integrates to 1 EXPONENTIAL MGF Suppose that Y N exponential6 The mgf of Y is given by 1 my 17w for t lt 16 Proof From the de nition of the mgf7 we have myt E6tY 0 Gig lteey dy 0 etyiy dy 1 00 gammam BA 9 l 4Lewmmw 6 7t 110 1 ewmmw 17m 0 1100 Inn eey l e ltZOO yaoo In the last expression7 note that if and only if 16 7 t gt 07 ie7 t lt 16 Thus7 for t lt 167 we have 0 l l l WWWH 44 ggi m 6 17m 0 1W39 1100 17m Note that 7h7h with h 16 is an open neighborhood around 0 for which myt exists D PAGE 82 CHAPTER 4 STATMATH 5117 J TEBBS fY 0 0015 00020 1 1 00010 1 00005 1 l l l l l 0 500 1000 1500 2000 2500 00000 y component lifetimes hours Figure 411 The probability density function fyy in Example 412 A model for electrical component lifetimes MEAN AND VARIANCE Suppose that Y N exponential The mean and variance of Y are given by EY o and VY oz Proof Exercise D Example 412 The lifetime of an electrical component has an exponential distribution with mean 6 500 hours What is the probability that a randomly selected component fails before 100 hours lasts between 250 and 750 hours SOLUTION With 6 5007 the pdf for Y is given by 1 7 500 We 1 7 y gt 0 fYl 0 otherwise 7 This pdf is depicted in Figure 411 Thus7 the probability of failing before 100 hours is 100 PY lt 100 iall50 m 0181 0 500 PAGE 83 CHAPTER 4 STATMATH 5117 J TEBBS Sirnilarly7 the probability of failing between 250 and 750 hours is 750 1 P250 lt Y lt 750 few Ody m 0383 D 250 500 EXPONENTIAL ODF Suppose that Y N exponential Then7 the cdf of Y exists in closed form and is given by FM 0 110 yy 7 176W 7 ygt0 Proof Exercise D THE MEMORYLESS PROPERTY Suppose that Y N exponential 7 and let 7 and s be positive constants Then PYgtrlegtr PYgts That is7 given that the lifetime Y has exceeded 7 the probability that Y exceeds rs ie7 an additional 5 units is the same as if we were to look at Y unconditionally lasting until time 5 Put another way7 that Y has actually made it77 to time 7 has been forgotten The exponential random variable is the only continuous random variable that possesses the rnernoryless property RELATIONSHIP WITH A POISSON PROCESS Suppose that we are observing events according to a Poisson process with rate A 167 and let the random variable W denote the time until the rst occurrence Then7 W N exponential Proof Clearly7 W is a continuous random variable with nonnegative support Thus7 for w 2 07 we have FWw PWltw 17PWgtw 1 7 Pno events in 0w 67Aww0 0 i 1 7 67M 1 Substituting A 167 we have FWw 1 7 e w the cdf of an exponential random variable with mean 6 Thus7 the result follows E PAGE 84 CHAPTER 4 STATMATH 5117 J TEBBS Example 413 Suppose that customers arrive at a check out according to a Poisson process with mean A 12 per hour What is the probability that we will have to wait longer than 10 minutes to see the rst customer NOTE 10 minutes is 16th of an hour SOLUTION The time until the rst arrival7 say lV7 follows an exponential distribution with mean 6 1A 1127 so that the cdf of lV7 for w gt 07 is FWw 17 127 Thus7 the desired probability is PW gt 16 1 i PW 16 17 FW16 1 i 1 i HO6 5 2 m 0135 D 472 Gamma distribution TERMINOLOGY The gamma function is a real function of t de ned by M 7 f 0 for all t gt 0 The gamma function satis es the recursive relationship Na Oz 71Pa 717 for 04 gt 1 From this fact7 we can deduce that if 04 is an integer7 then Na Oz 7 1 For example7 P5 4 24 TERMINOLOGY A random variable Y is said to have a gamma distribution with parameters 04 gt 0 and B gt 0 if its pdf is given by 1 0471 72119 a my 6 7 ygt0 fyy F 1 0 otherwise Shorthand notation is Y N gammaoz The gamma distribution is indexed by two parameters 04 the shape parameter 6 the scale parameter PAGE 85 CHAPTER 4 STATMATH 5117 J TEBBS 00 01 02 03 04 05 006 002 v 000 002 004 006 10 20 30 Figure 412 Four gamma pdfs Upper left 04 40 000 50 0 10 20 30 40 50 1 2 Upper right 04 2 1 Lower left 04 3 B 4 Lower right 04 6 B 3 REMARK By Changing the values of Oz and B the gamma pdf can assume many shapes This makes the gamma distribution popular for modeling lifetime data Note that when 04 17 the gamma pdf reduces to the exponential pdf That is7 the exponential pdf is a special77 gamma pdf Example 414 Show that the gammaoz pdf integrates to 1 SOLUTION Change the variable of integration to u y so that du dy and dy du We have 0 fyydy yaileiy dy PMB 1 00 i ail u N 0 a u e du U PAGE 86 CHAPTER 4 STATMATH 5117 J TEBBS GAMMA MGF Suppose that Y N garnrnaoz6 The rngf of Y is 1 DZ t 7 m tem Proof From the de nition of the rngf7 we have for t lt 16 mylttgtElte Ygt l ay 1e y dy 0 1 1 1 7 04 gigomit d A Romy y no 00 1 0 71 7217 y 6 dy 50 0 P0477a where 77 16 7 tl l If 77 gt 0 ltgt t lt 167 then the last integral equals 17 because the integrand is the garnrnaoz77 pdf and integration is over R y 0 lt y lt 00 Thus a 1 a 1 a my W m 39 Note that 7h7h with h 16 is an open neighborhood around 0 for which myt exists D MEAN AND VARIANCE If Y N garnrnaoz67 then EY a6 and VY 0462 NOTE Upon closer inspection7 we see that the nonzero part of the garnrnaoz6 pdf 1 aileiy Pom fYl consists of two parts 0 the kernel of the pdf ya le y o a constant out front 1Poz6 PAGE 87 CHAPTER 4 STATMATH 5117 J TEBBS The kernel is the guts77 of the formula7 while the constant out front is simply the right quantity77 that makes fyy a valid pdf ie7 the constant which makes fyy integrate 00 1 ail y e d 17 0 WW9 9 to 1 Note that because it follows immediately that y 1eyquotdy New 0 This fact is extremely fascinating in its own right7 and it is very helpful too we will use it repeatedly Example 415 Suppose that Y has pdf given by cyze y l y gt 0 0 fYl otherwise 7 a What is the value of c that makes this a valid pdf b What is the mgf of Y c What are the mean and variance of Y SOLUTIONS Note that 1126 1 is a gamma kernel with 04 3 and B 4 Thus7 the constant out front is The mgf of Y is for t lt 14 Finally7 EY a6 34 12 VY 0462 342 48 RELATIONSHIP WITH A POISSON PROCESS Suppose that we are observing events according to a Poisson process with rate A 167 and let the random variable W denote the time until the ath occurrence Then7 W N gammaoz PAGE 88 CHAPTER 4 STATMATH 5117 J TEBBS Proof Clearly7 W is a continuous random variable with nonnegative support Thus7 for w 2 07 we have FWw PWltw 17PWgtw 1 7 Pfewer than 04 events in 0w 0471 7A 39 e wAw7 172 j0 The pdf of lV7 j xxw7 is equal to F CVw provided that this derivative exists For w gt 07 fwltwgtF Vltwgt Aexweiwzjltxzgtlixltxmjxgtix telescoping sum 7 w w AAw quot1 7 A6 A 76A AW Awa7167Aw Au i 0471 Aw mini wa 639 Substituting A 157 1 anew197 W Ww for w gt 07 which is the pdf for the garnrnaoz distribution D Example 416 Suppose that customers arrive at a check out according to a Poisson process with mean A 12 per hour What is the probability that we will have to wait longer than 10 minutes to see the third custorner NOTE 10 minutes is 16th of an hour SOLUTION The time until the third arrival7 say lV7 follows a gamma distribution with parameters 04 3 and B 1A 1127 so that the pdf of lV7 for w gt 07 fWw 86410267127 Thus7 the desired probability is HWgtU 17HWU 16 17 864w25 12wdwm0323D 0 PAGE 89 CHAPTER 4 STATMATH 5117 J TEBBS 473 X2 distribution TERMINOLOGY Let V be a positive integer In the gammaoz family7 when a V2 B 2 7 we call the resulting distribution a X2 distribution with V degrees of freedom We write Y N X2V NOTE At this point7 it suf ces to accept the fact that the X2 distribution is simply a special77 gamma distribution However7 it should be noted that the X2 distribution is used extensively in applied statistics In fact7 many statistical procedures used in practice are valid because of this model X2 PDF If Y N X2V7 then the pdf of Y is 1 Ix2H 7212 g p y 6 7 y gt 0 fyy PM 2 0 otherwise 7 X2 MGF Suppose that Y N X2V The mgf of Y is Viit 1427 for t lt 12 MEAN AND VARIANCE lf Y N X2V7 then EY V and VY 2V TABLED VALUES FOR ODF Because the X2 distribution is so pervasive in applied statistics7 tables of probabilities are common Appendix lll7 Table 6 WMS7 pp 850 851 provides the upper 04 quantiles X which satisfy 00 l a PY gt xi 7 y 2 1e y2dy Xi mint for different values of a and degrees of freedom V PAGE 90 CHAPTER 4 STATMATH 5117 J TEBBS 48 Beta distribution TERMINOLOGY A random variable Y is said to have a beta distribution with parameters 04 gt 0 and B gt 0 if its pdf is given by WWW 071 7 1971 a y 1 y 7 0 lt y lt 1 fyy F W 0 otherwise 7 Since the support of Y is R y 0 lt y lt 1 the beta distribution is a popular probability model for proportions Shorthand notation is Y N betaoz NOTE Upon closer inspection we see that the nonzero part of the betaoz pdf N04 6 071 fYl W9 1 yya l consists of two parts 0 the kernel of the pdf ya l 7 y 1 o a constant out front No PaP Again the kernel is the guts77 of the formula while the constant out front is simply the right quantity77 that makes fyy a valid pdf ie the constant which makes fyy integrate to 1 Note that because it follows immediately that 0 ya l 7 w ldy BETA PDF SHAPES The beta pdf is very exible That is by changing the values of Oz and B we can come up with many different pdf shapes See Figure 413 for examples 0 When 04 B the pdf is symmetric about the line y 0 When 04 lt B the pdf is skewed right ie smaller values of y are more likely PAGE 91 CHAPTER 4 STATMATH 5117 J TEBBS o o o 2 o o a o a 10 o o o 2 o o o a 10 Beta Betaaz 00 02 04 06 as 10 00 02 04 06 as 10 Beta32 Betamna Figure 413 Four beta pdfs Upper left 04 2 B 1 Upper right 04 2 B 2 Lower left 04 3 B 2 Lower right 04 1 B 14 0 When 04 gt B the pdf is skewed left ie larger values ofy are more likely 0 When 04 B 1 the beta pdf reduces to the U0 1 pde BETA MGF The betaoz mgf exists but not in closed form Hence we7ll compute moments directly MEAN AND VARIANCE If Y N betaoz then 7 Oz 7 046 n6 and VY lta6gt2lta 1gt39 mm Proof We will derive EY only From the de nition of expected value we have 1 1 N04 B 071 71 EY 0 nyydy 0 9 my 1 y dy Pmm 1 i 04171 7 971 i y 1 y dy HOOP 0 z betaa1 kernel PAGE 92 CHAPTER 4 STATMATH 5117 J TEBBS Note that the last integrand is a beta kernel with parameters 04 1 and 6 Because integration is over R y 0 lt y lt 17 we have 1ya17117 y 71 P01 0 N04 1 B and thus Na 6 ma 1P warm my 1 6 Na 3 Na 1 Ha Ha 1 6 No B aPa 7 04 Na a Pa 6 T a 39 To derive lY7 rst nd EY2 using similar calculations Use the variance computing formula VY EY2 7 EY2 and simplify D EY Example 417 At a health clinic7 suppose that Y7 the proportion of individuals infected with a new u virus eg7 H1N17 etc7 varies daily according to a beta distribution with pdf 2017y l97 0 lt y lt1 0 fYl otherwise 7 This distribution is displayed in Figure 414 QUESTIONS a What are the parameters in this distribution ie7 what are 04 and B b What is the mean proportion of individuals infected c Find 095 the 95th percentile of this distribution d Treating daily infection counts as independent from day to day7 what is the prob ability that during any given 5 day span7 there is are at least 2 days where the infection proportion is above 10 percent SOLUTIONS aoz1and 20 b EY 11 20 m 0048 c The 95th percentile 15095 solves 150 95 PY 45095 200 20 095 0 PAGE 93 CHAPTER 4 STATMATH 5117 J TEBBS I I I I I I I 000 005 010 015 020 025 030 y proportion of infecteds Figure 414 The probability density function fyy in Example 417 A model for the proportion of infected individuals Let u 1 7 y so that du idy The limits on the integral must change 9 3 0 15095 u 3 1 1 7 15095 Thus7 we are left to solve 1i o 95 1i o 95 1 095 7 20u19du uZO 1 i 1 i 0V9520 1 for 15095 We get 095 1005120 0139 d First7 we compute 1 09 09 PY gt 01 2017 y19dy 20u19du uZO 0920 m 0122 01 0 0 This is the probability that the infection proportion exceeds 010 on any given day Now7 we treat each day as a trial7 and let X denote the number of days where the PAGE 94 CHAPTER 4 STATMATH 5117 J TEBBS infection proportion is above 10 percent77 ie a success Because days are assumed independent X N b5 0122 and PX22 17PX07PX1 5 5 1770012DOU701235771012ml 70122fce0ll6D 49 Chebyshev s Inequality MARKOV S INEQUALITY Suppose that X is a nonnegative random variable with pdf prnf fXz and let 0 be a positive constant Markov7s Inequality puts a bound on the upper tail probability PX gt c that is Pmgtggmm 0 Proof First de ne the event B x z gt c We know that EX 7 OOO foxd B BzfXzdzzfxzdz l V Bfozdx l V cfXzdx CPX gt c D B CHEBYSHEV S INEQUALITY Let Y be any random variable discrete or continuous with mean M and variance 02 lt 00 For k gt 0 l PY 7 m gt 1w 3 Proof Applying Markov7s Inequality with X Y 7 2 and c 202 we have my 7 m gt k0 PY 7 M gt 1amp2 Ely M 72 1 a 202 202 REMARK The beauty of Chebyshev7s result is that it applies to any random variable Y ln words PY 7 m gt k0 is the probability that the random variable Y will differ from the mean M by more than k standard deviations If we do not know how Y is distributed PAGE 95 CHAPTER 4 STATMATH 5117 J TEBBS we can not compute PlY 7 pl gt k0 exactly7 but7 at least we can put an upper bound on this probability this is what Chebyshev7s result allows us to do Note that PlY7pl gtka17PlYiul gka17PuikU Y pw Thus7 it must be the case that 1 PlY7Ml SkUPMikUSY MkU 217 Example 418 Suppose that Y represents the amount of precipitation in inches observed annually in Barrow7 AK The exact probability distribution for Y is unknown7 but7 from historical information7 it is posited that M 45 and 039 1 What is a lower bound on the probability that there will be between 25 and 65 inches of precipitation during the next year SOLUTION We want to compute a lower bound for P25 S Y S 65 Note that 1 P25 S Y S 65 PlY 7 pl 3 2017 E 075 Thus7 we know that P25 S Y S 65 2 075 The chances are good that the annual precipitation will be between 25 and 65 inches 410 Expectations of piecewise functions and mixed distribu tions 4101 Expectations of piecewise functions RECALL Suppose that Y is a continuous random variable with pdf fyy and support B Let gY be a function of Y The expected value of gY is given by El9Yl 9yfyydy R provided that this integral exists REMARK ln mathematical expectation examples up until now7 we have always consid ered functions 9 which were continuous and differentiable everywhere eg7 9y yz PAGE 96 CHAPTER 4 STATMATH 5117 J TEBBS 9y ety 9y lny7 etc We now extend the notion of mathematical expectation to handle piecewise functions which may not even be continuous EXTENSION Suppose that Y is a continuous random variable with pdf fyy and support R7 where R can be expressed as the union of k disjoint sets ie7 RB1UB2UUBk where Bl C R and B1 Bj Q for 239 31 j Let g R a R be a function which can be written as k 99 Z 9 yIBiy7 i1 where gi Bl a R is a continuous function and 132 Bl a 01 is the indicator function that y 6 Bi ie7 17 y E IBM The expected value of the function W ZigiltYgtIBltYgt is equal to 7 I ElgYl 9yfyydy gymgt13 yfyydy k Z gltygtIBltygtfyltygtdy k 123 giltygtfyltygtdy That is7 to compute the expectation for a piecewise function 9Y7 we simply compute the expectation of each giY over each set B1 and add up the results Note that if Y is discrete7 the same formula applies except integration is replaced by summation and the pdf fyy is replaced by a pmf pyy Example 419 An insurance policy reimburses a policy holder up to a limit of 10000 dollars If the loss exceeds 10000 dollars7 the insurance company will pay 10000 dollars PAGE 97 CHAPTER 4 STATMATH 5117 J TEBBS Figure 415 Left The probability density function for the loss incurred in Example 419 Right The function which describes the amount of bene t paid plus 80 percent of the loss that exceeds 10000 dollars Suppose that the policy holder7s loss Y measured in 1000s is a random variable with pdf 2213 y gt1 0 fYl otherwise This pdf is plotted in Figure 415 left What is the expected value of the bene t paid to the policy holder SOLUTION Let gY denote the bene t paid to the policy holder ie7 Y 1ltYlt10 10 08Y 710 Y 210 Y This function is plotted in Figure 415 right We have ElgYl 9yfyydy 1 110y dy 10 08y i 10 dy 198 Thus7 the expected bene t paid to the policy holder is 1980 EXERCISE Find lgY7 the variance of the bene t paid to the policy holder PAGE 98 CHAPTER 4 STATMATH 5117 J TEBBS 4102 Mixed distributions TERMINOLOGY We de ne a mixed distribution as one with discrete and continuous parts more general de nitions are available In particular7 suppose that Y1 is a discrete random variable with cdf Fy1 and that Y2 is a continuous random variable with cdf Fy2 A mixed random variable Y has cdf Fad 01FY1l 02369 for all y E R where the constants 01 and 02 satisfy 01 02 1 These constants are called mixing constants It is straightforward to show that the function Fyy satis es the cdf requirements see pp 647 notes RESULT Let Y have the mixed distribution Fad 01FY1l 02369 where Y1 is a discrete random variable with cdf Fy1 and Y2 is a continuous random variable with cdf Fy2 Let gY be a function of Y Then7 El9Yl 01El9Y1l 02El9Y2l where each expectation is taken with respect to the appropriate distribution Example 420 A standard experiment in the investigation of carcinogenic substances is one in which laboratory animals eg7 rats7 etc are exposed to a toxic substance Suppose that the time from exposure until death follows an exponential distribution with mean 6 10 hours Suppose additionally that the animal is sacri ced after 24 hours if death has not been observed Let Y denote the death time for an animal in this experiment Find the cdf for Y and compute SOLUTION Let Y2 denote the time until death for animals who die before 24 hours We are given that Y2 N exponential10 the cdf of Y2 is 0 110 FY 2 iffy10 ygt0 PAGE 99 CHAPTER 4 STATMATH 5117 J TEBBS The probability that an animal has not died before 24 hours is PY2 lt 24 Fy224 1 7 2410 x 0909 There is one discrete point in the distribution for Y7 narnely7 at the value y 24 which occurs with probability 17 PY1 lt 24 17 0909 0091 De ne Y1 to be a random variable with cdf 0 ylt24 1 12247 FY1 that is7 Y1 has a degenerate distribution at the value y 24 Here are the cdfs of Y1 and Y2 plotted side by side The cdf of Y is Fyy C1FY1 02FY297 where 01 0091 and 02 0909 The mean of Y is EY 0091EY1 0909EY2 009124 090910 11274 hours B EXERCISE Find VY PAGE 100 CHAPTER 5 STATMATH 5117 J TEBBS 5 Multivariate Distributions Complementary reading from WMS Chapter 5 51 Introduction REMARK Up until now we have discussed univariate random variables and their as sociated probability distributions moment generating functions means variances etc In practice however one is often interested in multiple random variables Consider the following examples 0 In an educational assessment program we want to predict a student7s posttest score Y2 from her pretest score In a clinical trial physicians want to characterize the concentration of a drug Y in ones body as a function of the time X from injection 0 An insurance company wants to estimate the amount of loss related to collisions Y1 and liability Y2 both measured in 1000s of dollars 0 Agronomists want to understand the relationship between yield Y measured in bushelsacre and the nitrogen content of the soil In a marketing study the goal is to forecast next months sales say Y based on sales gures from the previous 71 7 1 periods say Y1Y2 Yn1 NOTE In each ofthese examples it is natural to posit a relationship between or among the random variables that are involved This relationship can be described mathemati cally through a probabilistic model This model in turn allows us to make probability statements involving the random variables just as univariate models allow us to do this with a single random variable PAGE 101 CHAPTER 5 STATMATH 5117 J TEBBS TERMINOLOGY lf Y1 and Y2 are random variables then Y Y1 Y2 is called a bivariate random vector lf Y1Y2 Yn denote 71 random variables then Y Y112Yn is called an n Variate random vector 52 Discrete random vectors TERMINOLOGY Let Y1 and Y2 be discrete random variables Then Y1Y2 is called a discrete random vector and the joint probability mass function pmf of Y1 and Y2 is given by PY1Y2l17yz PY1 91752 112 for all y1y2 E R The set R Q R2 is the two dimensional support of The function pyy2y1y2 has the following properties 1 PiaY2y17y2 gt 0 for all 111112 6 R 2 ZRPY1Y291792 1 RESULT Suppose that Y1 Y2 is a discrete random vector with prnf pyy2y1 yg Then PKYIsz 6 Bl ZPHY2917927 B for any set E C R2 That is the probability of the event E B is obtained by adding up the probability mass associated with each support point in B If B fooy1 gtlt fooy2 then PY1Y2e B PltY1 1th 112 FY1Y291792 Z Z PY1i2t17t2 t1 y1t2Sy2 is called the joint cumulative distribution function cdf of PAGE 102 CHAPTER 5 STATMATH 5117 J TEBBS Example 51 Tornados are natural disasters that cause millions of dollars in damage each year An actuary determines that the annual numbers of tornadoes in two lowa counties Lee and Van Buren are jointly distributed as indicated in the table below Let Y1 and Y2 denote the number of tornados seen each year in Lee and Van Buren counties7 respectively payM1792 2120 2121 1122 1123 yl 0 012 006 005 002 yl 1 013 015 012 003 yl 2 005 015 010 002 a What is the probability that there is no more than one tornado seen in the two counties combined SOLUTION We want to compute PY1 Y2 S 1 Note that the support points which correspond to the event Y1 Y2 S 1 are 007 01 and 17 0 Thus7 Y2 1 pY1Y2070 pY1Y2170 pY1Y2071 012 013 006 031 b What is the probability that there are two tornadoes in Lee County SOLUTION We want to compute PY1 2 Note that the support points which correspond to the event Y1 2 are 27 07 27 17 27 2 and 23 Thus7 2 pY1Y227 0 pY1Y2271 pY1Y227 2 pY1Y227 3 i 005 015 010 002 032 D 53 Continuous random vectors TERMINOLOGY Let Y1 and Y2 be continuous random variables Then7 Y17 Y2 is called a continuous random vector7 and the joint probability density function pdf of Y1 and Y2 is denoted by fy1y2y1y2 The joint pdf fy1y2y1y2 is a three dimensional function whose domain is R7 the two dimensional support of Y17 PAGE 103 CHAPTER 5 STATMATH 5117 J TEBBS PROPERTIES The function fy1y2y1y2 has the following properties 1 fyy2y1y2 gt 0 for all y1y2 E R 2 The function fyy2y1y2 integrates to 1 over its support R ie fY1Y2l17112dy1dy2 1 R We realize this is a double integral since R is a two dimensional set RESULT Suppose that Y1 Y2 is a continuous random vector with joint pdf fy1y2y1 yg Then PlY17Y2 Bl BfY1Y2l17yzdyidyz7 for any set E C R2 We realize that this is a double integral since B is a two dimensional set in the y1y2 plane Therefore geometrically PY1Y2 E B is the volume under the three dimensional function fyy2y1y2 over the two dimensional set E TERMINOLOGY Suppose that Y1Y2 is a continuous random vector with joint pdf fybyz y1y2 The joint cumulative distribution function cdf for Y1Y2 is given by 22 21 FHA 91792 E 1301 S 91752 S 921m1m fY1Y2t17t2dt1dt27 for all y1y2 E R2 It follows upon differentiation that the joint pdf is given by 32 leY2l17yz mFY1Y2l17yz7 wherever these mixed partial derivatives are de ned Example 52 A bank operates with a drive up facility and a walk up window On a randomly selected day let Y1 proportion of time the drive up facility is in use Y2 proportion of time the walk up facility is in use Suppose that the joint pdf of Y1 Y2 is given by 2y1y 7 0lty1lt170lty2lt1 0 huf ng2 otherwise 7 PAGE 104 CHAPTER 5 STATMATH 5117 J TEBBS Note that the support in this example is Ry1y20lty1 lt17 0lty2 lt1 It is very helpful to plot the support of YhYZ in the yhyg plane a What is the probability that neither facility is busy more than 14 of the day That is7 what is PY1 147Y2 314 SOLUTION Here7 we want to integrate the joint pdf fy1y2y17y2 over the set E y1y2 0 lty1 lt147 0 lty2 lt14 The desired probability is 14 14 6 PY1 14Y2 314 341 y gtdy2dy1 y 2120 14 5191 0 6 14 ya 3 11112 1110 32 6 14 yl 1 7 7 7 d 5y104 192 yl 2 14 y1 y1 6 1 1 7 7 7 7 7 00109 8 192 5128768 1110 b Find the probability that the proportion of time the drive up facility is in use is less 10 5 than the proportion of time the walk up facility is in use ie7 cornpute PY1 lt SOLUTION Here7 we want to integrate the joint pdf fy1y2y17y2 over the set B 1117112 0 lty1 lty2 lt1 The desired probability is 1 112 6 1311 lt Y2 541 y dy1dy2 2120 2110 6 1 yZ 92 g 1119 dyz 1120 1110 61 lt93 a 7 7 yz dyg 5 1120 2 3 4 1 yiyl 1105D 6 4 70 5 6 4 1127 5 PAGE 105 CHAPTER 5 STATMATH 5117 J TEBBS 54 Marginal distributions DISCRETE CASE The joint pmf of Y17 Y2 in Example 51 is depicted below in the in ner rectangular part of the table The marginal distributions of Y1 and Y2 are catalogued in the margins of the table Pin 91792 920 1121 1122 1123 pY1y1 yl 0 012 006 005 002 025 yl 1 013 015 012 003 043 yl 2 005 015 010 002 032 py2y2 030 036 027 007 1 TERMINOLOGY Let Y17 Y2 be a discrete random vector with pmf py1y2y1y2 The marginal pmf of Y1 is 1011 11 21011 12 117 12 12 and the marginal pmf of Y2 is PY2 12 ZPY1Y291792 11 MAIN POINT In the two dimensional discrete case7 marginal pmfs are obtained by summing over77 the other variable TERMINOLOGY Let YhYZ be a continuous random vector with pdf fy1y2y17y2 Then the marginal pdf of Y1 is fY1l1 fY1Y2yl7yZdyZ and the marginal pdf of Y2 is fY2yz fY1Y2l1792dy1 MAIN POINT In the two dimensional continuous case7 marginal pdfs are obtained by integrating over77 the other variable PAGE 106 CHAPTER 5 STATMATH 5117 J TEBBS Example 53 In a simple genetics rnodel7 the proportion7 say Y1 of a population with trait 1 is always less than the proportion7 say Y2 of a population with trait 2 Suppose that the random vector YhYZ has joint pdf 6117 0ltyiltyzlt1 Jig 91792 07 otherwise a Find the marginal distributions fy1y1 and fy2 SOLUTION To nd fy1y17 we integrate fy1y2y1y2 over yg For 0 lt yl lt 17 1 fy1y1 62161112 611117 21 11121111 Thus7 the marginal distribution of Y1 is given by 6911 yl7 0 lt 91 lt1 0 fY1y1 otherwise That is7 Y1 beta27 2 To nd fy2 yg we integrate fyby2y1y2 over yl For values of 0 lt yz lt 1 12 2 12 2 fy2y2 62161211 311 0 312 1110 Thus7 the marginal distribution of Y2 is given by 3 0 lt yg lt1 fY2yz 07 otherwise That is7 Y2 beta31 b Find the probability that the proportion of individuals with trait 2 exceeds 12 SOLUTION Here7 we want to nd PB7 where the set B 1117112 3 0 lt 91 lt 127 12 gt12 This probability can be computed two different ways i using the joint distribution fy1y2y1y2 and computing 1 12 13mm 6 B 6y1dy1dy2 11212 1110 PAGE 107 CHAPTER 5 STATMATH 5117 J TEBBS Y1 beta27 2 Y2 beta31 Figure 516 Marginal distributions in Example 53 ii using the marginal distribution fy2y2 and computing 1 Pmgtua 3 m y212 Either way7 you will get the same answer Notice that in i7 you are computing the volume under fy1y2y1y2 over the set B In ii7 you are nding the area under fy2y2 over the set yg yg gt 12 c Find the probability that the proportion of individuals with trait 2 is at least twice that of the proportion of individuals with trait 1 SOLUTION Here7 we want to compute PY2 Z 2Y1 ie7 we want to compute PD7 where the set D 91792 392 Z 291 This equals 1 ygZ Hmnmm WWMM5 2120 2110 This is the volume under fy1y2y1y2 over the set D D PAGE 108 CHAPTER 5 STATMATH 5117 J TEBBS 55 Conditional distributions RECALL For events A and B in a non empty sample space S we de ned PA B MB W for PB gt 0 Now7 suppose that YhYZ is a discrete random vector If we let B Y2 yg and A Y1 yl we obtain PY1 0156 02 pay20102 PY2 12 193202 P A13 This leads to the de nition of a discrete conditional distribution TERMINOLOGY Suppose that Y1Y2 is a discrete random vector with joint pmf py1y2y1y2 We de ne the conditional probability mass function pmf of Y1 given Y2 yg as pY1Y2 11792 1032012 whenever py2y2 gt 0 Similarly7 the conditional probability mass function of Y2 given PYJYXMWZ 7 Y1 117 is Wig20102 PYY 291 2 1 my whenever py1y1 gt 0 Example 54 The joint pmf of YhYZ in Example 51 is depicted below in the inner rectangular part of the table The marginal distributions of Y1 and Y2 are catalogued in the margins of the table Mai 91792 12 0 12 1 12 2 12 3 1031011 yl 0 012 006 005 002 025 yl 1 013 015 012 003 043 yl 2 005 015 010 002 032 py2y2 030 036 027 007 1 QUESTION What is the conditional pmf of Y1 given Y2 1 PAGE 109 CHAPTER 5 STATMATH 511 J TEBBS SOLUTION Straightforward calculations show that Plan2 002 1 7 006 10 0 1 2 12 YinMl lyz 12202 1 036 p 7 7 PK Y2y1 1712 1 015 1 7 1 7 7 5 12 YinMl lyz 12202 i 1 036 p 7 PK Y2y1 212 1 015 2 1 7512 YinMl lyz 12202 i 1 036 Thus the conditional pmf of Y1 given Y2 1 is given by 91 l 0 1 2 Py11y2y1ly21 212 512 512 This conditional pmf tells us how Y1 is distributed if we are given that Y2 1 EXERCISE Find the conditional pmf of Y2 given Y1 0 D TERMINOLOGY Suppose that Y1Y2 is a continuous random vector with joint pdf fy1y2y1y2 We de ne the conditional probability density function pdf of Y1 given Y2 yg as fY1Y2yi792 ng 12 39 Similarly the conditional probability density function of Y2 given Y1 yl is fY1112yilyz flan2 11702 le 11 Example 55 Consider the bivariate pdf in Example 53 nglYl yzlyl 6117 0ltyiltyzlt1 0 fY1Y2y17y2 otherwise This model describes the distribution of the random vector Y1Y2 where Y1 the pro portion of a population with trait 1 is always less than Y2 the proportion of a population with trait 2 Derive the conditional distributions fmy2y1ly2 and fy2 y1y2ly1 SOLUTION ln Example 53 we derived the marginal pdfs to be iyl 0 lt yl lt1 0 le 11 otherwise 7 PAGE 110 CHAPTER 5 STATMATH 5117 J TEBBS and 3yg 0 lt yg lt1 0 ng 112 otherwise First we derive fy1 y2ylly2 so x Y2 yg Remember once we condition on Y2 yg ie once we x Y2 yg we then regard yg as simply a constant This is an important point to understand For values of 0 lt yl lt yg it follows that fny2yly2 7 7 fYY y y i i 7 1 2 ll 2 fy2y2 313 yr and thus this is the value of fy y2ylly2 when 0 lt yl lt yg For values of y1 0y2 the conditional density fy1 y2ylly2 0 Summarizing 2211137 0 lt 211 lt yz 0 I m2 91112 otherwise 7 To reiterate in this conditional pdf the value of y2 is xed and known It is Y1 that is varying This function describes how Y1 is distributed for yg xed Now to derive the conditional pdf of Y2 given Y1 we x Y1 ylg then for all values of y1 lt yg lt 1 we have ay2 1117 112 611 1 fY1yi T 62107111 17y139 This is the value of fmyygly1 when yl lt yg lt 1 When yg yl 1 the conditional nglY1yZlyl pdf is fyZ y1ygly1 0 Remember once we condition on Y1 yl then we regard yl simply as a constant Summarizing 1 Q7 y1lty2lt1 0 fy2iy1y2ly1 otherwise That is conditional on Y1 yl Y2 Uy1 1 Again in this conditional pdf the value of y1 is xed and known It is Y2 that is varying This function describes how Y2 is distributed for yl xed D RESULT The use of conditional distributions allows us to de ne conditional probabilities of events associated with one random variable when we know the value of another random PAGE 111 CHAPTER 5 STATMATH 5117 J TEBBS variable If Y1 and Y2 are jointly discrete7 then for any set E C R 1301 E BlYZ 92 ZPY11Y2l1lyz B 1302 E BlYI 91 ZPY21Y1Q2l91l B If Y1 and Y2 are jointly continuous7 then for any set E C R MK 6 Ble yz Bfmy2ylly2dy1 1302 E Blyl 91 fY2Y1yZlyldyZ B Example 56 A health food store stocks two different brands of grain Let Y1 denote the amount of brand 1 in stock and let Y2 denote the amount of brand 2 in stock both Y1 and Y2 are measured in 100s of lbs The joint distribution of Y1 and Y2 is given by 2411927 91gt07 92gt07 0ltyiyzlt1 huf ng2 07 otherwise a Find the conditional pdf fy1 y2y1ly2 b Compute PY1 gt 051Y2 03 c Find PY1 gt 05 SOLUTIONS a To nd the conditional pdf fy1 y2ylly2 we rst need to nd the marginal pdf of Y2 The marginal pdf of Y2 for 0 lt yg lt 17 is 1 92 yZ lit2 fY2yz 249192 dyl 2492 31 1110 0 gt 12921 92 and 07 otherwise We recognize this as a beta23 pdf ie7 Y2 beta23 The conditional pdf of Y1 given Y2 yg is le Y2 11792 2411212 f y y W 1 2 My 1211207112 n gi 1 12 for 0 lt yl lt 1 7 yg and 07 otherwise Summarizing7 4EL70ltmlt17m leli2yilyz 173W otherwise 7 PAGE 112 CHAPTER 5 STATMATH 5117 J TEBBS b To compute PY1 gt 05lY2 037 we work with the conditional pdf fy1 y2y1ly2 which for yg 037 is given by 200 i y10lty1 lt07 fY1lY2 11lyz 49 07 otherwise Thus7 07 200 PY1 gt 05lY2 03 yldyl 0489 05 49 c To compute PY1 gt 057 we can either use the marginal pdf fy1y1 or the joint pdf fy1y2y1y2 Marginally7 it turns out that Y1 beta23 as well verifyl Thus7 1 PY1 gt 05 121107 yl2dy1 0313 05 REMARK Notice how PY1 gt 05lY2 03 7 PY1 gt 05 that is7 knowledge of the value of Y2 has affected the way that we assign probability to events involving Y1 Of course7 one might expect this because of the support in the joint pdf fy1y2y1y2 D 56 Independent random variables TERMINOLOGY Suppose Y17Y2 is a random vector discrete or continuous with joint cdf FKY2y17 yg7 and denote the marginal cdfs of Y1 and Y2 by Fy1y1 and Fy2y27 respectively We say the random variables Y1 and Y2 are independent if and only if FHA211712 FY1 11FY212 for all values of y1 and y2 Otherwise7 we say that Y1 and Y2 are dependent RESULT Suppose that YhYZ is a random vector discrete or continuous with joint pdf pmf fy1y2y1y2 and denote the marginal pdfs pmfs of Y1 and Y2 by fy1y1 and fy2y27 respectively Then7 Y1 and Y2 are independent if and only if ay2 117 12 fY111fY212 for all values of y1 and y2 Otherwise7 Y1 and Y2 are dependent Proof Exercise D PAGE 113 CHAPTER 5 STATMATH 5117 J TEBBS Example 57 Suppose that the pmf for the discrete random vector Y17 Y2 is given by 041 212 211 1727212 17 0 pY1Y2 117 12 otherwise 7 The marginal distribution of Y1 for values of y1 17 27 is given by 2 2 1 1 1011 11 Z mach1112 Z Em 212 Beg1 6 2 1121 and py1y1 07 otherwise Similarly7 the marginal distribution of Y2 for values of y2 127 is given by 2 2 l l PY2yz E PEA917112 E 1 EQU 212 E3 412 11 1111 and py2 yg 07 otherwise Note that7 for example7 3 8 7 14 i 11 1 1 i 77 18 PymA 7 7 pY1pY2 18 x18 81 thus7 the random variables Y1 and Y2 are dependent D Example 58 Let Y1 and Y2 denote the proportions oftime out of one workday during which employees I and ll7 respectively7 perform their assigned tasks Suppose that the random vector Y1Y2 has joint pdf yl927 0lty1lt1 0lty2lt1 0 huf ng2 otherwise 7 It is straightforward to show verify that 111 0lty1lt1 0 22 l 0 lt 22 lt1 fY1l1 and fY2 12 2 7 otherwise 07 otherwise Thus7 since fY1Y2ylyyZ 11 112 7 11 12 fY1ylfY2yZ7 for 0 lt yl lt1 and 0 lt yg lt 17 Y1 and Y2 are dependent D PAGE 114 CHAPTER 5 STATMATH 5117 J TEBBS A CONVENIENT RESULT Let Y1Y2 be a random vector discrete or continuous with pdf pmf fybyz y1y2 If the support set B does not constrain yl by yg or yg by yl and additionally we can factor the joint pdf pmf fybyz y1y2 into two nonnegative expressions fmy2y1y2 9y1hyz7 then Y1 and Y2 are independent Note that 9y1 and hy2 are simply functions they need not be pdfs pmfs although they sometimes are The only requirement is that gy1 is a function of y1 only hy2 is a function of y2 only and that both are nonnegative If the support involves a constraint the random variables are automatically dependent Example 59 In Example 56 Y1 denoted the amount of brand 1 grain in stock and Y2 denoted the amount of brand 2 grain in stock Recall that the joint pdf of Y1Y2 was given by 2411927 91 gt 0712 gt 07 0 lt91 112 lt1 0 fmY2i1y2 otherwise Here the support is R y1y2 y1 gt 0 yg gt 0 0 lt yl y2 lt 1 Since knowledge of y1 affects the value of y2 and vice versa the support involves a constraint and Y1 and Y2 are dependent D Example 510 Suppose that the random vector X Y has joint pdf PozP 1Ae MD 1y 11 7 y 1 z gt 00 lt y lt 1 0 Jew9671 otherwise 7 for A gt 0 04 gt 0 and B gt 0 Since R z gt 0 0 lt y lt 1 does not involve a constraint it follows immediately that X and Y are independent since we can write mm X WHO 7 a Poms 7 hy fxy y Ae M w 9w where 9a and My are nonnegative functions Note that we are not saying that 9a and My are marginal distributions of X and Y respectively in fact they are not the marginal distributions although they are proportional to the marginals D PAGE 115 CHAPTER 5 STATMATH 5117 J TEBBS EXTENSION We generalize the notion of independence to n variate random vectors We use the conventional notation Y Y1Y2Yn and y 211212 We denote the joint cdf of Y by Fyy and the joint pdf pmf of Y by fyy TERMINOLOGY Suppose that the random vector Y Y1Y2 has joint cdf Fyy and suppose that the random variable Y has cdf Fy for 239 1 2 n Then Y1 Y2 Yn are independent random variables if and only if FY3 HFiQQi i1 that is the joint cdf can be factored into the product of the marginal cdfs Alternatively Y1 Y2 Yn are independent random variables if and only if fYy H inyi i1 that is the joint pdf pmf can be factored into the product of the marginals Example 511 In a small clinical trial 71 20 patients are treated with a new drug Suppose that the response from each patient is a measurement Y N NM02 Denot ing the 20 responses by Y Y1Y2 YZO then assuming independence the joint distribution of the 20 responses is for y 6 R20 fYy H f lz lt i 1 I 20 7 2 mum x27T039 g 21239 27m What is the probability that at least one patient7s response is greater than M 20 SOLUTION De ne the event B at least one patient7s response exceeds M 20 PAGE 116 CHAPTER 5 STATMATH 5117 J TEBBS We want to compute PB Note that F all 20 responses are less than M 20 and recall that PB 1 7 P We will compute PB because it is easier The probability that the rst patient7s response Y1 is less than M 20 is given by F3401 2a 1351 lt u 2a PZ lt 2 FZ2 09772 where Z N N0 1 and denotes the standard normal cdf This probability is same for each patient because each patient7s response follows the same NW 02 distribution Because the patients7 responses are independent random variables HF PY1 ltM207Y2 ltM2UW7Y20 lt M20 20 H FYM 2a i1 FZ220 m 063 Finally PB17 PB 17 063 037 D 57 Expectations of functions of random variables RESULT Suppose that Y Y1 Y2 Y has joint pdf fyy or joint pmf pyy and suppose that gY gY1Y2 Yn is a real vector valued function of Y1Y2 Yn ie g R 7 R Then 0 if Y is discrete ElgYl Z 2 2 9ypyy7 11 12 yn o and if Y is continuous ElgYl 1 9yfyydy If these quantities are not nite then we say that EgY does not exist PAGE 117 CHAPTER 5 STATMATH 5117 J TEBBS PROPERTIES OF EXPECTATIONS Let Y Y17Y2 Yn be a discrete or contin uous random vector7 suppose that 99192 gk are real vector valued functions from R a R and let 0 be any real constant Then7 a EC C b El09Yl CElgYl C E EL 9jYl 21 E l9jYl Example 512 In Example 567 Y1 denotes the amount of grain 1 in stock and Y2 denotes the amount of grain 2 in stock Both Y1 and Y2 are measured in 100s of lbs The joint distribution of Y1 and Y2 is 2411927 11 gt07 92gt07 0lty1yzlt1 0 fiaY2y1y2 otherw1se What is the expected total amount of grain Y1 Y2 in stock SOLUTION Let the function g R2 a R be de ned by 9y17y2 yl yg We would like to compute EgY1Y2 EY1 From the last result7 we know that 1 1 111 EY1 Y2 91 92 gtlt 249192 Ell26191 2110 2120 1 1 111 2492 24919 dyzdy1 1110 1120 1211 y 1211 241172 dyl 3 0 1 1 12911 y12dy1 82111 13dy1 1110 1110 1 1 7 12 931 y12dy1 8 911 y13dy1 70 10 r3r3 mud 12l M l8l M The expected total amount of grain in stock is 80 lbs 45 REMARK In the calculation above7 we twice used the fact that 0 ya l 7 w ldy PAGE 118 CHAPTER 5 STATMATH 5117 J TEBBS ANOTHER SOLUTION To compute EY1 Y27 we could have taken a different route In Example 567 we discovered that the marginal distributions were Y1 beta23 Y2 beta23 so that EY EY 72 i 2 1 7 2 7 2 3 7 5 Because expectations are linear7 we have 2 2 4 E Y Y 7 7 7 D 1 2 5 5 5 RESULT Suppose that Y1 and Y2 are independent random variables Let gY1 be a function of Y1 only7 and let hY2 be a function of Y2 only Then7 El9Y1hY2l El9Y1lElhY2l7 provided that all expectations exist Proof Without loss7 assume that Y17Y2 is a continuous random vector the discrete case is analogous Suppose that Y17Y2 has joint pdf fy1y2y1y2 with support R C R2 Note that ElgY1hY2l 7 Wgunwarmghyadyzdyl 7 9y1hy2fy1y1fy2y2dy2dy1 R R 7 9y1fy1y1dy1 hy2fy2y2dy2 R R 7 Emma1a D COROLLARY lf Y1 and Y2 are independent random variables7 then EY1Y2 EY1EY2 This is a special case of the previous result obtained by taking gY1 Y1 and hY2 Y2 PAGE 119 CHAPTER 5 STATMATH 5117 J TEBBS 58 Covariance and correlation 581 Covariance TERMINOLOGY Suppose that Y1 and Y2 are random variables discrete or continuous with means M1 and 2 respectively The covariance between Y1 and Y2 is given by CovY1Y2 E M1Y2 M2l EY1Y2 EY1EY2 The latter expression is often easier to work with and is called the covariance comput ing formula The covariance is a numerical measure that describes how two variables are linearly related o If CovY1Y2 gt 07 then Y1 and Y2 are positively linearly related o If CovY1Y2 lt 07 then Y1 and Y2 are negatively linearly related o If CovY1Y2 07 then Y1 and Y2 are not linearly related RESULT lf Y1 and Y2 are independent7 then CovY1Y2 0 Proof Suppose that Y1 and Y2 are independent Using the covariance computing formula7 Comm Ema EltY1gtEltY2gt EY1EY2 EY1EY2 0 5 IMPORTANT If two random variables are independent7 then they have zero covariance However7 zero covariance does not necessarily imply independence7 as we see now Example 513 An example of two dependent variables with zero covariance Suppose that Y1 171717 and let Y2 le It is straightforward to show that EltY1gt 0 EltYaeEltYEgt 7 0 EY2EYfVYl 13 PAGE 120 CHAPTER 5 STATMATH 5117 J TEBBS Thus7 Comm Ema e EltY1gtEltY2gt 0 7 003 0 However7 clearly Y1 and Y2 are not independent in fact7 they are perfectly related It is just that the relationship is not linear it is quadratic The covariance only measures linear relationships D Example 514 Gasoline is stocked in a tank once at the beginning of each week and then sold to customers Let Y1 denote the proportion of the capacity of the tank that is available after it is stocked Let Y2 denote the proportion of the capacity of the bulk tank that is sold during the week Suppose that the random vector YhYZ has joint pdf 3117 0ltyzltyilt1 Jig 91792 07 otherwise Compute CovY1 SOLUTION It is perhaps easiest to use the covariance computing formula COVY1 Y2 EY1Y2 EY1EY2 The marginal distribution of Y1 is beta37 1 The marginal distribution of Y2 is mm 0ltyzlt1 002 2 2 07 otherwise Thus7 the marginal rst moments are 3 EY1 075 1 3 EY2 12 X Eu 70 0375 O Now7 we need to compute This is given by 1 11 11112 gtlt 31116111261111 030 yi0 1120 Thus7 the covariance is CovY1Y2 EY1Y2 i EY1EY2 030 i 0750375 001875 D PAGE 121 CHAPTER 5 STATMATH 5117 J TEBBS IMPORTANT Suppose that Y1 and Y2 are random variables discrete or continuous VY1 Y2 7 VY1 VY2 2oovY1 Y2 V06 7 Y2 7 VY1 VY2 7 200VY17Y2 Proof Suppose that Y1 and Y2 are random variables with means p1 and 2 respectively Let Z Y1 Y2 From the de nition of variance7 we have mm 7 ElZ7Mz2l 7 ElY1 Y2 7 EY1 Y2l2 7 ElY1 Y2 7 M1 7 M22l 7 ElY1 7 1 Y2 7 lelz ElY1 7 02 Y2 7 M 2 Y1 7 M1Y2 7 lel cross product 7 ElY1 7 M EY2 7 Mg 2EY1 7 Mao2 7 2 VY1 VY2 2CovY1Y2 That VY1 7 Y2 VY1 V06 7 2CovY1Y2 is shown similarly D RESULT Suppose that Y1 and Y2 are independent random variables discrete or con tinuous VY1 Y2 7 VY1 V06 V06 7 Y2 7 VY1 VY2 Proof In the light of the last result7 this is obvious D Example 515 A small health food store stocks two different brands of grain Let Y1 denote the amount of brand 1 in stock and let Y2 denote the amount of brand 2 in stock both Y1 and Y2 are measured in 100s of lbs The joint distribution of Y1 and Y2 is 2411927 91 gt 0712 gt 07 0 lt91 112 lt1 flanMM 7 07 otherwise What is the variance for the total amount of grain in stock That is7 nd VY1 PAGE 122 CHAPTER 5 STATMATH 5117 J TEBBS SOLUTION We know that VY1 Y2 VY1 VY2 2oovY1Y2 Marginally Y1 and Y2 are both beta23 see Example 56 Thus 2 2 E Y E Y 7 7 lt 1 lt 2 23 5 and 23 7 1 231232 2539 We need to compute CovY1 Note that VY1 VY2 1 1 111 2 EY1Y2 1112 X 24y1y2dy2dy1 i 2110 2120 15 Thus Connie EltY1Y2gt7EltY1gtEltY2gt as VY1 Y2 VY1 VY2 QCoVOh Y2 1 1 g g 270027 N 0027 D Finally RESULTS Suppose that Y1 and Y2 are random variables discrete or continuous The covariance function satis es the following a CovY1Y2 CovY2Y1 b COVY1Y1 VY1 c Cova bY1c dYZ deovY1Y2 for any constants a b c and d Proof Exercise D PAGE 123 CHAPTERS STAT MATHSlLJTEBBS 582 Correlation GENERAL PROBLEM Suppose that X and Y are random variables and that we want to predict Y as a linear function of X That is we want to consider functions of the form Y 60 61X for xed constants 60 and 61 In this situation the error in prediction77 is given by Y i 50 51X This error can be positive or negative so in developing a measure of prediction error we want one that maintains the magnitude of error but ignores the sign Thus we de ne the mean squared error of prediction given by 60761 MW 7 60 6190 A two variable calculus argument shows that the mean squared error of prediction Q o l is minimized when CovX Y 61 VX and CovX Y VX Note that the value of 61 algebraically is equal to CovX Y 61 V00 7 COVXYUl UXUY UY pi7 UX i CovX Y UXUY 60 Em e EltXgt Em e 61EX UX where The quantity p is called the correlation coe icient between X and Y SUMMARY The best linear predictor of Y given X is Y 60 61X where m e 30 MW 7 B1EX PAGE124 CHAPTER 5 STATMATH 511 J TEBBS NOTES ON THE CORRELATION COEFFICIENT 1 71 S p S 1 this can be proven using the Cauchy Schwartz lnequality from calculus 2 If p 1 then Y 60 61X where 61 gt 0 That is X and Y are perfectly positively linearly related ie the bivariate probability distribution of X Y lies entirely on a straight line with positive slope 3 If p 71 then Y 60 61X where 61 lt 0 That is X and Y are perfectly negatively linearly related ie the bivariate probability distribution of X Y lies entirely on a straight line with negative slope 4 If p 0 then X and Y are not linearly related NOTE If X and Y are independent random variables then p 0 However again the implication does not go the other way that is if p 0 this does not necessarily mean that X and Y are independent NOTE In assessing the strength of the linear relationship between X and Y the corre lation coef cient is often preferred over the covariance since p is measured on a bounded unitless scale On the other hand CovX Y can be any real number and its units may not even make practical sense Example 516 In Example 514 we considered the bivariate model 3117 0ltyzltyilt1 0 fY1Y2y17y2 otherwise for Y1 the proportion of the capacity of the tank after being stocked and Y2 the pro portion of the capacity of the tank that is sold Compute the correlation p between Y1 and Y2 SOLUTION ln Example 514 we computed CovY1Y2 001875 so all we need is Ty and 039y2 the marginal standard deviations ln Example 514 we also found that PAGE 125 CHAPTER 5 STATMATH 5117 J TEBBS Y1 beta31 and g iyg 0lty2lt1 0 ng 12 otherwise 7 The variance of Y1 is 31 3 3 VY i 17w0194 1 311312 30 U 30 Simple calculations using fy2y2 show that 15 and 38 so that 1 3 2 VY2 g i g 0059 gt UYZ 0059 x 0244 Finally7 the correlation is 7 CovY1Y2 N 001875 N m 040 D 039y1039y2 01940244 59 Expectations and variances of linear functions of random variables TERMINOLOGY Suppose that Y17Y27 Yn are random variables and that 11027 an are constants The function UZnaiyialyl02y2quot anyn i1 is called a linear combination of the random variables Y1 Y2 Yn EXPECTED VALUE OF A LINEAR COMBINATION EltUgt imam i1 i1 VARIANCE OF A LINEAR COMBINATION VU V lt i ain Zn liqY0 2 Z aiajCovYl i1 i1 iltj Z a Yi ZaiajCovYZYj i1 13979 PAGE 126 CHAPTER 5 STATMATH 5117 J TEBBS Example 517 Achievement tests are commonly seen in educational or employment settings For a large population of test takers7 let Y1 Y2 and Y3 represent scores for different parts of an exam Suppose that Y1 N J1247 Y2 1697 and Y3 N2016 Suppose additionally that Y1 and Y2 are independent7 CovY1Y3 087 and CovY2Y3 767 Two different summary measures are computed to assess a subject7s performance U105Y172Y2Y3 and U23Y172Y27Y3 Find EU1 and VU1 SOLUTION The expected value of U1 is EltU1gt Elt05Y1 7 21 Y3 05EltY1 2EltY2gt EltY3 0512 7 216 20 76 The variance of U1 is VU1 V05Y1 2Y2 Y3 MWY1 lt722VltY2 lt12VltY3 2lto5gtlt72gtccvltY1Y2gt 2lto5gtlt1gtccvltY1Ysgt 2lt72gtlt1gtcevltnyggt 0254 49 16 205720 20508 272767 806 EXERCISE Find EU2 and VU2 D COVARIANCE BETWEEN TWO LINEAR COMBINATIONS Suppose that U1 Zaiyi alyl 02Y2 anYn i1 U2 ijxjb1X1b2X2mmem j1 Then7 CovU1 U2 Z 391 aibjCovOi 71 m j1 EXERCISE ln Example 5177 compute Cov U17 U2 PAGE 127 CHAPTER 5 STATMATH 5117 J TEBBS 510 The multinomial model RECALL When we discussed the binomial model in Chapter 3 each Bernoulli trial resulted in either a success or a failure77 that is on each trial there were only two outcomes possible eg infectednot germinatednot defectivenot etc TERMINOLOGY A multinomial experiment is simply a generalization of a binomial experiment In particular consider an experiment where o the experiment consists of 71 trials 71 is xed 0 the outcome for any trial belongs to exactly one of k 2 2 categories 0 the probability that an outcome for a single trial falls into category 239 is pi for 239 1 2 k where each p remains constant from trial to trial and o the trials are independent DEFINITION In a multinomial experiment de ne Y1 number of outcomes in category 1 Y2 number of outcomes in category 2 Yk number of outcomes in category k so that Y1Y2 Yk n and denote Y Y1Y2 Yk We call Y a multinomial random vector and write Y N multnp1p2 pk NOTE When there are k 2 categories eg successfailure the multinomial model reduces to a binomial model When k 3 Y is said to have a trinomial distribution JOINT PMF In general If Y N multnp1p2 pk the pmf for Y is given by n 21 22 yk pyy mp1 102 10 yz 0717u7ny Ziyz n 0 otherwise 7 PAGE 128 CHAPTER 5 STATMATH 5117 J TEBBS Example 518 At a number of clinic sites throughout Nebraska7 chlamydia and gon orrhea testing is performed on individuals using urine or swab specimens De ne the following categories Category 1 subjects with neither chlamydia nor gonorrhea Category 2 subjects with chlamydia but not gonorrhea Category 3 subjects with gonorrhea but not chlamydia Category 4 subjects with both chlamydia and gonorrhea For these k 4 categories7 empirical evidence suggests that p1 0907 p2 0067 p3 0017 and p4 003 At one site7 suppose that n 20 individuals are tested on a given day What is the probability exactly 16 are disease free7 2 are chlamydia positive but gonorrhea negative7 and the remaining 2 are positive for both infections SOLUTION De ne Y Y17Y27Y37Y4 where 1 counts the number of subjects in category 2 Assuming that subjects are independent7 Y N multn 20101 090102 006103 001104 003 We want to compute 20 1301 16Y2 253 014 2 m09016006200100032 0017 FACTS lfY Y17Y27 Yk multnpjpgpk7 then 0 the marginal distribution of Y is bnpi7 for 2 17 27 k 0 npi for 2 127 k 0 moi1 7101 for 2 17 27 k 0 the joint distribution of 167 is trinomialnpipj1 7 pl 7 19 o CovYlYj inpipj for 2 31 j PAGE 129 CHAPTER 5 STATMATH 5117 J TEBBS 511 The bivariate normal distribution TERMINOLOGY The random vector Y1Y2 has a bivariate normal distribution if its joint pdf is given by 6 62 01722 E R2 27n7102 17 p2 0 ay2 117 12 otherwise 2 2 1 7 7 7 7 Q 2llty1 M1 2plty1 M1 yz M2gtlty2 M2 17p 01 01 U2 02 We write Y1Y2 N N20102a12a p There are 5 parameters associated with this where bivariate distribution the marginal means 1 and M2 the marginal variances a and 0 and the correlation p FACTS ABOUT THE BIVARIATE NORMAL DISTRIBUTION 1 Marginally Y1 NM1Uf and Y2 NM2U 2 Y1 and Y2 are independent ltgt p 0 This is only true for the bivariate normal distribution remember this does not hold in general 3 The conditional distribution 039 Y1lY2 12 N N 1 p 02 7 M27U 1 pa 4 The conditional distribution Y2Y1 y1N M2 p 21 7 07030 i W EXERCISE Suppose that Y1Y2 N200 1 1 05 What is PY2 gt 05 Y1 02 ANSWER Conditional on Y1 yl 02 Y2 N N01075 Thus PY2 gt 05u1 02 PZ gt 046 03228 PAGE 130 CHAPTER 5 STATMATH 5117 J TEBBS 512 Conditional expectation 5121 Conditional means and curves of regression TERMINOLOGY Suppose that X and Y are continuous random variables and that gX and hY are functions of X and Y7 respectively The conditional expectation of gX7 given Y y is E9XY y 996fX7ylyd9 72 Similarly7 the conditional expectation of hY7 given X L is EWYMX z hltygtfy7xlty7zgtdy 72 If X and Y are discrete7 then sums replace integrals IMPORTANT It is important to see that7 in general7 o EgXY y is a function of y and o EhYX x is a function of x CONDITIONAL MEANS In the de nition above7 if gX X and hY Y7 we get in the continuous case7 EltX7Y y zfx7yltz7ygtdz 72 mm z yfy7xlty7zgtdy 72 EXY y is called the conditional mean of X7 given Y y EY X s is the conditional mean of Y7 given X x Example 519 In a simple genetics model7 the proportion7 say X7 of a population with Trait 1 is always less than the proportion7 say Y7 of a population with trait 2 ln Example 537 we saw that the random vector X7 Y has joint pdf 6x 0ltzltylt1 0 Jew9671 otherwise 7 PAGE 131 CHAPTER 5 STATMATH 5117 J TEBBS ln Example 55 we derived the conditional distributions 2zy2 0 lt z lt y fXYll and fYXll 0 otherwise 0 otherwise zltylt1 L 17m7 Thus the conditional mean of X given Y y is EXlY y y 0 95fXY95lld95 y 2 2 xd7 0 y 9 Similarly the conditional mean of Y given X z is 3 3 0 1 mnXzgt39thmnm 1 1 1 y 1 1 d 7 7 1 Alllt1izgty 1795 21 2Q That EYlX s 1 is not surprising because YlX x N HQ 1 D TERMINOLOGY Suppose that X Y is a bivariate random vector 0 The graph of EXlY y versus y is called the curve of regression of X on Y o The graph of EYlX s versus z is the curve of regression of Y on X The curve of regression of Y on X from Example 519 is depicted in Figure 517 5122 Iterated means and variances REMARK In general EXlY y is a function ofy and y is xed not random Thus EXlY y is a xed number However EXlY is a function of Y thus EXlY is a random variable Furthermore as with any random variable it has a mean and variance associated with itll ITERATED LAWS Suppose that X and Y are random variables Then the laws of iterated expectation and variance respectively are given by mmEmwwn PAGE 132 CHAPTER 5 STATMATH 5117 J TEBBS me Figure 517 The curve of regression EYlX s versus z in Example 519 and VX EVXlYl VEXlY NOTE When considering the quantity EEXlY7 the inner expectation is taken with respect to the conditional distribution fX yly However7 since EXlY is a function of Y7 the outer expectation is taken with respect to the marginal distribution fyy Proof We will prove that EX for the continuous case Note that EX zfXyxyddy 72 72 xfxlyxlyfyydxdy 72 72 R 72 leYWllW fyydy ElEXlYl a E X lYy Example 520 Suppose that in a eld experirnent7 we observe Y7 the number of plots7 out of n that respond to a treatment However7 we dont know the value of p7 the probability of response7 and furtherrnore7 we think that it may be a function of location7 PAGE 133 CHAPTER 5 STATMATH 5117 J TEBBS temperature precipitation etc In this situation it might be appropriate to regard p as a random variable Speci cally suppose that the random variable P varies according to a betaoz distribution That is we assume a hierarchical structure YlP p N binomialnp P N betaoz The unconditional mean of Y can be computed using the iterated expectation rule a EY EEYP EnP nEP n 7 ltgt l N l l lt gt a The unconditional variance of Y is given by VW MWYWNWEWWN EnP17 13 VlnP nEP 7 P2 n2VP nEP i nVP EP2 n2VP n L n L L 2 7 am a 2a 1a 7 n 04 i i nn 71MB 7 a 1 a ama h39 Unconditionally the random variable Y follows a betabinomial distribution This is 712046 ama h extra variation a popular probability model for situations wherein one observes binomial type responses but where the variance is suspected to be larger than the usual binomial variance D BETA BINOMIAL PMF The probability mass function for a betabinomial random variable Y is given by 1 1 mmfwummfmwmhmm 0 0 1 0 mpg17 MW W 6 lt1 7 pf ldp y Pawwf COPWBFaWWBiw y FWWWWWam 7 for y 01n and pyy 0 otherwise PAGE 134 STAT MATH 511 PROBABILITY Fan 2007 Lecture Notes Joshua M Tebbs Department of Statistics University of South Carolina TABLE OF CONTENTS STATMATH 5117 J TEBBS Contents 1 Probability 1 11 Introduction 1 12 Sample spaces 3 13 Basic set theory 3 14 Properties of probability 5 15 Discrete probability models and events 7 16 Tools for counting sample points 9 161 The multiplication rule 9 162 Permutations 10 163 Combinations 15 17 Conditional probability 17 18 lndependence 20 19 Law of Total Probability and Bayes Rule 22 2 Discrete Distributions 26 21 Random variables 26 22 Probability distributions for discrete random variables 27 23 Mathematical expectation 31 24 Variance 36 25 Moment generating functions 38 26 Binomial distribution 42 27 Geometric distribution 48 28 Negative binomial distribution 51 29 Hypergeometric distribution 54 210 Poisson distribution 58 TABLE OF CONTENTS STATMATH 5117 J TEBBS 3 Continuous Distributions 65 31 Introduction 65 32 Cumulative distribution functions 65 33 Continuous random variables 67 34 Mathematical expectation 74 341 Expected values 74 342 Variance 76 343 Moment generating functions 76 35 Uniform distribution 78 36 Normal distribution 79 37 The gamma family of pdfs 84 371 Exponential distribution 84 372 Gamma distribution 88 373 X2 distribution 91 38 Beta distribution 92 39 Chebyshev7s Inequality 94 4 Multivariate Distributions 96 41 Introduction 96 42 Discrete random vectors 97 43 Continuous random vectors 99 44 Marginal distributions 101 45 Conditional distributions 104 46 Independent random variables 109 47 Expectations of functions of random variables 113 48 Covariance and correlation 116 481 Covariance 116 TABLE OF CONTENTS STATMATH 5117 J TEBBS 482 Correlation 119 49 Expectations and variances of linear functions of random variables 122 410 The rnultinornial model 124 411 The bivariate normal distribution 126 412 Conditional expectation 127 4121 Conditional means and curves of regression 127 4122 lterated means and variances 128 CHAPTER 1 STATMATH 5117 J TEBBS 1 Probability Complementary reading Chapter 2 11 Introduction TERMINOLOGY The text de nes probability as a measure of ones belief in the occurrence of a future event It is also sometimes called the mathematics of uncertainty7 EVENTS Here are some events we may wish to assign probabilities to o tomorrow7s temperature exceeding 80 degrees 0 manufacturing a defective part 0 concluding one fertilizer is superior to another when it isnt o the NASDAQ losing 5 percent of its value 0 you earning a B77 or better in this course ASSIGNING PROBABILITIES TO EVENTS How do we assign probabilities to events There are three general approaches 1 Subjective approach 0 this is based on feeling and may not even be scienti c 2 Relative frequency approach 0 this approach can be used when some random phenomenon is observed repeatedly under identical conditions 3 Acciomatz39c approach This is the approach we will take in this course PAGE 1 CHAPTER 1 STATMATH 5117 J TEBBS 015 Prapaman am 010 Prapaman am 00 005 am 015 020 005 015 010 Prapaman am 005 am 015 020 Prapaman am 005 0 0 Figure 11 The proportion of tosses which result in a 2 each plot represents 1000 rolls of afair die Example 11 An eccarnple illustrating the relative frequency approach to probability Suppose we roll a die 1000 times and record the number of times we observe a 277 Let A denote this event The relative frequency approach says that where nA denotes the frequency of the event7 and n denotes the number of trials performed The ratio is sometimes called the relative frequency The symbol PA is shorthand for the probability that A occurs7 RELATIVE FREQUENCY APPROACH Continuing with our example7 suppose that nA 158 Then7 we would estimate PA with 1581000 0158 If we performed this experiment repeatedly7 the relative frequency approach says that HUDn a PA7 as n a 00 Of course7 if the die is unbiased7 a PA 16 D PAGE 2 CHAPTER 1 STATMATH 5117 J TEBBS 12 Sample spaces TERMINOLOGY ln probability applications7 it is common to perform some random experiment and then observe an outcome The set of all possible outcomes for an experiment is called the sample space7 hereafter denoted by S Example 12 The Michigan state lottery calls for a three digit integer to be selected S 00000100279987999 D Example 13 An industrial experiment consists of observing the lifetime of a certain battery lf lifetimes are measured in hours7 the sample space could be any one of S1 w w 2 0 S2 07 17 237 S3 defective not defective D MORAL Sample spaces are not unique in fact7 how we de ne the sample space has a direct in uence on how we assign probabilities to events 13 Basic set theory TERMINOLOGY A countable set A is one whose elements can be put into a one to one correspondence with N 127 7 the set of natural numbers ie7 there exists an injection with domain A and range A set that is not countable is called an uncountable set TERMINOLOGY Countable sets can be further divided up into two types A count ably in nite set has an in nite number of elements A countably nite set has a nite number of elements TERMINOLOGY Suppose that S is a nonempty set We say that A is a subset of S7 and write A C S or A Q S7 if u e A x w e S PAGE 3 CHAPTER 1 STATMATH 5117 J TEBBS ln probability applications S will denote a sample space A will represent an event to which we wish to assign a probability and w usually denotes a possible experimental outcome If in E A we would say that the event A has occurred77 TERMINOLOGY The null set denoted as Q is the set that contains no elements TERMINOLOGY The union of two sets is the set of all elements in either set or both We denote the union of two sets A and B as A U B In La notation AUBww Aorw B TERMINOLOGY The intersection of two sets A and B is the set containing those elements which are in both sets We denote the intersection of two sets A and B as A B In La notation A Bww Aandw B EXTENSION We can extend the notion of unions and intersections to more than two sets Suppose that A1 A2 An is a nite sequence of sets The union of these 71 sets is AjA1UA2UUAnww Aj for at leastonej x H and the intersection of the 71 sets is AjA1 A2 Anww Ajfor allj 1 x H EXTENSION Suppose that A1 A2 is a countable sequence of sets The union and intersection of this in nite collection of sets is Cg x H Aj w w E A for at least one j Aj wmuEAj for allj 38 x H Example 14 De ne the sequence of sets Aj 11 1j for j 12 Then and 11 PAGE 4 CHAPTER 1 STATMATH 5117 J TEBBS TERMINOLOGY The complement of a set A is the set of all elements not in A but still in S We denote the complement as A In La notation A w E S w A TERMINOLOGY We say that A is a subset of B and write A C B or A Q B if u E A i w E B Thus if A and B are events in an experiment and A C B then if A occurs B must occur as well Distributive Laws 1 A BUO A BUA O 2 AUB O AUB AUO DeMorgans Laws LA BAUE 2AUBA E TERMINOLOGY We call two events A and B mutually exclusive or disjoint if A B Q Extending this de nition to a nite or countable collection of sets is obvious 14 Properties of probability THE THREE AXIOMS OF PROBABILITY Given a nonempty sample space S the measure PA is a set function satisfying three axioms 1 PA 2 0 for every A Q S 2 135 1 3 If A1A2 is a countable sequence of pairwise mutually exclusive events ie Ai Aj Q foriy j in S then PAGE 5 CHAPTER 1 STATMATH 5117 J TEBBS IMPORTANT RESULTS The following results are important properties of the prob ability set function P and each follows from the Kolmolgorov Axioms those just stated All events below are assumed to be subsets of S H F 9 7 U Complement rule For any event A PA17P Proof Note that S A U A Thus since A and A are disjoint PA U A PA PA by Axiom 3 By Axiom 2 PS 1 Thus 1 135 PA o2 PA PC D 130 0 Proof Take A Q and A S Use the last result and Axiom 2 D Monotonicity property Suppose that A and B are two events such that A C B Then PA S PB Proof Write B A U B Axiom 3 PB PA PB 7 Clearly A and B A are disjoint Thus by Because PB 2 0 we are done D For any event A PA S 1 Proof Since A C S this follows from the monotonicity property and Axiom 2 D Inclusionexclusion Suppose that A and B are two events Then PA o B PA PB 7 PA m B Proof Write A U B A U A B Then since A and A B are disjoint by Axiom 3 PA u B PA Pam B Now write B A B U A B Clearly A B and A B are disjoint Thus again by Axiom 3 PB PA B PA B Combining the last two statements gives the result D PAGE 6 CHAPTER 1 STATMATH 5117 J TEBBS Example 15 The probability that train 1 is on time is 0957 and the probability that train 2 is on time is 093 The probability that both are on time is 090 a What is the probability that at least one train is on time SOLUTION Denote by Al the event that train 239 is on time for 239 17 2 Then7 PA1 0 A2 PA1 PA2 i PA1 0 A2 095 093 i 090 098 D b What is the probability that neither train is on time SOLUTION By DeMorgan7s Law PA1 r122 PA1 o A217 PA1 0 A2 1 i 098 002 D EXTENSION The inclusionexclusion formula can be extended to any nite sequence of sets A17A2 An For example7 if n 37 PA1UA2UA3 PA1PA2PA3 PA1 A2 PA1 A3 7 ln general7 the inclusion exclusion formula can be written for any nite sequence P Ola Pmi 7 PAl1 mil2 1 PAi1 Al2 A197 1 z lt12 z1lt12lt13 71 1PAi1 m A m m A Of course7 if the sets A1A2 An are disjoint7 then we arrive back at PltUAgt 2m i1 i1 a result implied by Axiom 3 15 Discrete probability models and events TERMINOLOGY If a sample space for an experiment contains a nite or countable number of sample points7 we call it a discrete sample space PAGE 7 CHAPTER 1 STATMATH 5117 J TEBBS 0 Finite number of sample points lt oo77 o Countable number of sample points may equal 007 but can be counted ie7 sample points may be put into a 11 correspondence with N 17 27 gt77 Example 16 A standard roulette wheel contains an array of numbered compartments referred to as pockets7 The pockets are either red7 black7 or green The numbers 1 through 36 are evenly split between red and black7 while 0 and 00 are green pockets On the next play7 one may be interested in the following events A1 13 A2 red A3 000 TERMINOLOGY A simple event is one that can not be decomposed That is7 a simple event corresponds to exactly one sample point u Compound events are those events that contain more than one sample point In Example 167 because A1 only contains one sample point7 it is a simple event The events A2 and A3 contain more than one sample point thus7 they are compound events STRATEGY Computing the probability of a compound event can be done by 1 identifying all sample points associated with the event 2 adding up the probabilities associated with each sample point NOTATION We have used an to denote an element in a set ie7 a sample point in an event In a more probabilistic spirit7 your authors use the symbol E1 to denote the 2th sample point ie7 simple event Thus7 if A denotes any compound event7 PA 2 ME iiEiEA We simply sum up the simple event probabilities for all 239 such that E E A PAGE 8 CHAPTER 1 STATMATH 5117 J TEBBS RESULT Suppose a discrete sample space S contains N lt 00 sample points each of which are equally likely If the event A consists of nu sample points then PA nilN Proof Write S E1 U E2 U U EN where E corresponds to the 2th sample point 239 1 2 N Then N 1 135 PE1UE2UUEN ZPE i1 Now as PE1 PE2 PEN we have that N 1 21313 NPE17 i1 and thus PE1 PE2 PEN Without loss of generality take AE1UE2Uquot39UEna Then N PA PE1UE2UUE ZPE 2i innN D i1 i1 16 Tools for counting sample points 161 The multiplication rule MULTIPLIC39A TION RULE Consider an experiment consisting of k 2 2 stages77 where 711 number of ways stage 1 can occur n2 number of ways stage 2 can occur nk number of ways stage k can occur Then there are k Hnn1 X712gtltgtlt71k 13971 different outcomes in the experiment Example 17 An experiment consists of rolling two dice Envision stage 1 as rolling the rst and stage 2 as rolling the second Here 711 6 and n2 6 By the multiplication rule there are 711 gtlt n2 6 gtlt 6 36 different outcomes D PAGE 9 CHAPTER 1 STATMATH 5117 J TEBBS Example 18 In a eld experiment7 I want to form all possible treatment combinations among the three factors Factor 1 Fertilizer 60 kg7 80 kg7 100kg 3 levels Factor 2 lnsects infectednot infected 2 levels Factor 3 Temperature 70F7 90F 2 levels Here7 n1 3712 27 and n3 2 Thus7 by the multiplication rule7 there are 711 gtltn2 gtltn3 12 different treatment combinations D Example 19 Suppose that an Iowa license plate consists of seven places the rst three are occupied by letters the remaining four with numbers Compute the total number of possible orderings if a there are no letternumber restrictions b repetition of letters is prohibited c repetition of numbers is prohibited d repetitions of numbers and letters are prohibited ANSWERS a 26 X 26 X 26 X 10 X 10 X 10 X 10 17577607000 b 26 X 25 X 24 X 10 X 10 X 10 X 10 15670007000 c 26gtlt26gtlt26gtlt10gtlt9gtlt8gtlt788583040 d 26gtlt25gtlt24gtlt10gtlt9gtlt8gtlt778624000 162 Permutations TERMINOLOGY A permutation is an arrangement of distinct objects in a particular order Order is important PAGE 10 CHAPTER 1 STATMATH 5117 J TEBBS PROBLEM Suppose that we have n distinct objects and we want to order or perrnute these objects Thinking of 71 slots we will put one object in each slot There are 0 71 different ways to choose the object for slot 1 o n 7 1 different ways to choose the object for slot 2 o n 7 2 different ways to choose the object for slot 3 and so on down to o 2 different ways to choose the object for slot 71 7 1 and o 1 way to choose for the last slot PUNCHLINE By the multiplication rule there are nn71n72 21 71 different ways to order permute the n distinct objects Example 110 My bookshelf has 10 books on it How many ways can I permute the 10 books on the shelf ANSWER 10 3628800 D Example 111 Now suppose that in Example 110 there are 4 math books 2 chemistry books 3 physics books and 1 statistics book I want to order the 10 books so that all books of the same subject are together How many ways can I do this SOLUTION Use the multiplication rule Stage 1 Permute the 4 math books 4 Stage 2 Permute the 2 chemistry books 2 Stage 3 Permute the 3 physics books 3 Stage 4 Permute the 1 statistics book 1 Stage 5 Permute the 4 subjects mcps 4 Thus there are 4 gtlt 2 gtlt 3 gtlt 1 gtlt 4 6912 different orderings D PAGE 11 CHAPTER 1 STATMATH 5117 J TEBBS PERMUTATIONS With a collection of n distinct objects7 we want to choose and per mute r of them 7 S The number of ways to do this is 71 BM E The symbol P is read the permutation of 71 things taken 7 at a time77 Proof Envision 7 slots There are 71 ways to ll the rst slot7 n 71 ways to ll the second slot7 and so on7 until we get to the rth slot7 in which case there are n 7 7 1 ways to ll it Thus7 by the multiplication rule7 there are nn71n7r1 W different permutations D Example 112 With a group of 5 people7 I want to choose a committee with three members a president7 a vice president7 and a secretary There are 5 120 P 7 60 5 3 5 i 3 2 different committees possible Here note that order is important For any 3 people selected7 there are 3 6 different committees possible D Example 113 In an agricultural experiment7 we are examining 10 plots of land however7 only four can be used in an experiment run to test four new different fertilizers How many ways can I choose these four plots and then assign fertilizers SOLUTION There are l P104 V 5040 1074 different permutations Here7 we are assuming fertilizer order is important a What is the probability of observing the permutation 77 4 27 6 b What is the probability of observing a permutation with only even numbered plots ANSWERS a 15040 b 1205040 PAGE 12 CHAPTER 1 STATMATH 5117 J TEBBS CURIOSITY What happens if the objects to perrnute are not distinct Example 114 Consider the word PEPPER How many permutations of the letters are possible TRICK lnitially7 treat all letters as distinct objects by writing7 say7 P1E1P2P3E2R With P1E1P2P3E2R7 there are 6 720 different orderings of these distinct objects Now7 we recognize that there are 3 ways to perrnute the Ps 2 ways to perrnute the Es 1 ways to perrnute the Rs Thus7 6 is 3 gtlt 2 gtlt 1 times too large7 so we need to divide 6 by 3 gtlt 2 gtlt 1 ie7 there are 6 7 60 3 2 1 possible perrnutations D MULTINOMIAL COEFFICIENTS Suppose that in a set of n objects7 there are 711 that are sirnilar7 712 that are sirnilar7 nk that are sirnilar7 where 711 712 nk n The number of perrnutations ie7 distinguishable perrnutations7 in the sense that the objects are put into distinct groups of the 71 objects is given by the multinomial coe icient n i n n1n2nk 7771712 7 NOTE Multinornial coef cients arise in the algebraic expansion of the rnultinornial ex pression 1 x2 zk ie7 12 knZ H 172L2 Zk7 D where PAGE 13 CHAPTER 1 STATMATH 5117 J TEBBS Example 115 How many signals7 each consisting of 9 ags in a line7 can be made from 4 white ags7 2 blue ags7 and 3 yellow ags ANSWER 9 7 1260 D 4 2 3 Example 116 In Example 1157 assuming all permutations are equally likely7 what is the probability that all ofthe white ags are grouped together I will offer two solutions The solutions differ in the way I construct the sample space De ne A all four white ags are grouped together SOLUTION 1 Work with a sample space that does not treat the ags as distinct objects7 but merely considers color Then7 we know from Example 115 that there are 1260 different orderings Thus7 N number of sample points in S 1260 Let nu denote the number of ways that A can occur We nd nu by using the multipli cation rule Stage 1 Pick four adjacent slots n1 6 Stage 2 With the remaining 5 slots7 permute the 2 blues and 3 yellows n2 10 Thus7 nu 6 gtlt 10 60 Finally7 since we have equally likely outcomes7 PA naN 601260 00476 D SOLUTION 2 lnitially7 treat all 9 ags as distinct objects ie7 W1W2W3W4B132Y1Y2Y3 and consider the sample space consisting of the 9 different permutations of these 9 distinct objects Then7 N number of sample points in S 9 PAGE 14 CHAPTER 1 STATMATH 5117 J TEBBS Let nu denote the number of ways that A can occur We nd nu again7 by using the multiplication rule Stage 1 Pick adjacent slots for W1 W2W3 W4 n1 6 Stage 2 With the four chosen slots7 permute W1 W2 W3 W4 n2 4 Stage 3 With remaining 5 slots7 permute B1 B2 Y17Y2 Y3 n3 5 Thus7 nu 6 gtlt 4 gtlt 5 17280 Finally7 since we have equally likely outcomes7 PA nilN 172809 x 00476 D 163 Combinations COMBINATIONS Given n distinct objects7 the number of ways to choose r of them r S n7 without regard to order7 is given by n n C E r r 7177 The symbol On is read the combination of 71 things taken r at a time77 By convention7 01 Proof Choosing r objects is equivalent to breaking the 71 objects into two distiguishable groups Group 1 r chosen Group 2 n 7 r not chosen There are On ways to do this D REMARK We will adopt the notation read 71 choose r7 as the symbol for CT The terms are often called binomial coe icients since they arise in the algebraic expansion of a binomial viz7 96 y Zn r0 PAGE 15 CHAPTER 1 STATMATH 5117 J TEBBS Example 117 Return to Example 112 Now7 suppose that we only want to choose 3 committee members from 5 without designations for president7 vice president7 and secretary Then7 there are 5 5 5gtlt4gtlt3l 10 3 3 573 31x2 different committees D NOTE From Examples 112 and 1177 one should note that PW r gtlt OW Recall that combinations do not regard order as important Thus7 once we have chosen our 7 objects there are On ways to do this7 there are then 7 ways to permute those 7 chosen objects Thus7 we can think of a permutation as simply a combination times the number of ways to permute the r chosen objects Example 118 A company receives 20 hard drives Five of the drives will be randomly selected and tested If all ve are satisfactory7 the entire lot will be accepted Otherwise7 the entire lot is rejected If there are really 3 defectives in the lot7 what is the probability of accepting the lot SOLUTION First7 the number of sample points in S is given by l N 20 2039 15504 5 5 20 i 5 Let A denote the event that the lot is accepted How many ways can A occur Use the multiplication rule Stage 1 Choose 5 good drives from 17 Stage 2 Choose 0 bad drives from 3 By the multiplication rule7 there are nu 157 gtlt 6188 different ways A can occur Assuming an equiprobability model ie7 each outcome is equally likely7 PA nilN 618815504 m 0399 D PAGE 16 CHAPTER 1 STATMATH 5117 J TEBBS 17 Conditional probability MOTIVATION In some problems7 we may be fortunate enough to have prior knowl edge about the likelihood of events related to the event of interest It may be of interest to incorporate this information into a probability calculation TERMINOLOGY Let A and B be events in a non empty sample space S The condi tional probability of A given that B has occurred7 is given by Pmmm MB MB provided that PB gt 0 Example 119 A couple has two children a What is the probability that both are girls b What is the probability that both are girls7 if the eldest is a girl SOLUTION a The sample space is given by S M7M7M7F7F7M713713 and N 4 the number of sample points in S De ne A1 1st born child is a girl7 A2 2nd born child is a girl Clearly7 A1 A2 F7 and PA1 A2 147 assuming that the four outcomes in S are equally likely D SOLUTION b Now7 we want PA2A1 Applying the de nition of conditional proba bility7 we get i PA1 A2 714 7 PA2A1 7 PltA1 7 712 712 D PAGE 17 CHAPTER 1 STATMATH 5117 J TEBBS REMARK In a profound sense the new information77 in Example 119 ie7 that the eldest is a girl induces a new or restricted sample space given by 5 F7M7 F7F On this space7 note that PA2 12 computed with respect to 5 Also note that whether you compute PA2lA1 with the original sample space S or compute PA2 with the restricted space 8 you will get the same answer Example 120 In a certain community7 36 percent of the families own a dog7 22 percent of the families that own a dog also own a cat7 and 30 percent of the families own a cat A family is selected at random a Compute the probability that the family owns both a cat and dog b Compute the probability that the family owns a dog7 given that it owns a cat SOLUTION Let C family owns a cat and D family owns a dog In a7 we want PO D But7 PO D PO D 022POD l PD 036 Thus7 PC D 036 gtlt 022 00792 For b7 simply use the de nition of conditional probability PO D 130 PROBABILITY AXIOMS It is interesting to note that conditional probability satis es PDlO 00792030 0264 D the axioms for a probability set function7 when PB gt 0 ln particular7 1 PAlB 2 0 2 PBlB 1 3 If A1A2 is a countable sequence of pairwise mutually exclusive events ie7 Ai Aj 0 forz397 j in S then i1 B iPUillB PAGE 18 CHAPTER 1 STATMATH 511 J TEBBS MULTIPLIOATION LAW OF PROBABILITY Suppose A and B are events in a non empty sample space S Then PA B PBlAPA PAlBPB Proof As long as PA and PB are strictly positive this follows directly from the de nition of conditional probability D EXTENSION The multiplication law of probability can be extended to more than 2 events For example PA1 A2 A3 PA1 A2 A3 PA3lA1 A2 gtlt PA1 A2 PA3lA1 A2 gtlt PA2lA1 gtlt PA1 NOTE This suggests that we can compute probabilities like PA1 A2 A3 sequen tially77 by rst computing PA1 then PA2lA1 then PA3lA1 A2 The probability of a k fold intersection can be computed similarly ie k P A PA1 gtlt PA2lA1 gtlt PA3lA1 m A2 gtlt gtlt P A kil A i1 Example 121 I am dealt a hand of 5 cards What is the probability that they are all spades SOLUTION De ne A to be the event that card 239 is a spade 239 1 2345 Then 13 PltA1gt a 12 PA2lA1 E 11 PltA3lA1 A2gt E 10 PltA4lA1 A2 A3gt E 9 PltA5lA1 A2 A3 A4gt E so that i 13 12 11 10 9 PAGE 19 CHAPTER 1 STATMATH 5117 J TEBBS 18 Independence TERMINOLOGY When the occurrence or non occurrence of A has no effect on whether or not B occurs and vice versa we say that the events A and B are independent Mathematically we de ne A and B to be independent iff PA m B PAPB Otherwise A and B are called dependent events Note that if A and B are independent PA B PAPB PAlB W W PM and PBlA P 1512 P 135 PB Example 122 A red die and a white die are rolled Let A 4 on red die and B sum is odd Of the 36 outcomes in S 6 are favorable to A 18 are favorable to B and 3 are favorable to A B Thus since outcomes are assumed to be equally likely 6 18 PA B PAPB gtlt and the events A and B are independent D Example 123 In an engineering system two components are place in a series that is the system is functional as long as both components are Let A 239 1 2 denote the event that component 239 is functional Assuming independence the probability the system is functional is then PA1 A2 PA1PA2 lf PA 095 for example then PA1 A2 0952 09025 D INDEPENDENCE OF COMPLEMENTS If A and B are independent events so are a A and B b A and E c A and E PAGE 20 CHAPTER 1 STATMATH 5117 J TEBBS Proof We will only prove a The other parts follow similarly PZ B PZlBPB 17 PAlBPB 17 PAPB PZPB D EXTENSION The concept of independence and independence of complements can be extended to any nite number of events in S TERMINOLOGY Let A1 A2 An denote a collection of n 2 2 events in a non empty sample space S The events A1 A2 An are said to be mutually independent if for any subcollection of events say A1Ai2Ak 2 S k S n we have k k Plt A H PA CHALLENGE Come up with a three events which are pairwise independent but not mutually independent COMMON SETTING Many experiments consist of a sequence of 71 trials that are independent eg ipping a coin 10 times If A denotes the event associated with the 2th trial and the trials are independent Plt Agt H PA i1 i1 Example 124 An unbiased die is rolled six times Let A 239 appears on roll 239 for 239 1 2 6 Then PA 16 and assuming independence 6 6 PA1 Ag Ag A4 A5 A6 HPA i1 Suppose that if A occurs we will call it a match77 What is the probability of at least one match in the six rolls SOLUTION Let B denote the event that there is at least one match Then F denotes the event that there are no matches Now pm pg 722 m 23 mm Ms MG pa 0335 i1 Thus PB 17 PB 17 0335 0665 by the complement rule EXERCISE Generalize this result to an n sided die What does this probability converge toasn7oltgt7D PAGE 21 CHAPTER 1 STATMATH 5117 J TEBBS 19 Law of Total Probability and Bayes Rule SETTING Suppose A and B are events in a non empty sample space S We can easily express the event A as follows AA BuA union of disjoint events Thus7 by Axiom 37 PA PA BPA mmbanmm p y where the last step follows from the multiplication law of probability This is called the Law of Total Probability LOTP The LOTP can be very helpful Sometimes com puting PAlB7 PAl 7 and PB may be easily computed with available information whereas computing PA directly may be dif cult NOTE The LOTP follows from the fact that B and F partition S that is7 a B and F are disjoint7 and b BUBS Example 125 An insurance company classi es people as accident prone77 and non accident prone7 For a xed year7 the probability that an accident prone person has an accident is 047 and the probability that a non accident prone person has an accident is 02 The population is estimated to be 30 percent accident prone a What is the probability that a new policy holder will have an accident SOLUTION De ne A policy holder has an accident and B policy holder is accident prone Then PB 03 PAB 04 HP 07 and MAE 02 By the LOTP PM PAlBPB PWEME 0403 0207 026 D PAGE 22 CHAPTER 1 STATMATH 5117 J TEBBS b Now suppose that the policy holder does have an accident What is the probability that he was accident prone 7 SOLUTION We want PBlA Note that PA B PAlBPB 0403 POEM PM PM 03926 046D NOTE From this last part7 we see that7 in general7 PAlBPB PAlBPB PW PltAgt PltABgtPltBgtPltAi gtPltPgt39 This is a form of Bayes Rule Example 126 A lab test is 95 percent effective in detecting a certain disease when it is present sensitivity However7 there is a one percent false positive rate that is7 the test says that one percent of healthy persons have the disease speci city lf 05 percent of the population truly has the disease7 what is the probability that a person has the disease given that a his test is positive b his test is negative SOLUTION Let D disease is present and gt14 test is positive We are given that PD 00057 FORD 095 sensitivity7 FORD 001 speci city7 and7 for a7 we want to compute PDlgtI4 By Bayes Rule7 PIlDPD PgtIltlDPD PgtIltlDPD 0950005 0950005 0010995 PUMP 0323 The reason this is so low is that POED is high relative to PD ln b7 we want PDl By Bayes Rule7 i P Q D P D MW i lt l gt L i PgtIltlDPD PgtIltlDPD 0050005 m 000025 a PAGE 23 CHAPTER 1 STATMATH 5117 J TEBBS Table 11 The general Bayesian scheme Measure before test Result Updated measure PD F PDlF 0005 6 gt14 6 0323 0005 a i 6 000025 NOTE We have discussed the LOTP and Bayes Rule in the case ofthe partition B However7 these rules hold for any partition TERMINOLOGY A sequence of sets B17B27Bk is said to form a partition of the sample space S if a B1 U B2 U U Bk S exhaustive condition7 and b Bl Bj Q for all 2 79739 disjoint condition LAW OF TOTAL PROABILITYrestatecO Suppose that B1 B2 Bk forms a partition of S and suppose PBi gt 0 for all 2 17 27 k Then7 k 2 PltAiBigtPltBii 21 PA Proof Write Cw AA SA B1UB2UUBk A Bi 2 H Thus7 Cw k k PA P A m Bi Z PA m Bi ZPAlBlPBi n i1 21 H BA YES RULE restatecO Suppose that B1 B2 Bk forms a partition of S and suppose that PA gt 0 and PBi gt 0 for all 2 17 27 k Then7 221PAlBiPBi PAGE 24 CHAPTER 1 STATMATH 5117 J TEBBS Proof Simply apply the de nition of conditional probability and the multiplication law of probability to get PAlB39PB P B A Then7 just apply LOTP to PA in the denominator to get the result D REMARK Bayesians will call PBj the prior probability for the event Bj they call PleA the posterior probability of B Example 127 Suppose that a manufacturer buys approximately 60 percent of a raw material in boxes from Supplier 17 30 percent from Supplier 27 and 10 percent from Supplier 3 these are the prior probabilities For each supplier7 defective rates are as follows Supplier 1 0017 Supplier 2 0027 and Supplier 3 003 Suppose that the manufacturer observes a defective box of raw material a What is the probability that it came from Supplier 2 b What is the probability that the defective did not come from Supplier 3 SOLUTION a Let A observe defective7 and B1 B2 and B3 respectively7 denote the events that the box comes from Supplier 17 27 and 3 Note that B17 B2 B3 partitions the space of possible suppliers Thus7 by Bayes Rule7 we have PAlePBz PAlB1PB10 S2IAlf2PB2 f PAlBsPBs 0 3 00106 00203 00301 040 PleA SOLUTION b First7 compute the posterior probability PB3lA By Bayes Rule7 PAlBsPBs PAlB1PB1 f PAlB2PB2 f PAlBsPBs 00301 00106 00203 00301 020 PleA Thus7 PB3lA 17 PB3lA17 020 0807 by the complement rule D PAGE 25 CHAPTER 2 STATMATH 5117 J TEBBS 2 Discrete Distributions Complementary reading Chapter 3 WMS7 except 310 11 21 Random variables MATHEMATICAL DEFINITION A random variable Y is a function whose domain is the sample space S and whose range is the set of real numbers R y foo lt y lt oo WORKING DEFINITION A random variable is a variable whose observed value is determined by chance Example 21 Suppose that our experiment consists of ipping two fair coins The sample space consists of four sample points 5 H7H7 H7T7 T7H7T7T Now7 let Y denote the number of heads observed Before we perform the experiment7 we do not know7 with certainty7 the value of Y What are the possible values of Y Sample point7 El y H H 2 H T 1 T H 1 T T 0 In a profound sense7 a random variable Y takes sample points E E S and assigns them a real number This is precisely why we can think of Y as a function ie7 YlH7Hl 2 YlH7Tl 1 YlT7Hl 1 YlT7Tl 07 so that PY2 PlH7Hl 14 PltY1gt PHTlPlT7Hl 141412 PY 0 PT7 14 PAGE 26 CHAPTER 2 STATMATH 5117 J TEBBS NOTE From these probability calculations note that we can 0 work on the sample space S and compute probabilities from S or 0 work on R and compute probabilities for events Y E B7 where B C R NOTATION We denote a random variable Y with a capital letter we denote an observed value of Y as y a lowercase letter This is standard notation Example 22 Let Y denote the weight7 in ounces7 ofthe next newborn boy in Columbia7 SC Here7 Y is random variable After the baby is born7 we observe y 128 D 22 Probability distributions for discrete random variables TERMINOLOGY The support of a random variable Y is set of all possible values that Y can assume We will often denote the support set as R If the random variable Y has a support set R that is either nite or countable7 we call Y a discrete random variable Example 23 Suppose that in rolling an unbiased die7 we record two random variables X face value on the rst roll Y number of rolls needed to observe a six The support of X is RX 17 23456 The support of Y is Ry 17 237 RX is nite and By is countable thus7 both random variables X and Y are discrete D GOAL With discrete random variables7 we would like to assign probabilities to events of the form Y That is7 we would like to compute PY y for any y E B To do this7 one approach is to determine all sample points E E S such that y and then compute 19M PY y ZPlEi 6 S YE 11 for all y E R However7 as we will see7 this approach is often unnecessary PAGE 27 CHAPTER 2 STATMATH 5117 J TEBBS TERMINOLOGY The function pyy PY y is called the probability mass function pmf for the discrete random variable Y FACTS The prnf pyy for a discrete random variable Y consists of two parts a R7 the support set of Y b a probability assignment PY y7 for all y E R PROPERTIES The prnf pyy for a discrete random variable Y satis es the following 1 pyy gt 0 for all y e R 2 The sum of the probabilities taken over all support points7 must equal one ie7 Elky 1 yER 3 The probability of an event B is computed by adding the probabilities pyy for all y E B ie7 W e B 2mg yEB Example 24 Suppose that we roll an unbiased die twice and observe the face on each roll Here7 the sample space is Let the random variable Y record the sum of the two faces Here7 R 237 12 PY 2 Pall El 6 S where y 2 Pl171l 136 PAGE 28 CHAPTER 2 STATMATH 5117 J TEBBS PY 3 Pall El 6 S where y 3 Pl17 2l Pl271l 236 The calculation PY y is performed similarly for y 4 57 12 The pmf for Y can be given as a formula7 table7 or graph ln tabular form7 the pmf of Y is given by y 2 3 4 5 6 7 8 9 10 11 12 pyy 136 236 336 436 536 636 536 436 336 236 136 A probability histogram is a display which depicts a pmf in graphical form The probability histogram for the pmf in Example 24 is given in Figure 22 015 010 PMPYV 005 Figure 22 Probability histogram for the pmfiii Example 24 The astute reader will note that a closed form formula for the pmf exists ie7 1 7 pm 67l7iylkyi273w712 07 otherwise ls pyy valid Yes7 since pyy gt 0 for all support points y 23127 and Zpyltygt2lt67177y1gt1 yER PAGE 29 CHAPTER 2 STATMATH 5117 J TEBBS QUESTION De ne the events B1 the sum is 3 and B2 the sum is odd ln Example 247 PltBlgt pylt3gt 236 and PBz Elky 1632 PY3 10Y5 10Y7 10Y9 10Y11 236 436 636 436 236 12 Example 25 An experiment consists of rolling an unbiased die until the rst 677 is observed Let Y denote the number of rolls needed Here7 the support set is R 12 Assuming independent trials7 we have PY 1 5 l PY 2 8 gtlt 8 5 5 l PY 3 8 gtlt 8 gtlt 8 in general7 the probability that y rolls are needed to observe the rst 677 is given by pltygtgg for all y 127 Thus7 the pmf for Y is given by aw y 12 07 otherwise ilky ls this a valid pmf Clearly7 pyy gt 0 for all y E R and Ziay yER y1 PAGE 30 CHAPTER 2 STATMATH 5117 J TEBBS 015 005 7 l l l 0 5 1 pMPlYyl O a l 000 20 25 30 llllllllllllll 0 15 V Figure 23 Probability histogram for the pmfih Example 25 IMPORTANT In the last calculation7 we have used an important fact concerning in nite geometric series namely7 if a is any real number and M lt 1 Then7 i L 7 1 7 r39 m0 The proof of this fact can be found in any standard calculus text We will use this fact many times in this course EXERCISE ln Example 257 nd PB7 where B the rst 677 is observed on an odd numbered roll 23 Mathematical expectation TERMINOLOGY Let Y be a discrete random variable with prnf pyy and support R The expected value of Y is given by 190 Zypyy yER PAGE 31 CHAPTER 2 STATMATH 5117 J TEBBS DESCRIPTION ln words7 the expected value for discrete random variable is a weighted average of possible values the variable can assume each value7 1 being weighted with the probability7 guyy7 that the random variable assumes the corresponding value MATHEMATICAL ASIDE For the expected value EY to exist7 the sum above must be absolutely convergent ie7 we need 2 himy lt 00 yER lf EY is not nite ie7 if EY 007 we say that EY does not exist Example 26 Let the random variable Y have pmf 6721 y1234 pyy p 07 otherw1se 04 03 F gt a II E 02 e n 01 00 i l l l l l l 1 0 1 5 2 0 2 5 3 0 3 5 4 0 Figure 24 Probability histogram for the pmfiii Example 26 The pmf for Y is depicted in Figure 24 The expected value of Y is given by gymy 2295721 1410 2310 3210 4110 2 D PAGE 32 CHAPTER 2 STATMATH 5117 J TEBBS Example 27 A random variable whose eapeeted value does not em39st Suppose that the random variable Y has prnf 111 y ER 07 otherwise7 ilky where the support set R 22 239 1237 It is easy to see that pyy is a valid prnf since gpyyy 7 1i 711 However7 EY Earny 2y 1 00 116R yER yeR since R7 the support set7 is countably in nite D INTERPRETATION How is EY interpreted a the center of gravity77 of a probability distribution b a long run average c the rst moment of the random variable STATISTICAL CONNECTION When used in a statistical context7 the expected value EY is sometimes called the mean of Y7 and we might use the symbol a or My when discussing it that is7 My M MY ln statistical settings7 n denotes a population parameter EXPECTATIONS OF FUNCTIONS OF Y Let Y be a discrete random variable with prnf pyy and support R7 and suppose that g is a real valued function Then7 gY is a random variable and E 900 Eyema yER The proof of this result is given on pp 90 D PAGE 33 CHAPTER 2 STATMATH 5117 J TEBBS MATHEMATICAL ASIDE For the expected value EgY to exist7 the sum above must be absolutely convergent ie7 Z l9ylpyy lt 00 yER lf EgY is not nite ie7 if EgY 007 we say that EgY does not exist Example 28 In Example 267 nd EY2 and EeY SOLUTION The functions 91 Y Y2 and 92Y eY are real functions of Y From the de nition7 EY2 El ny yER 21120579 12410 22310 32210 42110 5 and E6Y Z eyp y yER 4 1 yi5 Eelo y 211 51410 52310 53210 541101278 D Example 29 The discrete uniform distribution Suppose that the random variable X has pmf 17717 z 12m me 07 otherwise7 where m is a xed positive integer larger than 1 Find the expected value of X SOLUTION The expected value of X is given by m 1 1 m 1 mm 1 m 1 Em ZWW 295 a 29 a T39 zER 11 m1 In this calculation7 we have used the fact that 3 x the sum of the rst in integers7 equals mm 12 this fact can be proven by mathematical induction PAGE 34 CHAPTER 2 STATMATH 5117 J TEBBS REMARK If m 6 then the discrete uniform distribution serves as a probability model for the outcome of an unbiased die The expected outcome is EX 671 35 D z 1 2 3 4 5 6 pXz 16 16 16 16 16 16 PROPERTIES OF EXPECTATIONS Let Y be a discrete random variable with pmf pyy and support R suppose that 99192 gk are real valued functions and let 0 be any real constant Then a EC C b El09Yl CElgYl C E 221 9Yl 21 ElyYM Since enjoys these above mentioned properties we sometimes call E a linear op erator Proofs to these facts are easy and are left as exercises Example 210 In a one hour period the number of gallons of a certain toxic chemical that is produced at a local plant say Y has the pmf y 0 1 2 3 pyy 02 03 03 02 a Compute the expected number of gallons produced during a one hour period b The cost in tens of dollars to produce Y gallons is given by the cost function CY 3 12Y 2Y2 What is the expected cost in a one hour period SOLUTION a We have that EY Zypyy 002103 203 302 15 yER PAGE 35 CHAPTER 2 STATMATH 5117 J TEBBS Thus7 we would expect 15 gallons of the toxic chemical to be produced per hour For b7 rst compute EY2 EY2 Zyzpyy 02021203 2203 3202 33 yeR Now7 we use the aforementioned linearity properties to compute EOY E3 12Y 2Y2 3 12EY 2EY2 7 3 1215 233 276 Thus7 the expected hourly cost is 27600 D 24 Variance REMARK We have learned that EY is a measure of the center of a probability dis tribution Now7 we turn our attention to quantifying the variability in the distribution TERMINOLOGY Let Y be a discrete random variable with pmf guyy7 support R7 and mean u The variance of Y is given by 02 WY E EKY 7 m 29 7 M2pyy yeR The standard deviation of Y is given by the positive square root of the variance ie7 039 VY FACTS ABOUT THE VARIANCE a 02 gt 0 b 02 0 if and only if the random variable Y has a degenerate distribution ie7 all the probability mass is at one point PAGE 36 CHAPTER 2 STATMATH 5117 J TEBBS c The larger smaller 02 is7 the more less spread in the possible values of Y about the mean M d 02 is measured in units2 and o is measured in the original units NOTE Facts a7 b7 and c above are true if we replace 02 with 0 THE VARIANCE COMPUTING FORMULA Let Y be a random variable not neces sarily a discrete random variable with pmf pyy and mean EY M Then WY EKY 7 M2l 1902 7 M2 The formula VY EY2 7 p2 is called the variance computing formula Proof Expand the Y 7 2 term and distribute the expectation operator as follows EW 7 m 7 EY2 7 2W 2 EY2 i 2MEY if 7 EY2 7 2M2 M2 EY2 i if D Example 211 The discrete uniform distribution Suppose that the random variable X has pmf 17717 z 12m me 07 otherwise7 where m is a xed positive integer larger than 1 Find the variance of X SOLUTION We will nd 02 VX by using the variance computing formula ln Example 297 we computed We rst nd EX2 note that EX2 Zmszw Zzz we m1 fo LW m m 12m 1 6 PAGE 37 CHAPTER 2 STATMATH 5117 J TEBBS Above7 we have used the fact that 21 2 the sum of the rst m squared integers7 equals mm 12m 16 this fact can be proven by mathematical induction The variance of X is equal to 02 EX2M2 m12m1 m12 f7T 771271 12 39 Note that if m 67 as for our unbiased die example7 02 3512 D EXERCISE Find 02 for the prnf in Example 26 notes IMPORTANT RESULT Let Y be a random variable not necessarily a discrete random variable and suppose that a and b are real constants Then Va bY b2VY Proof Exercise D REMARK Taking b 0 above7 we see that Va 07 for any constant a This makes sense intuitively The variance is a measure of variability for a random variable a constant such as a does not vary Also7 by taking a 07 we see that VbY b2VY Both of these facts are important and we will use them repeatedly 25 Moment generating functions TERMINOLOGY Let Y be a discrete random variable with prnf pyy and support R The moment generating function mgf for Y7 denoted by m t7 is given by Viit We Z etyp y yeR provided Eety lt 00 for t in an open neighborhood about 0 ie7 there exists some h gt 0 such that Eety lt 00 for all t E flui lf Eety does not exist in an open neighborhood of 07 we say that the moment generating function does not exist PAGE 38 CHAPTER 2 STATMATH 5117 J TEBBS TERMINOLOGY We call EYk the kth moment of the random variable Y EY 1st moment mean Y2 2nd moment EY3 3rd moment NOTATION WMS use the notation u to denote the kth moment ie EYk 2 This is common notation in statistics applications but I rarely use it REMARK The moment generating function mgf can be used to generate moments In fact from the theory of Laplace transforms it follows that if the mgf exists it char acterizes an in nite set of moments So how do we generate moments RESULT Let Y denote a random variable not necessarily a discrete random variable with support R and mgf myt Then dkmyt dtk t0 EYk Note that derivatives are taken with respect to t Proof Assume without loss that Y is discrete With k 1 we have d d ty 771341 a 5 JOEy yER Z genmy Eganpm Ewe 116R yER Thus it follows that det tY E Y E Y d et0 ltgt Continuing to take higher order derivatives we can prove that dkat CM EM t0 for any integer k 2 1 Thus the result follows E PAGE 39 CHAPTER 2 STATMATH 5117 J TEBBS MATHEMATICAL ASIDE In the second line of the proof of the last result we in terchanged the derivative and possibly in nite sum This is permitted as long as myt EetY exists COMPUTING MEANS AND VARIANCES Let Y denote a random variable not nec essarily a discrete random variable with mgf myt Then we know that det E Y d t0 and dZmyt E Y2 dtz t0 Thus VY EY2 7 lEYl2 dsz t 7 det dtZ dt t0 Will0 7 ml0N2 2 t0 REMARK In many applications being able to compute means and variances is impor tant Thus we can use the mgf as a tool to do this This is helpful because sometimes computing E Y Z yiny yER directly or even higher order moments may be extremely dif cult depending on the form of py Example 212 Suppose that Y is a random variable with pmf pg y 123 0 pyy otherwise Find the mean of Y SOLUTION Using the de nition of expected values the mean of Y is given by MW Emuy 21 yER PAGE 40 CHAPTER 2 STATMATH 5117 J TEBBS Finding this in nite sum is quite dif cult at least7 this sum is not a geometric sum It is easier to use moment generating functions The mgf of Y is given by mylttgt7EltetYgt Zenmy yER Zen12 211 2 00 6t 2 7 a 1 00 t y 1 t 175 276 y0 d 6t dt 2 7 6t t0 et2 7 6t 7 et7et 2 7 62 ts H wl l I for values of t lt ln 2 why Thus7 i de t 7 dt EY 2D t0 Example 213 Let the random variable Y have pmf pyy given by 8771 217012 07 otherwise ilky For this probability dlistribution7 simple calculations verify show that EY 23 VY 59 Let7s check77 these calculations using the mgf It is given by mylttgt7Eltetygt ZenW yER 3 2 1 t0 t0 ta 6 66 66 6 3 2 1 if it 721 7 66e6e PAGE 41 CHAPTER 2 STATMATH 5117 J TEBBS Taking derivatives of myt with respect to t we get d 2 2 i t 7 t 7 2t dtmyltgt 66 66 and d2 2 t 4 2 gnut 86 86 Thus det i 2 0 2 20 7 7 MW d 7 i 86 85 746723 dZmyt 2 4 Ey2 70 7201 dt2 66 66 t0 so that WY EY2 lEYl2 1232 59 So in this example we can use the mgf to get EY and VY or we can compute EY and VY directly We get the same answer as we should D REMARK Not only is the mgf a tool for computing moments but it also helps us to characterize a probability distribution How When an mgf exists it happens to be unique Thus if two random variables have same mgf then they have the same probability distribution Sometimes this is referred to as the uniqueness property of mgfs it is based on the uniqueness of Laplace transforms For now however it suf ces to envision the mgf as a special expectation that generates moments This in turn helps us to compute means and variances of random variables 26 Binomial distribution BERNO ULLI TRIALS Many experiments consist of a sequence of trials where i each trial results in a success or a failure ii the trials are independent and iii the probability of success denoted by p 0 lt p lt 1 is the same on every trial PAGE 42 CHAPTER 2 STATMATH 5117 J TEBBS TERMINOLOGY In a sequence of n Bernoulli trials7 denote by Y the number of successes out of 717 where n is xed We call Y a binomial random variable7 and say that Y has a binomial distribution with parameters 71 and success probability p77 Shorthand notation is Y N bnp Example 214 Each of the following situations represent binomial experiments Are you satis ed with the Bernoulli assumptions in each instance a Suppose we ip a fair coin 10 times and let Y denote the number of tails in 10 ips Here7 Y N bn 1010 05 b In an agricultural experiment7 forty percent of all plots respond to a certain treat ment I have four plots of land to be treated If Y is the number of plots that respond to the treatment7 then Y N bn 410 04 c In rural Kenya7 the prevalence rate for HIV is estimated to be around 8 percent Let Y denote the number of HIV infecteds in a sample of 740 individuals Here7 Y N bn 74010 008 d It is known that screws produced by a certain company do not meet speci cations ie7 are defective with probability 0001 Let Y denote the number of defectives in a package of 40 Then7 Y N bn 4010 0001 D DERIVATION We now derive the prnf of a binomial random variable That is7 we need to compute pyy PY y7 for each possible value of y E R Recall that Y is the number of successes77 in n Bernoulli trials so the support set is R y y 07 17 27 QUESTION In a sequence of n trials7 how can we get exactly y successes Denoting S success F failure a possible sample point may be SSFSFSFFSFSF PAGE 43 CHAPTER 2 STATMATH 5117 J TEBBS Because the trials are independent7 the probability that we get any particular ordering of y successes and n 7y failures is py17p y Now7 how many ways are there to choose y successes from n trials We know that there are ways to do this Thus7 the prnf forYisfor0ltplt17 WW1 7 MW y 0 1 2 pyy p 07 otherw1se 03 F E 02 i L a n 01 00 1 1 1 1 1 0 1 2 3 4 Figure 25 Probability histogram for the number of plots which respond to treatment This represents the bn 410 04 model in Example 214b Example 215 In Example 214b7 assume that Y N bn 410 04 Here are the probability calculations for this binomial model PY 0 py0 30 4017 04 1 gtlt 0 4 gtlt 0 6 01296 PY 1 pyl110 4117 0441 4 gtlt 0 41 gtlt 0 63 0 3456 PY 2 py2 0 4217 04 2 6 gtlt 0 42 gtlt 0 62 0 3456 PY 3 py3 0 4317 04H 4 gtlt 0 43 gtlt 0 61 01536 PY 4 py4 0 4417 04 4 1 gtlt 0 44 gtlt 0 60 0 0256 The probability histogram is depicted in Figure 25 D PAGE 44 CHAPTER 2 STATMATH 5117 J TEBBS Example 216 In a small clinical trial with 20 patients7 let Y denote the number of patients that respond to a new skin rash treatment The physicians assume that a binomial model is appropriate so that Y N bn 20107 where p denotes the probability of response to the treatment In a statistical setting7 p would be an unknown parameter that we desire to estimate For this problem7 we7ll assume that p 04 Compute a My 5 b PY 2 5 and c My lt10 a PY 5 py5 2500450620 5 00746 b 20 PY 2 5 2130 y Z 04yo520y 115 This computation involves using the binomial pmf 16 times and adding the results TRICK Instead of computing the sum ZZZ05 0 04y0620 y directly7 we can write 1312517PY 47 by the complement rule We do this because WMS7s Appendix III Table 17 pp 783 785 contains binomial probability calculations of the form a n 7 Fyltagt 2 PltY a 2ypylt1epr 2 110 for different 71 and p With 71 20 and p 047 we see from Table 1 that PY S 4 0051 Thus7 PY 2 5 1 7 0051 0949 c PY lt10 PY S 9 07557 from Table 1 D REMARK The function Fm E 130 S y is called the cumulative distribution function well talk more about this function in the next chapter PAGE 45 CHAPTER 2 STATMATH 5117 J TEBBS 015 010 005 l l l l 5 10 15 PM PlYVl 000 l 0 V Figure 26 Probability histogram for the number of patierits responding to treatmerit This represerits the bn 20p 04 model iri Epample 216 CURIOSITY ls the binomial pmf a valid pmf Clearly pyy gt 0 for all y To Check that the pmf sums to one7 consider the binomial expansion rt n n n 19 1 7pl Zlt py 7p y y0 y The LHS Clearly equals 17 and the RHS represents the bnp pmf Thus7 pyy is valid MGF FOR THE BINOMIAL DISTRIBUTION Suppose that Y N bnp Then the mgf of Y is given by mylttgt Ea pr 7 MW ltpe gtylt1 7 2W q wet it 210 y where q 1 7 p The last step follows from noting that 230 pety1 7 p y is the binomial expansion of q pet D MEAN AND VARIANCE OF THE BINOMIAL DISTRIBUTION We want to compute EY and VY where Y N bnp To do this7 we will use the mgf Taking the derivative PAGE 46 CHAPTER 2 STATMATH 5117 J TEBBS of myt with respect t we get me 2 1mm d d a pet 7 W pet 1pe E Thus 11960 71Q pYHp 71p d Em Ema1t nltq 1960 t t0 since qp 1 Now we need to nd the second moment By using the product rule for derivatives we have d2 d Emat 7 a W petY pet 7 7101 7 1q pet 2p6t2 nq pe 1pe quotLlt Thus d2 7 n7 EY2 7 Emat 7 nn71qp60 2p602nQp60 11960 7 71017 1p2np t0 Finally the variance is calculated by appealing to the variance computing formula ie WY 7 EY2 7 lEYl2 7101 712 7119 7 WV np17p 5 Example 217 Artichokes are a marine climate vegetable and thrive in the cooler coastal climates Most will grow on a wide range of soils but produce best on a deep fertile well drained soil Suppose that 15 artichoke seeds are planted in identical soils and temperatures and let Y denote the number of seeds that germinate If 60 percent of all seeds germinate on average and we assume a b15 06 probability model for Y the mean number of seeds that will germinate is EY np 1506 9 The variance is 02 711017 p 150604 36 seeds2 The standard deviation is 039 V36 19 seeds D PAGE 47 CHAPTER 2 STATMATH 5117 J TEBBS SPECIAL BINOMIAL DISTRIBUTION In the bnp family when n 1 the binomial pmf reduces to py1p1 y7 y 071 0 pyy otherwise 7 This is sometimes called the Bernoulli distribution Shorthand notation is Y N b1p The sum of n independent b1p random variables actually follows a bnp distribution 27 Geometric distribution TERMINOLOGY Imagine an experiment where Bernoulli trials are observed If Y denotes the trial on which the rst success occurs then Y is said to follow a geometric distribution with parameter p the probability of success on any one trial 0 lt p lt 1 This is sometimes written as Y N geomp The pmf for Y is given by 1 My lp y 17 2737 0 pyy otherwise 7 RATIONALE The form of this pmf makes intuitive sense we need y 7 1 failures each of which occurs with probability 1 7 p and then a success on the yth trial this occurs with probability p By independence we multiply 171 X 171 X X 17pXp17py 1p y 7 1 failures NOTE Clearly pyy gt 0 for all y Does pyy sum to one Note that 217Py 1p p2lt17PV 27 7 p 171719 In the last step we realized that 201 7 pw is an in nite geometric sum with common ratio 1 7 p D PAGE 48 CHAPTER 2 STATMATH 5117 J TEBBS Example 218 Biology students are checking the eye color of fruit ies For each y7 the probability of observing white eyes is p 025 What is the probability the rst white eyed y will be observed among the rst ve ies that we check SOLUTION Let Y denote the number of ies needed to observe the rst white eyed y We need to compute PY S 5 We can envision each y as a Bernoulli trial each y either has white eyes or not lf we assume that the ies are independent7 then a geometric model is appropriate ie7 Y N geomp 0257 so that 025 m 008 Thus7 PY S 5 21PY y x 077 The prnf for the geomp 025 model is depicted in Figure 27 D MGF FOR THE GEOMETRIC DISTRIBUTION Suppose that Y N geomp Then the mgf of Y is given by pet my where q 1 7p for t lt ilnq Proof Exercise D MEAN AND VARIANCE OF THE GEOMETRIO DISTRIBUTION With the rngf we can derive the mean and variance Differentiating the mgf7 we get d d pet MO 7 916 petkqe t E 7 t 7 my dtmyltgt dtlt17 get 17 get Thus7 Eygm woltweogtimoltweogtplt1qgtpltqgt 1 t0 1 160 1 1939 Similar calculations show 19 2 t0 p EltY2gt 771M PAGE 49 CHAPTER 2 STATMATH 5117 J TEBBS 025 020 010 005 l l I I I I I 5 10 000 i 39 i 0 15 20 pMPtYyl 0 a V Figure 27 Probability histogram for the number of ies rieeded t0 rid the rst white eyed fly This represerits the geornp 025 model iri Example 218 Finally7 Example 219 At an apple orchard in Maine7 bags of 20 lbs77 are continually observed until the rst underweight bag is discovered Suppose that four percent of bags are under lled If we assume that the bags are independent7 and if Y denotes the the number of bags observed7 then Y N geornp 004 The mean number of bags we will observe is 1 1 EY E 25 bags The variance is q 096 2 V Y 7 600 b D lt gt p 004 lt agsgt PAGE 50 CHAPTER 2 STATMATH 5117 J TEBBS 28 Negative binomial distribution NOTE The negative binomial distribution can be motivated from two perspectives o as a generalization of the geometric o as a reversal of the binomial Recall that the geometric random variable was de ned to be the number of trials needed to observe the rst success in a sequence of Bernoulli trials TERMINOLOGY Imagine an experiment where Bernoulli trials are observed If Y denotes the trial on which the rth success occurs 7 2 1 then Y has a negative binomial distribution with parameters 7 and p where p denotes the probability of success on any one trial 0 lt p lt 1 This is sometimes written as Y N nibrp PMF FOR THE NEGATIVE BINOMIAL The pmf for Y N nibrp is given by iilpT1pyT y m 1r 2 0 pyy otherwise 7 Of course when r l the nibrp pmf reduces to the geomp pmf RATIONALE The logic behind the form of pyy is as follows If the rth success occurs on the yth trial then r 71 successes must have occurred during the 1st y 71 trials The total number of sample points in the underlying sample space S where this is the case is given by the binomial coef cient Zrj which counts the number of ways you order 7 71 successes and y 7r failures in the 1st y 7 1 trials The probability of any particular ordering by independence is given by p7 1l 7mg Now on the yth trial we observe the rth success this occurs with probability p Thus putting it all together we get 9 1 1 7 9 1 7 39r 17 yr 717 yr T71 p Xp T71 p p pertains to lst yil trials PAGE 51 CHAPTER 2 STATMATH 5117 J TEBBS Example 220 A botanist in lowa City is observing oak trees for the presence of a certain disease From past experience7 it is known that 30 percent of all trees are infected p 030 Treating each tree as a Bernoulli trial ie7 each tree is infectednot7 what is the probability that she will observe the 3rd infected tree 7quot 3 on the 6th or 7th observed tree SOLUTION Let Y denote the tree on which she observes the 3rd infected tree Then7 Y N nibr 310 03 We want to compute PY 6 or Y 7 671 PY 6 3 7103317 036 3 00926 7 71 PY 7 3 103317 037 3 00972 Thus7 PY6orY7 PY6PY7 00926 00972 01898 D RELATIONSHIP WITH THE BINOMIAL Recall that in a binomial experiment7 we x the number of Bernoulli trials7 n and we observe the number of successes However7 in a negative binomial experiment7 we x the number of successes we are to observe7 r and we continue to observe Bernoulli trials until we reach that success This is another way to think about the negative binomial model MGF FOR THE NEGATIVE BINOMIAL DISTRIBUTION Suppose that Y N nibrp t 7 p6 mYt lt17 get 7 where q 1 7 p7 for all t lt 7 ln q Before we prove this7 let7s state and prove a lemma The mgf of Y is given by LEMMA Suppose that r is a nonnegative integer Then7 i 3106024 lt1 7 get yr PAGE 52 CHAPTER 2 STATMATH 5117 J TEBBS Proof of lemma Consider the function fw 1 7 w 7 where r is a nonnegative integer It is easy to show that f w 7 1 WWW WW 7 W 1gtlt17 WW ln general7 fzw rr 1 r z 7 11 7 w TZ7 where fzw denotes the 2th derivative of f with respect to w Note that fltzgtwl Orr1rz71 w Now7 consider writing the McLaurin Series expansion of fw ie7 a Taylor Series ex pansion of fw about w 0 this expansion is given by 00 fZ0wz WW 7 Z 00 rr1rz71 Z 00 zr71 Z T71 w Now7 letting w get and z y 7 r the lemma is proven for 0 lt q lt 1 D MGF Now that we are nished with the lemma7 let7s nd the mgf ofthe nibrp random variable With q 1 7 p7 y39r i 1 Z gag7 y p39rqy739r r 7 1 117 7 W i y 1 MW r 7 1 217 1963717 get 1 196 T i 17qet 7 for t lt 7 lnq7 where the penultimate step follows from the lemma D PAGE 53 CHAPTER 2 STATMATH 5117 J TEBBS REMARK Showing that the nib7 7 p distribution sums to one can be done by using a similar series expansion as above We omit it for brevity MEAN AND VARIANCE OF THE NEGATIVE BINOMIAL DISTRIBUTION For a nibrp random variable7 with q 1 7 p7 EY 5 and m VY E Proof Exercise D 29 Hypergeometric distribution SETTING Consider a collection of N objects eg7 people7 poker chips7 plots of land7 etc and suppose that we have two dichotomous classes7 Class 1 and Class 2 For example7 the objects and classes might be Poker chips redblue People infectednot infected Plots of land respond to treatmentnot From the collection of N objects7 we observe a sample of n lt N of them7 and record Y7 the number of objects in Class 1 ie7 the number of successes REMARK This sounds like binomial setupl However7 the difference is that7 here7 N7 the population size7 is nite the population size7 theoretically7 is assumed to be in nite in the binomial model Thus7 if we sample from a population of objects Without replacement7 then the success probability changes trial to trial This7 violates the binomial model assumptionsll Of course7 if N is large ie7 in very large populations7 the two models will be similar7 because the change in the probability of success from trial to trial will be small maybe so small that it is not of practical concern PAGE 54 CHAPTER 2 STATMATH 5117 J TEBBS HYPERGEOMETRIC DISTRIBUTION Envision a collection of 71 objects sampled at random and without replacement from a population of size N where 7 denotes the size of Class 1 and N 7 7 denotes the size of Class 2 Let Y denote the number of objects in the sample that belong to Class 1 Then Y has a hypergeometric distribution written Y N hyperN n r where N total number of objects 7 number of the 1st class eg success N 7 r number of the 2nd class eg failure 71 number of objects sampled HYPERGEOMETRIC PMF The pmf for Y N hyperN n r is given by T N7T mfg y E R WW 0 otherwise where the support set R y E N max0n 7 N r S y S minnr BREAKDOWN In the hyperNnr pmf we have three parts number of ways to choose y Class 1 objects from r number of ways to choose 71 7 y Class 2 objects from N 7 r number of sample points REMARK In the hypergeometric model it follows that pyy sums to 1 over the support R but we omit this proof for brevity see Exercise 3176 pp 148 WMS Example 221 In my sh tank at home there are 50 sh Ten have been tagged lf I catch 7 sh and random and without replacement what is the probability that exactly two are tagged SOLUTION Here N 50 total number of sh n 7 sample size 7 10 tagged sh Class 1 N 7 r 40 untagged sh Class 2 and y 2 number of tagged sh caught Thus PWWPWDWW PAGE 55 CHAPTER 2 STATMATH 5117 J TEBBS What about the probability that my catch contains at most two tagged sh SOLUTION Here7 we want PY2 PY0PY1PY2 10015470 11015460 12015450 7 7 7 01867 03843 02964 08674 D Example 222 A supplier ships parts to another company in lots of 25 parts The receiving company has an acceptance sampling plan which adopts the following ac ceptance rule sample 5 parts at random and without replacement If there are no de fectives in the sample7 accept the entire lot otherwise7 reject the entire lot77 Let Y denote the number of defectives in the sampled parts ie7 out of 5 Then7 Y N hyper2557 7 where 7 denotes the number defectives in the lot in real life7 r is unknown De ne 002 PY 0 where p r25 denotes the true proportion of defectives in the lot The symbol 001 denotes the probability of accepting the lot which is a function of p Consider the following table7 whose entries are computed using the above probability expression 7 p 0009 0 0 100 1 004 080 2 008 063 3 012 050 4 016 038 5 020 029 10 040 006 15 060 001 PAGE 56 CHAPTER 2 STATMATH 5117 J TEBBS REMARK The graph of 001 versus p is sometimes called an operating character istic curve Of course as r or equivalently p increases the probability of accepting the lot decreases Acceptance sampling is important in statistical process control used in engineering and manufacturing settings In practice lot sizes may be very large eg N 1000 etc and developing sound sampling plans is crucial in order to avoid using defective parts in nished products D MEAN AND VARIANCE OF THE HYPERGEOMETRIC DISTRIBUTION lf Y N hyperNnr then and vltYgt NNT K We will prove this result later in the course RELATIONSHIP WITH THE BINOMIAL As noted earlier the binomial and hyperge ometric models are similar The key difference is that in a binomial experiment p does not change from trial to trial but it does in the hypergeometric setting noticeably if N is small However one can show that for y xed T if n WM pmf as rN a p The upshot is this if N is large ie the population size is large a binomial probability calculation withp rN closely approximates the corresponding hypergeometric probability calculation See pp 123 Example 223 In a small town there are 900 right handed individuals and 100 left handed individuals We take a sample of size n 20 individuals from this town at random and without replacement What is the probability that 4 or more people in the sample are left handed SOLUTION Let X denote the number of left handed individuals in our sample Let7s compute this probability PX 2 4 using both the binomial and hypergeometric models PAGE 57 CHAPTER 2 STATMATH 5117 J TEBBS o Hypergeometric Here N 1000 r 100 N 7 r 900 and n 20 Thus 3 100 900 PX24 17PX3 1iZmT020 wm0130947 0 m0 20 o Binomial Here n 20 and p rN 010 Thus 3 20 PX 2 4 17 PX 3 1 i Z 01w0920w m 0132953 D x m0 REMARK Of course since the binomial and hypergeometric models are similar when N is large their means and variances are similar too Note the similarities recall that the quantity rN a p as N a 00 and 210 Poisson distribution TERMINOLOGY Let the number of occurrences in a given continuous interval of time or space be counted A Poisson process enjoys the following properties 1 the number of occurrences in non overlapping intervals are independent random variables 2 The probability of an occurrence in a suf ciently short interval is proportional to the length of the interval 3 The probability of 2 or more occurrences in a suf ciently short interval is zero GOAL Suppose that an experiment satis es the above three conditions and let Y denote the number of occurrences in an interval of length one Our goal is to nd an expression for pyy PY y the pmf of Y PAGE 58 CHAPTER 2 STATMATH 5117 J TEBBS APPROACH Envision partitioning the unit interval 01 into n subintervals each of size Now if n is suf ciently large ie much larger than y then we can approximate the probability that y events occur in this unit interval by nding the probability that exactly one event occurrence occurs in exactly y of the subintervals o By Property 2 we know that the probability of one event in any one subinterval is proportional to the subinterval7s length say An where A is the proportionality constant 0 By Property 3 the probability of more than one occurrence in any subinterval is zero for 71 large 0 Consider the occurrencenon occurrence of an event in each subinterval as a Bernoulli trial Then by Property 1 we have a sequence of n Bernoulli trials each with probability of success p An Thus a binomial calculation gives A y A Ty W 10 e C H 1 y n 71 Now to get a better approximation we let n grow without bound Then 2 l 1 y A n 1 lim PY y lirn n Aya 1 a a A mace mace 77 71 17 7y 1y 1 w gt m an b 0n dV Now the limit of the product is the product of the limits Thus nna1nay1 lirn an lirn 1 mace mace 71 Ag Ay lirn bn irn a a mace mace yl yl A m lirn on lirn 1 a a e A mace ace n y 1 lirndn lirnlt1 A 1 PAGE 59 CHAPTER 2 STATMATH 5117 J TEBBS Thus7 Av 5 y012 pyy p 0 otherw1se 7 This is the pmf of a Poisson random variable with parameter A We sometimes write Y N Poisson That pyy sums to one is easily seen as 00 Me 2mm Z yER y0 y39 00 y AZ 7 5 i 210 yl 6 e 17 A 7 00 y i i i A since 6 i 2210 A yl7 the McLaurin series expansion of e D EXAMPLES OF POISSON PROCESSES A A A 00 D H V V V xed period of time A A U 4 V V counting the number of people in a certain community living to 100 years of age counting the number of customers entering a post of ce in a given day counting the number of a particles discharged from a radioactive substance in a counting the number of blemishes on a piece of arti cial turf counting the number of chocolate chips in a Chips Ahoy cookie Example 224 The number of cars abandoned weekly on a certain highway is modeled using a Poisson distribution with A 22 In a given week7 what is the probability that a no cars are abandoned b exactly one car is abandoned c at most one car is abandoned d at least one car is abandoned PAGE 60 CHAPTER 2 STATMATH 5117 J TEBBS SOLUTIONS Let Y denote the number of cars abandoned weekly a 0 722 PY0 py0 e 239201108 1 722 PY 1 pyl 225 02438 c PY lt1 PY 0 PY 1 py0 pyl 01108 02438 03456 d PY gt 1 1 i PY 0 1 71010 17 01108 08892 D 025 020 015 PM PYV 010 005 ltm Figure 28 Probability histogram for the number of abandoned cars This represents the Poisson 22 model in Example 224 REMARK WMS7s Appendix lll7 Table 37 pp 787 791 includes an impressive table for Poisson probabilities of the form 1 Me Fm PY a 2 y Recall that this function is called the cumulative distribution function of Y This makes computing compound event probabilities much easier PAGE 61 CHAPTER 2 STATMATH 511 J TEBBS MGF FOR THE POISSON DISTRIBUTION Suppose that Y N Poisson The mgf of Y for all t is given by 00 y A mytEetY ZenL 210 y 7A 00 Mt e Z X y0 Aet 646 t expet 7 MEAN AND VARIANCE OF THE POISSON DISTRIBUTION With the mgf we can derive the mean and variance Differentiating the mgf we get d d t t t myt E awnt Eexppde 71 A6 expe 7 Thus dt Now we need to nd the second moment By using the product rule for derivatives we EY innt 7 A60 exp50 71 A have d2 d gnut E Aetexpet 71 Aetexpet 71 Aet2expet 7 z WW Thus EY2 A A2 and WY EY21EYl2 A A2 7 A2 A REVELATION With a Poisson model the mean and variance are always equal D Example 225 Suppose that Y denotes the number of monthly defects observed at an automotive plant From past experience engineers believe the Poisson model is appro priate and that Y N Poisson7 QUESTION 1 What is the probability that in any given month we observe 11 or more defectives PAGE 62 CHAPTER 2 STATMATH 5117 J TEBBS SOLUTION We want to compute PY 2 11 17 PY S 10 17 0901 0099 Table 3 QUESTION 2 What about the probability that7 in a given year7 we have two or more months with 11 or more defectives SOLUTION First7 we assume that the 12 months are independent is this reasonable7 and call the event B 11 or more defects in a month a success Thus7 under our independence assumptions and viewing each month as a trial7 we have a sequence of 12 Bernoulli trials with success probability p PB 0099 Let X denote the number of months where we observe 11 or more defects Then7 X N b127 00997 and PXgt2 17PX07PX1 7 17 1020099017 0099 7 112009911 009911 1 7 02862 7 03774 03364 D POISSON PROCESSES OF ARBITRARY LENGTH lf events or occurrences in a Pois son process occur at a rate of A per unit time or space7 then the number of occurrences in an interval of length It also follows a Poisson distribution with mean At Example 226 Phone calls arrive at a switchboard according to a Poisson process7 at a rate of A 3 per minute Thus7 if Y represents the number of calls received in 5 minutes7 we have that Y N Poisson15 The probability that 8 or fewer calls come in during a 5 minute span is given by 15615 PY S 8 7 00377 from Table 3 D POISSON BINOMIAL LINK We have seen that the hypergeometric and binomial mod els are related as it turns out7 so are the Poisson and binomial models This should not be surprising because we derived the Poisson pmf by appealing to a binomial approximation PAGE 63 CHAPTER 2 STATMATH 5117 J TEBBS RELATIONSHIP Suppose that Y N bnp If n is large and p is small7 then We yl 7 n n my ypy1ip y for y E R 012n7 where A np Example 227 Hepatitis C HCV is a viral infection that causes cirrhosis and cancer of the liver Since HCV is transmitted through contact with infectious blood7 screening donors is important to prevent further transmission The World Health Organization has projected that HCV will be a major burden on the US health care system before the year 2020 For public health reasons7 researchers take a sample of n 1875 blood donors and screen each individual for HCV lf 3 percent of the entire population is infected7 what is the probability that 50 or more are HOV positive SOLUTION Let Y denote the number of HOV infected individuals in our sample We compute this probability PY 2 50 using both the binomial and Poisson models 0 Binomial Here7 n 1875 and p 003 Thus7 1875 PY 2 50 Z 003y0971875y m 0818783 1150 0 Poisson Here7 A np 1875003 5625 Thus7 f 5625ye 563925 PY 2 50 y 0814932 1150 As we can see7 the Poisson approximation is quite good D RELATIONSHIP One can see that the hypergeometric7 binomial7 and Poisson models are related in the following way hyperNnr lt gt bnp lt gt Poisson The rst link results when N is large and rN a p The second link results when n is large and p is small so that An a p When these situations are combined7 as you might suspect7 one can approximate the hypergeometric model with a Poisson modelll PAGE 64 CHAPTER 3 STATMATH 5117 J TEBBS 3 Continuous Distributions Complementary reading from WMS Chapter 4 omit 411 31 Introduction RECALL In the last chapter7 we focused on discrete random variables Recall that a discrete random variable is one that can assume only a nite or countable number of values We also learned about probability mass functions pmfs Loosely speaking7 these were functions that told us how to assign probabilities and to which points we assign probabilities TERMINOLOGY A random variable is said to be continuous if its support set is uncountable ie7 the random variable can assume an uncountably in nite number of values We will present an alternate de nition shortly 32 Cumulative distribution functions NEW We now introduce a new function associated with any random variable discrete or continuous TERMINOLOGY The cumulative distribution function cdf of a random variable Y7 denoted by Fyy7 is given by Fyy PY S y fOr all y E R Note that the cdf is de ned for all y E R not just for those values of y E R7 the support set of Y REMARK Every random variable7 discrete or continuous7 has a cdf We7ll start by computing some cdfs for discrete random variables PAGE 65 CHAPTER 3 STATMATH 5117 J TEBBS Example 31 Let the random variable Y have prnf any y012 07 otherwise Consider the following probability calculations we My lt 0 My 0 3 8 Fy1 PY1PY0PY1g Fy2 PY2PY0PY1PY2g1 Furtherrnore7 o foranyylt0PYSy0 o forany0ltylt1PY yPY0 o foranylltylt213032PY0PY1 O ylt0 0 ylt1 1 ylt2 H CMUI GNU 122 7 Note that we have de ned Fyy for all y E R Sorne points are worth mentioning concerning the graphs of the pmf and cdf o PMF The height of the bar above y is the probability that Y assumes that value 7 For any y not equal to 017 or 27 pyy 0 PAGE 66 CHAPTER 3 STATMATH 5117 J TEBBS o CDF Fyy is a nondecreasing function see theoretical properties below i 0 S Fyy S 1 this makes sense since Fyy is a probabilityll The height of the jump77 at a particular point is equal to the probability associated with that point THEORETICAL PROPERTIES Let Y be a random variable discrete or continuous and suppose that Fyy is the cdf for Y Then i hmyaioo FYl 07 ii limyndr00 Fyy 17 iii Fyy is a right continuous function that is7 for any real 1 limyna Fyy Fy a7 and iv Fyy is a nondecreasing function that is7 for any yl S yg Fyy1 S Fyy2 EXERCISE Graph the cdf for the b57 02 and Poisson2 distributions 33 Continuous random variables ALTERNATE DEFINITION A random variable is said to be continuous if its cdf Fyy is a continuous function of y RECALL The cdfs associated with discrete random variables are stepfunctions Such functions are certainly not continuous however7 they are still right continuous TERMINOLOGY Let Y be a continuous random variable with cdf The prob ability density function pdf for Y7 denoted by jfyy7 is given by d fYl digEdy PAGE 67 CHAPTER 3 STATMATH 511 J TEBBS provided that iFy y E F y exists Furthermore appealing to the Fundamental dy Y Theorem of Calculus we know that y wa now REMARK These equations illustrate key relationships linking pdfs and cdfs for con tinuous random variablesll PROPERTIES OF CONTINUOUS PDFs Suppose that Y is a continuous random vari able with pdf fyy and support R Then 1 fyy gt 0 for all y E R 2 fR fyydy 1 ie the total area under the pdf equals one 3 The probability of an event B is computed by integrating the pdf fyy over B ie PY E B fB fyydy for any B C R REMARK Compare these to the analogous results for the discrete case see page 28 in the notes The only difference is that in the continuous case integrals replace sums Example 32 Suppose that Y has the pdf 0 2 ltylt O Uh A otherwise 7 This pdf is depicted in Figure 39 We want to nd the cdf To do this we need to compute Fyy PY S y for all y E R There are three cases 0 when y S 0 we have 2 y wa nomima 0 when 0 lty lt 2 we have FyyOfytdt OdtOydt PAGE 68 CHAPTER 3 STATMATH 5117 J TEBBS fY Figure 39 Probability density function fyy in Example 32 0 when y 2 27 we have 2 0 21 y Fyy fytdt 0dtO Edt2 Odt t 0 7 01 2 0 Putting it all together7 we have 07 y lt 0 Fm 112 0 S y lt 2 1 y 2 2 Example 33 Remission times for a certain group of leukemia patients Y7 measured in months has Cdf 0 FYUJ 7 17 e yS y 2 0 ylt0 PAGE 69 CHAPTER 3 STATMATH 5117 J TEBBS FY Figure 310 Cumulative distribution function Fyy in Example 33 This cdf is depicted in Figure 310 Let7s calculate the pdf of Y Again7 we need to consider all possible cases 0 when y lt 07 d d i 7F 7 70 i 0 fYl dy y dy y 0 when y 2 07 d d 1 7 7 ill3 7 7 ill3 fyy dyFyy dy lt1 6 gt e Thus7 putting it all together we get if y 2 0 fYy 3 07 otherwise This pdf is depicted in Figure 311 D EXERCISE For the cdfs in Examples 32 and 337 verify that these functions satisfy the four theoretical properties for any cdf PAGE 70 CHAPTER 3 STATMATH 5117 J TEBBS Figure 311 Probability derisity furietiori fyy iri Example 33 This is a probability model for leukemia remissiori times UBIQUITO US RESULT Recall that one of the properties of a continuous pdf is that W e B 7 fyydy for any B C R If B y a S y S b ie7 B a7b7 then P0 S Y S b b fYldy FYb FYa Example 34 In Example 337 what is the probability that a randomly selected patient will have a rernission time between 2 and 5 months That is7 what is P2 S Y S 5 SOLUTION We can attack this two ways one using the cdlf7 one with the pdf 0 CDF refer to Figure 310 P2 S Y S 5 Fy5 7 Fy2 17 e75 7 1 7 e ZS 7 6723 7 6753 m 0325 PAGE 71 CHAPTER 3 STATMATH 5117 J TEBBS 0 PDF refer to Figure 311 5 1 PQltYlt5 AwW 23 1 5 g gtlt 73e y3 6723 7 6753 m 0325 D FACT If Y is a continuous random variable with pdf fyy7 then PY a 0 for any real constant a This follows since mymngygwmeo Thus7 for continuous random variables7 probabilities are assigned to single points with zero probability This is the key difference between discrete and continuous random variables An immediate consequence of the above fact is that for any continuous random variable Y7 ngygmngyltmPmltYmPmltYltm and the common value is fyydy Example 35 Suppose that Y represents the time in seconds until a certain chemical reaction takes place in a manufacturing process7 say7 and varies according to the pdf Eye ya y 2 0 0 fYl otherwise 7 a Find the c that makes this a valid pdf b Compute P35 S Y lt 45 SOLUTION a To nd 0 recall that fooo fyydy 1 Thus7 0 ye yZdy 1 0 PAGE 72 CHAPTER 3 STATMATH 5117 J TEBBS fY 0 1 0 005 l 000 Figure 312 Probability derisity function fyy iri Example 35 This is a probability model for chemical reactiori times Using an integration by parts argument with u y and do e yZdy7 we have that 00 26 y2dy 110 210 00 110 Solving for c we get 0 14 This pdf is depicted in Figure 312 ye yZdy 72ye y2 0 0 0 272ey2 74x0 71 4 b Using integration by parts again7 we get 45 1 P35 Y lt 45 Zye yZdy m 0135 35 Thus7 the probability that the Chemical reaction takes place between 35 and 45 seconds is about 014 D DISCLAIMER We will use integration by parts repeatedly in this coursell PAGE 73 CHAPTER 3 STATMATH 5117 J TEBBS 34 Mathematical expectation 341 Expected values TERMINOLOGY Let Y be a continuous random variable with pdf fyy and support R The expected value or mean of Y is given by MW yfyydy R If EY 00 we say that the expected value does not exist RECALL When Y is a discrete random variable with pmf guyy the expected value of Y is 190 Z yiny yER So again7 we have the obvious similarities between the continuous and discrete cases Example 36 Suppose that Y has a pdf given by 2y 0ltyltl 0 fYl otherwise 7 This pdf is depicted in Figure 313 Here7 the expected value of Y is given by 1 MW yfyydy 0 1 2y2dy 0 3 1 2 1L 2 Lo 23 D 3 0 3 EXPECTATIONS OF FUNCTIONS OF Y Let Y be a continuous random variable with pdf fyy and support R7 and suppose that g is a real valued function Then7 gY is a random variable and El9Yl R9yfyydy lf EgY 00 we say that the expected value does not exist PAGE 74 CHAPTER 3 STATMATH 5117 J TEBBS fY 1 0 1 Figure 313 Probability density function fyy in Example 36 Example 37 With the pdf in Example 367 compute EY2 and Eln Y 1 734 1 EY2 2y3dy2 Z 12 0 0 Using integration by parts7 with u lny and d1 ydy 1 1 21 1 1 1 1 7 iyzxidy 727 y 77D 0 02 y 2 20 2 PROPERTIES OF EYPECTATIONS Let Y be a continuous random variable with pdf SOLUTIONS 1 ElnY 2 ylnydy 2ltiy21ny 0 fyy and support R7 suppose that 97 9192 gk are real valued functions7 and let 0 be any real constant Then a Ec c b E10900 0E19Y1 C 19122191501 2271 E19700 These properties are identical to those we discussed in the discrete case PAGE 75 CHAPTER 3 STATMATH 5117 J TEBBS 342 Variance A SPECIAL EXPECTATION Let Y be a continuous random variable with pdf fyy7 support R7 and mean u The variance of Y is given by awszwgtm Awwmhmm Example 38 With the pdf in Example 367 2y 0 lty lt1 fYl 07 otherwise7 compute 02 VY SOLUTIONS Recall that M EY 237 from Example 36 Using the de nition above7 1 22 1 2VY quot 2d7 U 0 y 3 ny 18 Alternatively7 we could use the variance computing formula ie7 the variance of Y is WY EY2 ECNZ We know EY 23 and EY2 12 from Example 37 Thus7 g2 VY 12 7 23 118 D 343 Moment generating functions ANOTHER SPECIAL EXPECTATION Let Y be a continuous random variable with pdf fyy and support R The moment generating function mgf for Y7 denoted by m t7 is given by mmEwoAwnm provided Eety lt 00 for t in an open neighborhood about 0 ie7 there exists some h gt 0 such that Eety lt 00 for all t E flu1 lf Eety does not exist in an open neighborhood of 07 we say that the moment generating function does not exist PAGE 76 CHAPTER 3 STATMATH 5117 J TEBBS Example 39 Suppose that Y has a pdf given by 0 My 5 ygt 07 otherwise Find the moment generating function of Y and use it to compute EY and VY SOLUTION mylttgtElte gt etyfyltygtdy 0 ewww 0 Cjifw4gt lit 110 7 1 7 171 for values oft lt 1 With the mgf7 we can calculate the mean and variance Differentiating MW 2 gm 1 2 f w 170 To nd the variance7 we rst nd the second moment f aid 1 272 1 3 d my m 14 lit 39 MW the mgf7 we get Thus7 Thus7 the second moment is The computing formula gives 02 VY EY2 7 EY2 2 712 1 D EXERCISE Find EY and VY directly ie7 do not use the mgf Are your answers the same as above PAGE 77 CHAPTER 3 STATMATH 5117 J TEBBS 35 Uniform distribution TERMINOLOGY A random variable Y is said to have a uniform distribution from 91 to 02 61 lt 02 if its pdf is given by 1 fYl m7 0 01 lt y lt 62 otherwise 7 Shorthand notation is Y 1091 62 That the 1091 02 pdf integrates to one is obvious 92 92 7 71 will 62 9 62761 62761 9 62761 REMARKS Sometimes we call 91 and 02 the model parameters A popular member since of the 1091 02 family is the U0 1 distribution ie a uniform distribution with 91 0 and 02 1 this model is used extensively in computer programs to simulate random numbers The pdf for a U0 2 random variable is depicted in Figure 39 UNIFORM CDF The cdf Fyy for a U6162 distribution is given by 07 y S 91 Till 5221 01 lt y lt 62 17 y 2 92 Example 310 In a sedimentation experiment the size of particles studied are uniformly distributed between 01 and 05 millimeters What proportion of particles are less than 04 millimeters SOLUTION Let Y denote the size of a randomly selected particle Then Y N U01 05 and 04 0 3 PY lt 04 OV4dy i 4 075 D 0 05 i 01 04 01 04 MEAN AND VARIANCE If Y 210 62 then M EY 2 PAGE 78 CHAPTER 3 STATMATH 5117 J TEBBS and 92 i 902 12 39 These values can be computed using the pdf directly try it or by using the mgf below WY MOMENT GENERATING FUNCTION Suppose that Y N U6162 The mgf of Y is given by 92 91 e e 7 t 0 quotWW 92791 7g 1 t 0 7 36 Normal distribution TERMINOLOGY A random variable Y is said to have a normal distribution if its pdf is given by 1 2 lt lt 76 0 foo y 00 fYy m otherwise Shorthand notation is Y N NM702 There are two parameters in the normal distrib ution the mean M and the variance 02 FACTS ABOUT ANY NORMAL DISTRIBUTION a The pdf is symmetric about M that is7 for any a E R fyM 7 a fyM a b The points of in ection are located at y M i 039 c Any normal distribution can be transformed to a standard77 normal distribution llIIly icO 0 TERMINOLOGY A normal distribution with mean M 0 and variance 02 1 is called the standard normal distribution It is conventional to let Z denote a random variable that follows a standard normal distribution we often write Z N N01 IMPORTANT Tabled values of the standard normal probabilities are given in Ap pendix lll Table 4 pp 792 of WMS This table turns out to be very helpful since the PAGE 79 CHAPTER 3 STATMATH 5117 J TEBBS integral y 1 t 2 Fyy 2WU6 TH dt does not exist in closed form Speci cally the table provides values of 1FZ2PZgt2OO 1 e q Zdu 27m As mentioned any normal distribution can be transformed to a standard77 normal dis tribution we7ll see how later so there is only a need for one table of probabilities Of course probabilities like PZ gt 2 can be obtained using software too VALIDITY To show that the NM02 pdf integrates to one let 2 dz ldy and dy Udz Now de ne in 3771 Then 00 2 1 67y7 Cl 00 27W 00 1 2 7 1 2d 6 2 00 x 27139 Since I gt 0 it suf ces to show that I2 1 However note that 2 oo 1 eigds gtlt 00 1 eigd 00 x27T 00 x 27139 y 1 oo 00 2 y2 7 exp 7 27139 700 700 2 Now switching to polar coordinates ie letting z rcos0 and y rsin 0 we get 2 dxdy yz 7quot2cos2 6 sin2 9 r2 and dxdy rdrd ie the Jacobian of the transformation from zy space to 736 space Thus we write 2 27r co 1 7722 I 76 rdrd 90 70 2 1 27r 7 re r22dr d0 2 90 70 1 27r 7722 00 7 5 d9 27139 90 70 1 27r 27f 7 d6 i l D 27139 90 27139 90 PAGE 80 CHAPTER 3 STATMATH 5117 J TEBBS MOMENT GENERATING FUNCTION Suppose that Y N Np02 The rngf of Y de ned for all t is given by Uztz myt exp Mt Proof 1 00 gty y 2dy 27m 700 De ne b ty 7 y the exponent in the last integral Then 7 1y7M2 Md 0 1 2 2 t if 2 1 2029 M9M 1 7gb 7 2W 7 202w 2 12 7 2W 02m le complete the square 1 1 7 7 22 7 2m 02m u 0202 7 u 0202 le V add and subtract 1 2 2 1 2 2 2 1ly atl 1 at7M1 7 1 2 1 2 2 42 2 7 1 ya w 2M010tiltgt7 csay where a M Uzt Thus the last integral above is equal to 00 21 e 22lty7agt2dygt x go 700 7TH Na72 density Now nally note 60 E expc expmt 02t22 Thus the result follows E EXERCISE Use the rngf to verify that EY M and VY 02 IMPORTANT Suppose that Y N NW 02 Then the random variable Y 7 M 2 has a normal distribution with mean 0 and variance 1 That is Z N N0 1 PAGE 81 CHAPTER 3 STATMATH 5117 J TEBBS Proof Let Z Y 7 p The mgf of Z is given by 771200 EWZ EleXptZl y 7 E exp t M a E l eXp Mt0 eXp examam mytc7 expwta x exp mta 7 vata 622 2 7 which is the mgf of a N0 1 random variable Thus by the uniqueness of moment generating functions we know that Z N N0 1 D USEFULNESS From the last result we know that if Y N NW 02 then 7 Y7 7 7 7 ylltylty2y1 Mlt Mlty2 My1 MltZlty2 M 039 039 039 039 039 As a result 039 039 I 112 i M 7 I 91 i M 039 039 7 where denotes the cdf of the N0 1 distribution Note also that 72 17 lt1gtz for z gt 0 Example 311 In Florida young large mouth bass were studied to examine the level of mercury contamination Y measured in parts per million which varies according to a normal distribution with mean M 18 and variance 02 16 This model is depicted in Figure 314 a What proportion of contamination levels are between 11 and 21 parts per million b For this model ninety percent of all contamination levels will be above what mercury level PAGE 82 CHAPTER 3 STATMATH 5117 J TEBBS W 006 000 010 l l l 004 l 002 l 000 y mercury levels ppm Figure 314 Probability density function fyy in Example 311 A modelfoiquot mercury contamination in large mouth bass SOLUTIONS a In this part7 we want P11 lt Y lt 21 By standardizing7 we see that 11718 Y718 21718 P11ltYlt21 Plt lt lt gt 4 4 4 11718 21718 P Z P7175 lt Z lt 075 lt1gt075 4 4175 07734 4 00401 07333 For b7 we want to nd the 10th percentile of the Y N N1816 distribution ie7 we want the value y such that 090 PY gt y 1 7 To nd y rst well nd the 2 so that 090 PZ gt z 1 7 1327 then well unstandardize y From Table 47 we see 2 7128 so that y 718 4 7128 gt y 1288 Thus7 90 percent of all contarnination levels are larger that 1288 parts per million D PAGE 83 CHAPTER 3 STATMATH 5117 J TEBBS 37 The gamma family of pdfs THE GAMMA FAMILY In this section we examine an important family of probability distributions namely those in the gamma family There are three named distribu tions77 in particular 0 exponential distribution 0 gamma distribution 0 X2 distribution NOTE The exponential and gamma distributions are popular models for lifetime ran dom variables ie random variables that record time to event77 measurements such as the lifetimes of an electrical component death times for human subjects etc Other life time distributions include the lognormal Weibull and loggamma probability models 371 Exponential distribution TERMINOLOGY A random variable Y is said to have an exponential distribution with parameter 6 gt 0 if its pdf is given by Ten97 y gt 0 fYl 0 otherwise NOTATION Shorthand notation is Y N exponential The value 6 determines the scale of the distribution it is sometimes called the scale parameter That the expo nential density function integrates to one is easily shown verifyl MOMENT GENERATING FUNCTION Suppose that Y N exponential The mgf of Yis given by 1 t 7 mY 1731 for values of t lt 16 PAGE 84 CHAPTER 3 STATMATH 5117 J TEBBS Proof Let B 771 nt 1 so that 77 617 604 and ty 7 y iyn Then7 mytEetY etye y dy 0 1 00 eiyndy 3 7 Eeiyn 00 110 1 17m Note that for the last expression to be correct7 we need 77 gt 0 ie7 we need t lt D MEAN AND VARIANCE Suppose that Y N exponential The mean and variance of Y are given by and Proof Exercise D Example 312 The lifetime of a certain electrical component has an exponential dis tribution with mean 6 500 hours Engineers using this component are particularly interested in the probability until failure What is the probability that a randomly se lected component fails before 100 hours lasts between 250 and 750 hours SOLUTION With 6 5007 the pdf for Y is given by 1 731500 6 7 y gt 0 fyy 500 07 otherwise This pdf is depicted in Figure 315 Thus7 the probability of failing before 100 hours is given by 100 1 woo P Y lt 100 7 y d m 0181 lt gt 5006 y Similarly7 the probability of failing between 250 and 750 hours is 750 1 P250 lt Y lt 750 few Ody m 0383 D 250 500 PAGE 85 CHAPTER 3 STATMATH 5117 J TEBBS fY 0 0015 00020 l l 00010 l 00005 l l l l l l 0 500 1000 1500 2000 2500 00000 y component lifetimes hours Figure 315 Probability density function fyy in Example 312 A model for electrical component lifetimes CUMULATIVE DISTRIBUTION FUNCTION Suppose that Y N exponential Then7 the cdf of Y exists in closed form and is given by 0 y 0 7 Till 17 e y 7 y gt 0 The cdf for the exponential random variable in Example 312 is depicted in Figure 316 THE MEMORYLESS PROPERTY Suppose that Y N exponential 7 and suppose that r and s are both positive constants Then PYgtrlegtr PYgts That is7 given that the lifetime Y has exceeded 7 the probability that Y exceeds rs ie7 an additional 5 units is the same as if we were to look at Y unconditionally lasting until time 5 Put another way7 that Y has actually made it77 to time 7 has been forgotten The exponential random variable is the only continuous random variable that enjoys the rnernoryless property PAGE 86 CHAPTER 3 STATMATH 5117 J TEBBS FY l l l l l 0 500 1000 1500 2000 2500 y component lifetimes hours Figure 316 Cumulative distribution function Fyy in Example 312 A model for electrical component lifetimes RELATIONSHIP WITH A POISSON PROCESS Suppose that we are observing events according to a Poisson process with rate A 16 and let the random variable W denote the time until the rst occurrence Then W N exponential Proof Clearly W is a continuous random variable with nonnegative support Thus for w 2 0 we have FwwPW3w 1 PWgtw 1 7 Pno events in 0wl Ol 7 1 5 Substituting A 16 we nd that FWw 17 e w the cdf of an exponential random variable with mean 6 Thus the result follows E PAGE 87 CHAPTER 3 STATMATH 5117 J TEBBS 372 Gamma distribution THE GAMMA FUNCTION The gamma function is a function of t de ned for all t gt 0 as M 0 FACTS ABOUT THE GAMMA FUNCTION 1 A simple argument shows that Na a 7 1Pa 717 for all a gt 1 2 If a is an integer7 Na a 71l For example7 P5 4 24 TERMINOLOGY A random variable Y is said to have a gamma distribution with parameters a gt 0 and B gt 0 if its pdf is given by 04 15 21197 y gt 0 0 1 a my fad m m otherwise 7 Shorthand notation is Y N gammaa REMARK This model is indexed by two parameters We call a the shape parameter and B the scale parameter The gamma probability model is extremely exible By changing the values of a and B the gamma pdf can assume many shapes Thus7 the gamma model is very popular for modeling lifetime data IMPORTANT NOTE When a 17 the gamma pdf reduces to the exponential pde REMARK To see that the gamma pdf integrates to one7 consider the change of variable u y Then7 du dy and 00 1 041 y d 00 041 ud E D Amway 5 y mew0 5 u a 139 MGF FOR THE GAMMA DISTRIBUTION Suppose that Y N gammaa Then7 for values of t lt 167 the mgf of Y is given by PAGE 88 CHAPTER 3 STATMATH 5117 J TEBBS Proof Let B 771 nt 1 so that 77 617 604 and ty 7 y iyn 0 1 myt Eety ety yaile y dy 0 WW i 00 041 ill71d 6040 My 6 y 7 77a 00 1 0471 7117 7 7 7y e dy 50 0 WWW gamaoz7 density 9 1 D MEAN AND VARIANCE If Y N gammaoz 7 then EY a6 and VY 0462 Proof Exercise D TERMINOLOGY When talking about the gammaoz density function7 it is often helpful to think of the formula in two parts 0 the kernel ya le y o the constant Namal l Example 313 Suppose that Y has pdf given by cyze y l y gt 0 0 fYl otherwise a What is the value of c that makes this a valid pdf b Give an integral expression that equals PY lt 8 How could we solve this equation c What is the mgf of Y d What is the mean and standard deviation of Y RELATIONSHIP WITH A POISSON PROOESS Suppose that we are observing events according to a Poisson process with rate A 167 and let the random variable W denote the time until the ath occurrence Then7 W N gammaoz PAGE 89 CHAPTER 3 STATMATH 5117 J TEBBS fY 006 l 004 l Figure 317 Probability density function fyy in Example 313 Proof Clearly7 W is continuous with nonnegative support Thus7 for w 2 07 we have FWw PW S w 17PWgtw 1 7 Pfewer than 04 events in 0w 0471 e Awwj 1 i Z j0 The pdf of lV7 j xxw7 is equal to F CVw7 provided that this derivative exists For it gt 07 W F w Mm 7 Mimi A 7 Awa telescoping sum Substituting A 157 for w gt 07 which is the pdf for the garnrnaoz distribution 7 4w 7 4w 7 Awyil 7 A6 6 A a 71 J w quot16 w A DH Aw mini mmw 539 f 1071 wrle w W WW D PAGE 90 CHAPTER 3 STATMATH 511 J TEBBS 373 X2 distribution TERMINOLOGY In the gammaoz family when 04 V2 for any integer V and B 2 we call the resulting distribution a X2 distribution with V degrees of freedom If Y has a X2 distribution with V degrees of freedom we write Y N X2V NOTE At this point it suf ces to know that the X2 distribution is really just a spe cial77 gamma distribution However it should be noted that the X2 distribution is used extensively in applied statistics Many statistical procedures used in the literature are valid because of this model PROBABILITY DENSITY FUNCTION If Y N X2V then the pdf of Y is given by 1 Ix2H 7212 g p y 6 7 y gt 0 fyy PM 2 0 otherwise 7 MOMENT GENERATING FUNCTION Suppose that Y N X2V Then for values of tlt 12 the mgf of Y is given by me y2 Proof Take the gammaoz mgf and put in Oz V2 and B 2 D MEAN AND VARIANCE OF THE X2 DISTRIBUTION If Y N X2V then EY V and VY 2V Proof Take the gammaoz formulae and substitute Oz V2 and B 2 D TABLED VALUES FOR CDF Because the X2 distribution is so pervasive in applied statistics tables of probabilities are common Table 6 WMS pp 794 5 provides values of y which satisfy 1 py gt y Viuwzkieiuzdu y many2 for different values of y and degrees of freedom V PAGE 91 CHAPTER 3 STATMATH 5117 J TEBBS 38 Beta distribution TERMINOLOGY A random variable Y is said to have a beta distribution with parameters 04 gt 0 and B gt 0 if its pdf is given by mya l 711quot 0 lt y lt1 0 fYl otherwise 7 Since the support of Y is 0 lt y lt 1 the beta distribution is a popular probability model for proportions Shorthand notation is Y N betaoz The constant Ba is given by mam Na 6 39 TERMINOLOGY When talking about the betaoz density function it is often helpful 3amp6 to think of the formula in two parts 0 the kernel y 11 y 71 o the constant m THE SHAPE OF THE BETA PDF The beta pdf is very exible That is by changing the values of Oz and B we can come up with many different pdf shapes See Figure 318 for examples 0 When 04 B the pdf is symmetric about the line y 0 When 04 lt B the pdf is skewed right ie smaller values of y are more likely 0 When 04 gt B the pdf is skewed left ie larger values of y are more likely 0 When 04 B 1 the beta pdf reduces to the U0 1 pdf MOMENT GENERATING FUNCTION The mgf of a betaa 6 random variable exists but not in a nice compact formula Hence we7ll compute moments directly PAGE 92 CHAPTER 3 STATMATH 5117 J TEBBS o o o 2 o o a o a 10 o o o 2 4 o a o a 10 Beta Betaaz 00 02 04 06 as 10 00 02 04 06 as 10 Beta3 2 Betamm Figure 318 Four di erent beta probability models MEAN AND VARIANCE OF THE BETA DISTRIBUTION lfY betaoz then 04 045 EY 7 and VY m Proof Exercise D Example 314 A small lling station is supplied with premium gasoline once per day and can supply at most 1000 gallons lts daily volume of sales in 1000s of gallons is a random variable7 say Y7 which has the beta distribution 517y47 0lty lt1 0 fYl otherwise a What is are the parameters in this distribution ie7 what are 04 and B b What is the average daily sales c What need the capacity of the tank be so that the probability of the supply being exhausted in any day is 001 d Treating daily sales as independent from day to day7 what is the probability that during any given 7 day span7 there are exactly 2 days where sales exceed 200 gallons PAGE 93 CHAPTER 3 STATMATH 5117 J TEBBS SOLUTIONS a a 1 and 6 5 b EY 15 16 Thus7 the average sales is about 16666 gallons c We want to nd the capacity7 say 0 such that PY gt c 001 This means that 1 PY gt c 517 y4dy 001 and we need to solve this equation for c Using a change of variable u 1 7 y 1 170 10 517 y4dy 5u4dy 115 17 c5 c 0 0 Thus7 we have 17 c5 001 i 17 c 001 5 i c 1 7 00115 06027 and so there must be about 602 gallons in the tank d First7 we compute 8 1 08 0V PY gt 02 517 y4dy 5u4du 05 085 0328 02 0 0 This is the probability that sales exceed 200 gallons on any given day Now7 treat each day as a trial and let X denote the number of days where sales exceed 200 gallons77 ie7 a success Because days are assumed independent7 X N b77 0328 and PX 2 O328217 03285 0310 D 39 Chebyshev s Inequality MARKOV S INEQUALITY Suppose that X is a nonnegative random variable with pdf prnf fX7 and let 0 be any positive constant Then7 PX gt c g 0 Proof First7 de ne the event B x z gt c We know that EXOoozfxzdz BzfXxdxxfxzdz B 2 BfX95d95 2 BcfXzdx CPX gt c D PAGE 94 CHAPTER 3 STATMATH 5117 J TEBBS SPECIAL CASE Let Y be any random variable discrete or continuous with mean M and variance 02 lt 00 Then for k gt 0 l PY 7 M gt ka 3 This is known as Chebyshev7s Inequality Proof Apply Markov7s Inequality with X Y 7 M2 and c 202 With these substitutions we have Pay 7 m gt 1w 7 PKY 7 m2 gt W s a REMARK The beauty of Chebyshev7s result is that it applies to any random variable Y ln words PY 7 M gt k0 is the probability that the random variable Y will differ from the mean M by more than k standard deviations If we do not know how Y is distributed we can not compute PY 7 M gt k0 exactly but at least we can put an upper bound on this probability this is what Chebyshev7s result allows us to do Note that PY7Mgtka17PY7Mgka17PM7kU Y Mw Thus it must be the case that l P Y7M SkUPM7kU Y M UZl7 Example 315 Suppose that Y represents the amount of precipitation in inches observed annually in Barrow AK The exact probability distribution for Y is unknown but from historical information it is posited that M 45 and 039 1 What is a lower bound on the probability that there will be between 25 and 65 inches of precipitation during the next year SOLUTION We want to compute a lower bound for P25 S Y S 65 Note that 1 P25 S Y S 65 PY 7 M S 20 17 E 075 Thus we know that P25 S Y S 65 2 075 The chances are good that in fact Y will be between 25 and 65 inches PAGE 95 CHAPTER 4 STATMATH 5117 J TEBBS 4 Multivariate Distributions Complementary reading from WMS Chapter 5 41 Introduction REMARK So far we have only discussed univariate single random variables their probability distributions moment generating functions means and variances etc In practice however investigators are often interested in probability statements concerning two or more random variables Consider the following examples 0 In an agricultural eld trial we might to understand the relationship between yield Y measured in bushelsacre and the nitrogen content of the soil 0 In an educational assessment program we might want to predict a student7s posttest score from her pretest score 0 In a clinical trial physicians might want to characterize the concentration of a drug Y in ones body as a function of the time X from injection 0 In a marketing study the goal is to forecast next months sales say Y based on sales gures from the previous 71 7 1 periods say Y1Y2 Yn1 GOAL In each of these examples our goal is to describe the relationship between or among the random variables that are recorded As it turns out these relationships can be described mathematically through a probabilistic model TERMINOLOGY lf Y1 and Y2 are random variables then Y1Y2 is called a bivariate random vector lf Y1Y2Yn denote 71 random variables then Y Y1Y2Yn is called an nvariate random vector For much of this chapter we will consider the n 2 bivariate case However all ideas discussed herein extend naturally to higher dimensional settings PAGE 96 CHAPTER 4 STATMATH 5117 J TEBBS 42 Discrete random vectors TERMINOLOGY Let Y1 and Y2 be discrete random variables Then7 YhYZ is called a discrete random vector7 and the joint probability mass function pmf of Y1 and Y2 is given by PY1Y2l17yz PY1 91752 112 for all yhyg E RybYZ The set Ryby2 Q R2 is the two dimensional support of The function py1y2y1y2 has the following properties 1 0 S pY1Y2yi7l2 S 17 for all 11792 6 Phi136 2 Emmi2 PY1Y2yi7yz 1 3 PY1Y2 E B ZBpny2y1y2 for any set E C R2 Example 41 An urn contains 3 red balls7 4 white balls7 and 5 green balls Let Y17 Y2 denote the bivariate random vector where7 out of 3 randomly selected balls7 Y1 number of red balls Y2 number of white balls Consider the following calculations py1y200 py1y207131 py1y202 py1y203 lt3gt3 py1y210 py1y211 PAGE 97 CHAPTER 4 STATMATH 5117 J TEBBS Table 42 Joint prnfpyby2 y1y2 for Example 41 displayed in tabular form pmMhyz 22 0 22 1 yz 2 yz 3 7 10 40 30 4 1 1 i 0 E a a a 7 30 60 18 1 1 i 1 E a a 7 15 12 91 7 2 E E 7 1 91 7 3 and similarly 18 pY1Y2172 15 pY1Y2270 E 12 pY1Y2271 1 pY1Y2370 Here the support is Rim2 070707170727073717071717172727072704370 Table 42 depicts the joint prnf It is straightforward to see that ZRYI Y2 pyby2 y1y2 1 QUESTION What is the probability that among the three balls chosen there is at most 1 red ball and at most 1 white ball That is what is PY1 S 1Y2 S 1 SOLUTION Here we want to compute PB where the set E 0 0 0 1 1 0 1 From the properties associated with the joint prnf this calculation is given by S 17y 1 pY1Y2070 pY1Y2071 10Y1Y2170 10Y1Y2171 10 40 30 60 m t E t m t E 140 QUESTION What is the probability that among the three balls chosen there are at least 2 red balls That is what is PY1 Z 2 PAGE 98 CHAPTER 4 STATMATH 5117 J TEBBS 43 Continuous random vectors TERMINOLOGY Let Y1 and Y2 be continuous random variables Then Y1 Y2 is called a continuous random vector and the joint probability density function pdf of Y1 and Y2 is denoted by fyy2y1y2 The function fyy2y1y2 has the following properties 1 fyy2y1y2 gt 0 for all y1y2 E Ryly2 the two dimensional support set 2 fon f fY1Y2ylvyZdyldyZ 1 3 PY1Y2 E B f3 fyy2y1y2dy1dy2 for any set E C R2 REMARK Of course we realize that my 6 B fylyxyhyadyldyz B is really a double integral since B is a two dimensional set in the y1y2 plane thus PY1Y2 E B represents the volume under fyy2y1y2 over B TERMINOLOGY Suppose that Y1Y2 is a continuous random vector with joint pdf fybyz y1y2 The joint cumulative distribution function cdf for Y1Y2 is given by 12 11 FY1Y2l17112 E 1301 S 91752 S 12 fY1Y27 75d7 d57 for all y1y2 E R2 It follows upon differentiation that the joint pdf is given by FY1Y2 917127 62 fY1Y2yl792 m wherever these mixed partial derivatives are de ned Example 42 Suppose that in a controlled agricultural experiment we observe the random vector Y1 Y2 where Y1 temperature in Celcius and Y2 precipitation level in inches and suppose that the joint pdf of Y1Y2 is given by cylyg 10 lty1 lt 200 lty2 lt 3 0 Wig 91792 otherwise 7 PAGE 99 CHAPTER 4 STATMATH 5117 J TEBBS a What is the value of c b Compute PY1 gt 15712 lt 1 c Compute PY2 gt SOLUTIONS a We know that 20 3 011112 Ell261111 1 y110 2120 since fy1y2y17y2 must integrate to 1 over RWY2 yhyg 10 lt yl lt 207 0 lt yg lt 3 1e 20 3 20 2 3 2 20 y 96 y 1 011112 Ell251111 0 111 dy1 y110 y20 y110 0 10 Thus7 0 1675 9 30150 6750 b Let B y1y2 y1 gt157y2 lt1 The value PY1Y2 E B PY1 gt 15712 lt1 represents the volume under fy1y2y1y2 over the set E ie7 20 1 1 PMYZ e B PltY1 gt 15 lt1 7675211212 dyzdyl y 2120 115 1 20 2 1 9172 dy1 0 1 2 20 1 225 7 y 7 2007 0065 1350 2 15 1350 2 15 2 c Let D yhyg yg gt 1115 The quantity PY1Y2 E D PY2 gt Y15 represents the volume under fy1y2y1y2 over the set D ie7 3 512 1 13mm 6 D1 Poe gt 55 711112 dyldyz 1122 1 yFlO 675 3 512 dy2 10 f 675 2122 2 1 3 3 7 25y2 7 100y2dy2 1350 2 1 25 4 3 7 y i 50y m 0116 1350 4 2 NOTE The key thing to remember that7 in parts b and c7 the probability is simply the volume under the density fy1y2y1y2 over a particular set It is helpful to draw a picture to get the limits of integration correct PAGE 100 CHAPTER 4 STATMATH 5117 J TEBBS 44 Marginal distributions RECALL The joint pmf of YhYZ in Example 41 is depicted below in Table 43 You see that by summing out over the values of y2 in Table 437 we obtain the row sums Hype PY11 PY12 PY13 2730 This represents the marginal distribution of Y1 Similarly7 by summing out over the values of y1 we obtain the column sums PY2 0 PY2 1 PY2 2 PY2 3 i m g A 220 220 220 220 This represents the marginal distribution of Y2 Table 43 Joint pmfpy1y2y1y2 displayed in tabular form pyly myz 22 0 22 1 yz 2 yz 3 Row Sum 0 n E m A g 91 220 220 220 220 220 1 m m g m yl 220 220 220 220 7 g g E yl 2 220 220 220 7 m m 91 i 3 220 220 56 112 48 4 Column sum m m E m l TERMINOLOGY Let YhYZ be a discrete random vector with pmf py1y2y1y2 Then the marginal pmf of Y1 is 1031 11 Z 10th 11792 a11212 and the marginal pmf of Y2 is 1012 22 Z PY1Y291792 all 11 PAGE 101 CHAPTER 4 STATMATH 5117 J TEBBS MAIN POINT In the two dimensional discrete case7 marginal pmfs are obtained by summing out77 over the other variable TERMINOLOGY Let Y1Y2 be a continuous random vector with pdf fy1y2y1y2 Then the marginal pdf of Y1 is fY1yi fY1Y2l17112dy2 and the marginal pdf of Y2 is fY2yz fiaY2l17yzdyi MAIN POINT In the two dimensional continuous case7 marginal pdfs are obtained by integrating out77 over the other variable Example 43 In a simple genetics model7 the proportion7 say Y1 of a population with Trait 1 is always less than the proportion7 say Y2 of a population with trait 27 and the random vector Y1Y2 has joint pdf 6117 0ltyi ltyzlt1 fY1Y2y17y2 07 otherwise a Find the marginal distributions fy1y1 and fy2 b Find the probability that the proportion of individuals with trait 2 exceeds 12 c Find the probability that the proportion of individuals with trait 2 is at least twice that of the proportion of individuals with trait 1 SOLUTIONS a To nd the marginal distribution of Y1 ie7 fy1y17 we integrate out over yg For values of 0 S yl S 17 we have 1 fY1l1 61151112 61111 11 11121111 Thus7 the marginal distribution of Y1 is given by 6911 yl7 0 lt 91 lt1 fY1l1 07 otherwise PAGE 102 CHAPTER 4 STATMATH 5117 J TEBBS Of course we recognize this as a beta distribution with 04 2 and B 2 That is marginally Y1 beta22 To nd the marginal distribution of Y2 ie fy2y2 we integrate out over yl For values of 0 S yg S 1 we have 12 2 12 2 fy2y2 62161211 311 312 1110 0 Thus the marginal distribution of Y2 is given by 3 0 lt yg lt1 fY2yz 0 otherwise Of course we recognize this as a beta distribution with 04 3 and B 1 That is marginally Y2 beta3 1 b Here we want to nd PB where the set E y1y2 0 lt yl lt y2y2 gt 12 This probability can be computed two different ways i using the joint distribution fy1y2y1y2 and computing 1 12 PmYz e B 6y1dy1dy2 21205 2110 ii using the marginal distribution fy2y2 and computing 1 Mngtua iz y205 Either way you will get the same answer Notice that in i you are computing the volume under fyy2y1y2 over the set B In ii you are nding the area under fy2y2 over the set yg yg gt 12 c Here we want to compute PY2 2 2Y1 ie we want to compute PD where the set D y1y2 yg 2 2m This equals 1 ygZ Hmamm WWMM5 2120 2110 This is the volume under fyy2y1y2 over the set D D PAGE 103 CHAPTER 4 STATMATH 5117 J TEBBS 45 Conditional distributions RECALL For events A and B in a non empty sample space S we de ned PA B P A B lt l gt RE for PB gt 0 Now7 suppose that YhYZ is a discrete random vector If we let B Y2 yg and A Y1 yl we obtain PWB PY1 y1Y2 yz Mai 91112 PY2 12 1032 yz TERMINOLOGY Suppose that Y1Y2 is a discrete random vector with joint pmf py1y2y1y2 We de ne the conditional probability mass function pmf of Y1 given Y2 yg as pY1Y2 11792 1032 12 whenever py2y2 gt 0 Similarly7 the conditional probability mass function of Y2 given PYJYXMWZ Y1 11 as PY1Y291792 PYY 12111 2 mm 7 whenever py1y1 gt 0 Example 44 In Example 417 we computed the joint pmf for Y17 The table below depicts this joint pmf as well as the marginal pmfs Table 44 Joint pmfpy1y2y1y2 displayed in tabular form PY1Y291792 22 0 22 1 yz 2 yz 3 Row Sum 0 g E 2 7 91 i 220 220 220 220 220 1 m m g m 91 i 220 220 220 220 g g E 91 i 2 220 220 220 7 m m yl 3 220 220 56 112 48 4 Column sum m m E m 1 QUESTION What is the conditional pmf of Y1 given Y2 1 PAGE 104 CHAPTER 4 STATMATH 5117 J TEBBS SOLUTION Straightforward calculations show that 40220 0 1 i 7 40 112 PYilY2l1 lyz pY2y2 1 112220 pm 201 1702 1 60220 1 1 gt 7 60 112 paw2 lyz DiAW 1 112220 pY1Y2yl2721 12220 2 1 712112 paw2 lyz DiAW 1 112220 Thus the conditional prnf of Y1 given Y2 1 is given by yl l 0 1 2 pmy2ylly21l40112 60112 12112 This conditional pmf tells us how Y1 is distributed if we are given that Y2 1 EXERCISE Find the conditional prnf of Y2 given Y1 0 D THE CONTINUOUS CASE When Y1Y2 is a continuous random vector we have to be careful how we de ne conditional distributions since the quantity i ay2 91712 fY11Y2l1ly2 fY2y2 has a zero denorninator As it turns out this expression is the correct formula for the continuous case however we have to motivate its construction in a slightly different way ALTERNATE MOTIVATION Suppose that Y1 Y2 is a continuous random vector For dyL and dyg srnall fY1i2yi7y2dyidy2 fY2l2dy2 Pyl Syi Syidyi7y2 SY2 Sy2dy2 Py2 SY2 Sy2dyz Pyi S Y1 111 dyily2 S Y2 Sy2d112gt fi11Y2y1lyzdyi Thus we can think of fy11y2y1ly2 in this way ie for small values of dyL and dyg fy y2y1ly2 represents the conditional probability that Y1 is between yl and y1 dyl given that Y2 is between yg and y2 dyg PAGE 105 CHAPTER 4 STATMATH 5117 J TEBBS TERMINOLOGY Suppose that YhYZ is a continuous random vector with joint pdf fy1y2y1y2 We de ne the conditional probability density function pdf of Y1 given Y2 yg as fmY2yly2 ng 12 Similarly7 the conditional probability density function of Y2 given Y1 yl is fanAmiga fY1Y2yhyz fY1y1 Example 45 Consider the bivariate pdf in Example 43 nglY1yziyl 6117 0ltyiltyzlt1 0 Jig 91792 otherw1se Recall that this probabilistic model summarized the random vector Y17 Y27 where Y1 the proportion of a population with Trait 17 is always less than Y2 the proportion of a pop ulation with trait 2 Derive the conditional distributions fy1 y2y1y2 and fy2 y1y2y1 SOLUTION ln Example 43 we derived the marginal pdfs to be 6911 yl7 0 lt 91 lt1 0 fY1y1 otherwise 7 and 3 0 lt yg lt1 0 ng 12 7 otherwise First7 we derive fy1 y2y1y2 so x Y2 yg Remember7 once we condition on Y2 yg ie7 once we x Y2 yg we then regard yg as simply some constant This is an important point to understand Then7 for values of 0 lt yl lt yg it follows that fY1Y2yi792 i fy2y2 313 9 7 he m 21 92 and7 thus7 this is the value of fy1 y2y1y2 when 0 lt yl lt yg Of course7 for values of y1 0y27 the conditional density fy1 y2y1 y2 0 Summarizing7 the conditional pdf of Y1 given Y2 yg is given by 2211137 0 lt 211 lt yz 0 me2ylil2 otherwise 7 PAGE 106 CHAPTER 4 STATMATH 5117 J TEBBS Now to derive the conditional pdf of Y2 given Y1 we x Y1 ylg then for all values of y1 lt yg lt 1 we have le Y2 117 12 611 1 f y y 7 Y2mlt 2 1 fY1yi 62111 11 1 91 This is the value of fy2 yy2ly1 when yl lt yg lt 1 When yg yl 1 the conditional pdf is fwy1 yglyl 0 Remember once we condition on Y1 yl then we regard yl simply as some constant Thus the conditional pdf of Y2 given Y1 yl is given by 1 Q7 y1lty2lt1 nglY1yZlyl 0 otherwise 7 That is conditional on Y1 yl Y2 Uy1 1 D RESULT The use of conditional densities allows us to de ne conditional probabilities of events associated with one random variable when we know the value of another random variable If Y1 and Y2 are jointly discrete then for any set E C R 1301 E BlYZ 12 ZPY132yilyz B If Y1 and Y2 are jointly continuous then for any set E C R MK 6 Ble yz fymy1ly2dy1 B Example 46 A small health food store stocks two different brands of grain Let Y1 denote the amount of brand 1 in stock and let Y2 denote the amount of brand 2 in stock both Y1 and Y2 are measured in 100s of lbs The joint distribution of Y1 and Y2 is given by 2411112 yl gt 0112 gt 0 0 lty1 112 lt1 0 fiaY2yly2 otherwise a Find the conditional pdf fy1 y2y1ly2 b Compute PY1 gt 05lY2 03 c Find PY1 gt 05 PAGE 107 CHAPTER 4 STATMATH 5117 J TEBBS SOLUTIONS a To nd the conditional pdf fy1 y2y1ly2 we rst need to nd the mar ginal pdf of Y2 The marginal pdf of Y2 for 0 lt yg lt 17 is 1722 y2 1212 My 2421212 dy124y2 31 12y217y22 1110 0 and 07 otherwise Of course7 we recognize this as a beta23 pdf ie7 Y2 beta23 The conditional pdf of Y1 given Y2 yg is le Y2 11792 2411212 f y y W 1 2 My 1211207112 i 1 12 for 0 lt yl lt 1 7 yg and 07 otherwise Surnrnarizing7 723 0 lt y lt 1 i 7 27 1 12 leli2yilyz 1 22 07 otherwise b To compute PY1 gt 05lY2 037 we work with the conditional pdf fy1 y2y1ly2 which for yg 037 is given by y1 0 lt yl lt 07 fY1lY2lty1ly2 0 otherwise 7 Thus7 07 200 1301 gt 0502 03 E yidyl 05 0489 22 c To compute PY1 gt 057 we can either use the marginal pdf fy1y1 or the joint pdf fy1y2y17y2 Marginally7 it turns out that Y1 beta23 as well verifyl Thus7 1 PY1 gt 05 121107 yi2dy1 0313 05 REMARK Notice how PY1 gt 05lY2 03 31 PY1 gt 05 that is7 knowledge of the value of Y2 has affected the way that we assign probability to events involving Y1 Of course7 one might expect this because of the support in the joint pdf fy1y2y17y2 D PAGE 108 CHAPTER 4 STATMATH 5117 J TEBBS 46 Independent random variables TERMINOLOGY Suppose that Y1 Y2 is a random vector discrete or continuous with joint cdf Fy1y2y1 yg and denote the marginal cdfs of Y1 and Y2 by Fyy1 and Fy2y2 respectively We say that the random variables Y1 and Y2 are independent if and only if FHA 9171 FY1 1FY2l2 for all values of y1 and y2 Otherwise we say that Y1 and Y2 are dependent RESULT Suppose that Y1Y2 is a random vector discrete or continuous with joint pdf pmf fybyz y1y2 and denote the marginal pdfs pmfs of Y1 and Y2 by fyy1 and fy2y2 respectively Then Y1 and Y2 are independent if and only if fY1Y2yl7yZ fY1l1fY2l2 for all values of y1 and y2 Otherwise Y1 and Y2 are dependent Example 47 Suppose that the pmf for the discrete random vector Y1 Y2 is given by 041 212 211 1727212 172 0 10th 117 12 otherwise 7 The marginal distribution of Y1 for values of y1 12 is given by 2 2 1 1 10y1 11 Z py1y2y17y2 Z Em 212 Beg1 6 2 1121 and pyy1 0 otherwise Similarly the marginal distribution of Y2 for values of y2 12 is given by 2 2 l l PY292 ZPYQQWhW 2 Run 212 E3 4732 1111 y11 and 10y2 yg 0 otherwise Note that for example 3 8 7 14 E pY1Y21717 pY11pY21 E E g thus the random variables Y1 and Y2 are dependent D PAGE 109 CHAPTER 4 STATMATH 5117 J TEBBS Example 48 Let Y1 and Y2 denote the proportions oftime out of one workday during which employees I and H7 respectively7 perform their assigned tasks Suppose that the random vector Y1Y2 has joint pdf yl927 0lty1lt1 0lty2lt1 0 fiaY2yly2 otherwise 7 It is straightforward to show verify that 111 1 0 lty1 lt1 fyly1 2 07 otherwise 112 0lty2lt1 0 ng 92 otherwise ThUS since fmy2y1y2 11 yz 7e 21 y2 i fy1y1fy2y27 for 0 lt 211 lt 1 and 0 lt yg lt 17 Y1 and Y2 are dependent D Example 49 Suppose that Y1 and Y2 represent the death times in hours for rats treated with a certain toxin Marginally7 each death time follows an exponential distrib ution with mean 0 and Y1 and Y2 are independent a Write out the joint pdf of b Compute PY1 S 152 3 1 SOLUTIONS a Because Y1 and Y2 are independent7 the joint pdf of Y17Y27 for yl gt 0 and y2 gt 07 is given by l l l fY1Y2l17112 fY1ylfY2yZ 56721 X 56722 fe1M2 and fY1Y2 91712 0 otherwise b Because Y1 and Y2 are independent7 PY1 31752 31 FY1Y2171 FY11FY21 17 671917 6719 17649 D PAGE 110 CHAPTER 4 STATMATH 5117 J TEBBS A CONVENIENT RESULT Let Y1Y2 be a random vector discrete or continuous with pdf pmf fy1y2y1 yg lfthe support set RWY2 does not constrain yl by yg or yg by yl and additionally we can factor the joint pdf pmf fybyz y1y2 into two nonnegative expressions fmy2y1y2 9y1hyz then Y1 and Y2 are independent Note that 9y1 and hy2 are simply functions they need not be pdfs pmfs although they sometimes are The only requirement is that gy1 is a function of y1 only hy2 is a function of y2 only and that both are nonnegative If the support involves a constraint the random variables are automatically dependent Example 410 In Example 46 Y1 denoted the amount of brand 1 grain in stock and Y2 denoted the amount of brand 2 grain in stock Recall that the joint pdf of Y1Y2 was given by 2411927 91 gt07 92gt07 0ltyiyzlt1 0 fmY2i1y2 otherw1se 7 Here the support is Ryby2 y1y2 yl gt 0 yg gt 0 0 lt yl y2 lt 1 Since knowledge of y1 y2 affects the value of y2 yl the support involves a constraint and Y1 and Y2 are dependent D Example 411 Suppose that the random vector X Y has joint pdf PozP 1Ae MD 1y 11 7 y 1 z gt 00 lt y lt 1 0 Jew9671 otherw1se 7 for A gt 0 04 gt 0 and B gt 0 Since the support RXy z gt 0 0 lt y lt 1 does not involve a constraint it follows immediately that X and Y are independent since we can write mm X W107 a WWW hy Note that we are not saying that 9a and My are marginal distributions of X and Y fxy y Ae M As 9w respectively in fact they are not the marginal distributions D PAGE 111 CHAPTER 4 STATMATH 5117 J TEBBS EXTENSION We generalize the notion of independence to n variate random vectors We use the conventional notation Y Y1Y2Yn and y y1y2 Also we will denote the joint cdf of Y by Fyy and the joint pdf pmf of Y by fyy TERMINOLOGY Suppose that the random vector Y Y1Y2 has joint cdf Fyy and suppose that the random variable Y has cdf Fy for 239 1 2 n Then Y1 Y2 Yn are independent random variables if and only if RHampW i1 that is the joint cdf can be factored into the product of the marginal cdfs Alternatively Y1 Y2 Yn are independent random variables if and only if fYy H inyi i1 that is the joint pdf pmf can be factored into the product of the marginals Example 412 In a small clinical trial 71 20 patients are treated with a new drug Suppose that the response from each patient is a measurement Y N NM02 Denot ing the 20 responses by Y Y1Y2 YZO then assuming independence the joint distribution of the 20 responses is for y 6 R20 20 20 l 1 yru 2 l 1 20 yru 2 fYy 5 olt g ii1o39 N27T039 27m inZi What is the probability that every patient7s response is less than M 20 SOLUTION The probability that Y1 is less than M 20 is given by 1311 lt u 2a PZ lt 2 2 09772 where Z N N01 and denotes the standard normal cdf Because the patients7 responses are independent random variables to 0 PY1ltM20756ltM2077Y oltM2U PYiltM20 1 ltIgt 2200630 D PAGE 112 CHAPTER 4 STATMATH 5117 J TEBBS 47 Expectations of functions of random variables RESULT Suppose that Y Y1 Y2 Y has joint pdf fyy or joint pmf pyy and suppose that gY gY1Y2 Yn is any real vector valued function of Y1Y2 Yn ie g R a R Then 0 if Y is discrete E9Yl Z 2 2 9ypyy all 11811 2 allyn o and if Y is continuous E9Yl 1 9yfyydy If these quantities are not nite then we say that EgY does not exist Example 413 In Example 46 Y1 denotes the amount of grain 1 in stock and Y2 denotes the amount of grain 2 in stock The joint distribution of Y1 and Y2 was given by 2411927 91 gt07 92gt07 0ltyiyzlt1 0 flan2yhy2 otherwise 7 What is the expected total amount of grain Y1 Y2 in stock SOLUTION Let the function g R2 a R be de ned by gyly2 yl yg We would like to compute EgY1Y2 EY1 From the last result we know that 1 1 111 EY1 Y2 11 yz24yiyz dyzdyi 2110 2120 1 yz 1721 ya 111 24yfl 24113 dyl 1110 2 0 3 0 1 1 12911 110261111 8911 yi3dyl 1110 0 The expected amount of grain in stock is 80 lbs Recall that marginally Y1 beta23 and Y2 beta23 so that Z and EY1 Y2 2 g g D 5 8 5 PAGE 113 CHAPTER 4 STATMATH 5117 J TEBBS Example 414 A process for producing an industrial chemical yields a product con taining two types of impurities Type I and Type ll From a speci ed sample from this process7 let Y1 denote the proportion of impurities in the sample of both types and let Y2 denote the proportion of Type I impurities among all impurities found Suppose that the joint pdf of the random vector Y17 Y2 is given by 217y1 0lty1 lt1 0lty2lt1 fax 111112 07 otherwise Find the expected value of the proportion of Type I impurities in the sample SOLUTION Because Y1 is the proportion of impurities in the sample and Y2 is the proportion of Type I impurities among the sample impurities7 it follows that Y1Y2 is the proportion of Type I impurities in the sample taken Let the function g R2 a R be de ned by 9y17y2 ylyg We would like to compute EgY1Y2 This is given by 1 1 l 111221yidyidyz8 D 0 0 PROPERTIES OF EXPECTATIONS Let Y Y1Y2Yn be a discrete or con tinuous random vector with pdf pmf fyy and support R C R suppose that 97919279k are real vector valued functions from R a R and let 0 be any real constant Then7 a Ec c b E69Yl 0E9Yl C E 221 9jYl 2221 E 9jYl RESULT Suppose that Y1 and Y2 are independent random variables7 and consider the functions 9Y1 and hltY2 where 9Y1 is a function of Y1 only7 and hY2 is a function of Y2 only Then7 E9Y1hY2l E9Y1lEhY2L provided that all expectations exist Proof Without loss7 we will assume that Y1Y2 is a continuous random vector the PAGE 114 CHAPTER 4 STATMATH 5117 J TEBBS discrete case is analogous Suppose that Y17 Y2 has joint pdf fy1y2y1y2 with support R C R2 Note that Emma2 R 2glty1gthlty2fy1y2lty1yadyzdyl 9y1hy2fy1y1fy2y2dy2dy1 R R R9ylfY1yldyl Ahyzfygyzdyz Emu2 Aglty1gtfyllty1gtdy1 ElhY2lElgY1l D Example 415 A point YhYZ E R2 is selected at random7 where Y1 N NM102 Y2 N NM2702 and Y1 and Y2 are independent De ne the random variables T Y1Y2 U Y1Y2 z Y12Y22 Find ET EU and EZ SOLUTIONS a Because is linear7 we know ET EY1 Y2 1901 5706 1 2 Because Y1 and Y2 are independent7 we know that EU EY1Y2 EY1EY2 mm To compute EZ7 rst note that 137012 VY1 lEY1l2 02 M and EY22 VY2 lEY2l2 02 3 so that EZ EYf Y Em E0 02 u 02 3 202M M D EXERCISE Compute ETU7 ETZ7 and EUZ PAGE 115 CHAPTER 4 STATMATH 5117 J TEBBS 48 Covariance and correlation 481 Covariance TERMINOLOGY Suppose that Y1 and Y2 are random variables with means Myl and 1132 respectively The covariance between Y1 and Y2 is given by COVY17 Y2 EKYJL MY1Y2 MY2l39 The covariance gives us information about how Y1 and Y2 are linearly related THE OOVARIANC39E COMPUTING FORMULA It is easy to show that COVY1Y2 E EY1 MOO2 MYM EY1Y2 7 mm This latter expression is sometimes easier to work with and is called the covariance computing formula Example 416 Gasoline is stocked in a tank once at the beginning of each week and then sold to customers Let Y1 denote the proportion of the capacity of the tank that is available after it is stocked Let Y2 denote the proportion of the capacity of the bulk tank that is sold during the week Suppose that the random vector YhYZ has joint pdf 3117 0lt92 lt91 lt1 fY1Y2y17y2 07 otherwise To compute the covariance7 rst note that Y1 beta37 1 and Y2 fy2y27 where 317yg 0lty2ltl 0 ng 12 otherwise 7 Thus7 33 1 075 and 1 3 EY2 12 x 517y dy 0375 O Also7 1 11 11112 gtlt 31116111261111 030 y10 y20 PAGE 116 CHAPTER 4 STATMATH 5117 J TEBBS Thus the covariance is COVY17Y2 7 MY1MY2 030 i 0750375 001875 D NOTES ON THE OOVARIANOE o If CovY1Y2 gt 0 then Y1 and Y2 are positively linearly related o If CovY1Y2 lt 0 then Y1 and Y2 are negatively linearly related o If CovY1 Y2 0 then Y1 and Y2 are not linearly related This does not necessarily mean that Y1 and Y2 are independent RESULT lf Y1 and Y2 are independent then CovY1Y2 0 Proof Using the covariance computing formula we have COVY17Y2 7 MY1MY2 EY1EY2 Iii1W2 0 E MAIN POINT If two random variables are independent then they have zero covariance however zero covariance does not necessarily imply independence Example 417 An example of two dependent variables with zero covariance Suppose that Y1 U71 1 and let Y2 le It is straightforward to show that 0 my Ems o and E02 Em VY1 13 Thus COVY17Y2 7 MY1MY2 0 7 039 However not only are Y1 and Y2 related they are perfectly related But the relationship is not linear it is quadratic The covariance only assesses linear relationships D IMPORTANT RESULT Suppose that Y1 and Y2 are random variables Then VY1 Y2 VY1 VY2 2oovY1 Y2 WK 7 Y2 VY1 VY2 i QCovY1Y2 PAGE 117 CHAPTER 4 STATMATH 5117 J TEBBS Proof Let Z Y1 Y2 Using the de nition of variance7 we have VZ 7 EZ7Mzzl 7 EY1 Y2 7 EY1 Y2lz 7 EY1 Y2 7 MY1 M102 7 EY1 7 WI Y2 7 MW 7 MY12 7 MY22 2 7 MY1Y2 7 MY2l V cross product 7 ElY1 7 MY12l ElY2 7 MW 2EY1 7 MOO2 7 MYM 7 VY1 VY2 2CovY1 Y2 That VY1 7 Y2 VY1 106 7 2CovY1Y2 is shown similarly D Example 418 A small health food store stocks two different brands of grain Let Y1 denote the amount of brand 1 in stock and let Y2 denote the amount of brand 2 in stock both Y1 and Y2 are measured in 100s of lbs ln Example 467 we saw that the joint distribution of Y1 and Y2 was given by 2411927 91 gt07 92gt07 0ltyiyzlt1 0 fiaY2yry2 otherwise 7 What is the variance for the total amount of grain in stock That is7 what is VY1 Y2 SOLUTION Using the last result7 we know that VY1 Y2 VY1 VY2 2CovY1Y2 Marginally7 Y1 and Y2 both have beta23 distributions see Example 46 Thus7 2 2 E Y E Y 7 7 lt 1 lt 2 2 3 5 and 23 1 VY1 VY2 m Recall that CovY1Y2 7 so we need to rst compute 1 1 111 2 11112 gtlt 2411112 dyzdyi y10 2120 15 PAGE 118 CHAPTER 4 STATMATH 5117 J TEBBS Thus Comm Ema e EltY1gtEltY2gt 135 e 70027 Finally the variance of Y1 Y2 is given by 1 1 VY1 Y2 g g 270027 m 0027 D RESULT Suppose that Y1 and Y2 are independent random variables Then VY1 i Y2 VY1 VY2 Proof In general VY1 j Y2 VY1 VY2 i 2CovY1Y2 Since Y1 and Y2 are independent CovY1Y2 0 Thus the result follows immediately D LEMMA Suppose that Y1 and Y2 are random variables with means y and 32 respec tively Then a CovY1Y2 CovY2Y1 b Cowl Y1 VY1 c Cova bY1c dYZ deovY1Y2 for constants a b c and d Proof Exercise D 482 Correlation GENERAL PROBLEM Suppose that X and Y are random variables and that we want to predict Y as a linear function of X That is we want to consider functions of the form Y 60 61X for constants 60 and 61 In this situation the error in prediction77 is given by Y i 50 51X This error can be positive or negative so in developing a goodness measure77 of prediction error we want one that maintains the magnitude of error but ignores the sign Thus PAGE 119 CHAPTER 4 STATMATH 5117 J TEBBS consider the mean squared error of prediction given by QWo i E Eily 50 51Xl2 A two variable calculus argument shows that the mean squared error of prediction Qwo l is minimized when CovX Y l VX and CovX Y Wl Em However note that the value of 61 algebraically is equal to mMW7 CovX Y VX CovX Y J 073 UXUY UY PXY 7 7 UX CovX Y PXY UXU39Y 61 UX where The quantity pr is called the correlation coe icient between X and Y SUMMARY The best linear predictor of Y given X is Y 60 61X where a 51 PXY 0X 60 EY 761E NOTES ON THE CORRELATION COEFFICIENT 1 71 S pr S 1 this can be proven using the Cauchy Schwartz lnequality frorn calculus 2 If pxy 1 then Y 60 61X where 61 gt 0 That is X and Y are perfectly positively linearly related ie the bivariate probability distribution of X Y lies entirely on a straight line with positive slope PAGE 120 CHAPTER 4 STATMATH 511 J TEBBS 3 If pxy 1 then Y BO 61X where 61 lt 0 That is X and Y are perfectly negatively linearly related ie the bivariate probability distribution of X Y lies entirely on a straight line with negative slope 4 If pXy 0 then X and Y are not linearly related NOTE If X and Y are independent random variables then pr 0 However again the implication does not go the other way that is if pr 0 this does not necessarily mean that X and Y are independent NOTE In assessing the strength of the linear relationship between X and Y the cor relation coef cient is often preferred over the covariance since pr is measured on a bounded unitless scale On the other hand CovX Y can be any real number Example 419 In Example 416 we considered the bivariate model 3117 0lt92 lt91 lt1 Jig 91792 0 otherwise for Y1 the proportion of the capacity of the tank after being stocked and Y2 the pro portion of the capacity of the tank that is sold What is pyly SOLUTION ln Example 416 we computed CovY1Y2 001875 so all we need is Ty and 039y2 We also found that Y1 beta3 1 and Y2 fy2y2 where 317yg 0lty2 lt1 0 ng 12 otherwise The variance of Y1 is 31 3 3 VY 7gt i0194 1 311312 80 U V80 Simple calculations using fy2y2 show that 15 and 38 so that 1 3 2 VY2 g 7 0059 gt 0y2 0059 x 0244 Thus C Y Y 001875 pYbY2 M N x 040 D 039y1039y2 N 0194 gtlt 0244 PAGE 121 CHAPTER 4 STATMATH 5117 J TEBBS 49 Expectations and variances of linear functions of random variables TERMINOLOGY Suppose that Y17Y27 Yn are random variables and that 11027 an are constants The function UZamama2Y2manYn i1 is called a linear combination of the random variables Y1 Y2 Yn EXPECTED VALUE OF A LINEAR COMBINATION EU EltZaiyigt ZaiEYi i1 i1 VARIANCE OF A LINEAR COMBINATION VU V lt Z ain Z IlaK 2 Z magCovOi7 i1 i1 iltj Z a Yi ZaiajCovYZYj i1 i7 j OOVARIANOE BETWEEN TWO LINEAR OOMBINATIONS Suppose that M H U1 aiYi01Y112Y2anYn U2 M3 ijj lel b2X2 mem x H Then7 it follows that CovU1 U2 Z Z aibjCovOi Xj 71 m 11 j1 BIVARIATE CASE Interest will often focus on situations wherein we have a linear combination of n 2 random variables In this setting7 E01Y1 1252 01EY1 MEG2 V01Y1 1252 ailY1 ailY2 2010200VY17Y2 PAGE 122 CHAPTER 4 STATMATH 5117 J TEBBS Similarly7 when n m 27 Cova1Y1 a2Y2b1X1 bzXz alblcovY1X1 albgcoVltY17 X2 agblcOVY2 X1 azbzcoVltY27 Example 420 Achievement tests are usually seen in educational or employment set tings These tests attempt to measure how much you know about a certain topic in a particular area Suppose that Y1 Y2 and Y3 represent scores for a particular different parts of an exam It is posited that Y1 N J1247 Y2 1697 Y3 20167 Y1 and Y2 are independent7 CovY1Y3 087 and CovY2Y3 767 Two different summary measures are computed to assess a subject7s performance U105Y172Y2Y3 and U23Y172Y27Y3 a EU1 and VU1 b Find CovU1U2 SOLUTIONS The mean of U1 is EU1 E05Y1 2Y2 Y3 05EY1 2EY2 EY3 0512 7 216 20 76 The variance of U1 is vltU1gt mm 7 21 Y3 052VY1 722VY2 VY3 20572COVY17Y2 2051COVY1Y3 272lCOVY2Y3 0254 49 16 205720 20508 272767 806 The covariance between U1 and U2 is CovU1 U2 Cov05Y1 i 2Y2 Y3 3Y1 i 2Y2 7 Y3 o53cOvi1Y1 0572COVY17Y2 0571COVY17Y3 23COVY2Y1 22COVY2Y2 21COVY2 13COVYz x7Y112COV5G75611COV5G5G PAGE 123 CHAPTER 4 STATMATH 5117 J TEBBS 410 The multinomial model RECALL When we discussed the binomial model in Chapter 2 each Bernoulli trial resulted in either a success or a failure77 that is on each trial there were only two outcomes possible eg infectednot germinatednot defectivenot etc TERMINOLOGY A multinomial experiment is simply a generalization of a binomial experiment In particular consider an experiment where o the experiment consists of 71 trials 71 is xed 0 the outcome for any trial belongs to exactly one of k 2 2 classes 0 the probability that an outcome for a single trial falls into class 239 is given by pi for 239 1 2 k where each p remains constant from trial to trial and 0 trials are independent DEFINITION In a multinomial experiment let Y denote the number of outcomes in class 239 so that Y1 Y2 Yk n and denote Y Y1Y2Yk We call Y a multinomial random vector and write Y N multnp1p2 pk Zip 1 NOTE When k 2 the multinomial random vector reduces to our well known binomial situation When k 3 Y would be called a trinomial random vector JOINT PMF If Y N multnp1p2 pk Zip 1 the pmf for Y is given by n 21 22 yk 7 7 gillyzlykp1 192 mph in i 07 17 i n 0 WW otherwise 7 Example 421 In a manufacturing experiment we observe n 10 parts each of which can be classi ed as non defective defective or reworkable De ne Y1 number of non defective parts Y2 number of defective parts Y3 number of reworkable parts PAGE 124 CHAPTER 4 STATMATH 5117 J TEBBS Assuming that each part ie7 trial is independent of other parts7 a multinomial model applies and Y Y17Y27Y3 mult10p1p2p3 Zipl 1 Suppose that p1 0907 p2 0037 and p3 007 What is the probability that a sample of 10 contains 8 non defective parts7 1 defective part7 and 1 reworkable part7 SOLUTION We want to compute py1y2y387171 This equals 01 m090800310071 m 0081 a pY1Y2Y387171 Example 422 At a number of clinic sites throughout Nebraska7 chlamydia and gon orrhea testing is performed on individuals using urine or cervical swab specimens More than 30000 of these tests are done annually by the Nebraska Public Health Laboratory Suppose that on a given day7 there are n 280 subjects tested7 and de ne p1 proportion of subjects with neither chlamydia nor gonorrhea p2 proportion of subjects with chlamydia but not gonorrhea p3 proportion of subjects with gonorrhea but not chlamydia p4 proportion of subjects with both chlamydia and gonorrhea De ne Y 117127137147 where K counts the number of subjects in category 2 As suming that subjects are independent7 Y N mult280p1p2p3p4 Zipl 1 The pmf of Y is given by gmpllpg2p 3pi 22 0 1 280 221 280 0 WW otherwise FACTS lfY Y17Y2Yk multnp1p2pk Zipl 17 then 0 The marginal distribution of K is bnpi7 for 2 17 27 o npi for 2 127 k 0 moi1 7101 for 2 17 27 k 0 The joint distribution of Y7 is trinomialnpipj1 7 pl 7 pg 0 CovYlYj inpipj for 2 31 j PAGE 125 CHAPTER 4 STATMATH 5117 J TEBBS 411 The bivariate normal distribution TERMINOLOGY The random vector Y1Y2 has a bivariate normal distribution if its joint pdf is given by 6 62 2117212 6 R2 27n7102 17 p2 0 ay2 117 12 otherwise 2 2 1 7 7 7 7 Q 2 lt91 1 7 2p lt91 1 lt92 2 lt92 M2 17p 01 01 U2 02 We write Y1Y2 N N2Mlpgafagp There are 5 parameters associated with this where bivariate distribution the marginal means 1 and M2 the marginal variances a and 0 and the correlation p E pybyz FACTS ABOUT THE BIVARIATE NORMAL DISTRIBUTION 1 Marginally Y1 NM1Uf and Y2 NM2U 2 Y1 and Y2 are independent ltgt p 0 This is only true for the bivariate normal distribution remember this does not hold in general 3 The conditional distribution mm 12 N m p lt12 7 ma a we 4 The conditional distribution YzHYl 21 N N 2 p011 M17U 17 p2 EXERCISE Suppose that Y1Y2 N200 1 1 05 What is PY2 gt 085 Y1 02 ANSWER From the last result note that conditional on Y1 yl 02 Y2 N01075 Thus PY2 gt 085 Y1 02 PZ gt 1 01587 lnterpret this value as an area PAGE 126 CHAPTER 4 STATMATH 5117 J TEBBS 412 Conditional expectation 4121 Conditional means and curves of regression TERMINOLOGY Suppose that X and Y are continuous random variables and that gX and hY are functions of X and Y7 respectively7 Recall that the conditional dis tributions are denoted by fx y y and fy Xylx Then7 E9XY y R 996fxiylyd96 EWYMX z R hyfyixyldy If X and Y are discrete7 then sums replace integrals IMPORTANT It is important to see that7 in general7 o EgXY y is a function of y and o EhYX x is a function of x CONDITIONAL MEANS In the de nition above7 if gX X and hY Y7 we get in the continuous case7 EltXY y R fxiy95lyd95 EmX z jagmummy EXY y is called the conditional mean of X7 given Y y it is the mean of the conditional distribution fx y y On the other hand7 EYX s is the conditional mean of Y7 given X x it is the mean of the conditional distribution fy Xylx Example 423 In a simple genetics model7 the proportion7 say X7 of a population with Trait l is always less than the proportion7 say Y7 of a population with trait 2 ln Example 437 we saw that the random vector X7 Y has joint pdf 6x 0ltzltylt1 0 Jew9671 otherwise 7 PAGE 127 CHAPTER 4 STATMATH 5117 J TEBBS ln Example 45 we derived the conditional distributions 29692 0lt9clty fXYll and fYXll 1 0 otherwise 0 otherwise zltylt1 Thus the conditional mean of X given Y y is EXlY y y 0 95fXY95lld95 y 2 2 xd7 0 y 9 Similarly the conditional mean of Y given X z is 3 3 0 1 mnXzgt39thmnm 1 1 1 y 1 1 d 7 1 Alllt1izgty 1795 21 2Q That EYlX s 1 is not surprising because YlX x N 12 1 D TERMINOLOGY Suppose that X Y is a bivariate random vector 0 The graph of EXlY y versus y is called the curve of regression of X on Y o The graph of EYlX s versus z is called the curve of regression of Y on X The curve of regression of Y on X from Example 423 is depicted in Figure 419 4122 Iterated means and variances REMARK In general EXlY y is a function ofy and y is xed not random Thus EXlY y is a xed number However EXlY is a function of Y thus EXlY is a random variable Furthermore as with any random variable it has a mean and variance associated with itll ITERATED LAWS Suppose that X and Y are random variables Then the laws of iterated expectation and variance respectively are given by mmEmwwn PAGE 128 CHAPTER 4 STATMATH 5117 J TEBBS x 0 Figure 419 The curve of regression EYlX s versus z in Example 423 and VX EVXlYl VEXlY NOTE When considering the quantity EEXlY7 the inner expectation is taken with respect to the conditional distribution fX yly However7 since EXlY is a function of Y7 the outer expectation is taken with respect to the marginal distribution fyy Proof We will prove that EX for the continuous case Note that EX Rszxyxyddy xfxlyxlyfyydxdy R R R szxlymmdz fyydy ElEXlYl a E X lYy Example 424 Suppose that in a eld experirnent7 we observe Y7 the number of plots7 out of n that respond to a treatment However7 we dont know the value of p7 the probability of response7 and furtherrnore7 we think that it may be a function of location7 PAGE 129 CHAPTER 4 STATMATH 5117 J TEBBS temperature precipitation etc In this situation it might be appropriate to regard p as a random variable Speci cally suppose that the random variable P varies according to a betaoz distribution That is we assume a hierarchical structure Y P p N binomialnp P N betaoz The unconditional mean of Y can be computed using the iterated expectation rule a EY EEYP EnP nEP n 7 ltgt M i ltgt MB The unconditional variance of Y is given by VW MWYWNWEWWN EnP17 13 VnP nEP 7 P2 n2VP nEP i nVP EP2 n2VP n L n L L 2 7 am a 2a 1a 7 n 04 i i nn 71MB 7 a 1 a ama h39 Unconditionally the random variable Y follows a betabinomial distribution This is 712046 ama h extra Variation a popular probability model for situations wherein one observes binomial type responses but where the variance is suspected to be larger than the usual binomial variance D BETA BINOMIAL PMF The probability mass function for a betabinomial random variable Y is given by 1 1 my mwwp fyipyipfppdp 0 0 0 ltZPy1 e prw Sjg ipmu e pf ldp n Na mm awn a e y y FWWWWWam 7 for y 01n and pyy 0 otherwise PAGE 130

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

#### "I made $350 in just two days after posting my first study guide."

#### "There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

#### "Their 'Elite Notetakers' are making over $1,200/month in sales by creating high quality content that helps their classmates in a time of need."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.