### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# PROBABILITY STAT 511

GPA 3.93

### View Full Document

## 23

## 0

## Popular in Course

## Popular in Statistics

This 134 page Class Notes was uploaded by Shane Marks on Monday October 26, 2015. The Class Notes belongs to STAT 511 at University of South Carolina - Columbia taught by J. Tebbs in Fall. Since its upload, it has received 23 views. For similar materials see /class/229665/stat-511-university-of-south-carolina-columbia in Statistics at University of South Carolina - Columbia.

## Reviews for PROBABILITY

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 10/26/15

STAT MATH 511 PROBABILITY Fan 2007 Lecture Notes Joshua M Tebbs Department of Statistics University of South Carolina TABLE OF CONTENTS STATMATH 5117 J TEBBS Contents 1 Probability 1 11 Introduction 1 12 Sample spaces 3 13 Basic set theory 3 14 Properties of probability 5 15 Discrete probability models and events 7 16 Tools for counting sample points 9 161 The multiplication rule 9 162 Permutations 10 163 Combinations 15 17 Conditional probability 17 18 lndependence 20 19 Law of Total Probability and Bayes Rule 22 2 Discrete Distributions 26 21 Random variables 26 22 Probability distributions for discrete random variables 27 23 Mathematical expectation 31 24 Variance 36 25 Moment generating functions 38 26 Binomial distribution 42 27 Geometric distribution 48 28 Negative binomial distribution 51 29 Hypergeometric distribution 54 210 Poisson distribution 58 TABLE OF CONTENTS STATMATH 5117 J TEBBS 3 Continuous Distributions 65 31 Introduction 65 32 Cumulative distribution functions 65 33 Continuous random variables 67 34 Mathematical expectation 74 341 Expected values 74 342 Variance 76 343 Moment generating functions 76 35 Uniform distribution 78 36 Normal distribution 79 37 The gamma family of pdfs 84 371 Exponential distribution 84 372 Gamma distribution 88 373 X2 distribution 91 38 Beta distribution 92 39 Chebyshev7s Inequality 94 4 Multivariate Distributions 96 41 Introduction 96 42 Discrete random vectors 97 43 Continuous random vectors 99 44 Marginal distributions 101 45 Conditional distributions 104 46 Independent random variables 109 47 Expectations of functions of random variables 113 48 Covariance and correlation 116 481 Covariance 116 TABLE OF CONTENTS STATMATH 5117 J TEBBS 482 Correlation 119 49 Expectations and variances of linear functions of random variables 122 410 The rnultinornial model 124 411 The bivariate normal distribution 126 412 Conditional expectation 127 4121 Conditional means and curves of regression 127 4122 lterated means and variances 128 CHAPTER 1 STATMATH 5117 J TEBBS 1 Probability Complementary reading Chapter 2 11 Introduction TERMINOLOGY The text de nes probability as a measure of ones belief in the occurrence of a future event It is also sometimes called the mathematics of uncertainty7 EVENTS Here are some events we may wish to assign probabilities to o tomorrow7s temperature exceeding 80 degrees 0 manufacturing a defective part 0 concluding one fertilizer is superior to another when it isnt o the NASDAQ losing 5 percent of its value 0 you earning a B77 or better in this course ASSIGNING PROBABILITIES TO EVENTS How do we assign probabilities to events There are three general approaches 1 Subjective approach 0 this is based on feeling and may not even be scienti c 2 Relative frequency approach 0 this approach can be used when some random phenomenon is observed repeatedly under identical conditions 3 Acciomatz39c approach This is the approach we will take in this course PAGE 1 CHAPTER 1 STATMATH 5117 J TEBBS 015 Prapaman am 010 Prapaman am 00 005 am 015 020 005 015 010 Prapaman am 005 am 015 020 Prapaman am 005 0 0 Figure 11 The proportion of tosses which result in a 2 each plot represents 1000 rolls of afair die Example 11 An eccarnple illustrating the relative frequency approach to probability Suppose we roll a die 1000 times and record the number of times we observe a 277 Let A denote this event The relative frequency approach says that where nA denotes the frequency of the event7 and n denotes the number of trials performed The ratio is sometimes called the relative frequency The symbol PA is shorthand for the probability that A occurs7 RELATIVE FREQUENCY APPROACH Continuing with our example7 suppose that nA 158 Then7 we would estimate PA with 1581000 0158 If we performed this experiment repeatedly7 the relative frequency approach says that HUDn a PA7 as n a 00 Of course7 if the die is unbiased7 a PA 16 D PAGE 2 CHAPTER 1 STATMATH 5117 J TEBBS 12 Sample spaces TERMINOLOGY ln probability applications7 it is common to perform some random experiment and then observe an outcome The set of all possible outcomes for an experiment is called the sample space7 hereafter denoted by S Example 12 The Michigan state lottery calls for a three digit integer to be selected S 00000100279987999 D Example 13 An industrial experiment consists of observing the lifetime of a certain battery lf lifetimes are measured in hours7 the sample space could be any one of S1 w w 2 0 S2 07 17 237 S3 defective not defective D MORAL Sample spaces are not unique in fact7 how we de ne the sample space has a direct in uence on how we assign probabilities to events 13 Basic set theory TERMINOLOGY A countable set A is one whose elements can be put into a one to one correspondence with N 127 7 the set of natural numbers ie7 there exists an injection with domain A and range A set that is not countable is called an uncountable set TERMINOLOGY Countable sets can be further divided up into two types A count ably in nite set has an in nite number of elements A countably nite set has a nite number of elements TERMINOLOGY Suppose that S is a nonempty set We say that A is a subset of S7 and write A C S or A Q S7 if u e A x w e S PAGE 3 CHAPTER 1 STATMATH 5117 J TEBBS ln probability applications S will denote a sample space A will represent an event to which we wish to assign a probability and w usually denotes a possible experimental outcome If in E A we would say that the event A has occurred77 TERMINOLOGY The null set denoted as Q is the set that contains no elements TERMINOLOGY The union of two sets is the set of all elements in either set or both We denote the union of two sets A and B as A U B In La notation AUBww Aorw B TERMINOLOGY The intersection of two sets A and B is the set containing those elements which are in both sets We denote the intersection of two sets A and B as A B In La notation A Bww Aandw B EXTENSION We can extend the notion of unions and intersections to more than two sets Suppose that A1 A2 An is a nite sequence of sets The union of these 71 sets is AjA1UA2UUAnww Aj for at leastonej x H and the intersection of the 71 sets is AjA1 A2 Anww Ajfor allj 1 x H EXTENSION Suppose that A1 A2 is a countable sequence of sets The union and intersection of this in nite collection of sets is Cg x H Aj w w E A for at least one j Aj wmuEAj for allj 38 x H Example 14 De ne the sequence of sets Aj 11 1j for j 12 Then and 11 PAGE 4 CHAPTER 1 STATMATH 5117 J TEBBS TERMINOLOGY The complement of a set A is the set of all elements not in A but still in S We denote the complement as A In La notation A w E S w A TERMINOLOGY We say that A is a subset of B and write A C B or A Q B if u E A i w E B Thus if A and B are events in an experiment and A C B then if A occurs B must occur as well Distributive Laws 1 A BUO A BUA O 2 AUB O AUB AUO DeMorgans Laws LA BAUE 2AUBA E TERMINOLOGY We call two events A and B mutually exclusive or disjoint if A B Q Extending this de nition to a nite or countable collection of sets is obvious 14 Properties of probability THE THREE AXIOMS OF PROBABILITY Given a nonempty sample space S the measure PA is a set function satisfying three axioms 1 PA 2 0 for every A Q S 2 135 1 3 If A1A2 is a countable sequence of pairwise mutually exclusive events ie Ai Aj Q foriy j in S then PAGE 5 CHAPTER 1 STATMATH 5117 J TEBBS IMPORTANT RESULTS The following results are important properties of the prob ability set function P and each follows from the Kolmolgorov Axioms those just stated All events below are assumed to be subsets of S H F 9 7 U Complement rule For any event A PA17P Proof Note that S A U A Thus since A and A are disjoint PA U A PA PA by Axiom 3 By Axiom 2 PS 1 Thus 1 135 PA o2 PA PC D 130 0 Proof Take A Q and A S Use the last result and Axiom 2 D Monotonicity property Suppose that A and B are two events such that A C B Then PA S PB Proof Write B A U B Axiom 3 PB PA PB 7 Clearly A and B A are disjoint Thus by Because PB 2 0 we are done D For any event A PA S 1 Proof Since A C S this follows from the monotonicity property and Axiom 2 D Inclusionexclusion Suppose that A and B are two events Then PA o B PA PB 7 PA m B Proof Write A U B A U A B Then since A and A B are disjoint by Axiom 3 PA u B PA Pam B Now write B A B U A B Clearly A B and A B are disjoint Thus again by Axiom 3 PB PA B PA B Combining the last two statements gives the result D PAGE 6 CHAPTER 1 STATMATH 5117 J TEBBS Example 15 The probability that train 1 is on time is 0957 and the probability that train 2 is on time is 093 The probability that both are on time is 090 a What is the probability that at least one train is on time SOLUTION Denote by Al the event that train 239 is on time for 239 17 2 Then7 PA1 0 A2 PA1 PA2 i PA1 0 A2 095 093 i 090 098 D b What is the probability that neither train is on time SOLUTION By DeMorgan7s Law PA1 r122 PA1 o A217 PA1 0 A2 1 i 098 002 D EXTENSION The inclusionexclusion formula can be extended to any nite sequence of sets A17A2 An For example7 if n 37 PA1UA2UA3 PA1PA2PA3 PA1 A2 PA1 A3 7 ln general7 the inclusion exclusion formula can be written for any nite sequence P Ola Pmi 7 PAl1 mil2 1 PAi1 Al2 A197 1 z lt12 z1lt12lt13 71 1PAi1 m A m m A Of course7 if the sets A1A2 An are disjoint7 then we arrive back at PltUAgt 2m i1 i1 a result implied by Axiom 3 15 Discrete probability models and events TERMINOLOGY If a sample space for an experiment contains a nite or countable number of sample points7 we call it a discrete sample space PAGE 7 CHAPTER 1 STATMATH 5117 J TEBBS 0 Finite number of sample points lt oo77 o Countable number of sample points may equal 007 but can be counted ie7 sample points may be put into a 11 correspondence with N 17 27 gt77 Example 16 A standard roulette wheel contains an array of numbered compartments referred to as pockets7 The pockets are either red7 black7 or green The numbers 1 through 36 are evenly split between red and black7 while 0 and 00 are green pockets On the next play7 one may be interested in the following events A1 13 A2 red A3 000 TERMINOLOGY A simple event is one that can not be decomposed That is7 a simple event corresponds to exactly one sample point u Compound events are those events that contain more than one sample point In Example 167 because A1 only contains one sample point7 it is a simple event The events A2 and A3 contain more than one sample point thus7 they are compound events STRATEGY Computing the probability of a compound event can be done by 1 identifying all sample points associated with the event 2 adding up the probabilities associated with each sample point NOTATION We have used an to denote an element in a set ie7 a sample point in an event In a more probabilistic spirit7 your authors use the symbol E1 to denote the 2th sample point ie7 simple event Thus7 if A denotes any compound event7 PA 2 ME iiEiEA We simply sum up the simple event probabilities for all 239 such that E E A PAGE 8 CHAPTER 1 STATMATH 5117 J TEBBS RESULT Suppose a discrete sample space S contains N lt 00 sample points each of which are equally likely If the event A consists of nu sample points then PA nilN Proof Write S E1 U E2 U U EN where E corresponds to the 2th sample point 239 1 2 N Then N 1 135 PE1UE2UUEN ZPE i1 Now as PE1 PE2 PEN we have that N 1 21313 NPE17 i1 and thus PE1 PE2 PEN Without loss of generality take AE1UE2Uquot39UEna Then N PA PE1UE2UUE ZPE 2i innN D i1 i1 16 Tools for counting sample points 161 The multiplication rule MULTIPLIC39A TION RULE Consider an experiment consisting of k 2 2 stages77 where 711 number of ways stage 1 can occur n2 number of ways stage 2 can occur nk number of ways stage k can occur Then there are k Hnn1 X712gtltgtlt71k 13971 different outcomes in the experiment Example 17 An experiment consists of rolling two dice Envision stage 1 as rolling the rst and stage 2 as rolling the second Here 711 6 and n2 6 By the multiplication rule there are 711 gtlt n2 6 gtlt 6 36 different outcomes D PAGE 9 CHAPTER 1 STATMATH 5117 J TEBBS Example 18 In a eld experiment7 I want to form all possible treatment combinations among the three factors Factor 1 Fertilizer 60 kg7 80 kg7 100kg 3 levels Factor 2 lnsects infectednot infected 2 levels Factor 3 Temperature 70F7 90F 2 levels Here7 n1 3712 27 and n3 2 Thus7 by the multiplication rule7 there are 711 gtltn2 gtltn3 12 different treatment combinations D Example 19 Suppose that an Iowa license plate consists of seven places the rst three are occupied by letters the remaining four with numbers Compute the total number of possible orderings if a there are no letternumber restrictions b repetition of letters is prohibited c repetition of numbers is prohibited d repetitions of numbers and letters are prohibited ANSWERS a 26 X 26 X 26 X 10 X 10 X 10 X 10 17577607000 b 26 X 25 X 24 X 10 X 10 X 10 X 10 15670007000 c 26gtlt26gtlt26gtlt10gtlt9gtlt8gtlt788583040 d 26gtlt25gtlt24gtlt10gtlt9gtlt8gtlt778624000 162 Permutations TERMINOLOGY A permutation is an arrangement of distinct objects in a particular order Order is important PAGE 10 CHAPTER 1 STATMATH 5117 J TEBBS PROBLEM Suppose that we have n distinct objects and we want to order or perrnute these objects Thinking of 71 slots we will put one object in each slot There are 0 71 different ways to choose the object for slot 1 o n 7 1 different ways to choose the object for slot 2 o n 7 2 different ways to choose the object for slot 3 and so on down to o 2 different ways to choose the object for slot 71 7 1 and o 1 way to choose for the last slot PUNCHLINE By the multiplication rule there are nn71n72 21 71 different ways to order permute the n distinct objects Example 110 My bookshelf has 10 books on it How many ways can I permute the 10 books on the shelf ANSWER 10 3628800 D Example 111 Now suppose that in Example 110 there are 4 math books 2 chemistry books 3 physics books and 1 statistics book I want to order the 10 books so that all books of the same subject are together How many ways can I do this SOLUTION Use the multiplication rule Stage 1 Permute the 4 math books 4 Stage 2 Permute the 2 chemistry books 2 Stage 3 Permute the 3 physics books 3 Stage 4 Permute the 1 statistics book 1 Stage 5 Permute the 4 subjects mcps 4 Thus there are 4 gtlt 2 gtlt 3 gtlt 1 gtlt 4 6912 different orderings D PAGE 11 CHAPTER 1 STATMATH 5117 J TEBBS PERMUTATIONS With a collection of n distinct objects7 we want to choose and per mute r of them 7 S The number of ways to do this is 71 BM E The symbol P is read the permutation of 71 things taken 7 at a time77 Proof Envision 7 slots There are 71 ways to ll the rst slot7 n 71 ways to ll the second slot7 and so on7 until we get to the rth slot7 in which case there are n 7 7 1 ways to ll it Thus7 by the multiplication rule7 there are nn71n7r1 W different permutations D Example 112 With a group of 5 people7 I want to choose a committee with three members a president7 a vice president7 and a secretary There are 5 120 P 7 60 5 3 5 i 3 2 different committees possible Here note that order is important For any 3 people selected7 there are 3 6 different committees possible D Example 113 In an agricultural experiment7 we are examining 10 plots of land however7 only four can be used in an experiment run to test four new different fertilizers How many ways can I choose these four plots and then assign fertilizers SOLUTION There are l P104 V 5040 1074 different permutations Here7 we are assuming fertilizer order is important a What is the probability of observing the permutation 77 4 27 6 b What is the probability of observing a permutation with only even numbered plots ANSWERS a 15040 b 1205040 PAGE 12 CHAPTER 1 STATMATH 5117 J TEBBS CURIOSITY What happens if the objects to perrnute are not distinct Example 114 Consider the word PEPPER How many permutations of the letters are possible TRICK lnitially7 treat all letters as distinct objects by writing7 say7 P1E1P2P3E2R With P1E1P2P3E2R7 there are 6 720 different orderings of these distinct objects Now7 we recognize that there are 3 ways to perrnute the Ps 2 ways to perrnute the Es 1 ways to perrnute the Rs Thus7 6 is 3 gtlt 2 gtlt 1 times too large7 so we need to divide 6 by 3 gtlt 2 gtlt 1 ie7 there are 6 7 60 3 2 1 possible perrnutations D MULTINOMIAL COEFFICIENTS Suppose that in a set of n objects7 there are 711 that are sirnilar7 712 that are sirnilar7 nk that are sirnilar7 where 711 712 nk n The number of perrnutations ie7 distinguishable perrnutations7 in the sense that the objects are put into distinct groups of the 71 objects is given by the multinomial coe icient n i n n1n2nk 7771712 7 NOTE Multinornial coef cients arise in the algebraic expansion of the rnultinornial ex pression 1 x2 zk ie7 12 knZ H 172L2 Zk7 D where PAGE 13 CHAPTER 1 STATMATH 5117 J TEBBS Example 115 How many signals7 each consisting of 9 ags in a line7 can be made from 4 white ags7 2 blue ags7 and 3 yellow ags ANSWER 9 7 1260 D 4 2 3 Example 116 In Example 1157 assuming all permutations are equally likely7 what is the probability that all ofthe white ags are grouped together I will offer two solutions The solutions differ in the way I construct the sample space De ne A all four white ags are grouped together SOLUTION 1 Work with a sample space that does not treat the ags as distinct objects7 but merely considers color Then7 we know from Example 115 that there are 1260 different orderings Thus7 N number of sample points in S 1260 Let nu denote the number of ways that A can occur We nd nu by using the multipli cation rule Stage 1 Pick four adjacent slots n1 6 Stage 2 With the remaining 5 slots7 permute the 2 blues and 3 yellows n2 10 Thus7 nu 6 gtlt 10 60 Finally7 since we have equally likely outcomes7 PA naN 601260 00476 D SOLUTION 2 lnitially7 treat all 9 ags as distinct objects ie7 W1W2W3W4B132Y1Y2Y3 and consider the sample space consisting of the 9 different permutations of these 9 distinct objects Then7 N number of sample points in S 9 PAGE 14 CHAPTER 1 STATMATH 5117 J TEBBS Let nu denote the number of ways that A can occur We nd nu again7 by using the multiplication rule Stage 1 Pick adjacent slots for W1 W2W3 W4 n1 6 Stage 2 With the four chosen slots7 permute W1 W2 W3 W4 n2 4 Stage 3 With remaining 5 slots7 permute B1 B2 Y17Y2 Y3 n3 5 Thus7 nu 6 gtlt 4 gtlt 5 17280 Finally7 since we have equally likely outcomes7 PA nilN 172809 x 00476 D 163 Combinations COMBINATIONS Given n distinct objects7 the number of ways to choose r of them r S n7 without regard to order7 is given by n n C E r r 7177 The symbol On is read the combination of 71 things taken r at a time77 By convention7 01 Proof Choosing r objects is equivalent to breaking the 71 objects into two distiguishable groups Group 1 r chosen Group 2 n 7 r not chosen There are On ways to do this D REMARK We will adopt the notation read 71 choose r7 as the symbol for CT The terms are often called binomial coe icients since they arise in the algebraic expansion of a binomial viz7 96 y Zn r0 PAGE 15 CHAPTER 1 STATMATH 5117 J TEBBS Example 117 Return to Example 112 Now7 suppose that we only want to choose 3 committee members from 5 without designations for president7 vice president7 and secretary Then7 there are 5 5 5gtlt4gtlt3l 10 3 3 573 31x2 different committees D NOTE From Examples 112 and 1177 one should note that PW r gtlt OW Recall that combinations do not regard order as important Thus7 once we have chosen our 7 objects there are On ways to do this7 there are then 7 ways to permute those 7 chosen objects Thus7 we can think of a permutation as simply a combination times the number of ways to permute the r chosen objects Example 118 A company receives 20 hard drives Five of the drives will be randomly selected and tested If all ve are satisfactory7 the entire lot will be accepted Otherwise7 the entire lot is rejected If there are really 3 defectives in the lot7 what is the probability of accepting the lot SOLUTION First7 the number of sample points in S is given by l N 20 2039 15504 5 5 20 i 5 Let A denote the event that the lot is accepted How many ways can A occur Use the multiplication rule Stage 1 Choose 5 good drives from 17 Stage 2 Choose 0 bad drives from 3 By the multiplication rule7 there are nu 157 gtlt 6188 different ways A can occur Assuming an equiprobability model ie7 each outcome is equally likely7 PA nilN 618815504 m 0399 D PAGE 16 CHAPTER 1 STATMATH 5117 J TEBBS 17 Conditional probability MOTIVATION In some problems7 we may be fortunate enough to have prior knowl edge about the likelihood of events related to the event of interest It may be of interest to incorporate this information into a probability calculation TERMINOLOGY Let A and B be events in a non empty sample space S The condi tional probability of A given that B has occurred7 is given by Pmmm MB MB provided that PB gt 0 Example 119 A couple has two children a What is the probability that both are girls b What is the probability that both are girls7 if the eldest is a girl SOLUTION a The sample space is given by S M7M7M7F7F7M713713 and N 4 the number of sample points in S De ne A1 1st born child is a girl7 A2 2nd born child is a girl Clearly7 A1 A2 F7 and PA1 A2 147 assuming that the four outcomes in S are equally likely D SOLUTION b Now7 we want PA2A1 Applying the de nition of conditional proba bility7 we get i PA1 A2 714 7 PA2A1 7 PltA1 7 712 712 D PAGE 17 CHAPTER 1 STATMATH 5117 J TEBBS REMARK In a profound sense the new information77 in Example 119 ie7 that the eldest is a girl induces a new or restricted sample space given by 5 F7M7 F7F On this space7 note that PA2 12 computed with respect to 5 Also note that whether you compute PA2lA1 with the original sample space S or compute PA2 with the restricted space 8 you will get the same answer Example 120 In a certain community7 36 percent of the families own a dog7 22 percent of the families that own a dog also own a cat7 and 30 percent of the families own a cat A family is selected at random a Compute the probability that the family owns both a cat and dog b Compute the probability that the family owns a dog7 given that it owns a cat SOLUTION Let C family owns a cat and D family owns a dog In a7 we want PO D But7 PO D PO D 022POD l PD 036 Thus7 PC D 036 gtlt 022 00792 For b7 simply use the de nition of conditional probability PO D 130 PROBABILITY AXIOMS It is interesting to note that conditional probability satis es PDlO 00792030 0264 D the axioms for a probability set function7 when PB gt 0 ln particular7 1 PAlB 2 0 2 PBlB 1 3 If A1A2 is a countable sequence of pairwise mutually exclusive events ie7 Ai Aj 0 forz397 j in S then i1 B iPUillB PAGE 18 CHAPTER 1 STATMATH 511 J TEBBS MULTIPLIOATION LAW OF PROBABILITY Suppose A and B are events in a non empty sample space S Then PA B PBlAPA PAlBPB Proof As long as PA and PB are strictly positive this follows directly from the de nition of conditional probability D EXTENSION The multiplication law of probability can be extended to more than 2 events For example PA1 A2 A3 PA1 A2 A3 PA3lA1 A2 gtlt PA1 A2 PA3lA1 A2 gtlt PA2lA1 gtlt PA1 NOTE This suggests that we can compute probabilities like PA1 A2 A3 sequen tially77 by rst computing PA1 then PA2lA1 then PA3lA1 A2 The probability of a k fold intersection can be computed similarly ie k P A PA1 gtlt PA2lA1 gtlt PA3lA1 m A2 gtlt gtlt P A kil A i1 Example 121 I am dealt a hand of 5 cards What is the probability that they are all spades SOLUTION De ne A to be the event that card 239 is a spade 239 1 2345 Then 13 PltA1gt a 12 PA2lA1 E 11 PltA3lA1 A2gt E 10 PltA4lA1 A2 A3gt E 9 PltA5lA1 A2 A3 A4gt E so that i 13 12 11 10 9 PAGE 19 CHAPTER 1 STATMATH 5117 J TEBBS 18 Independence TERMINOLOGY When the occurrence or non occurrence of A has no effect on whether or not B occurs and vice versa we say that the events A and B are independent Mathematically we de ne A and B to be independent iff PA m B PAPB Otherwise A and B are called dependent events Note that if A and B are independent PA B PAPB PAlB W W PM and PBlA P 1512 P 135 PB Example 122 A red die and a white die are rolled Let A 4 on red die and B sum is odd Of the 36 outcomes in S 6 are favorable to A 18 are favorable to B and 3 are favorable to A B Thus since outcomes are assumed to be equally likely 6 18 PA B PAPB gtlt and the events A and B are independent D Example 123 In an engineering system two components are place in a series that is the system is functional as long as both components are Let A 239 1 2 denote the event that component 239 is functional Assuming independence the probability the system is functional is then PA1 A2 PA1PA2 lf PA 095 for example then PA1 A2 0952 09025 D INDEPENDENCE OF COMPLEMENTS If A and B are independent events so are a A and B b A and E c A and E PAGE 20 CHAPTER 1 STATMATH 5117 J TEBBS Proof We will only prove a The other parts follow similarly PZ B PZlBPB 17 PAlBPB 17 PAPB PZPB D EXTENSION The concept of independence and independence of complements can be extended to any nite number of events in S TERMINOLOGY Let A1 A2 An denote a collection of n 2 2 events in a non empty sample space S The events A1 A2 An are said to be mutually independent if for any subcollection of events say A1Ai2Ak 2 S k S n we have k k Plt A H PA CHALLENGE Come up with a three events which are pairwise independent but not mutually independent COMMON SETTING Many experiments consist of a sequence of 71 trials that are independent eg ipping a coin 10 times If A denotes the event associated with the 2th trial and the trials are independent Plt Agt H PA i1 i1 Example 124 An unbiased die is rolled six times Let A 239 appears on roll 239 for 239 1 2 6 Then PA 16 and assuming independence 6 6 PA1 Ag Ag A4 A5 A6 HPA i1 Suppose that if A occurs we will call it a match77 What is the probability of at least one match in the six rolls SOLUTION Let B denote the event that there is at least one match Then F denotes the event that there are no matches Now pm pg 722 m 23 mm Ms MG pa 0335 i1 Thus PB 17 PB 17 0335 0665 by the complement rule EXERCISE Generalize this result to an n sided die What does this probability converge toasn7oltgt7D PAGE 21 CHAPTER 1 STATMATH 5117 J TEBBS 19 Law of Total Probability and Bayes Rule SETTING Suppose A and B are events in a non empty sample space S We can easily express the event A as follows AA BuA union of disjoint events Thus7 by Axiom 37 PA PA BPA mmbanmm p y where the last step follows from the multiplication law of probability This is called the Law of Total Probability LOTP The LOTP can be very helpful Sometimes com puting PAlB7 PAl 7 and PB may be easily computed with available information whereas computing PA directly may be dif cult NOTE The LOTP follows from the fact that B and F partition S that is7 a B and F are disjoint7 and b BUBS Example 125 An insurance company classi es people as accident prone77 and non accident prone7 For a xed year7 the probability that an accident prone person has an accident is 047 and the probability that a non accident prone person has an accident is 02 The population is estimated to be 30 percent accident prone a What is the probability that a new policy holder will have an accident SOLUTION De ne A policy holder has an accident and B policy holder is accident prone Then PB 03 PAB 04 HP 07 and MAE 02 By the LOTP PM PAlBPB PWEME 0403 0207 026 D PAGE 22 CHAPTER 1 STATMATH 5117 J TEBBS b Now suppose that the policy holder does have an accident What is the probability that he was accident prone 7 SOLUTION We want PBlA Note that PA B PAlBPB 0403 POEM PM PM 03926 046D NOTE From this last part7 we see that7 in general7 PAlBPB PAlBPB PW PltAgt PltABgtPltBgtPltAi gtPltPgt39 This is a form of Bayes Rule Example 126 A lab test is 95 percent effective in detecting a certain disease when it is present sensitivity However7 there is a one percent false positive rate that is7 the test says that one percent of healthy persons have the disease speci city lf 05 percent of the population truly has the disease7 what is the probability that a person has the disease given that a his test is positive b his test is negative SOLUTION Let D disease is present and gt14 test is positive We are given that PD 00057 FORD 095 sensitivity7 FORD 001 speci city7 and7 for a7 we want to compute PDlgtI4 By Bayes Rule7 PIlDPD PgtIltlDPD PgtIltlDPD 0950005 0950005 0010995 PUMP 0323 The reason this is so low is that POED is high relative to PD ln b7 we want PDl By Bayes Rule7 i P Q D P D MW i lt l gt L i PgtIltlDPD PgtIltlDPD 0050005 m 000025 a PAGE 23 CHAPTER 1 STATMATH 5117 J TEBBS Table 11 The general Bayesian scheme Measure before test Result Updated measure PD F PDlF 0005 6 gt14 6 0323 0005 a i 6 000025 NOTE We have discussed the LOTP and Bayes Rule in the case ofthe partition B However7 these rules hold for any partition TERMINOLOGY A sequence of sets B17B27Bk is said to form a partition of the sample space S if a B1 U B2 U U Bk S exhaustive condition7 and b Bl Bj Q for all 2 79739 disjoint condition LAW OF TOTAL PROABILITYrestatecO Suppose that B1 B2 Bk forms a partition of S and suppose PBi gt 0 for all 2 17 27 k Then7 k 2 PltAiBigtPltBii 21 PA Proof Write Cw AA SA B1UB2UUBk A Bi 2 H Thus7 Cw k k PA P A m Bi Z PA m Bi ZPAlBlPBi n i1 21 H BA YES RULE restatecO Suppose that B1 B2 Bk forms a partition of S and suppose that PA gt 0 and PBi gt 0 for all 2 17 27 k Then7 221PAlBiPBi PAGE 24 CHAPTER 1 STATMATH 5117 J TEBBS Proof Simply apply the de nition of conditional probability and the multiplication law of probability to get PAlB39PB P B A Then7 just apply LOTP to PA in the denominator to get the result D REMARK Bayesians will call PBj the prior probability for the event Bj they call PleA the posterior probability of B Example 127 Suppose that a manufacturer buys approximately 60 percent of a raw material in boxes from Supplier 17 30 percent from Supplier 27 and 10 percent from Supplier 3 these are the prior probabilities For each supplier7 defective rates are as follows Supplier 1 0017 Supplier 2 0027 and Supplier 3 003 Suppose that the manufacturer observes a defective box of raw material a What is the probability that it came from Supplier 2 b What is the probability that the defective did not come from Supplier 3 SOLUTION a Let A observe defective7 and B1 B2 and B3 respectively7 denote the events that the box comes from Supplier 17 27 and 3 Note that B17 B2 B3 partitions the space of possible suppliers Thus7 by Bayes Rule7 we have PAlePBz PAlB1PB10 S2IAlf2PB2 f PAlBsPBs 0 3 00106 00203 00301 040 PleA SOLUTION b First7 compute the posterior probability PB3lA By Bayes Rule7 PAlBsPBs PAlB1PB1 f PAlB2PB2 f PAlBsPBs 00301 00106 00203 00301 020 PleA Thus7 PB3lA 17 PB3lA17 020 0807 by the complement rule D PAGE 25 CHAPTER 2 STATMATH 5117 J TEBBS 2 Discrete Distributions Complementary reading Chapter 3 WMS7 except 310 11 21 Random variables MATHEMATICAL DEFINITION A random variable Y is a function whose domain is the sample space S and whose range is the set of real numbers R y foo lt y lt oo WORKING DEFINITION A random variable is a variable whose observed value is determined by chance Example 21 Suppose that our experiment consists of ipping two fair coins The sample space consists of four sample points 5 H7H7 H7T7 T7H7T7T Now7 let Y denote the number of heads observed Before we perform the experiment7 we do not know7 with certainty7 the value of Y What are the possible values of Y Sample point7 El y H H 2 H T 1 T H 1 T T 0 In a profound sense7 a random variable Y takes sample points E E S and assigns them a real number This is precisely why we can think of Y as a function ie7 YlH7Hl 2 YlH7Tl 1 YlT7Hl 1 YlT7Tl 07 so that PY2 PlH7Hl 14 PltY1gt PHTlPlT7Hl 141412 PY 0 PT7 14 PAGE 26 CHAPTER 2 STATMATH 5117 J TEBBS NOTE From these probability calculations note that we can 0 work on the sample space S and compute probabilities from S or 0 work on R and compute probabilities for events Y E B7 where B C R NOTATION We denote a random variable Y with a capital letter we denote an observed value of Y as y a lowercase letter This is standard notation Example 22 Let Y denote the weight7 in ounces7 ofthe next newborn boy in Columbia7 SC Here7 Y is random variable After the baby is born7 we observe y 128 D 22 Probability distributions for discrete random variables TERMINOLOGY The support of a random variable Y is set of all possible values that Y can assume We will often denote the support set as R If the random variable Y has a support set R that is either nite or countable7 we call Y a discrete random variable Example 23 Suppose that in rolling an unbiased die7 we record two random variables X face value on the rst roll Y number of rolls needed to observe a six The support of X is RX 17 23456 The support of Y is Ry 17 237 RX is nite and By is countable thus7 both random variables X and Y are discrete D GOAL With discrete random variables7 we would like to assign probabilities to events of the form Y That is7 we would like to compute PY y for any y E B To do this7 one approach is to determine all sample points E E S such that y and then compute 19M PY y ZPlEi 6 S YE 11 for all y E R However7 as we will see7 this approach is often unnecessary PAGE 27 CHAPTER 2 STATMATH 5117 J TEBBS TERMINOLOGY The function pyy PY y is called the probability mass function pmf for the discrete random variable Y FACTS The prnf pyy for a discrete random variable Y consists of two parts a R7 the support set of Y b a probability assignment PY y7 for all y E R PROPERTIES The prnf pyy for a discrete random variable Y satis es the following 1 pyy gt 0 for all y e R 2 The sum of the probabilities taken over all support points7 must equal one ie7 Elky 1 yER 3 The probability of an event B is computed by adding the probabilities pyy for all y E B ie7 W e B 2mg yEB Example 24 Suppose that we roll an unbiased die twice and observe the face on each roll Here7 the sample space is Let the random variable Y record the sum of the two faces Here7 R 237 12 PY 2 Pall El 6 S where y 2 Pl171l 136 PAGE 28 CHAPTER 2 STATMATH 5117 J TEBBS PY 3 Pall El 6 S where y 3 Pl17 2l Pl271l 236 The calculation PY y is performed similarly for y 4 57 12 The pmf for Y can be given as a formula7 table7 or graph ln tabular form7 the pmf of Y is given by y 2 3 4 5 6 7 8 9 10 11 12 pyy 136 236 336 436 536 636 536 436 336 236 136 A probability histogram is a display which depicts a pmf in graphical form The probability histogram for the pmf in Example 24 is given in Figure 22 015 010 PMPYV 005 Figure 22 Probability histogram for the pmfiii Example 24 The astute reader will note that a closed form formula for the pmf exists ie7 1 7 pm 67l7iylkyi273w712 07 otherwise ls pyy valid Yes7 since pyy gt 0 for all support points y 23127 and Zpyltygt2lt67177y1gt1 yER PAGE 29 CHAPTER 2 STATMATH 5117 J TEBBS QUESTION De ne the events B1 the sum is 3 and B2 the sum is odd ln Example 247 PltBlgt pylt3gt 236 and PBz Elky 1632 PY3 10Y5 10Y7 10Y9 10Y11 236 436 636 436 236 12 Example 25 An experiment consists of rolling an unbiased die until the rst 677 is observed Let Y denote the number of rolls needed Here7 the support set is R 12 Assuming independent trials7 we have PY 1 5 l PY 2 8 gtlt 8 5 5 l PY 3 8 gtlt 8 gtlt 8 in general7 the probability that y rolls are needed to observe the rst 677 is given by pltygtgg for all y 127 Thus7 the pmf for Y is given by aw y 12 07 otherwise ilky ls this a valid pmf Clearly7 pyy gt 0 for all y E R and Ziay yER y1 PAGE 30 CHAPTER 2 STATMATH 5117 J TEBBS 015 005 7 l l l 0 5 1 pMPlYyl O a l 000 20 25 30 llllllllllllll 0 15 V Figure 23 Probability histogram for the pmfih Example 25 IMPORTANT In the last calculation7 we have used an important fact concerning in nite geometric series namely7 if a is any real number and M lt 1 Then7 i L 7 1 7 r39 m0 The proof of this fact can be found in any standard calculus text We will use this fact many times in this course EXERCISE ln Example 257 nd PB7 where B the rst 677 is observed on an odd numbered roll 23 Mathematical expectation TERMINOLOGY Let Y be a discrete random variable with prnf pyy and support R The expected value of Y is given by 190 Zypyy yER PAGE 31 CHAPTER 2 STATMATH 5117 J TEBBS DESCRIPTION ln words7 the expected value for discrete random variable is a weighted average of possible values the variable can assume each value7 1 being weighted with the probability7 guyy7 that the random variable assumes the corresponding value MATHEMATICAL ASIDE For the expected value EY to exist7 the sum above must be absolutely convergent ie7 we need 2 himy lt 00 yER lf EY is not nite ie7 if EY 007 we say that EY does not exist Example 26 Let the random variable Y have pmf 6721 y1234 pyy p 07 otherw1se 04 03 F gt a II E 02 e n 01 00 i l l l l l l 1 0 1 5 2 0 2 5 3 0 3 5 4 0 Figure 24 Probability histogram for the pmfiii Example 26 The pmf for Y is depicted in Figure 24 The expected value of Y is given by gymy 2295721 1410 2310 3210 4110 2 D PAGE 32 CHAPTER 2 STATMATH 5117 J TEBBS Example 27 A random variable whose eapeeted value does not em39st Suppose that the random variable Y has prnf 111 y ER 07 otherwise7 ilky where the support set R 22 239 1237 It is easy to see that pyy is a valid prnf since gpyyy 7 1i 711 However7 EY Earny 2y 1 00 116R yER yeR since R7 the support set7 is countably in nite D INTERPRETATION How is EY interpreted a the center of gravity77 of a probability distribution b a long run average c the rst moment of the random variable STATISTICAL CONNECTION When used in a statistical context7 the expected value EY is sometimes called the mean of Y7 and we might use the symbol a or My when discussing it that is7 My M MY ln statistical settings7 n denotes a population parameter EXPECTATIONS OF FUNCTIONS OF Y Let Y be a discrete random variable with prnf pyy and support R7 and suppose that g is a real valued function Then7 gY is a random variable and E 900 Eyema yER The proof of this result is given on pp 90 D PAGE 33 CHAPTER 2 STATMATH 5117 J TEBBS MATHEMATICAL ASIDE For the expected value EgY to exist7 the sum above must be absolutely convergent ie7 Z l9ylpyy lt 00 yER lf EgY is not nite ie7 if EgY 007 we say that EgY does not exist Example 28 In Example 267 nd EY2 and EeY SOLUTION The functions 91 Y Y2 and 92Y eY are real functions of Y From the de nition7 EY2 El ny yER 21120579 12410 22310 32210 42110 5 and E6Y Z eyp y yER 4 1 yi5 Eelo y 211 51410 52310 53210 541101278 D Example 29 The discrete uniform distribution Suppose that the random variable X has pmf 17717 z 12m me 07 otherwise7 where m is a xed positive integer larger than 1 Find the expected value of X SOLUTION The expected value of X is given by m 1 1 m 1 mm 1 m 1 Em ZWW 295 a 29 a T39 zER 11 m1 In this calculation7 we have used the fact that 3 x the sum of the rst in integers7 equals mm 12 this fact can be proven by mathematical induction PAGE 34 CHAPTER 2 STATMATH 5117 J TEBBS REMARK If m 6 then the discrete uniform distribution serves as a probability model for the outcome of an unbiased die The expected outcome is EX 671 35 D z 1 2 3 4 5 6 pXz 16 16 16 16 16 16 PROPERTIES OF EXPECTATIONS Let Y be a discrete random variable with pmf pyy and support R suppose that 99192 gk are real valued functions and let 0 be any real constant Then a EC C b El09Yl CElgYl C E 221 9Yl 21 ElyYM Since enjoys these above mentioned properties we sometimes call E a linear op erator Proofs to these facts are easy and are left as exercises Example 210 In a one hour period the number of gallons of a certain toxic chemical that is produced at a local plant say Y has the pmf y 0 1 2 3 pyy 02 03 03 02 a Compute the expected number of gallons produced during a one hour period b The cost in tens of dollars to produce Y gallons is given by the cost function CY 3 12Y 2Y2 What is the expected cost in a one hour period SOLUTION a We have that EY Zypyy 002103 203 302 15 yER PAGE 35 CHAPTER 2 STATMATH 5117 J TEBBS Thus7 we would expect 15 gallons of the toxic chemical to be produced per hour For b7 rst compute EY2 EY2 Zyzpyy 02021203 2203 3202 33 yeR Now7 we use the aforementioned linearity properties to compute EOY E3 12Y 2Y2 3 12EY 2EY2 7 3 1215 233 276 Thus7 the expected hourly cost is 27600 D 24 Variance REMARK We have learned that EY is a measure of the center of a probability dis tribution Now7 we turn our attention to quantifying the variability in the distribution TERMINOLOGY Let Y be a discrete random variable with pmf guyy7 support R7 and mean u The variance of Y is given by 02 WY E EKY 7 m 29 7 M2pyy yeR The standard deviation of Y is given by the positive square root of the variance ie7 039 VY FACTS ABOUT THE VARIANCE a 02 gt 0 b 02 0 if and only if the random variable Y has a degenerate distribution ie7 all the probability mass is at one point PAGE 36 CHAPTER 2 STATMATH 5117 J TEBBS c The larger smaller 02 is7 the more less spread in the possible values of Y about the mean M d 02 is measured in units2 and o is measured in the original units NOTE Facts a7 b7 and c above are true if we replace 02 with 0 THE VARIANCE COMPUTING FORMULA Let Y be a random variable not neces sarily a discrete random variable with pmf pyy and mean EY M Then WY EKY 7 M2l 1902 7 M2 The formula VY EY2 7 p2 is called the variance computing formula Proof Expand the Y 7 2 term and distribute the expectation operator as follows EW 7 m 7 EY2 7 2W 2 EY2 i 2MEY if 7 EY2 7 2M2 M2 EY2 i if D Example 211 The discrete uniform distribution Suppose that the random variable X has pmf 17717 z 12m me 07 otherwise7 where m is a xed positive integer larger than 1 Find the variance of X SOLUTION We will nd 02 VX by using the variance computing formula ln Example 297 we computed We rst nd EX2 note that EX2 Zmszw Zzz we m1 fo LW m m 12m 1 6 PAGE 37 CHAPTER 2 STATMATH 5117 J TEBBS Above7 we have used the fact that 21 2 the sum of the rst m squared integers7 equals mm 12m 16 this fact can be proven by mathematical induction The variance of X is equal to 02 EX2M2 m12m1 m12 f7T 771271 12 39 Note that if m 67 as for our unbiased die example7 02 3512 D EXERCISE Find 02 for the prnf in Example 26 notes IMPORTANT RESULT Let Y be a random variable not necessarily a discrete random variable and suppose that a and b are real constants Then Va bY b2VY Proof Exercise D REMARK Taking b 0 above7 we see that Va 07 for any constant a This makes sense intuitively The variance is a measure of variability for a random variable a constant such as a does not vary Also7 by taking a 07 we see that VbY b2VY Both of these facts are important and we will use them repeatedly 25 Moment generating functions TERMINOLOGY Let Y be a discrete random variable with prnf pyy and support R The moment generating function mgf for Y7 denoted by m t7 is given by Viit We Z etyp y yeR provided Eety lt 00 for t in an open neighborhood about 0 ie7 there exists some h gt 0 such that Eety lt 00 for all t E flui lf Eety does not exist in an open neighborhood of 07 we say that the moment generating function does not exist PAGE 38 CHAPTER 2 STATMATH 5117 J TEBBS TERMINOLOGY We call EYk the kth moment of the random variable Y EY 1st moment mean Y2 2nd moment EY3 3rd moment NOTATION WMS use the notation u to denote the kth moment ie EYk 2 This is common notation in statistics applications but I rarely use it REMARK The moment generating function mgf can be used to generate moments In fact from the theory of Laplace transforms it follows that if the mgf exists it char acterizes an in nite set of moments So how do we generate moments RESULT Let Y denote a random variable not necessarily a discrete random variable with support R and mgf myt Then dkmyt dtk t0 EYk Note that derivatives are taken with respect to t Proof Assume without loss that Y is discrete With k 1 we have d d ty 771341 a 5 JOEy yER Z genmy Eganpm Ewe 116R yER Thus it follows that det tY E Y E Y d et0 ltgt Continuing to take higher order derivatives we can prove that dkat CM EM t0 for any integer k 2 1 Thus the result follows E PAGE 39 CHAPTER 2 STATMATH 5117 J TEBBS MATHEMATICAL ASIDE In the second line of the proof of the last result we in terchanged the derivative and possibly in nite sum This is permitted as long as myt EetY exists COMPUTING MEANS AND VARIANCES Let Y denote a random variable not nec essarily a discrete random variable with mgf myt Then we know that det E Y d t0 and dZmyt E Y2 dtz t0 Thus VY EY2 7 lEYl2 dsz t 7 det dtZ dt t0 Will0 7 ml0N2 2 t0 REMARK In many applications being able to compute means and variances is impor tant Thus we can use the mgf as a tool to do this This is helpful because sometimes computing E Y Z yiny yER directly or even higher order moments may be extremely dif cult depending on the form of py Example 212 Suppose that Y is a random variable with pmf pg y 123 0 pyy otherwise Find the mean of Y SOLUTION Using the de nition of expected values the mean of Y is given by MW Emuy 21 yER PAGE 40 CHAPTER 2 STATMATH 5117 J TEBBS Finding this in nite sum is quite dif cult at least7 this sum is not a geometric sum It is easier to use moment generating functions The mgf of Y is given by mylttgt7EltetYgt Zenmy yER Zen12 211 2 00 6t 2 7 a 1 00 t y 1 t 175 276 y0 d 6t dt 2 7 6t t0 et2 7 6t 7 et7et 2 7 62 ts H wl l I for values of t lt ln 2 why Thus7 i de t 7 dt EY 2D t0 Example 213 Let the random variable Y have pmf pyy given by 8771 217012 07 otherwise ilky For this probability dlistribution7 simple calculations verify show that EY 23 VY 59 Let7s check77 these calculations using the mgf It is given by mylttgt7Eltetygt ZenW yER 3 2 1 t0 t0 ta 6 66 66 6 3 2 1 if it 721 7 66e6e PAGE 41 CHAPTER 2 STATMATH 5117 J TEBBS Taking derivatives of myt with respect to t we get d 2 2 i t 7 t 7 2t dtmyltgt 66 66 and d2 2 t 4 2 gnut 86 86 Thus det i 2 0 2 20 7 7 MW d 7 i 86 85 746723 dZmyt 2 4 Ey2 70 7201 dt2 66 66 t0 so that WY EY2 lEYl2 1232 59 So in this example we can use the mgf to get EY and VY or we can compute EY and VY directly We get the same answer as we should D REMARK Not only is the mgf a tool for computing moments but it also helps us to characterize a probability distribution How When an mgf exists it happens to be unique Thus if two random variables have same mgf then they have the same probability distribution Sometimes this is referred to as the uniqueness property of mgfs it is based on the uniqueness of Laplace transforms For now however it suf ces to envision the mgf as a special expectation that generates moments This in turn helps us to compute means and variances of random variables 26 Binomial distribution BERNO ULLI TRIALS Many experiments consist of a sequence of trials where i each trial results in a success or a failure ii the trials are independent and iii the probability of success denoted by p 0 lt p lt 1 is the same on every trial PAGE 42 CHAPTER 2 STATMATH 5117 J TEBBS TERMINOLOGY In a sequence of n Bernoulli trials7 denote by Y the number of successes out of 717 where n is xed We call Y a binomial random variable7 and say that Y has a binomial distribution with parameters 71 and success probability p77 Shorthand notation is Y N bnp Example 214 Each of the following situations represent binomial experiments Are you satis ed with the Bernoulli assumptions in each instance a Suppose we ip a fair coin 10 times and let Y denote the number of tails in 10 ips Here7 Y N bn 1010 05 b In an agricultural experiment7 forty percent of all plots respond to a certain treat ment I have four plots of land to be treated If Y is the number of plots that respond to the treatment7 then Y N bn 410 04 c In rural Kenya7 the prevalence rate for HIV is estimated to be around 8 percent Let Y denote the number of HIV infecteds in a sample of 740 individuals Here7 Y N bn 74010 008 d It is known that screws produced by a certain company do not meet speci cations ie7 are defective with probability 0001 Let Y denote the number of defectives in a package of 40 Then7 Y N bn 4010 0001 D DERIVATION We now derive the prnf of a binomial random variable That is7 we need to compute pyy PY y7 for each possible value of y E R Recall that Y is the number of successes77 in n Bernoulli trials so the support set is R y y 07 17 27 QUESTION In a sequence of n trials7 how can we get exactly y successes Denoting S success F failure a possible sample point may be SSFSFSFFSFSF PAGE 43 CHAPTER 2 STATMATH 5117 J TEBBS Because the trials are independent7 the probability that we get any particular ordering of y successes and n 7y failures is py17p y Now7 how many ways are there to choose y successes from n trials We know that there are ways to do this Thus7 the prnf forYisfor0ltplt17 WW1 7 MW y 0 1 2 pyy p 07 otherw1se 03 F E 02 i L a n 01 00 1 1 1 1 1 0 1 2 3 4 Figure 25 Probability histogram for the number of plots which respond to treatment This represents the bn 410 04 model in Example 214b Example 215 In Example 214b7 assume that Y N bn 410 04 Here are the probability calculations for this binomial model PY 0 py0 30 4017 04 1 gtlt 0 4 gtlt 0 6 01296 PY 1 pyl110 4117 0441 4 gtlt 0 41 gtlt 0 63 0 3456 PY 2 py2 0 4217 04 2 6 gtlt 0 42 gtlt 0 62 0 3456 PY 3 py3 0 4317 04H 4 gtlt 0 43 gtlt 0 61 01536 PY 4 py4 0 4417 04 4 1 gtlt 0 44 gtlt 0 60 0 0256 The probability histogram is depicted in Figure 25 D PAGE 44 CHAPTER 2 STATMATH 5117 J TEBBS Example 216 In a small clinical trial with 20 patients7 let Y denote the number of patients that respond to a new skin rash treatment The physicians assume that a binomial model is appropriate so that Y N bn 20107 where p denotes the probability of response to the treatment In a statistical setting7 p would be an unknown parameter that we desire to estimate For this problem7 we7ll assume that p 04 Compute a My 5 b PY 2 5 and c My lt10 a PY 5 py5 2500450620 5 00746 b 20 PY 2 5 2130 y Z 04yo520y 115 This computation involves using the binomial pmf 16 times and adding the results TRICK Instead of computing the sum ZZZ05 0 04y0620 y directly7 we can write 1312517PY 47 by the complement rule We do this because WMS7s Appendix III Table 17 pp 783 785 contains binomial probability calculations of the form a n 7 Fyltagt 2 PltY a 2ypylt1epr 2 110 for different 71 and p With 71 20 and p 047 we see from Table 1 that PY S 4 0051 Thus7 PY 2 5 1 7 0051 0949 c PY lt10 PY S 9 07557 from Table 1 D REMARK The function Fm E 130 S y is called the cumulative distribution function well talk more about this function in the next chapter PAGE 45 CHAPTER 2 STATMATH 5117 J TEBBS 015 010 005 l l l l 5 10 15 PM PlYVl 000 l 0 V Figure 26 Probability histogram for the number of patierits responding to treatmerit This represerits the bn 20p 04 model iri Epample 216 CURIOSITY ls the binomial pmf a valid pmf Clearly pyy gt 0 for all y To Check that the pmf sums to one7 consider the binomial expansion rt n n n 19 1 7pl Zlt py 7p y y0 y The LHS Clearly equals 17 and the RHS represents the bnp pmf Thus7 pyy is valid MGF FOR THE BINOMIAL DISTRIBUTION Suppose that Y N bnp Then the mgf of Y is given by mylttgt Ea pr 7 MW ltpe gtylt1 7 2W q wet it 210 y where q 1 7 p The last step follows from noting that 230 pety1 7 p y is the binomial expansion of q pet D MEAN AND VARIANCE OF THE BINOMIAL DISTRIBUTION We want to compute EY and VY where Y N bnp To do this7 we will use the mgf Taking the derivative PAGE 46 CHAPTER 2 STATMATH 5117 J TEBBS of myt with respect t we get me 2 1mm d d a pet 7 W pet 1pe E Thus 11960 71Q pYHp 71p d Em Ema1t nltq 1960 t t0 since qp 1 Now we need to nd the second moment By using the product rule for derivatives we have d2 d Emat 7 a W petY pet 7 7101 7 1q pet 2p6t2 nq pe 1pe quotLlt Thus d2 7 n7 EY2 7 Emat 7 nn71qp60 2p602nQp60 11960 7 71017 1p2np t0 Finally the variance is calculated by appealing to the variance computing formula ie WY 7 EY2 7 lEYl2 7101 712 7119 7 WV np17p 5 Example 217 Artichokes are a marine climate vegetable and thrive in the cooler coastal climates Most will grow on a wide range of soils but produce best on a deep fertile well drained soil Suppose that 15 artichoke seeds are planted in identical soils and temperatures and let Y denote the number of seeds that germinate If 60 percent of all seeds germinate on average and we assume a b15 06 probability model for Y the mean number of seeds that will germinate is EY np 1506 9 The variance is 02 711017 p 150604 36 seeds2 The standard deviation is 039 V36 19 seeds D PAGE 47 CHAPTER 2 STATMATH 5117 J TEBBS SPECIAL BINOMIAL DISTRIBUTION In the bnp family when n 1 the binomial pmf reduces to py1p1 y7 y 071 0 pyy otherwise 7 This is sometimes called the Bernoulli distribution Shorthand notation is Y N b1p The sum of n independent b1p random variables actually follows a bnp distribution 27 Geometric distribution TERMINOLOGY Imagine an experiment where Bernoulli trials are observed If Y denotes the trial on which the rst success occurs then Y is said to follow a geometric distribution with parameter p the probability of success on any one trial 0 lt p lt 1 This is sometimes written as Y N geomp The pmf for Y is given by 1 My lp y 17 2737 0 pyy otherwise 7 RATIONALE The form of this pmf makes intuitive sense we need y 7 1 failures each of which occurs with probability 1 7 p and then a success on the yth trial this occurs with probability p By independence we multiply 171 X 171 X X 17pXp17py 1p y 7 1 failures NOTE Clearly pyy gt 0 for all y Does pyy sum to one Note that 217Py 1p p2lt17PV 27 7 p 171719 In the last step we realized that 201 7 pw is an in nite geometric sum with common ratio 1 7 p D PAGE 48 CHAPTER 2 STATMATH 5117 J TEBBS Example 218 Biology students are checking the eye color of fruit ies For each y7 the probability of observing white eyes is p 025 What is the probability the rst white eyed y will be observed among the rst ve ies that we check SOLUTION Let Y denote the number of ies needed to observe the rst white eyed y We need to compute PY S 5 We can envision each y as a Bernoulli trial each y either has white eyes or not lf we assume that the ies are independent7 then a geometric model is appropriate ie7 Y N geomp 0257 so that 025 m 008 Thus7 PY S 5 21PY y x 077 The prnf for the geomp 025 model is depicted in Figure 27 D MGF FOR THE GEOMETRIC DISTRIBUTION Suppose that Y N geomp Then the mgf of Y is given by pet my where q 1 7p for t lt ilnq Proof Exercise D MEAN AND VARIANCE OF THE GEOMETRIO DISTRIBUTION With the rngf we can derive the mean and variance Differentiating the mgf7 we get d d pet MO 7 916 petkqe t E 7 t 7 my dtmyltgt dtlt17 get 17 get Thus7 Eygm woltweogtimoltweogtplt1qgtpltqgt 1 t0 1 160 1 1939 Similar calculations show 19 2 t0 p EltY2gt 771M PAGE 49 CHAPTER 2 STATMATH 5117 J TEBBS 025 020 010 005 l l I I I I I 5 10 000 i 39 i 0 15 20 pMPtYyl 0 a V Figure 27 Probability histogram for the number of ies rieeded t0 rid the rst white eyed fly This represerits the geornp 025 model iri Example 218 Finally7 Example 219 At an apple orchard in Maine7 bags of 20 lbs77 are continually observed until the rst underweight bag is discovered Suppose that four percent of bags are under lled If we assume that the bags are independent7 and if Y denotes the the number of bags observed7 then Y N geornp 004 The mean number of bags we will observe is 1 1 EY E 25 bags The variance is q 096 2 V Y 7 600 b D lt gt p 004 lt agsgt PAGE 50 CHAPTER 2 STATMATH 5117 J TEBBS 28 Negative binomial distribution NOTE The negative binomial distribution can be motivated from two perspectives o as a generalization of the geometric o as a reversal of the binomial Recall that the geometric random variable was de ned to be the number of trials needed to observe the rst success in a sequence of Bernoulli trials TERMINOLOGY Imagine an experiment where Bernoulli trials are observed If Y denotes the trial on which the rth success occurs 7 2 1 then Y has a negative binomial distribution with parameters 7 and p where p denotes the probability of success on any one trial 0 lt p lt 1 This is sometimes written as Y N nibrp PMF FOR THE NEGATIVE BINOMIAL The pmf for Y N nibrp is given by iilpT1pyT y m 1r 2 0 pyy otherwise 7 Of course when r l the nibrp pmf reduces to the geomp pmf RATIONALE The logic behind the form of pyy is as follows If the rth success occurs on the yth trial then r 71 successes must have occurred during the 1st y 71 trials The total number of sample points in the underlying sample space S where this is the case is given by the binomial coef cient Zrj which counts the number of ways you order 7 71 successes and y 7r failures in the 1st y 7 1 trials The probability of any particular ordering by independence is given by p7 1l 7mg Now on the yth trial we observe the rth success this occurs with probability p Thus putting it all together we get 9 1 1 7 9 1 7 39r 17 yr 717 yr T71 p Xp T71 p p pertains to lst yil trials PAGE 51 CHAPTER 2 STATMATH 5117 J TEBBS Example 220 A botanist in lowa City is observing oak trees for the presence of a certain disease From past experience7 it is known that 30 percent of all trees are infected p 030 Treating each tree as a Bernoulli trial ie7 each tree is infectednot7 what is the probability that she will observe the 3rd infected tree 7quot 3 on the 6th or 7th observed tree SOLUTION Let Y denote the tree on which she observes the 3rd infected tree Then7 Y N nibr 310 03 We want to compute PY 6 or Y 7 671 PY 6 3 7103317 036 3 00926 7 71 PY 7 3 103317 037 3 00972 Thus7 PY6orY7 PY6PY7 00926 00972 01898 D RELATIONSHIP WITH THE BINOMIAL Recall that in a binomial experiment7 we x the number of Bernoulli trials7 n and we observe the number of successes However7 in a negative binomial experiment7 we x the number of successes we are to observe7 r and we continue to observe Bernoulli trials until we reach that success This is another way to think about the negative binomial model MGF FOR THE NEGATIVE BINOMIAL DISTRIBUTION Suppose that Y N nibrp t 7 p6 mYt lt17 get 7 where q 1 7 p7 for all t lt 7 ln q Before we prove this7 let7s state and prove a lemma The mgf of Y is given by LEMMA Suppose that r is a nonnegative integer Then7 i 3106024 lt1 7 get yr PAGE 52 CHAPTER 2 STATMATH 5117 J TEBBS Proof of lemma Consider the function fw 1 7 w 7 where r is a nonnegative integer It is easy to show that f w 7 1 WWW WW 7 W 1gtlt17 WW ln general7 fzw rr 1 r z 7 11 7 w TZ7 where fzw denotes the 2th derivative of f with respect to w Note that fltzgtwl Orr1rz71 w Now7 consider writing the McLaurin Series expansion of fw ie7 a Taylor Series ex pansion of fw about w 0 this expansion is given by 00 fZ0wz WW 7 Z 00 rr1rz71 Z 00 zr71 Z T71 w Now7 letting w get and z y 7 r the lemma is proven for 0 lt q lt 1 D MGF Now that we are nished with the lemma7 let7s nd the mgf ofthe nibrp random variable With q 1 7 p7 y39r i 1 Z gag7 y p39rqy739r r 7 1 117 7 W i y 1 MW r 7 1 217 1963717 get 1 196 T i 17qet 7 for t lt 7 lnq7 where the penultimate step follows from the lemma D PAGE 53 CHAPTER 2 STATMATH 5117 J TEBBS REMARK Showing that the nib7 7 p distribution sums to one can be done by using a similar series expansion as above We omit it for brevity MEAN AND VARIANCE OF THE NEGATIVE BINOMIAL DISTRIBUTION For a nibrp random variable7 with q 1 7 p7 EY 5 and m VY E Proof Exercise D 29 Hypergeometric distribution SETTING Consider a collection of N objects eg7 people7 poker chips7 plots of land7 etc and suppose that we have two dichotomous classes7 Class 1 and Class 2 For example7 the objects and classes might be Poker chips redblue People infectednot infected Plots of land respond to treatmentnot From the collection of N objects7 we observe a sample of n lt N of them7 and record Y7 the number of objects in Class 1 ie7 the number of successes REMARK This sounds like binomial setupl However7 the difference is that7 here7 N7 the population size7 is nite the population size7 theoretically7 is assumed to be in nite in the binomial model Thus7 if we sample from a population of objects Without replacement7 then the success probability changes trial to trial This7 violates the binomial model assumptionsll Of course7 if N is large ie7 in very large populations7 the two models will be similar7 because the change in the probability of success from trial to trial will be small maybe so small that it is not of practical concern PAGE 54 CHAPTER 2 STATMATH 5117 J TEBBS HYPERGEOMETRIC DISTRIBUTION Envision a collection of 71 objects sampled at random and without replacement from a population of size N where 7 denotes the size of Class 1 and N 7 7 denotes the size of Class 2 Let Y denote the number of objects in the sample that belong to Class 1 Then Y has a hypergeometric distribution written Y N hyperN n r where N total number of objects 7 number of the 1st class eg success N 7 r number of the 2nd class eg failure 71 number of objects sampled HYPERGEOMETRIC PMF The pmf for Y N hyperN n r is given by T N7T mfg y E R WW 0 otherwise where the support set R y E N max0n 7 N r S y S minnr BREAKDOWN In the hyperNnr pmf we have three parts number of ways to choose y Class 1 objects from r number of ways to choose 71 7 y Class 2 objects from N 7 r number of sample points REMARK In the hypergeometric model it follows that pyy sums to 1 over the support R but we omit this proof for brevity see Exercise 3176 pp 148 WMS Example 221 In my sh tank at home there are 50 sh Ten have been tagged lf I catch 7 sh and random and without replacement what is the probability that exactly two are tagged SOLUTION Here N 50 total number of sh n 7 sample size 7 10 tagged sh Class 1 N 7 r 40 untagged sh Class 2 and y 2 number of tagged sh caught Thus PWWPWDWW PAGE 55 CHAPTER 2 STATMATH 5117 J TEBBS What about the probability that my catch contains at most two tagged sh SOLUTION Here7 we want PY2 PY0PY1PY2 10015470 11015460 12015450 7 7 7 01867 03843 02964 08674 D Example 222 A supplier ships parts to another company in lots of 25 parts The receiving company has an acceptance sampling plan which adopts the following ac ceptance rule sample 5 parts at random and without replacement If there are no de fectives in the sample7 accept the entire lot otherwise7 reject the entire lot77 Let Y denote the number of defectives in the sampled parts ie7 out of 5 Then7 Y N hyper2557 7 where 7 denotes the number defectives in the lot in real life7 r is unknown De ne 002 PY 0 where p r25 denotes the true proportion of defectives in the lot The symbol 001 denotes the probability of accepting the lot which is a function of p Consider the following table7 whose entries are computed using the above probability expression 7 p 0009 0 0 100 1 004 080 2 008 063 3 012 050 4 016 038 5 020 029 10 040 006 15 060 001 PAGE 56 CHAPTER 2 STATMATH 5117 J TEBBS REMARK The graph of 001 versus p is sometimes called an operating character istic curve Of course as r or equivalently p increases the probability of accepting the lot decreases Acceptance sampling is important in statistical process control used in engineering and manufacturing settings In practice lot sizes may be very large eg N 1000 etc and developing sound sampling plans is crucial in order to avoid using defective parts in nished products D MEAN AND VARIANCE OF THE HYPERGEOMETRIC DISTRIBUTION lf Y N hyperNnr then and vltYgt NNT K We will prove this result later in the course RELATIONSHIP WITH THE BINOMIAL As noted earlier the binomial and hyperge ometric models are similar The key difference is that in a binomial experiment p does not change from trial to trial but it does in the hypergeometric setting noticeably if N is small However one can show that for y xed T if n WM pmf as rN a p The upshot is this if N is large ie the population size is large a binomial probability calculation withp rN closely approximates the corresponding hypergeometric probability calculation See pp 123 Example 223 In a small town there are 900 right handed individuals and 100 left handed individuals We take a sample of size n 20 individuals from this town at random and without replacement What is the probability that 4 or more people in the sample are left handed SOLUTION Let X denote the number of left handed individuals in our sample Let7s compute this probability PX 2 4 using both the binomial and hypergeometric models PAGE 57 CHAPTER 2 STATMATH 5117 J TEBBS o Hypergeometric Here N 1000 r 100 N 7 r 900 and n 20 Thus 3 100 900 PX24 17PX3 1iZmT020 wm0130947 0 m0 20 o Binomial Here n 20 and p rN 010 Thus 3 20 PX 2 4 17 PX 3 1 i Z 01w0920w m 0132953 D x m0 REMARK Of course since the binomial and hypergeometric models are similar when N is large their means and variances are similar too Note the similarities recall that the quantity rN a p as N a 00 and 210 Poisson distribution TERMINOLOGY Let the number of occurrences in a given continuous interval of time or space be counted A Poisson process enjoys the following properties 1 the number of occurrences in non overlapping intervals are independent random variables 2 The probability of an occurrence in a suf ciently short interval is proportional to the length of the interval 3 The probability of 2 or more occurrences in a suf ciently short interval is zero GOAL Suppose that an experiment satis es the above three conditions and let Y denote the number of occurrences in an interval of length one Our goal is to nd an expression for pyy PY y the pmf of Y PAGE 58 CHAPTER 2 STATMATH 5117 J TEBBS APPROACH Envision partitioning the unit interval 01 into n subintervals each of size Now if n is suf ciently large ie much larger than y then we can approximate the probability that y events occur in this unit interval by nding the probability that exactly one event occurrence occurs in exactly y of the subintervals o By Property 2 we know that the probability of one event in any one subinterval is proportional to the subinterval7s length say An where A is the proportionality constant 0 By Property 3 the probability of more than one occurrence in any subinterval is zero for 71 large 0 Consider the occurrencenon occurrence of an event in each subinterval as a Bernoulli trial Then by Property 1 we have a sequence of n Bernoulli trials each with probability of success p An Thus a binomial calculation gives A y A Ty W 10 e C H 1 y n 71 Now to get a better approximation we let n grow without bound Then 2 l 1 y A n 1 lim PY y lirn n Aya 1 a a A mace mace 77 71 17 7y 1y 1 w gt m an b 0n dV Now the limit of the product is the product of the limits Thus nna1nay1 lirn an lirn 1 mace mace 71 Ag Ay lirn bn irn a a mace mace yl yl A m lirn on lirn 1 a a e A mace ace n y 1 lirndn lirnlt1 A 1 PAGE 59 CHAPTER 2 STATMATH 5117 J TEBBS Thus7 Av 5 y012 pyy p 0 otherw1se 7 This is the pmf of a Poisson random variable with parameter A We sometimes write Y N Poisson That pyy sums to one is easily seen as 00 Me 2mm Z yER y0 y39 00 y AZ 7 5 i 210 yl 6 e 17 A 7 00 y i i i A since 6 i 2210 A yl7 the McLaurin series expansion of e D EXAMPLES OF POISSON PROCESSES A A A 00 D H V V V xed period of time A A U 4 V V counting the number of people in a certain community living to 100 years of age counting the number of customers entering a post of ce in a given day counting the number of a particles discharged from a radioactive substance in a counting the number of blemishes on a piece of arti cial turf counting the number of chocolate chips in a Chips Ahoy cookie Example 224 The number of cars abandoned weekly on a certain highway is modeled using a Poisson distribution with A 22 In a given week7 what is the probability that a no cars are abandoned b exactly one car is abandoned c at most one car is abandoned d at least one car is abandoned PAGE 60 CHAPTER 2 STATMATH 5117 J TEBBS SOLUTIONS Let Y denote the number of cars abandoned weekly a 0 722 PY0 py0 e 239201108 1 722 PY 1 pyl 225 02438 c PY lt1 PY 0 PY 1 py0 pyl 01108 02438 03456 d PY gt 1 1 i PY 0 1 71010 17 01108 08892 D 025 020 015 PM PYV 010 005 ltm Figure 28 Probability histogram for the number of abandoned cars This represents the Poisson 22 model in Example 224 REMARK WMS7s Appendix lll7 Table 37 pp 787 791 includes an impressive table for Poisson probabilities of the form 1 Me Fm PY a 2 y Recall that this function is called the cumulative distribution function of Y This makes computing compound event probabilities much easier PAGE 61 CHAPTER 2 STATMATH 511 J TEBBS MGF FOR THE POISSON DISTRIBUTION Suppose that Y N Poisson The mgf of Y for all t is given by 00 y A mytEetY ZenL 210 y 7A 00 Mt e Z X y0 Aet 646 t expet 7 MEAN AND VARIANCE OF THE POISSON DISTRIBUTION With the mgf we can derive the mean and variance Differentiating the mgf we get d d t t t myt E awnt Eexppde 71 A6 expe 7 Thus dt Now we need to nd the second moment By using the product rule for derivatives we EY innt 7 A60 exp50 71 A have d2 d gnut E Aetexpet 71 Aetexpet 71 Aet2expet 7 z WW Thus EY2 A A2 and WY EY21EYl2 A A2 7 A2 A REVELATION With a Poisson model the mean and variance are always equal D Example 225 Suppose that Y denotes the number of monthly defects observed at an automotive plant From past experience engineers believe the Poisson model is appro priate and that Y N Poisson7 QUESTION 1 What is the probability that in any given month we observe 11 or more defectives PAGE 62 CHAPTER 2 STATMATH 5117 J TEBBS SOLUTION We want to compute PY 2 11 17 PY S 10 17 0901 0099 Table 3 QUESTION 2 What about the probability that7 in a given year7 we have two or more months with 11 or more defectives SOLUTION First7 we assume that the 12 months are independent is this reasonable7 and call the event B 11 or more defects in a month a success Thus7 under our independence assumptions and viewing each month as a trial7 we have a sequence of 12 Bernoulli trials with success probability p PB 0099 Let X denote the number of months where we observe 11 or more defects Then7 X N b127 00997 and PXgt2 17PX07PX1 7 17 1020099017 0099 7 112009911 009911 1 7 02862 7 03774 03364 D POISSON PROCESSES OF ARBITRARY LENGTH lf events or occurrences in a Pois son process occur at a rate of A per unit time or space7 then the number of occurrences in an interval of length It also follows a Poisson distribution with mean At Example 226 Phone calls arrive at a switchboard according to a Poisson process7 at a rate of A 3 per minute Thus7 if Y represents the number of calls received in 5 minutes7 we have that Y N Poisson15 The probability that 8 or fewer calls come in during a 5 minute span is given by 15615 PY S 8 7 00377 from Table 3 D POISSON BINOMIAL LINK We have seen that the hypergeometric and binomial mod els are related as it turns out7 so are the Poisson and binomial models This should not be surprising because we derived the Poisson pmf by appealing to a binomial approximation PAGE 63 CHAPTER 2 STATMATH 5117 J TEBBS RELATIONSHIP Suppose that Y N bnp If n is large and p is small7 then We yl 7 n n my ypy1ip y for y E R 012n7 where A np Example 227 Hepatitis C HCV is a viral infection that causes cirrhosis and cancer of the liver Since HCV is transmitted through contact with infectious blood7 screening donors is important to prevent further transmission The World Health Organization has projected that HCV will be a major burden on the US health care system before the year 2020 For public health reasons7 researchers take a sample of n 1875 blood donors and screen each individual for HCV lf 3 percent of the entire population is infected7 what is the probability that 50 or more are HOV positive SOLUTION Let Y denote the number of HOV infected individuals in our sample We compute this probability PY 2 50 using both the binomial and Poisson models 0 Binomial Here7 n 1875 and p 003 Thus7 1875 PY 2 50 Z 003y0971875y m 0818783 1150 0 Poisson Here7 A np 1875003 5625 Thus7 f 5625ye 563925 PY 2 50 y 0814932 1150 As we can see7 the Poisson approximation is quite good D RELATIONSHIP One can see that the hypergeometric7 binomial7 and Poisson models are related in the following way hyperNnr lt gt bnp lt gt Poisson The rst link results when N is large and rN a p The second link results when n is large and p is small so that An a p When these situations are combined7 as you might suspect7 one can approximate the hypergeometric model with a Poisson modelll PAGE 64 CHAPTER 3 STATMATH 5117 J TEBBS 3 Continuous Distributions Complementary reading from WMS Chapter 4 omit 411 31 Introduction RECALL In the last chapter7 we focused on discrete random variables Recall that a discrete random variable is one that can assume only a nite or countable number of values We also learned about probability mass functions pmfs Loosely speaking7 these were functions that told us how to assign probabilities and to which points we assign probabilities TERMINOLOGY A random variable is said to be continuous if its support set is uncountable ie7 the random variable can assume an uncountably in nite number of values We will present an alternate de nition shortly 32 Cumulative distribution functions NEW We now introduce a new function associated with any random variable discrete or continuous TERMINOLOGY The cumulative distribution function cdf of a random variable Y7 denoted by Fyy7 is given by Fyy PY S y fOr all y E R Note that the cdf is de ned for all y E R not just for those values of y E R7 the support set of Y REMARK Every random variable7 discrete or continuous7 has a cdf We7ll start by computing some cdfs for discrete random variables PAGE 65 CHAPTER 3 STATMATH 5117 J TEBBS Example 31 Let the random variable Y have prnf any y012 07 otherwise Consider the following probability calculations we My lt 0 My 0 3 8 Fy1 PY1PY0PY1g Fy2 PY2PY0PY1PY2g1 Furtherrnore7 o foranyylt0PYSy0 o forany0ltylt1PY yPY0 o foranylltylt213032PY0PY1 O ylt0 0 ylt1 1 ylt2 H CMUI GNU 122 7 Note that we have de ned Fyy for all y E R Sorne points are worth mentioning concerning the graphs of the pmf and cdf o PMF The height of the bar above y is the probability that Y assumes that value 7 For any y not equal to 017 or 27 pyy 0 PAGE 66 CHAPTER 3 STATMATH 5117 J TEBBS o CDF Fyy is a nondecreasing function see theoretical properties below i 0 S Fyy S 1 this makes sense since Fyy is a probabilityll The height of the jump77 at a particular point is equal to the probability associated with that point THEORETICAL PROPERTIES Let Y be a random variable discrete or continuous and suppose that Fyy is the cdf for Y Then i hmyaioo FYl 07 ii limyndr00 Fyy 17 iii Fyy is a right continuous function that is7 for any real 1 limyna Fyy Fy a7 and iv Fyy is a nondecreasing function that is7 for any yl S yg Fyy1 S Fyy2 EXERCISE Graph the cdf for the b57 02 and Poisson2 distributions 33 Continuous random variables ALTERNATE DEFINITION A random variable is said to be continuous if its cdf Fyy is a continuous function of y RECALL The cdfs associated with discrete random variables are stepfunctions Such functions are certainly not continuous however7 they are still right continuous TERMINOLOGY Let Y be a continuous random variable with cdf The prob ability density function pdf for Y7 denoted by jfyy7 is given by d fYl digEdy PAGE 67 CHAPTER 3 STATMATH 511 J TEBBS provided that iFy y E F y exists Furthermore appealing to the Fundamental dy Y Theorem of Calculus we know that y wa now REMARK These equations illustrate key relationships linking pdfs and cdfs for con tinuous random variablesll PROPERTIES OF CONTINUOUS PDFs Suppose that Y is a continuous random vari able with pdf fyy and support R Then 1 fyy gt 0 for all y E R 2 fR fyydy 1 ie the total area under the pdf equals one 3 The probability of an event B is computed by integrating the pdf fyy over B ie PY E B fB fyydy for any B C R REMARK Compare these to the analogous results for the discrete case see page 28 in the notes The only difference is that in the continuous case integrals replace sums Example 32 Suppose that Y has the pdf 0 2 ltylt O Uh A otherwise 7 This pdf is depicted in Figure 39 We want to nd the cdf To do this we need to compute Fyy PY S y for all y E R There are three cases 0 when y S 0 we have 2 y wa nomima 0 when 0 lty lt 2 we have FyyOfytdt OdtOydt PAGE 68 CHAPTER 3 STATMATH 5117 J TEBBS fY Figure 39 Probability density function fyy in Example 32 0 when y 2 27 we have 2 0 21 y Fyy fytdt 0dtO Edt2 Odt t 0 7 01 2 0 Putting it all together7 we have 07 y lt 0 Fm 112 0 S y lt 2 1 y 2 2 Example 33 Remission times for a certain group of leukemia patients Y7 measured in months has Cdf 0 FYUJ 7 17 e yS y 2 0 ylt0 PAGE 69 CHAPTER 3 STATMATH 5117 J TEBBS FY Figure 310 Cumulative distribution function Fyy in Example 33 This cdf is depicted in Figure 310 Let7s calculate the pdf of Y Again7 we need to consider all possible cases 0 when y lt 07 d d i 7F 7 70 i 0 fYl dy y dy y 0 when y 2 07 d d 1 7 7 ill3 7 7 ill3 fyy dyFyy dy lt1 6 gt e Thus7 putting it all together we get if y 2 0 fYy 3 07 otherwise This pdf is depicted in Figure 311 D EXERCISE For the cdfs in Examples 32 and 337 verify that these functions satisfy the four theoretical properties for any cdf PAGE 70 CHAPTER 3 STATMATH 5117 J TEBBS Figure 311 Probability derisity furietiori fyy iri Example 33 This is a probability model for leukemia remissiori times UBIQUITO US RESULT Recall that one of the properties of a continuous pdf is that W e B 7 fyydy for any B C R If B y a S y S b ie7 B a7b7 then P0 S Y S b b fYldy FYb FYa Example 34 In Example 337 what is the probability that a randomly selected patient will have a rernission time between 2 and 5 months That is7 what is P2 S Y S 5 SOLUTION We can attack this two ways one using the cdlf7 one with the pdf 0 CDF refer to Figure 310 P2 S Y S 5 Fy5 7 Fy2 17 e75 7 1 7 e ZS 7 6723 7 6753 m 0325 PAGE 71 CHAPTER 3 STATMATH 5117 J TEBBS 0 PDF refer to Figure 311 5 1 PQltYlt5 AwW 23 1 5 g gtlt 73e y3 6723 7 6753 m 0325 D FACT If Y is a continuous random variable with pdf fyy7 then PY a 0 for any real constant a This follows since mymngygwmeo Thus7 for continuous random variables7 probabilities are assigned to single points with zero probability This is the key difference between discrete and continuous random variables An immediate consequence of the above fact is that for any continuous random variable Y7 ngygmngyltmPmltYmPmltYltm and the common value is fyydy Example 35 Suppose that Y represents the time in seconds until a certain chemical reaction takes place in a manufacturing process7 say7 and varies according to the pdf Eye ya y 2 0 0 fYl otherwise 7 a Find the c that makes this a valid pdf b Compute P35 S Y lt 45 SOLUTION a To nd 0 recall that fooo fyydy 1 Thus7 0 ye yZdy 1 0 PAGE 72 CHAPTER 3 STATMATH 5117 J TEBBS fY 0 1 0 005 l 000 Figure 312 Probability derisity function fyy iri Example 35 This is a probability model for chemical reactiori times Using an integration by parts argument with u y and do e yZdy7 we have that 00 26 y2dy 110 210 00 110 Solving for c we get 0 14 This pdf is depicted in Figure 312 ye yZdy 72ye y2 0 0 0 272ey2 74x0 71 4 b Using integration by parts again7 we get 45 1 P35 Y lt 45 Zye yZdy m 0135 35 Thus7 the probability that the Chemical reaction takes place between 35 and 45 seconds is about 014 D DISCLAIMER We will use integration by parts repeatedly in this coursell PAGE 73 CHAPTER 3 STATMATH 5117 J TEBBS 34 Mathematical expectation 341 Expected values TERMINOLOGY Let Y be a continuous random variable with pdf fyy and support R The expected value or mean of Y is given by MW yfyydy R If EY 00 we say that the expected value does not exist RECALL When Y is a discrete random variable with pmf guyy the expected value of Y is 190 Z yiny yER So again7 we have the obvious similarities between the continuous and discrete cases Example 36 Suppose that Y has a pdf given by 2y 0ltyltl 0 fYl otherwise 7 This pdf is depicted in Figure 313 Here7 the expected value of Y is given by 1 MW yfyydy 0 1 2y2dy 0 3 1 2 1L 2 Lo 23 D 3 0 3 EXPECTATIONS OF FUNCTIONS OF Y Let Y be a continuous random variable with pdf fyy and support R7 and suppose that g is a real valued function Then7 gY is a random variable and El9Yl R9yfyydy lf EgY 00 we say that the expected value does not exist PAGE 74 CHAPTER 3 STATMATH 5117 J TEBBS fY 1 0 1 Figure 313 Probability density function fyy in Example 36 Example 37 With the pdf in Example 367 compute EY2 and Eln Y 1 734 1 EY2 2y3dy2 Z 12 0 0 Using integration by parts7 with u lny and d1 ydy 1 1 21 1 1 1 1 7 iyzxidy 727 y 77D 0 02 y 2 20 2 PROPERTIES OF EYPECTATIONS Let Y be a continuous random variable with pdf SOLUTIONS 1 ElnY 2 ylnydy 2ltiy21ny 0 fyy and support R7 suppose that 97 9192 gk are real valued functions7 and let 0 be any real constant Then a Ec c b E10900 0E19Y1 C 19122191501 2271 E19700 These properties are identical to those we discussed in the discrete case PAGE 75 CHAPTER 3 STATMATH 5117 J TEBBS 342 Variance A SPECIAL EXPECTATION Let Y be a continuous random variable with pdf fyy7 support R7 and mean u The variance of Y is given by awszwgtm Awwmhmm Example 38 With the pdf in Example 367 2y 0 lty lt1 fYl 07 otherwise7 compute 02 VY SOLUTIONS Recall that M EY 237 from Example 36 Using the de nition above7 1 22 1 2VY quot 2d7 U 0 y 3 ny 18 Alternatively7 we could use the variance computing formula ie7 the variance of Y is WY EY2 ECNZ We know EY 23 and EY2 12 from Example 37 Thus7 g2 VY 12 7 23 118 D 343 Moment generating functions ANOTHER SPECIAL EXPECTATION Let Y be a continuous random variable with pdf fyy and support R The moment generating function mgf for Y7 denoted by m t7 is given by mmEwoAwnm provided Eety lt 00 for t in an open neighborhood about 0 ie7 there exists some h gt 0 such that Eety lt 00 for all t E flu1 lf Eety does not exist in an open neighborhood of 07 we say that the moment generating function does not exist PAGE 76 CHAPTER 3 STATMATH 5117 J TEBBS Example 39 Suppose that Y has a pdf given by 0 My 5 ygt 07 otherwise Find the moment generating function of Y and use it to compute EY and VY SOLUTION mylttgtElte gt etyfyltygtdy 0 ewww 0 Cjifw4gt lit 110 7 1 7 171 for values oft lt 1 With the mgf7 we can calculate the mean and variance Differentiating MW 2 gm 1 2 f w 170 To nd the variance7 we rst nd the second moment f aid 1 272 1 3 d my m 14 lit 39 MW the mgf7 we get Thus7 Thus7 the second moment is The computing formula gives 02 VY EY2 7 EY2 2 712 1 D EXERCISE Find EY and VY directly ie7 do not use the mgf Are your answers the same as above PAGE 77 CHAPTER 3 STATMATH 5117 J TEBBS 35 Uniform distribution TERMINOLOGY A random variable Y is said to have a uniform distribution from 91 to 02 61 lt 02 if its pdf is given by 1 fYl m7 0 01 lt y lt 62 otherwise 7 Shorthand notation is Y 1091 62 That the 1091 02 pdf integrates to one is obvious 92 92 7 71 will 62 9 62761 62761 9 62761 REMARKS Sometimes we call 91 and 02 the model parameters A popular member since of the 1091 02 family is the U0 1 distribution ie a uniform distribution with 91 0 and 02 1 this model is used extensively in computer programs to simulate random numbers The pdf for a U0 2 random variable is depicted in Figure 39 UNIFORM CDF The cdf Fyy for a U6162 distribution is given by 07 y S 91 Till 5221 01 lt y lt 62 17 y 2 92 Example 310 In a sedimentation experiment the size of particles studied are uniformly distributed between 01 and 05 millimeters What proportion of particles are less than 04 millimeters SOLUTION Let Y denote the size of a randomly selected particle Then Y N U01 05 and 04 0 3 PY lt 04 OV4dy i 4 075 D 0 05 i 01 04 01 04 MEAN AND VARIANCE If Y 210 62 then M EY 2 PAGE 78 CHAPTER 3 STATMATH 5117 J TEBBS and 92 i 902 12 39 These values can be computed using the pdf directly try it or by using the mgf below WY MOMENT GENERATING FUNCTION Suppose that Y N U6162 The mgf of Y is given by 92 91 e e 7 t 0 quotWW 92791 7g 1 t 0 7 36 Normal distribution TERMINOLOGY A random variable Y is said to have a normal distribution if its pdf is given by 1 2 lt lt 76 0 foo y 00 fYy m otherwise Shorthand notation is Y N NM702 There are two parameters in the normal distrib ution the mean M and the variance 02 FACTS ABOUT ANY NORMAL DISTRIBUTION a The pdf is symmetric about M that is7 for any a E R fyM 7 a fyM a b The points of in ection are located at y M i 039 c Any normal distribution can be transformed to a standard77 normal distribution llIIly icO 0 TERMINOLOGY A normal distribution with mean M 0 and variance 02 1 is called the standard normal distribution It is conventional to let Z denote a random variable that follows a standard normal distribution we often write Z N N01 IMPORTANT Tabled values of the standard normal probabilities are given in Ap pendix lll Table 4 pp 792 of WMS This table turns out to be very helpful since the PAGE 79 CHAPTER 3 STATMATH 5117 J TEBBS integral y 1 t 2 Fyy 2WU6 TH dt does not exist in closed form Speci cally the table provides values of 1FZ2PZgt2OO 1 e q Zdu 27m As mentioned any normal distribution can be transformed to a standard77 normal dis tribution we7ll see how later so there is only a need for one table of probabilities Of course probabilities like PZ gt 2 can be obtained using software too VALIDITY To show that the NM02 pdf integrates to one let 2 dz ldy and dy Udz Now de ne in 3771 Then 00 2 1 67y7 Cl 00 27W 00 1 2 7 1 2d 6 2 00 x 27139 Since I gt 0 it suf ces to show that I2 1 However note that 2 oo 1 eigds gtlt 00 1 eigd 00 x27T 00 x 27139 y 1 oo 00 2 y2 7 exp 7 27139 700 700 2 Now switching to polar coordinates ie letting z rcos0 and y rsin 0 we get 2 dxdy yz 7quot2cos2 6 sin2 9 r2 and dxdy rdrd ie the Jacobian of the transformation from zy space to 736 space Thus we write 2 27r co 1 7722 I 76 rdrd 90 70 2 1 27r 7 re r22dr d0 2 90 70 1 27r 7722 00 7 5 d9 27139 90 70 1 27r 27f 7 d6 i l D 27139 90 27139 90 PAGE 80 CHAPTER 3 STATMATH 5117 J TEBBS MOMENT GENERATING FUNCTION Suppose that Y N Np02 The rngf of Y de ned for all t is given by Uztz myt exp Mt Proof 1 00 gty y 2dy 27m 700 De ne b ty 7 y the exponent in the last integral Then 7 1y7M2 Md 0 1 2 2 t if 2 1 2029 M9M 1 7gb 7 2W 7 202w 2 12 7 2W 02m le complete the square 1 1 7 7 22 7 2m 02m u 0202 7 u 0202 le V add and subtract 1 2 2 1 2 2 2 1ly atl 1 at7M1 7 1 2 1 2 2 42 2 7 1 ya w 2M010tiltgt7 csay where a M Uzt Thus the last integral above is equal to 00 21 e 22lty7agt2dygt x go 700 7TH Na72 density Now nally note 60 E expc expmt 02t22 Thus the result follows E EXERCISE Use the rngf to verify that EY M and VY 02 IMPORTANT Suppose that Y N NW 02 Then the random variable Y 7 M 2 has a normal distribution with mean 0 and variance 1 That is Z N N0 1 PAGE 81 CHAPTER 3 STATMATH 5117 J TEBBS Proof Let Z Y 7 p The mgf of Z is given by 771200 EWZ EleXptZl y 7 E exp t M a E l eXp Mt0 eXp examam mytc7 expwta x exp mta 7 vata 622 2 7 which is the mgf of a N0 1 random variable Thus by the uniqueness of moment generating functions we know that Z N N0 1 D USEFULNESS From the last result we know that if Y N NW 02 then 7 Y7 7 7 7 ylltylty2y1 Mlt Mlty2 My1 MltZlty2 M 039 039 039 039 039 As a result 039 039 I 112 i M 7 I 91 i M 039 039 7 where denotes the cdf of the N0 1 distribution Note also that 72 17 lt1gtz for z gt 0 Example 311 In Florida young large mouth bass were studied to examine the level of mercury contamination Y measured in parts per million which varies according to a normal distribution with mean M 18 and variance 02 16 This model is depicted in Figure 314 a What proportion of contamination levels are between 11 and 21 parts per million b For this model ninety percent of all contamination levels will be above what mercury level PAGE 82 CHAPTER 3 STATMATH 5117 J TEBBS W 006 000 010 l l l 004 l 002 l 000 y mercury levels ppm Figure 314 Probability density function fyy in Example 311 A modelfoiquot mercury contamination in large mouth bass SOLUTIONS a In this part7 we want P11 lt Y lt 21 By standardizing7 we see that 11718 Y718 21718 P11ltYlt21 Plt lt lt gt 4 4 4 11718 21718 P Z P7175 lt Z lt 075 lt1gt075 4 4175 07734 4 00401 07333 For b7 we want to nd the 10th percentile of the Y N N1816 distribution ie7 we want the value y such that 090 PY gt y 1 7 To nd y rst well nd the 2 so that 090 PZ gt z 1 7 1327 then well unstandardize y From Table 47 we see 2 7128 so that y 718 4 7128 gt y 1288 Thus7 90 percent of all contarnination levels are larger that 1288 parts per million D PAGE 83 CHAPTER 3 STATMATH 5117 J TEBBS 37 The gamma family of pdfs THE GAMMA FAMILY In this section we examine an important family of probability distributions namely those in the gamma family There are three named distribu tions77 in particular 0 exponential distribution 0 gamma distribution 0 X2 distribution NOTE The exponential and gamma distributions are popular models for lifetime ran dom variables ie random variables that record time to event77 measurements such as the lifetimes of an electrical component death times for human subjects etc Other life time distributions include the lognormal Weibull and loggamma probability models 371 Exponential distribution TERMINOLOGY A random variable Y is said to have an exponential distribution with parameter 6 gt 0 if its pdf is given by Ten97 y gt 0 fYl 0 otherwise NOTATION Shorthand notation is Y N exponential The value 6 determines the scale of the distribution it is sometimes called the scale parameter That the expo nential density function integrates to one is easily shown verifyl MOMENT GENERATING FUNCTION Suppose that Y N exponential The mgf of Yis given by 1 t 7 mY 1731 for values of t lt 16 PAGE 84 CHAPTER 3 STATMATH 5117 J TEBBS Proof Let B 771 nt 1 so that 77 617 604 and ty 7 y iyn Then7 mytEetY etye y dy 0 1 00 eiyndy 3 7 Eeiyn 00 110 1 17m Note that for the last expression to be correct7 we need 77 gt 0 ie7 we need t lt D MEAN AND VARIANCE Suppose that Y N exponential The mean and variance of Y are given by and Proof Exercise D Example 312 The lifetime of a certain electrical component has an exponential dis tribution with mean 6 500 hours Engineers using this component are particularly interested in the probability until failure What is the probability that a randomly se lected component fails before 100 hours lasts between 250 and 750 hours SOLUTION With 6 5007 the pdf for Y is given by 1 731500 6 7 y gt 0 fyy 500 07 otherwise This pdf is depicted in Figure 315 Thus7 the probability of failing before 100 hours is given by 100 1 woo P Y lt 100 7 y d m 0181 lt gt 5006 y Similarly7 the probability of failing between 250 and 750 hours is 750 1 P250 lt Y lt 750 few Ody m 0383 D 250 500 PAGE 85 CHAPTER 3 STATMATH 5117 J TEBBS fY 0 0015 00020 l l 00010 l 00005 l l l l l l 0 500 1000 1500 2000 2500 00000 y component lifetimes hours Figure 315 Probability density function fyy in Example 312 A model for electrical component lifetimes CUMULATIVE DISTRIBUTION FUNCTION Suppose that Y N exponential Then7 the cdf of Y exists in closed form and is given by 0 y 0 7 Till 17 e y 7 y gt 0 The cdf for the exponential random variable in Example 312 is depicted in Figure 316 THE MEMORYLESS PROPERTY Suppose that Y N exponential 7 and suppose that r and s are both positive constants Then PYgtrlegtr PYgts That is7 given that the lifetime Y has exceeded 7 the probability that Y exceeds rs ie7 an additional 5 units is the same as if we were to look at Y unconditionally lasting until time 5 Put another way7 that Y has actually made it77 to time 7 has been forgotten The exponential random variable is the only continuous random variable that enjoys the rnernoryless property PAGE 86 CHAPTER 3 STATMATH 5117 J TEBBS FY l l l l l 0 500 1000 1500 2000 2500 y component lifetimes hours Figure 316 Cumulative distribution function Fyy in Example 312 A model for electrical component lifetimes RELATIONSHIP WITH A POISSON PROCESS Suppose that we are observing events according to a Poisson process with rate A 16 and let the random variable W denote the time until the rst occurrence Then W N exponential Proof Clearly W is a continuous random variable with nonnegative support Thus for w 2 0 we have FwwPW3w 1 PWgtw 1 7 Pno events in 0wl Ol 7 1 5 Substituting A 16 we nd that FWw 17 e w the cdf of an exponential random variable with mean 6 Thus the result follows E PAGE 87 CHAPTER 3 STATMATH 5117 J TEBBS 372 Gamma distribution THE GAMMA FUNCTION The gamma function is a function of t de ned for all t gt 0 as M 0 FACTS ABOUT THE GAMMA FUNCTION 1 A simple argument shows that Na a 7 1Pa 717 for all a gt 1 2 If a is an integer7 Na a 71l For example7 P5 4 24 TERMINOLOGY A random variable Y is said to have a gamma distribution with parameters a gt 0 and B gt 0 if its pdf is given by 04 15 21197 y gt 0 0 1 a my fad m m otherwise 7 Shorthand notation is Y N gammaa REMARK This model is indexed by two parameters We call a the shape parameter and B the scale parameter The gamma probability model is extremely exible By changing the values of a and B the gamma pdf can assume many shapes Thus7 the gamma model is very popular for modeling lifetime data IMPORTANT NOTE When a 17 the gamma pdf reduces to the exponential pde REMARK To see that the gamma pdf integrates to one7 consider the change of variable u y Then7 du dy and 00 1 041 y d 00 041 ud E D Amway 5 y mew0 5 u a 139 MGF FOR THE GAMMA DISTRIBUTION Suppose that Y N gammaa Then7 for values of t lt 167 the mgf of Y is given by PAGE 88 CHAPTER 3 STATMATH 5117 J TEBBS Proof Let B 771 nt 1 so that 77 617 604 and ty 7 y iyn 0 1 myt Eety ety yaile y dy 0 WW i 00 041 ill71d 6040 My 6 y 7 77a 00 1 0471 7117 7 7 7y e dy 50 0 WWW gamaoz7 density 9 1 D MEAN AND VARIANCE If Y N gammaoz 7 then EY a6 and VY 0462 Proof Exercise D TERMINOLOGY When talking about the gammaoz density function7 it is often helpful to think of the formula in two parts 0 the kernel ya le y o the constant Namal l Example 313 Suppose that Y has pdf given by cyze y l y gt 0 0 fYl otherwise a What is the value of c that makes this a valid pdf b Give an integral expression that equals PY lt 8 How could we solve this equation c What is the mgf of Y d What is the mean and standard deviation of Y RELATIONSHIP WITH A POISSON PROOESS Suppose that we are observing events according to a Poisson process with rate A 167 and let the random variable W denote the time until the ath occurrence Then7 W N gammaoz PAGE 89 CHAPTER 3 STATMATH 5117 J TEBBS fY 006 l 004 l Figure 317 Probability density function fyy in Example 313 Proof Clearly7 W is continuous with nonnegative support Thus7 for w 2 07 we have FWw PW S w 17PWgtw 1 7 Pfewer than 04 events in 0w 0471 e Awwj 1 i Z j0 The pdf of lV7 j xxw7 is equal to F CVw7 provided that this derivative exists For it gt 07 W F w Mm 7 Mimi A 7 Awa telescoping sum Substituting A 157 for w gt 07 which is the pdf for the garnrnaoz distribution 7 4w 7 4w 7 Awyil 7 A6 6 A a 71 J w quot16 w A DH Aw mini mmw 539 f 1071 wrle w W WW D PAGE 90 CHAPTER 3 STATMATH 511 J TEBBS 373 X2 distribution TERMINOLOGY In the gammaoz family when 04 V2 for any integer V and B 2 we call the resulting distribution a X2 distribution with V degrees of freedom If Y has a X2 distribution with V degrees of freedom we write Y N X2V NOTE At this point it suf ces to know that the X2 distribution is really just a spe cial77 gamma distribution However it should be noted that the X2 distribution is used extensively in applied statistics Many statistical procedures used in the literature are valid because of this model PROBABILITY DENSITY FUNCTION If Y N X2V then the pdf of Y is given by 1 Ix2H 7212 g p y 6 7 y gt 0 fyy PM 2 0 otherwise 7 MOMENT GENERATING FUNCTION Suppose that Y N X2V Then for values of tlt 12 the mgf of Y is given by me y2 Proof Take the gammaoz mgf and put in Oz V2 and B 2 D MEAN AND VARIANCE OF THE X2 DISTRIBUTION If Y N X2V then EY V and VY 2V Proof Take the gammaoz formulae and substitute Oz V2 and B 2 D TABLED VALUES FOR CDF Because the X2 distribution is so pervasive in applied statistics tables of probabilities are common Table 6 WMS pp 794 5 provides values of y which satisfy 1 py gt y Viuwzkieiuzdu y many2 for different values of y and degrees of freedom V PAGE 91 CHAPTER 3 STATMATH 5117 J TEBBS 38 Beta distribution TERMINOLOGY A random variable Y is said to have a beta distribution with parameters 04 gt 0 and B gt 0 if its pdf is given by mya l 711quot 0 lt y lt1 0 fYl otherwise 7 Since the support of Y is 0 lt y lt 1 the beta distribution is a popular probability model for proportions Shorthand notation is Y N betaoz The constant Ba is given by mam Na 6 39 TERMINOLOGY When talking about the betaoz density function it is often helpful 3amp6 to think of the formula in two parts 0 the kernel y 11 y 71 o the constant m THE SHAPE OF THE BETA PDF The beta pdf is very exible That is by changing the values of Oz and B we can come up with many different pdf shapes See Figure 318 for examples 0 When 04 B the pdf is symmetric about the line y 0 When 04 lt B the pdf is skewed right ie smaller values of y are more likely 0 When 04 gt B the pdf is skewed left ie larger values of y are more likely 0 When 04 B 1 the beta pdf reduces to the U0 1 pdf MOMENT GENERATING FUNCTION The mgf of a betaa 6 random variable exists but not in a nice compact formula Hence we7ll compute moments directly PAGE 92 CHAPTER 3 STATMATH 5117 J TEBBS o o o 2 o o a o a 10 o o o 2 4 o a o a 10 Beta Betaaz 00 02 04 06 as 10 00 02 04 06 as 10 Beta3 2 Betamm Figure 318 Four di erent beta probability models MEAN AND VARIANCE OF THE BETA DISTRIBUTION lfY betaoz then 04 045 EY 7 and VY m Proof Exercise D Example 314 A small lling station is supplied with premium gasoline once per day and can supply at most 1000 gallons lts daily volume of sales in 1000s of gallons is a random variable7 say Y7 which has the beta distribution 517y47 0lty lt1 0 fYl otherwise a What is are the parameters in this distribution ie7 what are 04 and B b What is the average daily sales c What need the capacity of the tank be so that the probability of the supply being exhausted in any day is 001 d Treating daily sales as independent from day to day7 what is the probability that during any given 7 day span7 there are exactly 2 days where sales exceed 200 gallons PAGE 93 CHAPTER 3 STATMATH 5117 J TEBBS SOLUTIONS a a 1 and 6 5 b EY 15 16 Thus7 the average sales is about 16666 gallons c We want to nd the capacity7 say 0 such that PY gt c 001 This means that 1 PY gt c 517 y4dy 001 and we need to solve this equation for c Using a change of variable u 1 7 y 1 170 10 517 y4dy 5u4dy 115 17 c5 c 0 0 Thus7 we have 17 c5 001 i 17 c 001 5 i c 1 7 00115 06027 and so there must be about 602 gallons in the tank d First7 we compute 8 1 08 0V PY gt 02 517 y4dy 5u4du 05 085 0328 02 0 0 This is the probability that sales exceed 200 gallons on any given day Now7 treat each day as a trial and let X denote the number of days where sales exceed 200 gallons77 ie7 a success Because days are assumed independent7 X N b77 0328 and PX 2 O328217 03285 0310 D 39 Chebyshev s Inequality MARKOV S INEQUALITY Suppose that X is a nonnegative random variable with pdf prnf fX7 and let 0 be any positive constant Then7 PX gt c g 0 Proof First7 de ne the event B x z gt c We know that EXOoozfxzdz BzfXxdxxfxzdz B 2 BfX95d95 2 BcfXzdx CPX gt c D PAGE 94 CHAPTER 3 STATMATH 5117 J TEBBS SPECIAL CASE Let Y be any random variable discrete or continuous with mean M and variance 02 lt 00 Then for k gt 0 l PY 7 M gt ka 3 This is known as Chebyshev7s Inequality Proof Apply Markov7s Inequality with X Y 7 M2 and c 202 With these substitutions we have Pay 7 m gt 1w 7 PKY 7 m2 gt W s a REMARK The beauty of Chebyshev7s result is that it applies to any random variable Y ln words PY 7 M gt k0 is the probability that the random variable Y will differ from the mean M by more than k standard deviations If we do not know how Y is distributed we can not compute PY 7 M gt k0 exactly but at least we can put an upper bound on this probability this is what Chebyshev7s result allows us to do Note that PY7Mgtka17PY7Mgka17PM7kU Y Mw Thus it must be the case that l P Y7M SkUPM7kU Y M UZl7 Example 315 Suppose that Y represents the amount of precipitation in inches observed annually in Barrow AK The exact probability distribution for Y is unknown but from historical information it is posited that M 45 and 039 1 What is a lower bound on the probability that there will be between 25 and 65 inches of precipitation during the next year SOLUTION We want to compute a lower bound for P25 S Y S 65 Note that 1 P25 S Y S 65 PY 7 M S 20 17 E 075 Thus we know that P25 S Y S 65 2 075 The chances are good that in fact Y will be between 25 and 65 inches PAGE 95 CHAPTER 4 STATMATH 5117 J TEBBS 4 Multivariate Distributions Complementary reading from WMS Chapter 5 41 Introduction REMARK So far we have only discussed univariate single random variables their probability distributions moment generating functions means and variances etc In practice however investigators are often interested in probability statements concerning two or more random variables Consider the following examples 0 In an agricultural eld trial we might to understand the relationship between yield Y measured in bushelsacre and the nitrogen content of the soil 0 In an educational assessment program we might want to predict a student7s posttest score from her pretest score 0 In a clinical trial physicians might want to characterize the concentration of a drug Y in ones body as a function of the time X from injection 0 In a marketing study the goal is to forecast next months sales say Y based on sales gures from the previous 71 7 1 periods say Y1Y2 Yn1 GOAL In each of these examples our goal is to describe the relationship between or among the random variables that are recorded As it turns out these relationships can be described mathematically through a probabilistic model TERMINOLOGY lf Y1 and Y2 are random variables then Y1Y2 is called a bivariate random vector lf Y1Y2Yn denote 71 random variables then Y Y1Y2Yn is called an nvariate random vector For much of this chapter we will consider the n 2 bivariate case However all ideas discussed herein extend naturally to higher dimensional settings PAGE 96 CHAPTER 4 STATMATH 5117 J TEBBS 42 Discrete random vectors TERMINOLOGY Let Y1 and Y2 be discrete random variables Then7 YhYZ is called a discrete random vector7 and the joint probability mass function pmf of Y1 and Y2 is given by PY1Y2l17yz PY1 91752 112 for all yhyg E RybYZ The set Ryby2 Q R2 is the two dimensional support of The function py1y2y1y2 has the following properties 1 0 S pY1Y2yi7l2 S 17 for all 11792 6 Phi136 2 Emmi2 PY1Y2yi7yz 1 3 PY1Y2 E B ZBpny2y1y2 for any set E C R2 Example 41 An urn contains 3 red balls7 4 white balls7 and 5 green balls Let Y17 Y2 denote the bivariate random vector where7 out of 3 randomly selected balls7 Y1 number of red balls Y2 number of white balls Consider the following calculations py1y200 py1y207131 py1y202 py1y203 lt3gt3 py1y210 py1y211 PAGE 97 CHAPTER 4 STATMATH 5117 J TEBBS Table 42 Joint prnfpyby2 y1y2 for Example 41 displayed in tabular form pmMhyz 22 0 22 1 yz 2 yz 3 7 10 40 30 4 1 1 i 0 E a a a 7 30 60 18 1 1 i 1 E a a 7 15 12 91 7 2 E E 7 1 91 7 3 and similarly 18 pY1Y2172 15 pY1Y2270 E 12 pY1Y2271 1 pY1Y2370 Here the support is Rim2 070707170727073717071717172727072704370 Table 42 depicts the joint prnf It is straightforward to see that ZRYI Y2 pyby2 y1y2 1 QUESTION What is the probability that among the three balls chosen there is at most 1 red ball and at most 1 white ball That is what is PY1 S 1Y2 S 1 SOLUTION Here we want to compute PB where the set E 0 0 0 1 1 0 1 From the properties associated with the joint prnf this calculation is given by S 17y 1 pY1Y2070 pY1Y2071 10Y1Y2170 10Y1Y2171 10 40 30 60 m t E t m t E 140 QUESTION What is the probability that among the three balls chosen there are at least 2 red balls That is what is PY1 Z 2 PAGE 98 CHAPTER 4 STATMATH 5117 J TEBBS 43 Continuous random vectors TERMINOLOGY Let Y1 and Y2 be continuous random variables Then Y1 Y2 is called a continuous random vector and the joint probability density function pdf of Y1 and Y2 is denoted by fyy2y1y2 The function fyy2y1y2 has the following properties 1 fyy2y1y2 gt 0 for all y1y2 E Ryly2 the two dimensional support set 2 fon f fY1Y2ylvyZdyldyZ 1 3 PY1Y2 E B f3 fyy2y1y2dy1dy2 for any set E C R2 REMARK Of course we realize that my 6 B fylyxyhyadyldyz B is really a double integral since B is a two dimensional set in the y1y2 plane thus PY1Y2 E B represents the volume under fyy2y1y2 over B TERMINOLOGY Suppose that Y1Y2 is a continuous random vector with joint pdf fybyz y1y2 The joint cumulative distribution function cdf for Y1Y2 is given by 12 11 FY1Y2l17112 E 1301 S 91752 S 12 fY1Y27 75d7 d57 for all y1y2 E R2 It follows upon differentiation that the joint pdf is given by FY1Y2 917127 62 fY1Y2yl792 m wherever these mixed partial derivatives are de ned Example 42 Suppose that in a controlled agricultural experiment we observe the random vector Y1 Y2 where Y1 temperature in Celcius and Y2 precipitation level in inches and suppose that the joint pdf of Y1Y2 is given by cylyg 10 lty1 lt 200 lty2 lt 3 0 Wig 91792 otherwise 7 PAGE 99 CHAPTER 4 STATMATH 5117 J TEBBS a What is the value of c b Compute PY1 gt 15712 lt 1 c Compute PY2 gt SOLUTIONS a We know that 20 3 011112 Ell261111 1 y110 2120 since fy1y2y17y2 must integrate to 1 over RWY2 yhyg 10 lt yl lt 207 0 lt yg lt 3 1e 20 3 20 2 3 2 20 y 96 y 1 011112 Ell251111 0 111 dy1 y110 y20 y110 0 10 Thus7 0 1675 9 30150 6750 b Let B y1y2 y1 gt157y2 lt1 The value PY1Y2 E B PY1 gt 15712 lt1 represents the volume under fy1y2y1y2 over the set E ie7 20 1 1 PMYZ e B PltY1 gt 15 lt1 7675211212 dyzdyl y 2120 115 1 20 2 1 9172 dy1 0 1 2 20 1 225 7 y 7 2007 0065 1350 2 15 1350 2 15 2 c Let D yhyg yg gt 1115 The quantity PY1Y2 E D PY2 gt Y15 represents the volume under fy1y2y1y2 over the set D ie7 3 512 1 13mm 6 D1 Poe gt 55 711112 dyldyz 1122 1 yFlO 675 3 512 dy2 10 f 675 2122 2 1 3 3 7 25y2 7 100y2dy2 1350 2 1 25 4 3 7 y i 50y m 0116 1350 4 2 NOTE The key thing to remember that7 in parts b and c7 the probability is simply the volume under the density fy1y2y1y2 over a particular set It is helpful to draw a picture to get the limits of integration correct PAGE 100 CHAPTER 4 STATMATH 5117 J TEBBS 44 Marginal distributions RECALL The joint pmf of YhYZ in Example 41 is depicted below in Table 43 You see that by summing out over the values of y2 in Table 437 we obtain the row sums Hype PY11 PY12 PY13 2730 This represents the marginal distribution of Y1 Similarly7 by summing out over the values of y1 we obtain the column sums PY2 0 PY2 1 PY2 2 PY2 3 i m g A 220 220 220 220 This represents the marginal distribution of Y2 Table 43 Joint pmfpy1y2y1y2 displayed in tabular form pyly myz 22 0 22 1 yz 2 yz 3 Row Sum 0 n E m A g 91 220 220 220 220 220 1 m m g m yl 220 220 220 220 7 g g E yl 2 220 220 220 7 m m 91 i 3 220 220 56 112 48 4 Column sum m m E m l TERMINOLOGY Let YhYZ be a discrete random vector with pmf py1y2y1y2 Then the marginal pmf of Y1 is 1031 11 Z 10th 11792 a11212 and the marginal pmf of Y2 is 1012 22 Z PY1Y291792 all 11 PAGE 101 CHAPTER 4 STATMATH 5117 J TEBBS MAIN POINT In the two dimensional discrete case7 marginal pmfs are obtained by summing out77 over the other variable TERMINOLOGY Let Y1Y2 be a continuous random vector with pdf fy1y2y1y2 Then the marginal pdf of Y1 is fY1yi fY1Y2l17112dy2 and the marginal pdf of Y2 is fY2yz fiaY2l17yzdyi MAIN POINT In the two dimensional continuous case7 marginal pdfs are obtained by integrating out77 over the other variable Example 43 In a simple genetics model7 the proportion7 say Y1 of a population with Trait 1 is always less than the proportion7 say Y2 of a population with trait 27 and the random vector Y1Y2 has joint pdf 6117 0ltyi ltyzlt1 fY1Y2y17y2 07 otherwise a Find the marginal distributions fy1y1 and fy2 b Find the probability that the proportion of individuals with trait 2 exceeds 12 c Find the probability that the proportion of individuals with trait 2 is at least twice that of the proportion of individuals with trait 1 SOLUTIONS a To nd the marginal distribution of Y1 ie7 fy1y17 we integrate out over yg For values of 0 S yl S 17 we have 1 fY1l1 61151112 61111 11 11121111 Thus7 the marginal distribution of Y1 is given by 6911 yl7 0 lt 91 lt1 fY1l1 07 otherwise PAGE 102 CHAPTER 4 STATMATH 5117 J TEBBS Of course we recognize this as a beta distribution with 04 2 and B 2 That is marginally Y1 beta22 To nd the marginal distribution of Y2 ie fy2y2 we integrate out over yl For values of 0 S yg S 1 we have 12 2 12 2 fy2y2 62161211 311 312 1110 0 Thus the marginal distribution of Y2 is given by 3 0 lt yg lt1 fY2yz 0 otherwise Of course we recognize this as a beta distribution with 04 3 and B 1 That is marginally Y2 beta3 1 b Here we want to nd PB where the set E y1y2 0 lt yl lt y2y2 gt 12 This probability can be computed two different ways i using the joint distribution fy1y2y1y2 and computing 1 12 PmYz e B 6y1dy1dy2 21205 2110 ii using the marginal distribution fy2y2 and computing 1 Mngtua iz y205 Either way you will get the same answer Notice that in i you are computing the volume under fyy2y1y2 over the set B In ii you are nding the area under fy2y2 over the set yg yg gt 12 c Here we want to compute PY2 2 2Y1 ie we want to compute PD where the set D y1y2 yg 2 2m This equals 1 ygZ Hmamm WWMM5 2120 2110 This is the volume under fyy2y1y2 over the set D D PAGE 103 CHAPTER 4 STATMATH 5117 J TEBBS 45 Conditional distributions RECALL For events A and B in a non empty sample space S we de ned PA B P A B lt l gt RE for PB gt 0 Now7 suppose that YhYZ is a discrete random vector If we let B Y2 yg and A Y1 yl we obtain PWB PY1 y1Y2 yz Mai 91112 PY2 12 1032 yz TERMINOLOGY Suppose that Y1Y2 is a discrete random vector with joint pmf py1y2y1y2 We de ne the conditional probability mass function pmf of Y1 given Y2 yg as pY1Y2 11792 1032 12 whenever py2y2 gt 0 Similarly7 the conditional probability mass function of Y2 given PYJYXMWZ Y1 11 as PY1Y291792 PYY 12111 2 mm 7 whenever py1y1 gt 0 Example 44 In Example 417 we computed the joint pmf for Y17 The table below depicts this joint pmf as well as the marginal pmfs Table 44 Joint pmfpy1y2y1y2 displayed in tabular form PY1Y291792 22 0 22 1 yz 2 yz 3 Row Sum 0 g E 2 7 91 i 220 220 220 220 220 1 m m g m 91 i 220 220 220 220 g g E 91 i 2 220 220 220 7 m m yl 3 220 220 56 112 48 4 Column sum m m E m 1 QUESTION What is the conditional pmf of Y1 given Y2 1 PAGE 104 CHAPTER 4 STATMATH 5117 J TEBBS SOLUTION Straightforward calculations show that 40220 0 1 i 7 40 112 PYilY2l1 lyz pY2y2 1 112220 pm 201 1702 1 60220 1 1 gt 7 60 112 paw2 lyz DiAW 1 112220 pY1Y2yl2721 12220 2 1 712112 paw2 lyz DiAW 1 112220 Thus the conditional prnf of Y1 given Y2 1 is given by yl l 0 1 2 pmy2ylly21l40112 60112 12112 This conditional pmf tells us how Y1 is distributed if we are given that Y2 1 EXERCISE Find the conditional prnf of Y2 given Y1 0 D THE CONTINUOUS CASE When Y1Y2 is a continuous random vector we have to be careful how we de ne conditional distributions since the quantity i ay2 91712 fY11Y2l1ly2 fY2y2 has a zero denorninator As it turns out this expression is the correct formula for the continuous case however we have to motivate its construction in a slightly different way ALTERNATE MOTIVATION Suppose that Y1 Y2 is a continuous random vector For dyL and dyg srnall fY1i2yi7y2dyidy2 fY2l2dy2 Pyl Syi Syidyi7y2 SY2 Sy2dy2 Py2 SY2 Sy2dyz Pyi S Y1 111 dyily2 S Y2 Sy2d112gt fi11Y2y1lyzdyi Thus we can think of fy11y2y1ly2 in this way ie for small values of dyL and dyg fy y2y1ly2 represents the conditional probability that Y1 is between yl and y1 dyl given that Y2 is between yg and y2 dyg PAGE 105 CHAPTER 4 STATMATH 5117 J TEBBS TERMINOLOGY Suppose that YhYZ is a continuous random vector with joint pdf fy1y2y1y2 We de ne the conditional probability density function pdf of Y1 given Y2 yg as fmY2yly2 ng 12 Similarly7 the conditional probability density function of Y2 given Y1 yl is fanAmiga fY1Y2yhyz fY1y1 Example 45 Consider the bivariate pdf in Example 43 nglY1yziyl 6117 0ltyiltyzlt1 0 Jig 91792 otherw1se Recall that this probabilistic model summarized the random vector Y17 Y27 where Y1 the proportion of a population with Trait 17 is always less than Y2 the proportion of a pop ulation with trait 2 Derive the conditional distributions fy1 y2y1y2 and fy2 y1y2y1 SOLUTION ln Example 43 we derived the marginal pdfs to be 6911 yl7 0 lt 91 lt1 0 fY1y1 otherwise 7 and 3 0 lt yg lt1 0 ng 12 7 otherwise First7 we derive fy1 y2y1y2 so x Y2 yg Remember7 once we condition on Y2 yg ie7 once we x Y2 yg we then regard yg as simply some constant This is an important point to understand Then7 for values of 0 lt yl lt yg it follows that fY1Y2yi792 i fy2y2 313 9 7 he m 21 92 and7 thus7 this is the value of fy1 y2y1y2 when 0 lt yl lt yg Of course7 for values of y1 0y27 the conditional density fy1 y2y1 y2 0 Summarizing7 the conditional pdf of Y1 given Y2 yg is given by 2211137 0 lt 211 lt yz 0 me2ylil2 otherwise 7 PAGE 106 CHAPTER 4 STATMATH 5117 J TEBBS Now to derive the conditional pdf of Y2 given Y1 we x Y1 ylg then for all values of y1 lt yg lt 1 we have le Y2 117 12 611 1 f y y 7 Y2mlt 2 1 fY1yi 62111 11 1 91 This is the value of fy2 yy2ly1 when yl lt yg lt 1 When yg yl 1 the conditional pdf is fwy1 yglyl 0 Remember once we condition on Y1 yl then we regard yl simply as some constant Thus the conditional pdf of Y2 given Y1 yl is given by 1 Q7 y1lty2lt1 nglY1yZlyl 0 otherwise 7 That is conditional on Y1 yl Y2 Uy1 1 D RESULT The use of conditional densities allows us to de ne conditional probabilities of events associated with one random variable when we know the value of another random variable If Y1 and Y2 are jointly discrete then for any set E C R 1301 E BlYZ 12 ZPY132yilyz B If Y1 and Y2 are jointly continuous then for any set E C R MK 6 Ble yz fymy1ly2dy1 B Example 46 A small health food store stocks two different brands of grain Let Y1 denote the amount of brand 1 in stock and let Y2 denote the amount of brand 2 in stock both Y1 and Y2 are measured in 100s of lbs The joint distribution of Y1 and Y2 is given by 2411112 yl gt 0112 gt 0 0 lty1 112 lt1 0 fiaY2yly2 otherwise a Find the conditional pdf fy1 y2y1ly2 b Compute PY1 gt 05lY2 03 c Find PY1 gt 05 PAGE 107 CHAPTER 4 STATMATH 5117 J TEBBS SOLUTIONS a To nd the conditional pdf fy1 y2y1ly2 we rst need to nd the mar ginal pdf of Y2 The marginal pdf of Y2 for 0 lt yg lt 17 is 1722 y2 1212 My 2421212 dy124y2 31 12y217y22 1110 0 and 07 otherwise Of course7 we recognize this as a beta23 pdf ie7 Y2 beta23 The conditional pdf of Y1 given Y2 yg is le Y2 11792 2411212 f y y W 1 2 My 1211207112 i 1 12 for 0 lt yl lt 1 7 yg and 07 otherwise Surnrnarizing7 723 0 lt y lt 1 i 7 27 1 12 leli2yilyz 1 22 07 otherwise b To compute PY1 gt 05lY2 037 we work with the conditional pdf fy1 y2y1ly2 which for yg 037 is given by y1 0 lt yl lt 07 fY1lY2lty1ly2 0 otherwise 7 Thus7 07 200 1301 gt 0502 03 E yidyl 05 0489 22 c To compute PY1 gt 057 we can either use the marginal pdf fy1y1 or the joint pdf fy1y2y17y2 Marginally7 it turns out that Y1 beta23 as well verifyl Thus7 1 PY1 gt 05 121107 yi2dy1 0313 05 REMARK Notice how PY1 gt 05lY2 03 31 PY1 gt 05 that is7 knowledge of the value of Y2 has affected the way that we assign probability to events involving Y1 Of course7 one might expect this because of the support in the joint pdf fy1y2y17y2 D PAGE 108 CHAPTER 4 STATMATH 5117 J TEBBS 46 Independent random variables TERMINOLOGY Suppose that Y1 Y2 is a random vector discrete or continuous with joint cdf Fy1y2y1 yg and denote the marginal cdfs of Y1 and Y2 by Fyy1 and Fy2y2 respectively We say that the random variables Y1 and Y2 are independent if and only if FHA 9171 FY1 1FY2l2 for all values of y1 and y2 Otherwise we say that Y1 and Y2 are dependent RESULT Suppose that Y1Y2 is a random vector discrete or continuous with joint pdf pmf fybyz y1y2 and denote the marginal pdfs pmfs of Y1 and Y2 by fyy1 and fy2y2 respectively Then Y1 and Y2 are independent if and only if fY1Y2yl7yZ fY1l1fY2l2 for all values of y1 and y2 Otherwise Y1 and Y2 are dependent Example 47 Suppose that the pmf for the discrete random vector Y1 Y2 is given by 041 212 211 1727212 172 0 10th 117 12 otherwise 7 The marginal distribution of Y1 for values of y1 12 is given by 2 2 1 1 10y1 11 Z py1y2y17y2 Z Em 212 Beg1 6 2 1121 and pyy1 0 otherwise Similarly the marginal distribution of Y2 for values of y2 12 is given by 2 2 l l PY292 ZPYQQWhW 2 Run 212 E3 4732 1111 y11 and 10y2 yg 0 otherwise Note that for example 3 8 7 14 E pY1Y21717 pY11pY21 E E g thus the random variables Y1 and Y2 are dependent D PAGE 109 CHAPTER 4 STATMATH 5117 J TEBBS Example 48 Let Y1 and Y2 denote the proportions oftime out of one workday during which employees I and H7 respectively7 perform their assigned tasks Suppose that the random vector Y1Y2 has joint pdf yl927 0lty1lt1 0lty2lt1 0 fiaY2yly2 otherwise 7 It is straightforward to show verify that 111 1 0 lty1 lt1 fyly1 2 07 otherwise 112 0lty2lt1 0 ng 92 otherwise ThUS since fmy2y1y2 11 yz 7e 21 y2 i fy1y1fy2y27 for 0 lt 211 lt 1 and 0 lt yg lt 17 Y1 and Y2 are dependent D Example 49 Suppose that Y1 and Y2 represent the death times in hours for rats treated with a certain toxin Marginally7 each death time follows an exponential distrib ution with mean 0 and Y1 and Y2 are independent a Write out the joint pdf of b Compute PY1 S 152 3 1 SOLUTIONS a Because Y1 and Y2 are independent7 the joint pdf of Y17Y27 for yl gt 0 and y2 gt 07 is given by l l l fY1Y2l17112 fY1ylfY2yZ 56721 X 56722 fe1M2 and fY1Y2 91712 0 otherwise b Because Y1 and Y2 are independent7 PY1 31752 31 FY1Y2171 FY11FY21 17 671917 6719 17649 D PAGE 110 CHAPTER 4 STATMATH 5117 J TEBBS A CONVENIENT RESULT Let Y1Y2 be a random vector discrete or continuous with pdf pmf fy1y2y1 yg lfthe support set RWY2 does not constrain yl by yg or yg by yl and additionally we can factor the joint pdf pmf fybyz y1y2 into two nonnegative expressions fmy2y1y2 9y1hyz then Y1 and Y2 are independent Note that 9y1 and hy2 are simply functions they need not be pdfs pmfs although they sometimes are The only requirement is that gy1 is a function of y1 only hy2 is a function of y2 only and that both are nonnegative If the support involves a constraint the random variables are automatically dependent Example 410 In Example 46 Y1 denoted the amount of brand 1 grain in stock and Y2 denoted the amount of brand 2 grain in stock Recall that the joint pdf of Y1Y2 was given by 2411927 91 gt07 92gt07 0ltyiyzlt1 0 fmY2i1y2 otherw1se 7 Here the support is Ryby2 y1y2 yl gt 0 yg gt 0 0 lt yl y2 lt 1 Since knowledge of y1 y2 affects the value of y2 yl the support involves a constraint and Y1 and Y2 are dependent D Example 411 Suppose that the random vector X Y has joint pdf PozP 1Ae MD 1y 11 7 y 1 z gt 00 lt y lt 1 0 Jew9671 otherw1se 7 for A gt 0 04 gt 0 and B gt 0 Since the support RXy z gt 0 0 lt y lt 1 does not involve a constraint it follows immediately that X and Y are independent since we can write mm X W107 a WWW hy Note that we are not saying that 9a and My are marginal distributions of X and Y fxy y Ae M As 9w respectively in fact they are not the marginal distributions D PAGE 111 CHAPTER 4 STATMATH 5117 J TEBBS EXTENSION We generalize the notion of independence to n variate random vectors We use the conventional notation Y Y1Y2Yn and y y1y2 Also we will denote the joint cdf of Y by Fyy and the joint pdf pmf of Y by fyy TERMINOLOGY Suppose that the random vector Y Y1Y2 has joint cdf Fyy and suppose that the random variable Y has cdf Fy for 239 1 2 n Then Y1 Y2 Yn are independent random variables if and only if RHampW i1 that is the joint cdf can be factored into the product of the marginal cdfs Alternatively Y1 Y2 Yn are independent random variables if and only if fYy H inyi i1 that is the joint pdf pmf can be factored into the product of the marginals Example 412 In a small clinical trial 71 20 patients are treated with a new drug Suppose that the response from each patient is a measurement Y N NM02 Denot ing the 20 responses by Y Y1Y2 YZO then assuming independence the joint distribution of the 20 responses is for y 6 R20 20 20 l 1 yru 2 l 1 20 yru 2 fYy 5 olt g ii1o39 N27T039 27m inZi What is the probability that every patient7s response is less than M 20 SOLUTION The probability that Y1 is less than M 20 is given by 1311 lt u 2a PZ lt 2 2 09772 where Z N N01 and denotes the standard normal cdf Because the patients7 responses are independent random variables to 0 PY1ltM20756ltM2077Y oltM2U PYiltM20 1 ltIgt 2200630 D PAGE 112 CHAPTER 4 STATMATH 5117 J TEBBS 47 Expectations of functions of random variables RESULT Suppose that Y Y1 Y2 Y has joint pdf fyy or joint pmf pyy and suppose that gY gY1Y2 Yn is any real vector valued function of Y1Y2 Yn ie g R a R Then 0 if Y is discrete E9Yl Z 2 2 9ypyy all 11811 2 allyn o and if Y is continuous E9Yl 1 9yfyydy If these quantities are not nite then we say that EgY does not exist Example 413 In Example 46 Y1 denotes the amount of grain 1 in stock and Y2 denotes the amount of grain 2 in stock The joint distribution of Y1 and Y2 was given by 2411927 91 gt07 92gt07 0ltyiyzlt1 0 flan2yhy2 otherwise 7 What is the expected total amount of grain Y1 Y2 in stock SOLUTION Let the function g R2 a R be de ned by gyly2 yl yg We would like to compute EgY1Y2 EY1 From the last result we know that 1 1 111 EY1 Y2 11 yz24yiyz dyzdyi 2110 2120 1 yz 1721 ya 111 24yfl 24113 dyl 1110 2 0 3 0 1 1 12911 110261111 8911 yi3dyl 1110 0 The expected amount of grain in stock is 80 lbs Recall that marginally Y1 beta23 and Y2 beta23 so that Z and EY1 Y2 2 g g D 5 8 5 PAGE 113 CHAPTER 4 STATMATH 5117 J TEBBS Example 414 A process for producing an industrial chemical yields a product con taining two types of impurities Type I and Type ll From a speci ed sample from this process7 let Y1 denote the proportion of impurities in the sample of both types and let Y2 denote the proportion of Type I impurities among all impurities found Suppose that the joint pdf of the random vector Y17 Y2 is given by 217y1 0lty1 lt1 0lty2lt1 fax 111112 07 otherwise Find the expected value of the proportion of Type I impurities in the sample SOLUTION Because Y1 is the proportion of impurities in the sample and Y2 is the proportion of Type I impurities among the sample impurities7 it follows that Y1Y2 is the proportion of Type I impurities in the sample taken Let the function g R2 a R be de ned by 9y17y2 ylyg We would like to compute EgY1Y2 This is given by 1 1 l 111221yidyidyz8 D 0 0 PROPERTIES OF EXPECTATIONS Let Y Y1Y2Yn be a discrete or con tinuous random vector with pdf pmf fyy and support R C R suppose that 97919279k are real vector valued functions from R a R and let 0 be any real constant Then7 a Ec c b E69Yl 0E9Yl C E 221 9jYl 2221 E 9jYl RESULT Suppose that Y1 and Y2 are independent random variables7 and consider the functions 9Y1 and hltY2 where 9Y1 is a function of Y1 only7 and hY2 is a function of Y2 only Then7 E9Y1hY2l E9Y1lEhY2L provided that all expectations exist Proof Without loss7 we will assume that Y1Y2 is a continuous random vector the PAGE 114 CHAPTER 4 STATMATH 5117 J TEBBS discrete case is analogous Suppose that Y17 Y2 has joint pdf fy1y2y1y2 with support R C R2 Note that Emma2 R 2glty1gthlty2fy1y2lty1yadyzdyl 9y1hy2fy1y1fy2y2dy2dy1 R R R9ylfY1yldyl Ahyzfygyzdyz Emu2 Aglty1gtfyllty1gtdy1 ElhY2lElgY1l D Example 415 A point YhYZ E R2 is selected at random7 where Y1 N NM102 Y2 N NM2702 and Y1 and Y2 are independent De ne the random variables T Y1Y2 U Y1Y2 z Y12Y22 Find ET EU and EZ SOLUTIONS a Because is linear7 we know ET EY1 Y2 1901 5706 1 2 Because Y1 and Y2 are independent7 we know that EU EY1Y2 EY1EY2 mm To compute EZ7 rst note that 137012 VY1 lEY1l2 02 M and EY22 VY2 lEY2l2 02 3 so that EZ EYf Y Em E0 02 u 02 3 202M M D EXERCISE Compute ETU7 ETZ7 and EUZ PAGE 115 CHAPTER 4 STATMATH 5117 J TEBBS 48 Covariance and correlation 481 Covariance TERMINOLOGY Suppose that Y1 and Y2 are random variables with means Myl and 1132 respectively The covariance between Y1 and Y2 is given by COVY17 Y2 EKYJL MY1Y2 MY2l39 The covariance gives us information about how Y1 and Y2 are linearly related THE OOVARIANC39E COMPUTING FORMULA It is easy to show that COVY1Y2 E EY1 MOO2 MYM EY1Y2 7 mm This latter expression is sometimes easier to work with and is called the covariance computing formula Example 416 Gasoline is stocked in a tank once at the beginning of each week and then sold to customers Let Y1 denote the proportion of the capacity of the tank that is available after it is stocked Let Y2 denote the proportion of the capacity of the bulk tank that is sold during the week Suppose that the random vector YhYZ has joint pdf 3117 0lt92 lt91 lt1 fY1Y2y17y2 07 otherwise To compute the covariance7 rst note that Y1 beta37 1 and Y2 fy2y27 where 317yg 0lty2ltl 0 ng 12 otherwise 7 Thus7 33 1 075 and 1 3 EY2 12 x 517y dy 0375 O Also7 1 11 11112 gtlt 31116111261111 030 y10 y20 PAGE 116 CHAPTER 4 STATMATH 5117 J TEBBS Thus the covariance is COVY17Y2 7 MY1MY2 030 i 0750375 001875 D NOTES ON THE OOVARIANOE o If CovY1Y2 gt 0 then Y1 and Y2 are positively linearly related o If CovY1Y2 lt 0 then Y1 and Y2 are negatively linearly related o If CovY1 Y2 0 then Y1 and Y2 are not linearly related This does not necessarily mean that Y1 and Y2 are independent RESULT lf Y1 and Y2 are independent then CovY1Y2 0 Proof Using the covariance computing formula we have COVY17Y2 7 MY1MY2 EY1EY2 Iii1W2 0 E MAIN POINT If two random variables are independent then they have zero covariance however zero covariance does not necessarily imply independence Example 417 An example of two dependent variables with zero covariance Suppose that Y1 U71 1 and let Y2 le It is straightforward to show that 0 my Ems o and E02 Em VY1 13 Thus COVY17Y2 7 MY1MY2 0 7 039 However not only are Y1 and Y2 related they are perfectly related But the relationship is not linear it is quadratic The covariance only assesses linear relationships D IMPORTANT RESULT Suppose that Y1 and Y2 are random variables Then VY1 Y2 VY1 VY2 2oovY1 Y2 WK 7 Y2 VY1 VY2 i QCovY1Y2 PAGE 117 CHAPTER 4 STATMATH 5117 J TEBBS Proof Let Z Y1 Y2 Using the de nition of variance7 we have VZ 7 EZ7Mzzl 7 EY1 Y2 7 EY1 Y2lz 7 EY1 Y2 7 MY1 M102 7 EY1 7 WI Y2 7 MW 7 MY12 7 MY22 2 7 MY1Y2 7 MY2l V cross product 7 ElY1 7 MY12l ElY2 7 MW 2EY1 7 MOO2 7 MYM 7 VY1 VY2 2CovY1 Y2 That VY1 7 Y2 VY1 106 7 2CovY1Y2 is shown similarly D Example 418 A small health food store stocks two different brands of grain Let Y1 denote the amount of brand 1 in stock and let Y2 denote the amount of brand 2 in stock both Y1 and Y2 are measured in 100s of lbs ln Example 467 we saw that the joint distribution of Y1 and Y2 was given by 2411927 91 gt07 92gt07 0ltyiyzlt1 0 fiaY2yry2 otherwise 7 What is the variance for the total amount of grain in stock That is7 what is VY1 Y2 SOLUTION Using the last result7 we know that VY1 Y2 VY1 VY2 2CovY1Y2 Marginally7 Y1 and Y2 both have beta23 distributions see Example 46 Thus7 2 2 E Y E Y 7 7 lt 1 lt 2 2 3 5 and 23 1 VY1 VY2 m Recall that CovY1Y2 7 so we need to rst compute 1 1 111 2 11112 gtlt 2411112 dyzdyi y10 2120 15 PAGE 118 CHAPTER 4 STATMATH 5117 J TEBBS Thus Comm Ema e EltY1gtEltY2gt 135 e 70027 Finally the variance of Y1 Y2 is given by 1 1 VY1 Y2 g g 270027 m 0027 D RESULT Suppose that Y1 and Y2 are independent random variables Then VY1 i Y2 VY1 VY2 Proof In general VY1 j Y2 VY1 VY2 i 2CovY1Y2 Since Y1 and Y2 are independent CovY1Y2 0 Thus the result follows immediately D LEMMA Suppose that Y1 and Y2 are random variables with means y and 32 respec tively Then a CovY1Y2 CovY2Y1 b Cowl Y1 VY1 c Cova bY1c dYZ deovY1Y2 for constants a b c and d Proof Exercise D 482 Correlation GENERAL PROBLEM Suppose that X and Y are random variables and that we want to predict Y as a linear function of X That is we want to consider functions of the form Y 60 61X for constants 60 and 61 In this situation the error in prediction77 is given by Y i 50 51X This error can be positive or negative so in developing a goodness measure77 of prediction error we want one that maintains the magnitude of error but ignores the sign Thus PAGE 119 CHAPTER 4 STATMATH 5117 J TEBBS consider the mean squared error of prediction given by QWo i E Eily 50 51Xl2 A two variable calculus argument shows that the mean squared error of prediction Qwo l is minimized when CovX Y l VX and CovX Y Wl Em However note that the value of 61 algebraically is equal to mMW7 CovX Y VX CovX Y J 073 UXUY UY PXY 7 7 UX CovX Y PXY UXU39Y 61 UX where The quantity pr is called the correlation coe icient between X and Y SUMMARY The best linear predictor of Y given X is Y 60 61X where a 51 PXY 0X 60 EY 761E NOTES ON THE CORRELATION COEFFICIENT 1 71 S pr S 1 this can be proven using the Cauchy Schwartz lnequality frorn calculus 2 If pxy 1 then Y 60 61X where 61 gt 0 That is X and Y are perfectly positively linearly related ie the bivariate probability distribution of X Y lies entirely on a straight line with positive slope PAGE 120 CHAPTER 4 STATMATH 511 J TEBBS 3 If pxy 1 then Y BO 61X where 61 lt 0 That is X and Y are perfectly negatively linearly related ie the bivariate probability distribution of X Y lies entirely on a straight line with negative slope 4 If pXy 0 then X and Y are not linearly related NOTE If X and Y are independent random variables then pr 0 However again the implication does not go the other way that is if pr 0 this does not necessarily mean that X and Y are independent NOTE In assessing the strength of the linear relationship between X and Y the cor relation coef cient is often preferred over the covariance since pr is measured on a bounded unitless scale On the other hand CovX Y can be any real number Example 419 In Example 416 we considered the bivariate model 3117 0lt92 lt91 lt1 Jig 91792 0 otherwise for Y1 the proportion of the capacity of the tank after being stocked and Y2 the pro portion of the capacity of the tank that is sold What is pyly SOLUTION ln Example 416 we computed CovY1Y2 001875 so all we need is Ty and 039y2 We also found that Y1 beta3 1 and Y2 fy2y2 where 317yg 0lty2 lt1 0 ng 12 otherwise The variance of Y1 is 31 3 3 VY 7gt i0194 1 311312 80 U V80 Simple calculations using fy2y2 show that 15 and 38 so that 1 3 2 VY2 g 7 0059 gt 0y2 0059 x 0244 Thus C Y Y 001875 pYbY2 M N x 040 D 039y1039y2 N 0194 gtlt 0244 PAGE 121 CHAPTER 4 STATMATH 5117 J TEBBS 49 Expectations and variances of linear functions of random variables TERMINOLOGY Suppose that Y17Y27 Yn are random variables and that 11027 an are constants The function UZamama2Y2manYn i1 is called a linear combination of the random variables Y1 Y2 Yn EXPECTED VALUE OF A LINEAR COMBINATION EU EltZaiyigt ZaiEYi i1 i1 VARIANCE OF A LINEAR COMBINATION VU V lt Z ain Z IlaK 2 Z magCovOi7 i1 i1 iltj Z a Yi ZaiajCovYZYj i1 i7 j OOVARIANOE BETWEEN TWO LINEAR OOMBINATIONS Suppose that M H U1 aiYi01Y112Y2anYn U2 M3 ijj lel b2X2 mem x H Then7 it follows that CovU1 U2 Z Z aibjCovOi Xj 71 m 11 j1 BIVARIATE CASE Interest will often focus on situations wherein we have a linear combination of n 2 random variables In this setting7 E01Y1 1252 01EY1 MEG2 V01Y1 1252 ailY1 ailY2 2010200VY17Y2 PAGE 122 CHAPTER 4 STATMATH 5117 J TEBBS Similarly7 when n m 27 Cova1Y1 a2Y2b1X1 bzXz alblcovY1X1 albgcoVltY17 X2 agblcOVY2 X1 azbzcoVltY27 Example 420 Achievement tests are usually seen in educational or employment set tings These tests attempt to measure how much you know about a certain topic in a particular area Suppose that Y1 Y2 and Y3 represent scores for a particular different parts of an exam It is posited that Y1 N J1247 Y2 1697 Y3 20167 Y1 and Y2 are independent7 CovY1Y3 087 and CovY2Y3 767 Two different summary measures are computed to assess a subject7s performance U105Y172Y2Y3 and U23Y172Y27Y3 a EU1 and VU1 b Find CovU1U2 SOLUTIONS The mean of U1 is EU1 E05Y1 2Y2 Y3 05EY1 2EY2 EY3 0512 7 216 20 76 The variance of U1 is vltU1gt mm 7 21 Y3 052VY1 722VY2 VY3 20572COVY17Y2 2051COVY1Y3 272lCOVY2Y3 0254 49 16 205720 20508 272767 806 The covariance between U1 and U2 is CovU1 U2 Cov05Y1 i 2Y2 Y3 3Y1 i 2Y2 7 Y3 o53cOvi1Y1 0572COVY17Y2 0571COVY17Y3 23COVY2Y1 22COVY2Y2 21COVY2 13COVYz x7Y112COV5G75611COV5G5G PAGE 123 CHAPTER 4 STATMATH 5117 J TEBBS 410 The multinomial model RECALL When we discussed the binomial model in Chapter 2 each Bernoulli trial resulted in either a success or a failure77 that is on each trial there were only two outcomes possible eg infectednot germinatednot defectivenot etc TERMINOLOGY A multinomial experiment is simply a generalization of a binomial experiment In particular consider an experiment where o the experiment consists of 71 trials 71 is xed 0 the outcome for any trial belongs to exactly one of k 2 2 classes 0 the probability that an outcome for a single trial falls into class 239 is given by pi for 239 1 2 k where each p remains constant from trial to trial and 0 trials are independent DEFINITION In a multinomial experiment let Y denote the number of outcomes in class 239 so that Y1 Y2 Yk n and denote Y Y1Y2Yk We call Y a multinomial random vector and write Y N multnp1p2 pk Zip 1 NOTE When k 2 the multinomial random vector reduces to our well known binomial situation When k 3 Y would be called a trinomial random vector JOINT PMF If Y N multnp1p2 pk Zip 1 the pmf for Y is given by n 21 22 yk 7 7 gillyzlykp1 192 mph in i 07 17 i n 0 WW otherwise 7 Example 421 In a manufacturing experiment we observe n 10 parts each of which can be classi ed as non defective defective or reworkable De ne Y1 number of non defective parts Y2 number of defective parts Y3 number of reworkable parts PAGE 124 CHAPTER 4 STATMATH 5117 J TEBBS Assuming that each part ie7 trial is independent of other parts7 a multinomial model applies and Y Y17Y27Y3 mult10p1p2p3 Zipl 1 Suppose that p1 0907 p2 0037 and p3 007 What is the probability that a sample of 10 contains 8 non defective parts7 1 defective part7 and 1 reworkable part7 SOLUTION We want to compute py1y2y387171 This equals 01 m090800310071 m 0081 a pY1Y2Y387171 Example 422 At a number of clinic sites throughout Nebraska7 chlamydia and gon orrhea testing is performed on individuals using urine or cervical swab specimens More than 30000 of these tests are done annually by the Nebraska Public Health Laboratory Suppose that on a given day7 there are n 280 subjects tested7 and de ne p1 proportion of subjects with neither chlamydia nor gonorrhea p2 proportion of subjects with chlamydia but not gonorrhea p3 proportion of subjects with gonorrhea but not chlamydia p4 proportion of subjects with both chlamydia and gonorrhea De ne Y 117127137147 where K counts the number of subjects in category 2 As suming that subjects are independent7 Y N mult280p1p2p3p4 Zipl 1 The pmf of Y is given by gmpllpg2p 3pi 22 0 1 280 221 280 0 WW otherwise FACTS lfY Y17Y2Yk multnp1p2pk Zipl 17 then 0 The marginal distribution of K is bnpi7 for 2 17 27 o npi for 2 127 k 0 moi1 7101 for 2 17 27 k 0 The joint distribution of Y7 is trinomialnpipj1 7 pl 7 pg 0 CovYlYj inpipj for 2 31 j PAGE 125 CHAPTER 4 STATMATH 5117 J TEBBS 411 The bivariate normal distribution TERMINOLOGY The random vector Y1Y2 has a bivariate normal distribution if its joint pdf is given by 6 62 2117212 6 R2 27n7102 17 p2 0 ay2 117 12 otherwise 2 2 1 7 7 7 7 Q 2 lt91 1 7 2p lt91 1 lt92 2 lt92 M2 17p 01 01 U2 02 We write Y1Y2 N N2Mlpgafagp There are 5 parameters associated with this where bivariate distribution the marginal means 1 and M2 the marginal variances a and 0 and the correlation p E pybyz FACTS ABOUT THE BIVARIATE NORMAL DISTRIBUTION 1 Marginally Y1 NM1Uf and Y2 NM2U 2 Y1 and Y2 are independent ltgt p 0 This is only true for the bivariate normal distribution remember this does not hold in general 3 The conditional distribution mm 12 N m p lt12 7 ma a we 4 The conditional distribution YzHYl 21 N N 2 p011 M17U 17 p2 EXERCISE Suppose that Y1Y2 N200 1 1 05 What is PY2 gt 085 Y1 02 ANSWER From the last result note that conditional on Y1 yl 02 Y2 N01075 Thus PY2 gt 085 Y1 02 PZ gt 1 01587 lnterpret this value as an area PAGE 126 CHAPTER 4 STATMATH 5117 J TEBBS 412 Conditional expectation 4121 Conditional means and curves of regression TERMINOLOGY Suppose that X and Y are continuous random variables and that gX and hY are functions of X and Y7 respectively7 Recall that the conditional dis tributions are denoted by fx y y and fy Xylx Then7 E9XY y R 996fxiylyd96 EWYMX z R hyfyixyldy If X and Y are discrete7 then sums replace integrals IMPORTANT It is important to see that7 in general7 o EgXY y is a function of y and o EhYX x is a function of x CONDITIONAL MEANS In the de nition above7 if gX X and hY Y7 we get in the continuous case7 EltXY y R fxiy95lyd95 EmX z jagmummy EXY y is called the conditional mean of X7 given Y y it is the mean of the conditional distribution fx y y On the other hand7 EYX s is the conditional mean of Y7 given X x it is the mean of the conditional distribution fy Xylx Example 423 In a simple genetics model7 the proportion7 say X7 of a population with Trait l is always less than the proportion7 say Y7 of a population with trait 2 ln Example 437 we saw that the random vector X7 Y has joint pdf 6x 0ltzltylt1 0 Jew9671 otherwise 7 PAGE 127 CHAPTER 4 STATMATH 5117 J TEBBS ln Example 45 we derived the conditional distributions 29692 0lt9clty fXYll and fYXll 1 0 otherwise 0 otherwise zltylt1 Thus the conditional mean of X given Y y is EXlY y y 0 95fXY95lld95 y 2 2 xd7 0 y 9 Similarly the conditional mean of Y given X z is 3 3 0 1 mnXzgt39thmnm 1 1 1 y 1 1 d 7 1 Alllt1izgty 1795 21 2Q That EYlX s 1 is not surprising because YlX x N 12 1 D TERMINOLOGY Suppose that X Y is a bivariate random vector 0 The graph of EXlY y versus y is called the curve of regression of X on Y o The graph of EYlX s versus z is called the curve of regression of Y on X The curve of regression of Y on X from Example 423 is depicted in Figure 419 4122 Iterated means and variances REMARK In general EXlY y is a function ofy and y is xed not random Thus EXlY y is a xed number However EXlY is a function of Y thus EXlY is a random variable Furthermore as with any random variable it has a mean and variance associated with itll ITERATED LAWS Suppose that X and Y are random variables Then the laws of iterated expectation and variance respectively are given by mmEmwwn PAGE 128 CHAPTER 4 STATMATH 5117 J TEBBS x 0 Figure 419 The curve of regression EYlX s versus z in Example 423 and VX EVXlYl VEXlY NOTE When considering the quantity EEXlY7 the inner expectation is taken with respect to the conditional distribution fX yly However7 since EXlY is a function of Y7 the outer expectation is taken with respect to the marginal distribution fyy Proof We will prove that EX for the continuous case Note that EX Rszxyxyddy xfxlyxlyfyydxdy R R R szxlymmdz fyydy ElEXlYl a E X lYy Example 424 Suppose that in a eld experirnent7 we observe Y7 the number of plots7 out of n that respond to a treatment However7 we dont know the value of p7 the probability of response7 and furtherrnore7 we think that it may be a function of location7 PAGE 129 CHAPTER 4 STATMATH 5117 J TEBBS temperature precipitation etc In this situation it might be appropriate to regard p as a random variable Speci cally suppose that the random variable P varies according to a betaoz distribution That is we assume a hierarchical structure Y P p N binomialnp P N betaoz The unconditional mean of Y can be computed using the iterated expectation rule a EY EEYP EnP nEP n 7 ltgt M i ltgt MB The unconditional variance of Y is given by VW MWYWNWEWWN EnP17 13 VnP nEP 7 P2 n2VP nEP i nVP EP2 n2VP n L n L L 2 7 am a 2a 1a 7 n 04 i i nn 71MB 7 a 1 a ama h39 Unconditionally the random variable Y follows a betabinomial distribution This is 712046 ama h extra Variation a popular probability model for situations wherein one observes binomial type responses but where the variance is suspected to be larger than the usual binomial variance D BETA BINOMIAL PMF The probability mass function for a betabinomial random variable Y is given by 1 1 my mwwp fyipyipfppdp 0 0 0 ltZPy1 e prw Sjg ipmu e pf ldp n Na mm awn a e y y FWWWWWam 7 for y 01n and pyy 0 otherwise PAGE 130

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

#### "I signed up to be an Elite Notetaker with 2 of my sorority sisters this semester. We just posted our notes weekly and were each making over $600 per month. I LOVE StudySoup!"

#### "There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

#### "Their 'Elite Notetakers' are making over $1,200/month in sales by creating high quality content that helps their classmates in a time of need."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.