DISTRIBUTION THEORY STAT 610
Popular in Course
Popular in Statistics
This 18 page Class Notes was uploaded by Celestino Bergnaum on Wednesday October 21, 2015. The Class Notes belongs to STAT 610 at Texas A&M University taught by Staff in Fall. Since its upload, it has received 51 views. For similar materials see /class/225759/stat-610-texas-a-m-university in Statistics at Texas A&M University.
Reviews for DISTRIBUTION THEORY
Report this Material
What is Karma?
Karma is the currency of StudySoup.
Date Created: 10/21/15
Chapter 3 Common Families of Distributions 31 Discrete distributions CB 32 Binomial distribution An experiment which satis es the following conditions 1 4 is called a binomial experiment a The experiment consists of a sequence of n trials where n is xed in advance b The trials are identical and each trial can result in two possible outcomes which we denote by S and F Such trials are called Bernoulli trials c The trials are independent so that the outcome of any particular trial does not in uence that of any other trial d The probability of S is constant from trial to trial and is denoted by p Given a binomial experiment consisting of n trials the binomial random variable say X associated with this experiment is de ned as the total number of S s among the n trials If the total number of trials is n and the probability of success for each trial is p then we say that X is distributed as binomial n p or B n p for convenience write X N 30171 Summary of facts concerning B n p pmf We pgtHIo1nltxgt mgf 194 1 PM mean np variance npl p EXAMPLES Suppose a balanced coin is tossed 10 times The number of heads ob served is distributed as B10 5 Suppose that a group of 200 cancer patients are given a certain treatment and that according to past research 40 of all patients respond to the treatment Then the number of patients that respond to the treatment in this group is a binomial random variable with the distribution B 200 4 Suppose that 10 students are selected at random from TAMU with re placement Then the number of students selected from the College of Science is distributed as binomial with n 10 and p the percentage of Science students at TAMU Suppose that 10 students are selected at random from TAMU without replacement Then the number of students selected from the College of Science is not distributed as binomial However the binomial provides a rather good approximation to the true distribution EXAMPLE Suppose in a quiz a student has to answer at least 6 of 10 true false questions in order to pass What is the probability for someone who knows nothing about the material to pass SOLUTION Let X be the number of correct answers Since the person makes blind guesses p 5 Hence mem Hence HX PmeJMHXm b61057b71054 7b101ow5 205141172404394009840010 3770 Geometric and negative binomial distributions The experiments that lead to negative binomial distributions can be charac terized as follows a The experiment consists of a sequence of independent and identical trials b Each trial results in either S or F c The experiment continues ie trials are performed until a total of r successes are observed where r is a speci ed ie pre determined positive integer The rv of interest is X 7 the number of trials needed to obtain the r th success X is said to have a negative binomial distribution The pmf of the negative binomial rv is derived as follows First RX 737 17 2 For each x in RX if X x then it means that the rst x 1 outcomes there must be r 1 S s call this A and the last trial must result in S call this B Hence PX x PA m B PAPB ltiTgtPT11P p 1 i lt05 gtpT1 pITxrr1r2 r 1 Alternatively we can focus on Y X r the number of failures before the r th success The pmf of Y is PYy PXyr y7 1 T yr 1 T ltr1gtP1 Pylt y gtP1 pyy012 Let s work with this version of the negative binomial distribution Note that ltyr 1gt i yr 137 2r1r y yy 1y 221 r l r 2 r yl yy 1y 221 ly rgt negative binomial coef cient ml Thus PltY y lt 1gty rprlt1 my Summary of facts concerning the negative binomial pmf TWO Ply1012 mgf Wm lt log1 p mean w variance 1 P EXAMPLE Suppose that Pmale birth 5 A couple wish to have ex actly two female children in their family They will have children until this condition is ful lled a What is the probability that the family has four or more children b How many male children would you expect the family to have SOLUTION Let X total number of male children which is a negative binomial random variable with p 5 r 2 where PXxltx1gtp2l prx01 P the the family has four or more children PX Z 2 1 PX0 PX1 1 2 M La 2u 5 p E The special case of the negative binomial distribution where r 1 is called the geometric distribution One version of the geometric distribution is described by the pmf pWp pfd mw which is the pmf for the number of trials needed to get the rst success The geometric distribution has the following property called memoryless property PX gttuX gt1 PX gtu for all u7t 1727 This is due to the fact that the trials are independent If Xn is negative binomial pmrn where Tn gt 00 and rnl pn gt some constant A 6 000 then it can be shown exercise that the mgf of Xn converges to the mgf of Poisson A and hence the distribution of Xn converges to Poisson Hypergeometric distribution The experiments that lead to hypergeometric distributions can be character ized as follows a The population to be sampled consists of N elements b Each element can be classi ed as a success S or a failure F and there are M successes in the set c A sample of K elements is drawn at random and Without replacement from the set The rV of interest is X number of S s in the sample Clearly for any integerx for which 0 S x S M and 0 S 95 S N M X is said to have a hypergeometric distribution Note that in the experiment above if the samples are drawn m replace ment then XNBWM Where M nKandpW In that connection if K the sample size is small compared with N 7 the population size then the hypergeometric probabilities can be approximated by binomial probabilities ere e EXAMPLE During the course of an hour 1000 bottles of beer are lled by a particular machine Each hour a sample of 20 bottles is randomly checked and the number of ounces if beer per bottle is checked Suppose that during a particular hour 100 under lled bottles are produced Find the pr that at least 3 under lled bottles will be among those sampled The exact value of this pr is given by PX23 1 PX0 PX1 PX2 100 900 100 900 100 900 020 l192100183224 1 1380 1380 20 This can be approximated by a binomial probability let n 2019 100 1000 1 1 b0 mp bl mp b2 mp 3231 The mean can be computed as x if H 1 Note that 959510341 xl Mgr 1xlll 1Di x 1 Mltl11gt andlikewise N i N fl N l K TKKN kMTKK l Hence K l4 l1 i 2 K M T 11 K71 K 1M1 v y y0 N since M71 NiM y Kiliy I 01K71 is another hypergeometric pmf The Poisson distribution Suppose X is a random variable with the following pmf eiAAx x 0 1 2 z 7 7 7 7 7 p l 0 elsewhere where A is some positive constant and e is the base of the natural logarithm Then we say that X has a Poisson distribution with parameter A Summary of facts concerning PoissonA mgf e et 1 oo lt t lt 00 mean A variance A Unlike the distributions that we have considered so far there is no univer sal characterization of the experiments from which the Poisson distribution arises In fact the Poisson distribution cannot be produced by an experiment that only involves a nite number of steps It is useful to understand the Poisson distribution from the point of view of the Poisson process Consider the time points where random arrivals into a system occur for example we may be considering the arrivals of customers in a store the arrivals of ights at an airport the arrivals of jobs into a computer system etc Suppose the inter arrival times are independent and follow the exponential distribution then the arrivals are said to form a Poisson process For a Poisson process we can speak of its rate denoted by oz here which is the expected number of arrivals in a unit time In a Poisson process with rate oz the number of arrivals between any two time points a b is a Poisson random variable with mean ab a EXAMPLE Suppose small aircrafts arrive at a certain airport according to a Poisson process with rate oz 2 per hour so that the number of arrivals during t hours is a Poisson rv with mean 2t a What is the probability that exactly 3 or more aircrafts arrive during a 25 hour period b What is the expected value and standard deviation of the number of small aircrafts that arrive during a 90 minute period SOLUTION a Let X the number of arrivals during a 25 hour period Then X N Poisson 5 and so PX23 1 PX0 PX1 PX2 50 51 52 7 5 7 1 6 lt0 T 1 T 2 875 b The Poisson distribution concerned has parameter A 3 Hence the mean is 3 and the standard deviation is 1732 I An interesting application of the Poisson distribution is the approximation to the binomial distribution We saw in Chapter 2 that if Xn N Bnpn were npn gt A E 0 00 then the cdf of Xn converges to the Poisson cdf This implies that the pmf of Xn converges to the pmf of Poisson so that for any xed 05 n 64 ltxgtp1 P707171 ml as n gt 00 The way in which this is applied in practice is that if X N B01719 Where n is large 7 and np is not too large 7 then n 64 I 1 H m 05 p 051 EXAMPLE Suppose that a PC manufacturer buys a particular type of disk drive in shipments of 1000 from a supplier Suppose from past experience the probability that any one disk drive purchased being unsatisfactory is 001 In a shipment of 1000 of such drives What is the probability that a none are defective b three or more are defective SOLUTION Let X the number of defective drives X is a binomial rv with parameters 71 1000 and p 001 Since np 1 it is appropriate to use Poisson approximation 6 110 0 6 1 368 PX0m 10 11 12 71 1 6 080301 32 Continuous distributions CB 33 Uniform a b distribution pdf 74112 95 mean aTb variance bid 2 Gamma oz Q distribution Pdf FltJgtMQ 16 I fltowgtltmgt mgf 1 315W mimm lt 15 mean oz variance a g Gammal is called the exponential distribution with mean THEOREM 321 Memoryless property If X has the exponential distribu tion Then for 151152 gt 0 PltX gt151 15ng gt151 PX gt152 PROOF Note that PX gt1 6432 gt 0 Hence W Beulah5 e t2BPXgt1 PX gt t1 67W 2 D Gammaoz is called the Erlang distribution if oz is a positive integer Er lang was a Swedish scientist whose work revolutionized theory of communi cation lf signals arrive according to a Poisson process with rate 15 per unit time then the time until the r th signal is distributed as gamma 7 This can be seen from the following result THEOREM 321 Gamma Poisson relationship If X N gammar where r is a positive integer then for any x gt 0 PX xPYZr ll Where Y N Poisson Note that PY Z 7 the probability that the r th arrival in a Poisson process with rate 1 is S x PROOF 1 I P X S x tHe t dt 7 1ll T 0 By integration by parts 1 z 13ng xril eiz r1 tTZBt dt r Wl l 0 tTTge t dt 7 39 0 Repeating this process for another 7 2 times gives the desired result 1 For any positive integer p gamma 192 2 is called the chi square distribution with p degrees of freedom df Normal distribution Summary of facts concerning N01 02 pmf 1 e W MVQUQ mgf eMHUWg oo lt t lt 00 mean variance 02 transformation aX b N Nau b ago The normal is a good approximation to many distributions because of the Central Limit Theorem Ch 5 Beta distribution The beta function B oz 5 is de ned as Boz W oz gt 0 12 The beta 045 pdf is 1 3amp5 94045 mail 95 710195 The n th moment is 1 1 1 xnxa 1 x B ldx Mani gt 1 BW m Bkn 39 For example Ba 15 rmnmmrwm Na 5 1 PaF ammmm rmm 010 BWW 5 NOOFW oz i EX Cauchy distribution The pdf of the Cauchy distribution is 1 1 fm mu53 The mean of the distribution is unde ned since 00 1 1 E X d I I wxw1x 02x 00 Lognormal distribution If Y N N 1702 then X eY is said to have the lognormal distribution The pdf of X is le aogI MQQUQ a 27F 05 fty1702 and EX EeY Mm WW2 33 Exponential family of distributions CB 34 A collection of pmf s or pdf s is in the exponential family if it can be expressed in the form k mm home exp zit0mm Where Mac 2 0 06 2 0 0 can be vector h t1 tk are functions of x that do not depend on 0 and 071121 wk are functions of 0 that do not depend on 95 Examples of exponential families include binomial negative binomial Pois son normal gamma The binomial pmf can be written as Mlp 9pm Pln71101mn95 1 P y Io1mn95 hx0pexpwpt Where am w1ltT gt amx The gamma pdf can be written as ft m Paaxa1eIBIlt0OOltxgt emillogIeif jmpoxx hxcg exp w10t105 w20t295 Where 0 045 Mm 1ooofr 09 lm mm nnmmx 1026 l 15295 x Note that the functions are not unique The uniform distribution on 0 0 0 gt 0 is not an exponential family since the pdf 1 ffmlgl 510695 can not be put into exponential form since it is impossible to write 0995 as a product of Mac and 06 This can obviously generalized a family is not exponential Whenever the set on which the pmf or pdf is positive called the support of the distribution depends on 6 In some situations it may be desirable to reparameterize the family in terms of the natural parameters k fWWhW7WXplt mMO 2391 15 where 772 In the gamma example the natural parameters are 771Ot 1 772 15 and i 772 c 77 Pm U 34 Location and scale families CB 35 Let f be any pdf The the family of pdf s x u oo lt u lt 00 is called the location family with standard pdf f and u is called the location parameter for the family The the family of pdf s if a gt 0 is called the scale family with standard pdf f and a is called the scale parameter for the family The the family of pdf s oo lt u lt oo 0 gt 0 is called the location scale family with standard pdf f with location parameter u and scale param eter a This is a way to generate a family of distributions from a given distribution The family of normal distributions is a location scale family 35 lnegualities and identities CB 35 THEOREM 351 ChebycheV s inequality Let X be a rV and g a nonnegative function Then PROOF EgltXgt 0 gltxgtdFltxgt 2 Ham 2 ugtgltxgtdFltxgt gt u 905 2 udFx mom 2 u uP9X 2 u I COROLLARY 352 Let X be a rV with mean u and variance 02 Then 02 PROOF Applying Theorem 351 with 995 x M2 E X u 2 02 FOXMl gtyPXM2gty2 2 2 y y D ChebyoheV s inequality is a very crude estimator as it only uses information in the rst two moments For example by Chebyohev s inequality PX u gt 20 S 25 but if X N Nuag then PX 0 gt 20 04550026 EXAMPLE The following inequality for the normal distribution is sharper lf Z N N01 then 21 2 PZ gt t 3 et 2t gt 0 7Tt l7 PROOF 2 0 2 PZgtt 64de l l x27Tt 2 00 2 1 2d 6 f xQWt t 212 Eze D EXAMPLE Here is another way in which Chebychev s inequality can be used For t gt 0 Mxt etz PX gt x PetX gt e 3 provided the mgf M X exists For example if X N N 0 1 then PX gt x PetX gt e S et22 The choice oft 05 gives the best bound 6 122 El THEOREM 352 Stein s Lemma Let X N N0702 and let 9 be a differ entiable function with EgX lt 00 Then ElgXX 0l UgEgX PROOF By integration by parts El9XX all 1 00 2 2 x x 0 67176 2a dx a 7 wglt gtlt gt 1 a2gmeltz6gt2lt2a2gtloow 2 gWemeVcaadm a 27F 00 The assumption implies that cr2gace g 922a2I3000 0 EX EX HX 0 0 EX HX 0 eEX H k 1a2EXH eEX H which can be applied iteratively to obtain higher order normal moments 18