INTRO TO PROBABILITY
INTRO TO PROBABILITY STA 4321
Popular in Course
Popular in Statistics
This 66 page Class Notes was uploaded by Golden Bernhard on Friday September 18, 2015. The Class Notes belongs to STA 4321 at University of Florida taught by Yasar Yesilcay in Fall. Since its upload, it has received 37 views. For similar materials see /class/206555/sta-4321-university-of-florida in Statistics at University of Florida.
Reviews for INTRO TO PROBABILITY
Report this Material
What is Karma?
Karma is the currency of StudySoup.
Date Created: 09/18/15
Chapter 6 Multivariate Probability Distributions 61 Bivariate and Marginal Probability Distributions Definition 61 Let X and Y be two discrete random variables The joint probability distribution of X and Y is given by PXYxy PX x Y y Here p X Yx y is defined for all real numbers x and y The marginal distribution of X is mac PX x Z pXYxy all y Similarly the marginal distribution of Y is pyy PYy XPXYxy all x STA4321 Chaps Page 1 of 27 Example 1 Refer to Tables 61 and 62 in text The joint pmf p X Yx y of two discrete random variables X and Y is given as follows X 1 2 Total 0 e 010 Y 1 010 020 2 010 020 Total 7 a Find 0 so that p X YXy is the joint distribution of the random variables X and Y b Find the marginal distribution of X c Find the marginal distribution of Y Solutions a For pXy to be the joint distribution pmf of the random variables X and Y two conditions must be satis ed 1 We must have 0 S pxy xy lt I for all pairs xy ii The sum of p XY xy over all pairs xy must be equal to I The rst condition will be satis ed if we have c gt 0 and pmxy 0 for a11xygt 10 11 12 20 21 and 22 To satisfy this we must have c 030 Then STA4321 Chaps Page 2 of 27 2 2 221mm 030 010 010 020 010 020 100 x1 y0 Thus both conditions are satis ed when 0 030 Now the complete table will be X 1 2 Total 0 030 010 040 1 010 020 030 2 010 020 030 Total 050 050 100 b The marginal distribution of X is found by adding the probabilities pXy over all values of Y for each value of X ie 7 pXxZpxyie 03010010050 when x 1 17quot 010020020050 whenx2 That is the marginal distribution of X is pXx 12 when x 1 0r 2 and pXx 0 otherwise STA4321 Chaps Page 3 of 27 Note that these probabilities are given in the lower margin last row of the table that is Why they are called the marginal probabilities c The marginal distribution of Y is found similarly for each value of Y add up the probabilities PXy for all values of X 03001 040 f0ry0 Pyy 010020 030 f0ry1 010 020 030 for y 2 and pYOI 0 otherwise Note These are the probabilities given in the right margin last column of the table Observe that the marginal and the joint probabilities given above satisfy the conditions of pmfs ie they are all greater than or equal to zero and add up to one STA4321 Chaps Page 4 of 27 Definition 62 Let X and Y be two continuous random variables The joint probability density function of X and Y is given by a nonnegative function fx y such that 6 fx y 2 Wm all x y b IIfxydxdy 1 c PultXltbcltYltdIchfxydxdy The marginal probability density function of X is fXx I fxydy Similarly the marginal pdf of Y is my Iwfxydx Example 2 Let fXYxyeWhenOltxltland0ltylt2 and fXYx y 0 otherwise a Find 0 so that fXYxy is the joint distribution pdf of the random variables X and Y b Find the marginal distribution of X What is the name of this distribution What are the values of its parameters c Find the marginal distribution of Y Do we have a name for this distribution STA4321 Chaps Page 5 of 27 Solutions a For fXYx y to be the joint pdf of the random variables two conditions must be satis ed i For all x y pairs fXYx y 2 0 ii Iwfwfxyxy1 The rst condition is satis ed when c gt O For the second condition we note that 1IIfx x y I021010dxdy102cx 1dy jozcdy 20 1 Hence c 12 Then the joint pdf of X and Y is fXYxy 12when0ltxlt1and0 ltylt2 and fXYx y 0 otherwise STA4321 Chaps Page 6 of 27 b To nd the marginal distribution of X integrate out the random variable Y om the joint distribution and nd fXx IwaYxydy 21 1 2 L Edy y y0 1 That is fXx 1 f0r0 ltxlt 1 and fXx 0 otherwise This is the continuous uniform distribution With parameters 0 and 1 STA4321 Chaps Page 7 of 27 d The marginal distribution of Y is fyy jfxyxydx 11 11 1 x 0 2 2 I 2 So fYy12f0r0ltylt2and fYy 0 otherwise This is the continuous uniform distribution With parameters 0 and 2 STA4321 Chaps Page 8 of 27 62 Conditional Probability Distributions Definition 63a Conditional probability distribution for discrete random variables Let X and Y be discrete random variables with joint probability mass function pXYxy and with marginal probability mass functions pXx and pyy for X and Y respectively Then The conditional distribution of X given Y y is pxy PXxYy pXYxy pyy PYy 0 otherwise forPYygt0 The conditional distribution of Y given X x is pxy PXxYy pYXy I x pX x PX x 0 otherwise forPXxgt0 STA4321 Chaps Page 9 of 27 Example 1 Continued Remember in Example 1 we had the following joint pmf distribution of X and Y X 1 2 Total 0 030 010 040 1 010 020 030 2 010 020 030 Total 050 050 100 d To nd conditional distribution of X given Y 1 we use the de nition of conditional probability that you have learned in Chapter 2 We de ne two events A and B as A Xx B Y 1 Then anduse PA BPAandB P B That is P X x and Y 1 pX Yx1PXxY1 PY1 Thus STA4321 ChapB Page 10 of 27 pmeHJPL xIY1 PX1Y1 010 1 form PY1 030 3 PX2Y10203 PY1 0303 pXYx1 for x 2 and PX x Y 1 0 otherwise e Similarly we can nd the conditional distribution of Y given X 2 as pYXy2PYyX2 PX2Y0 0101 0r 0 PX2 050 5 f y PX2Y1 020 2 2 1 pYXy PX2 050 5 fory PX2Y2 020 2 PX2 050 5 and PY y X 2 0 otherwise STA4321 ChapB Page 11 of 27 Definition 63b Conditional probability distribution for continuous random variables LetX and Y be jointly continuous random variables with joint probability density function fX Yx y and marginal densities fXx and f y respectively The conditional probability density function of X given Y y is fxy 0 fxy my form 0 otherwise The conditional probability density function of Y given X x is f xy fYXy I x fXx 0 otherwise foerxgt0 Example 2 Continued Using the results of Example 2 f Find fX Yx 1 the conditional distribution of X given that Y 1 g Is fX YOc Y a pdf Give reasons STA4321 ChapB Page 12 of 27 In example 2 we found that the joint distribution of the random variables X and Y is fXYxy 12when0ltxlt l and0ltylt2 and fXYx y 0 otherwise The marginal distributions of X is fXx 1 when 0 lt x lt l and fXx 0 otherwise The marginal distribution of Y is fYy 12 when 0 lt y lt 2 and fyy 0 otherwise Using these and the de nition of conditional distribution we have fX Yx 1 1 0ltxlt1 fxlyxY1 fy1 12 0 otherwise ie leYxY1 1 when 0 lt x lt 1 and leYxY1 0 otherwise e We see that fX YxYl as given above is in fact a pdf because it satis es both requirements of pdfs ie fX YxYl Z O for all X and JfXYleldx1 STA4321 ChapB Page 13 of 27 63 Independent Random Variables Definition 6 4a IndeQendence of two discrete random variables Given two discrete random variables X and Y with joint pmf may marginal pmfs pm and my and conditional pmfs pX x y and py XOlx respectively If any one of the following is true then all of them are true a X and Y are independent b pm x y pxx Xp mm c pXWxb pXxfor allx when pyy gt 0 d meyx pyyfor ally when pXx gt 0 Example1 Continued Are X and Y independent variables When pXYx y is as given in the following table X 1 2 Total 0 030 010 040 Y 1 010 020 030 2 010 020 030 Total 050 050 100 STA4321 ChapB Page 14 of 27 To show independence of the two variables the condition given above must be true for all xy pairs On the other hand to show dependence it is enough to show that the equality does not hold for only one pair In the above table we see that PX 1Y 0 030 7e PX 1 x PY 0 020 PX1gtltPYO O50gtltO4O 020 This is suf cient to state that X and Y are not independent random variables Definition 64b IndeQendence of two continuous random variables Given two continuous random variables X and Y with joint pdf fXYxy marginal pdfs fXx and fyy and conditional pdfs fX Yx y and fY XOlx respectively If any one of the following is true then all of them are true a X and Y are independent bfXY x y fxx XfYOIUr all x c fmyxly fXxfor allx whenfyy gt 0 d meyx fyyfor ally wheanx gt 0 STA4321 ChapB Page 15 of 27 Example2 Continued Are X and Y independent when fXYxy 12 when Olt x ltl Olty lt2 and fXYx y 0 otherwise We have already found that fX x 1 when 0 lt x lt l and fX x 0 otherwise and fy 02 12 when 0 lty lt 2 and fy 02 0 otherwise Putting these together we have fXYxy fX x X fv V for all pairs xy oo ltx lt 00 and oo lty lt 00 Thus X and Y are independent random variables The above definitions of independence may be extended to more than two variables For example X X 2 X are independent random variables if their joint distribution is equal to the product of their marginal distributions i e fx1 x2 xn f1x1 Xf2x2 X anxn Observe that ALL of the above de nitions are exactly the same as the de nitions of the independence of two events seen in Chapter 2 STA4321 ChapB Page 16 of 27 Expected Values of Functions of Random Variables The de nitions are very similar to the de nition of expected values seen in Chapters 3 and 4 except we have Bivariate or multivariate prnfs or pdfs Definition 65 Given two discrete random variables X Y with joint probability mass function px y or two continuous random variables X Y with joint probability density function fx y the expected value any function of X K W gOCY is 22gxypxy ifX and Y are discrete ElgXYl quot lly 0 0 gxyfxydxdy ifX and Y are continuous A Special Case If X and Y are two independent random variables Then mm x hm EgX1 x EhY1 Two important special expected values COVARIANCE and CORRELATION STA4321 ChapB Page 17 of 27 Definition 66 The covariance between two random variables X and Y is given by CovX Y EX uxY HY EXgtltYqugtltuy The equality on the right hand side is given as 21 Theorem 61 in your text Definition 67 The correlation coe icient between two random variables X and Y is given by CovX y p lVarXgtltVarY Examples 610 i 614 are interesting applications of the above concepts They are easy and you should be able to understand them Read and make sure you can solve such problems Otherwise come and see me STA4321 ChapB Page 18 of 27 Theorem 6 If two random variables X and Y are independent then CovX Y 0 and hence the correlation between them is zero i e p 0 Note that the opposite is not necessarily true See Example 610 for an interesting application that will be used again and again in this course and in the future courses you will take This is given in the following theorem Theorem 6 Let Y1 Y 2 Y be independent random variables with mean EY u and VY 0392 Define17lZYi n i1 ThenEI uY and VYan T he following theorem is needed to prove this one We will see an alternative and easier proof later STA4321 ChapB Page 19 of 27 Theorem 62 Simplified Given two random variables X and Y with means uX and y and standard deviations 0X and 0y respectively Then for any constants a b the following are true a EaX b aEX b auX b b VaraX b a2 Varx ClZO39YZ c CovaX bY a Xb XCovxy d VaraX bY aZVarX bZVar j 2abCovX 10 Proof a Very trivial Just use the de nition of expected values and do the required summation 0r integration bjAgain easy Expand aX b2and then nd its expected value From the EaX b2subtract the square of the EaX b asfound in part a This gives VaraX b c To nd the CovaX bY we will use the basic de nition of the covariance of two random variables say U and C say U aXand VbY here STA4321 ChapB Page 20 of 27 C0vUVEU EU gtlt V EV C0vaXbYEaX EaXxbY EbY EabXY aXEbY bYEaXEaXEbY EabXY aXbpY bYayX ayXbpY abEXY aEXbuY bEYauXa1bey abE X Y abux y b x y b x y abEXYab1XYl abC0vXY d For nding the VaraX bY we will start with the de nition of the variance of a random variable say U with U aX bY here VarUEl U EU ll VaraXbY E aXbY EaXbY2 E aX EaXbY EbYz EIaXXbYylz EazX1X2 b2Yuy2 2abXXYY azElXXzlb2ElYy2 ZabElXXYy a2gtltVarXb2XVarY2abe0vXY STA4321 ChapB Page 21 of 27 HW Show that the following are true 0 VaraX bY aZVa X bZVa Y 2abC0vX Y 0 F X and Y are independent Then VaraX bY aZVa X bZVa Y 0 F X and Y are independent Then VaraX bY aZVa X bZVa Y STA4321 ChapB Page 22 of 27 65 Multinomial Distribution When ALL of the following conditions are satis ed we have a random variable that has the multinomial distribution 1 The experiment consists of 11 identical trials 2 Each trial results in one of k possible outcomes 3 The probability that a single trial will result in outcome i is pi i l 2 k and remains the same om trial to trial Note that p1 p2 pk 1 4 The trials are independent of each other 5 The random variables of interest are Y1 Y2 Yk where Y is the number of times outcome i is observed in 11 trials i l 2 k Note Y1 Y2 Yk n with Y having observed values yi that can be 0 l 2 11 subject to the condition that they k add up to n ie Zyi n i1 When all of the above conditions are satis ed then the joint probability mass function of the random variables Y1 Y2 Yk is n PY1 y1Y2 y2rqu yk y y y pr p 2p 3pik 1 239 k39 Where iyi n and 2 1 i1 i1 STA4321 ChapB Page 23 of 27 Then under these conditions Ilpi VarYi npi1 pi and COVYi Ilpipj 67 More on Moment Generating Functions We have stated without proof that the moment generating functions when they exist are unique in the sense that every mgf belongs to one and only one distribution and if two random variables have the same mgf then they both have the same distribution This is called the unique identi cation property of moment generating functions and is very useful in nding the distributions of functions of random variables We will see more of that in the next chapter However read pages 369 and 370 and attempt all problems at the end of this section to prepare yourself for the next chapter STA4321 ChapB Page 24 of 27 Example Find the moment generating function of Y the sum of k k2 2 independent random variables say X1 X2 Xk ie YZXi when the distribution and i1 the mgf of X exist It is easy to show that the mgf of Y is equal to the product ofthe mgfs ofthe X s MYtEetYEetX1tX2th EetX1 xe X2 xXeth Ee X1XEetX2xXEeth MX1txMX2txxMXkt k 113M 0 Note The expected value of the product of random variables is equal to the product of the expected values when the random variables are independent This is used in line 3 of the above equalities In Chapter 4 we have seen that the exponential distribution is a special case of the gamma distribution The following theorem shows another relation between these two random variables STA4321 ChapB Page 25 of 27 Theorem Let X1 X2 Xk k2 2 be independent random variables each having exponential distribution k with parameter 6and let Y 2X1 i1 Then Y has the gamma distribution with parameters a k and 6 that is Y Gk 6 Proof We will prove this theorem using the unique identi cation property of moment generating functions Since X E 6 for i I 2 k we have seen in Chapter 4 that MXitMXt11 0t Now using the result of the above example we can write k k Mm gm 0 1ng t k k H i1 1 0t 1 0t ie Mm 1 6tk But this is the mgf of a random variable that has the gamma distribution With CL k and B 6 Hence by the unique identi cation property of mgf s Y has that distribution ie Y Gk 6 67 Conditional Expectation STA4321 ChapB Page 26 of 27 De nition 68 If X and Y are any two random variables the conditional expectation of X given Y y is de ned to be prXy x y if both X and Y are discrete EXYy 1 foyx y dx if both X and Y are continuous Similarly the conditional distribution of Y given X x is Zprxy x if both X and Y are discrete EYXx ally 0 nyxy xdy if both X and Y are continuous Note that this de nition is very similar to the de nitions you have seen in Chapters 3 and 4 The only difference is that the pmf or pdf is replaced With the conditional pmf or pdf 68 Compounding and its applications Read and Skip STA4321 ChapB Page 27 of 27 Chapter 3 Conditional Probability and Independence De nition 31 Conditional probability If A and B are any two events of the same sample space then the conditional probability of A given B ie probability of A given that B has occurred denoted by PA B is PA B mwhen PBgt0 P 3 IMPORTANT NOTE Conditional probabilities satisfy all axioms and rules of probability De nition Independence Two events A and B are said to be independent if PA n B PA gtlt PB IMPORTANT NOTE The following four statements are equivalent ie when any one of them is true then all the others are true and conversely when any one of them is false all the others are false A and B are independent PA B PA when PB gt 0 PBlA PB when PA gt 0 PAnBPAXPB Some Rules of Probability Rule 4 General Multiplication Rule For any two events A and B PA and B PA n B PA gtlt PB A PB gtlt PA B Rule 4a If A and B are independent events Then PA and B PA n B PA gtlt PB This is a special case of the general multiplication rule 33 Theorem of Total Probability and Bayes Rule Theorem 32 Theorem of Total Probability If B1 B2 Bk is a collection of mutually exclusive and totally exhaustive events ie if 3 MB wheni j AND 03 5 This is called partitioning the samples1pace ThenforanyeventACS PAZkPBigtltPA Bi H Theorem 33 Bayes Rule If B1 B2 Bk form a partition of the sample space S and A is any event in S then the probability of observing any one of these partitions say Bj given that A is observed is PBj APBI11ABj fBjgtltPA Bj ZPBxPAB Proofis easy if you can see that k PA ZPA nBI by Axiom 3 i1 k ZPB x PA Bi by multiplcation rule i1 Read and understand all the examples Ask me if you cannot understand any of them Skip Section 34 and exercises 3417 344 359 and 360 Solve as many of the supplementary exercises as you can Revised on September 22 2009 Chapter 8 Limiting Distributions 81 Introduction In Chapter 6 we have seen various techniques for nding distributions of functions of random variables However in some cases these may not suf ce and we need a few other techniques that can give us some approximation when faced with such cases For example in the previous chapter we have seen that the sample mean has a normal distribution when the Donulation has a normal distribution But what can we say about the distribution of the sample mean if the population has some other distribution We will see that a strong theorem called the Central Limit Theorem will give use the answer for such cases In this Chapter we will see some such techniques that yield approximate results for very large sample sizes n ie when n goes to or tends to in nity Some of these may even yield approximations that are reasonably good even when samples size are small In the next section we will give some theorems called limit theorems that give properties of random variables as n tends to in nity STA4321 Chap 7 Page 1 of 37 82 Convergence in Probability De nition 81 The sequence of random variables X1 X2 Xn is said to converge in probability to a constant c for every positive number 8 limPXn cl 810 11 co limPXn cl gt 0 11 so Now we are ready to use this de nition to show a very important property of the sample mean and to prove some theorems Theorem 8 1 Weak Law 01 Large Numbersl Let X1 X2 X be a random sample from a population of X s with mean EX uX and a 1 n nite variance VarX Oi lt oo Let X 2 X i1 Then for any positive real number 8 n gtoo limp M 2 a n gtoo STA4321 Chap 7 Page 2 of 37 Proof We know from Tchebysheff s Theorem that for any random variable Y and any k gt 1 l PY 9 k3 d P Now in the above inequalities replace Y with X We know froni Chapter 7 you can now prove that it EX X cmdu 22 VarX iin Replace y with X and replace 6y with G X AM to obtain 1 P X pXs ka X 1 J k Since we are free to choose k in any way we want 9J2 let s choose it as k This gives 6 X 8 0X X 0X W xpllX Xl 1871 2 P lf yXIS Now taking the limits of both sides w obtain 0 i 1 5 limiPHX quS e Emir But the probability of any event cannot be greater than one so gimwpuX xlg 251 STA4321 Chap 7 Page 3 of 37 Using the complement rule we can also write 11mPyX2 51 11ml Xs 1 12 94 That is ggm lf HXIZ 0 In other words as the sample size increases to in nity the sample mean converges in probability to the population mean p lim This may also be written in symbols as T M Corollary to Theorem 81 Example 81 Let Y Bn p where p is the population proportion n is the sample size and Y is the number of success s in the sample Then the sample proportion is f9 Yn Given these limP Z p S slimaagtp pl l n n gt so 11 0 This states that as the sample size increases The sample proportion converges in probability to The population proportion that is p Flim STA4321 Chap 7 Page 4 of 37 Theorem 81b Strong Law 01 Large Numbers or almost sure convergencel Let X1 X2 Xn be a random sample from a population of X s with mean EX LL and a nite variance VarX G if lt oo 1 n LetX 2 X Then for any positive real number n i1 8 zapmg M g lsorP limJX M 2 a Theorem 82 Suppose Xn converges in probability to aX And Y n converges in probability to lay Then ALL of the following are true 1 Xn Yn converges in probability to aX lay 2 X XYn converges in probability to aX Xpy 3 Xn Y converges in probability to IXlily 4 XM converges in probability to uX provided that PXn Z 0 1 STA4321 Chap 7 Page 5 of 37 Example 82 Suppose X1 X2 Xn are iid random variables With the rst four moments nite Then 1 i l n S 2 X X Converges in probability to V21139Xi i1 Proof Skip 83 Convergence in Distribution Note that in the previous section we talked about random variables converging to some constant but said nothing about What happens to the distribution of the random variable as n increases to in nity Here we Will talk about that De nition 82 Let Y n and Y be random variables with distribution functions F HQ and F respectively J nlgmfny F0 for every y then Y is said to converge in distribution to Y F y is called the Limiting Distribution Function of Y n for ml because of S 2 at S2 quot Observe that n l the difference in their denominators To be exact we want this condition to hold at all points y for which FYy is continuous STA4321 Chap 7 Page 6 of 37 It is sometimes easier to work with moment generating functions and use the unique identi cation property of mgf s to nd the limiting distribution of a random variable as a result of the following theorem Theorem 83 Let Y n and Y be random variables with moment generating functions and respectively llIn M for all real t then Y converges in distribution to Y Example 84 Poisson approximation to the Binomial Distribution This is an application of Theorem 83 to prove what we stated in Chapter 3 about the relation between the binomial and Poisson distributions namely When 11 goes to in nity and p goes to zero in such a way that up J remains constant then the binomial distribution converges to the Poisson distribution STA4321 Chap 7 Page 7 of 37 Proof We know that when Xn Bn p its moment generating function is Mnt pgtltet l p This can be rewritten as Mnt l pgtltet 1 Now let X np so that p 7 n Substituting this in the above relation and taking the limits of both sides we get limMntlim1 pel 1 lim l l e 1 11 so 11 gt so 11 71 Note that in the last line above we have used the de nition for ek that 1s equot 11141 2 Thus limMnt lim 1 m win 11 so 11 n so But the right hand side of the above equality is the mgf of a random variable that has the Poisson distribution with parameter 7v Hence by unique identi cation property of mgf s and by theorem 83 a random variable Xn that has the binomial distribution with parameters n and p converges in distribution toward a Poisson random variable with parameter 7 ngtltp STA4321 Chap 7 Page 8 of 37 Example 85 The author shows that standardizing a random variable that has the Poisson distribution yields a new random variable that has the standard normal distribution for large 3 Can you follow the proof X l 1 Just write Y T T i m and hence Y aX b where a 17 and b 7 Now show that MYIE6 Ee b EM eb ebt EgteatX ebt MX at 61 61 Ifl Substituting the values of a and b we get it at A MYU ebt 62568 1 ed 62 1 r l rJ39 eJI t 63571 1 6ft 8 21 After this point you need a bit more algebraic r manipulation to show that MY t e m t 33 A 1 396 But this is the mgf of a random variable that has the standard normal distribution Hence by the unique identi cation property of mgf s Y NO 1 Although this Example deals with the Poisson distribution it is true for many other distributions and STA4321 Chap 7 Page 9 of 37 is stated as the Central Limit Theorem covered in the next section 84 The Central Limit Theorem This is the most frequently used theorem by many people in many elds Here is the formal statement of the theorem Theorem 84 Central Limit Theorem Let X1 X2 Xn be independent and identically distributed random variablesF with uX and 1n VarX 039lt lt Now let XZE Xi s0 il ETy and Define Ynas XHy X39iiX J y n of oXJh X X i Z X J i1 0 Then Y converges in distribution to A standard normal random variable 3 This is slightly more general than saying that X1 X2 Xn is a random sample from a population with mean u and standard deviation 6 However in many applications we do have a random sample of size n Note that the distribution of the population is unknown we know only its mean and standard deviation STA4321 Chap 7 Page 10 of 37 Proof Read pages 443 and 444 Where the method of moment generation is used to sketch a proof for the above theorem The bottom line is 1 6t 2 Corollary to Theorem 84 Given a random sample of size n from a population with mean u and a finite variance 02 but unknown or nonnormal distribution then the distribution of the sample mean approached the normal distribution with mean u and variance ozn or large n This corollary is in fact a restatement of the Central Limit Theorem CLT Large n means n 2 30 in general but in some case may even be much less See wwwrufriceedulanestatsimsamplingdistindexhtml Let s compare the CLT With a corollary to Theorem 74 seen in Chapter 7 STA4321 Chap 7 Page 11 of 37 Rule 3 Corollary 2 to Theorem 74 states that IF a random sample of size n is selected from a population that has the normal distribution with mean u and standard deviation 039 THEN X has a normal distribution with mean u and standard deviation on That is X Nu on IF X Nu 039 In this corollary observe the silence on the sample size This mean the corollary is true for AM sample size Compare this with CLT or Rule 4 that says X r39vNu on approximately if a large random sample is selected from AN Y population with mean u and standard deviation 039 Notice silence on the distribution of the population 0 What are the important differences between these two rules o Which one of these rules can we use where and why Why not the other one o If both could be used which one should we prefer to use I Normal Theory I CLT STA4321 Chap 7 Page 12 of 37 Distribution of Normal Any with known the Population X N04 6 LL and 6 Any Large Sample s1ze n 2 1 n 2 30 Distribution of XNN Moo X f N Jog X J J sample means exactly approx1mately Before we go further let s revise some of the symbols and rules we have been or will be using STA4321 Chap 7 Page 13 of 37 De nition of some symbols Note the similarities and differences N Population size Number of elements in the population4 n Sample size Number of population units in the sample M x EX Mean5 of the population of X s 2 xPX x iinsadl39screZe rv all x J m fo xdx if X is a continuous rv 0 if Variance of the W 0f XS 2 x yX2 PX x ifXisadiscreterV all x x u X2fXxdx ifX is a continuous w 4 Assumed to be in nity in many theoretical studies but nite in all real life problems The subscript indicating the population X Y etc Will be dropped When there is no possible confusion STA4321 Chap 7 Page 14 of 37 n X 2X i1 quot Mean of the random sample n from the population of X s Hy E Mean of the population at sample means u by rule 1 2 fPQ 7 leiS a discrete W all 7c Jig if leiS a continuous W 2 6 0 i Variance of the population of X s E x yX2 PX x if ins adiscreterv all x J x uX2fXxdx ifX is a continuous IV of Variance of the population at 2 sample means 0 X by Rule 2 n 2 f M92 PO 3 i Xisadiscreterv all f u 2fffdf ifX is a continuous rV STA4321 Chap 7 Page 15 of 37 TC Proportion of Success s in the population Number of Success s in the population N p Proportion of Success s in the sample Number of Success s in the sample n up Mean of the ponulation of sample proportions Ep TC by rule 6 a Variance of the population at sam Ie ro ortions 1 D by rule 9 n STA4321 Chap 7 Page 16 of 37 Table 1 Some rules that will be used Before you use any rule or theory ALWAYS first check to make sure that conditions are satisfied Rules Conditions6 Result 1 Always true H X X l1 0X 2 AI t 0 ways rue X J THEN 3 IF 0 X N 62 I N X M H X J2 IF THEN 4 CLT 0X n 2 30 X N HX IF 5 Random THEN General variable rv Z W mean OfTV N071 case has a normal st deV ofrv distribution 6 Check to make sure that these conditions are satis ed before using them STA4321 Chap 7 Page 17 of 37 Table 1 Continued Some rules that will be used Before you use any rule or theory ALWAYS first check to make sure that conditions are satisfied Rules Conditions 7 ReSUIt 6 Always true 1 p Z 7 F THEN X Bn TC ux EX M THEN 8 IF X Bn 7 a VarX nTE1TC 9 Always true a p I l Fgt 1O THEN nxn 10 and pN p 1 n x 111 2 10 F X Bn n THEN 11 and XNmrn 17 n X 7 2 10 Normal Approximation to and the Binomial Distribution n gtlt11t2 10 7 Check to make sure that these conditions are satis ed before using them STA4321 Chap 7 Page 18 of 37 The examples and the problems solved below are all applications of the above rules Please note the steps involved with reasons given wherever possible in solving these problems You are expected to show all these steps in the tests and homework with sketches Example Let X fracture strength of glass We are given ux 14 and ox 2 A random sample of size 100 is selected from this population whose distribution is unknown 1 We are asked to nd PXgt 145 We know that ByRule 1 2 X It 14and a X 2 Byrule2039 W m 02 VIP VERY IMPORTANT POINT l Note that these two statements say nothing about shape of the distribution of X But we do need to know the distribution of X Here rules 3 or 4 will help us We cannot use rule 3 here why not but since the sample size is large n 100 gt 30 we can use CLT Thus X 5 Nl4 02 by rules 1 2 and 4 STA4321 Chap 7 Page 19 of 37 Hence PUT A SKETCH HERE PXgt 14 5 0 5 P1 4 0 X Why Using standardization of X we get P140lt X 145 140 140 X p lt145 140 02 a 02 P0 Zlt 250lt 04938 SoPXgt 145 05 04938 00062 Example Modi ed Suppose we continue with the above problem with some modi cations 3 What will change if we are told in addition to everything else that the distribution of the fracture strength f the glass is normal Not much will change except our reason on the distribution of X Now we will say that X has an exact distribution with mean 14 and standard deviation 02 by rules 1 2 and 3 VIP Note that although we can use Rule 3 because we know that the population from which the sample was selected has a normal distribution Although we could have used Rule 4 also since 11 gt 30 we do not use it because Rule 3 is better since it gives exact distribution of X whereas Rule 4 gives an approximation STA4321 Chap 7 Page 20 of 37 4 Suppose the sample size is 16 and the population has a normal distribution with mean 14 and standard deviation 2 Now what will change We now have a number of changes a First the mean of X will still be 14 by Rule 1 But the standard deviation of X will change since n has changed It will be 2N 16 12 by Rule 2 Also since we know that the population has a normal distribution X will have a normal distribution by Rule 3 ie X N14 12 by Rule 3 b V Thus PXgt 145 05 P140 lt X lt 145 and P140lt X 05 a 05 PO Z lt100 lt03413 Then Pgt 145 05 03413 01587 X 145 140140 11 145141 What is the effect of increasingdecreasing the sample size Remember that when n 100 in part 1 we hadPgt 145 05 04938 00062 5 Suppose we still have a sample of size 16 but we are not told anything about the population distribution Now what shall we do STA4321 Chap 7 Page 21 of 37 a We still have EX ux by Rule 1 and by rule 2 Var X 1A these rules are always true b We still need the distribution of X and here we have a problem c We cannot use Rule 3 since we do not know the distribution of the population d We cannot use Rule 4 since n 16 lt 30 e Well if we must solve this problem then we can do so only by assuming that the population has a normal distribution f Then X has a normal distribution with mean 14 and standard deviation 12 by Rule 3 The rest of the problem is as in part 4 giving PXgt 145 05 03413 01587 6 Now we are being dif cult Suppose we still have a sample of size 16 AND we know that the population distribution is NOT NORMAL Now what shall we do a Rules 1 and 2 are valid since they are always M but they say nothing about the distribution of X b Furthermore we cannot use Rule 3 since the population distribution is not normal we cannot use Rule 4 since n 16 lt 30 and c We cannot assume that the population has a normal distribution since we know that it is not STA4321 Chap 7 Page 22 of 37 But we must have the distribution of X to solve the problem d The techniques you ve learned are not suf cient to answer such questions There are alternative methods called nonparametric statistics that may be used in such cases Let s look at some problems that you CAN solve STA4321 Chap 7 Page 23 of 37 Example 87 Let X Amount of liquid in a bottle Given ox 1 ounce Sample size n 25 Distribution of the Population is unknown Mean of the population ux is not speci ed By Rule 1 I1 X and a 1 1 Byrule2039 T 7 g 02 INIPORTANT We need the distribution of X However to nd the distribution of X we cannot use rule 3 since the population distribution is unknown On the other hand we cannot use rule 4 since n 25 lt 30 What can we do The only solution is to assume that the population has a normal distribution or that 25 is large enough to use the CLT Thus X Nu 02 by rules 1 2 and 3 or 12 and 4 depending on your assumption STA4321 Chap 7 Page 24 of 37 We are asked to nd P X within 03 ounce of the true population mean ie P u I S 03 P 03 S X LL S 03 Dividing each term in the above brackets by 02 the standard deviation of X we obtain 03 X pX 03 P X 03 P S I M 02 a 02 P150 2 5150 s 2 R0 z 150 gWhy 204332 08664 Example Let X Achievement test scores Given 0 60 ox 64 8 n 100 X 58 Population distribution is unknown Question Is there evidence that this high school class is substandard To answer this question we will nd the probability of observing a sample mean that is as low as 58 when the population from which the sample is selected has mean 60 and standard deviation 8 ie nd His 58 given 4 60 G 8 STA4321 Chap 7 Page 25 of 37 If PXS 58 is large then we say that this class is not below the standard of the rest of the population We had such a low sample mean simply by chance On the other hand if PX39S 58 is very small then observing such a small or smaller sample mean cannot be due to chance the population from which the sample is selected must have a mean below 60 We will assume that the population variance is still 64 Using the principal of the general law Innocent until proven guilty we will assume that the population from which the sample is selected is not different from the rest of seniors in other schools ie we will assume that the population mean ux is still 60 and the population standard deviation 6x 8 Then E 60 and Var 64100 064 by Rules 1 and 2 Although we cannot use Rule 3 because the population distribution is unknown we can use Rule 4 since n 100 gt 30 and state that X has an approximate normal distribution with mean 60 and standard deviation 08 STA4321 Chap 7 Page 26 of 37 Hence P S 58 P 2 62 by symmetry 05 P60 S X S 62 See sketch 05 P0 S Z S 250 by standardization 05 04938 00062 Since the probability of observing what we have observed or more extreme is very small it is very unlikely to observe such a sample from a population that has mean 60 Hence we conclude that there is strong evidence in the sample data to indicate that this class is substandard 8 8 Welcome to the eld of inferential statistics You will see more of such problems in the next course However in this paragraph observe a few important points rst the words in quotation marks de ne what is known as the p value of the test Also note that the conclusion does not say that the population is substandard just that there is some strong evidence in the sample data to indicate so In order to make de nite statements we must observe the complete population ie have a census of population rather than a sample STA4321 Chap 7 Page 27 of 37 Distribution of the sample proportion p Suppose we have n independent repetitions of a Bernoulli experiment with9 P Success it Let X be the outcome of the ith repetition of this experiment with X 1 if the ith repetition is a 11 success and 0 otherwise Then Y 2 Xiis the i1 number of success s in n repetition of this experiment We know from Chapter 3 that Y has the binomial distribution Y Bn TC Let s de ne Success as presence a characteristic of interest in the elements of a population of size N where ith population unit either has that characteristic denoted by X 1 or does not have that characteristic denoted by X 0 9 Your text uses p for the population proportion or the probability of success and for the sample proportion or sample estimate of p Since it is dif culty to type f9 I will use 11 to denote the population proportion and p to denote the sample proportion STA4321 Chap 7 Page 28 of 37 Then PXi l R where it is the proportion of population units that have the characteristic ie Number of population units N that have the characteristic H a N N X This means that the population proportion is in act a special case 01 the population mean special in the sense that the random variable can have a value of either 1 or 0 On the other hand p the sample proportion is simply the proportion of sample units the population units that are selected into the random sample that have the characteristic of interest Thus Number of sample units 17 2 X Y that have the characteristic l 1 p 39 X n n n In other words the sample proportion is a special case of the sample mean of random variable that can have only two possible values 0 and l STA4321 Chap 7 Page 29 of 37 The random variable Y is the number of Success s in the sample and has the binomial distribution Y Bn 1t Hence using the rules and theorems we ve seen until now we can write the following rules Rule 6 Ep EYn 1 Always true with Y p and R de ned as above Rule 7 IF Y Bn 7t THEN EY ngtlt1t Rule 8 IF Y Bn 1t THEN VarY ngtlt1tgtltl 1 Rule 9 IF Y Bn it AND p Yn THEN G M p n STA4321 Chap 7 Page 30 of 37 Rule 10IF Y Bn it AND p Yn AND ngtlt1t 2 10 AND ngtltl 10210 1 11 Rule 11 IFYBn11 ANannZ 10 AND ngtltl 1t2 10 THENYeNM 1 H The last rule Rule 11 is called the Normal approximation to the binomial distribution Rule 10 is the result of this application Criteria for approximation The general rule for using the normal approximation to the binomial distribution is Use normal approximation when n is large and it is neither too small nor too large 77 4 Since large too small and too large are vague terms the following rules of thumb are suggested STA4321 Chap 7 Page 31 of 37 1 When I iz fuln n 0 1 2 When both ngtlt7t 2 15 AND ngtltl 1t 2 15 3 When both ngtlt1t 2 10 AND ngtltl 1t 2 10 4 When both ngtlt7t 2 5 AND ngtltl 1t 2 5 Remember that they are all rules 0 thumb to improve the approximation Using continuity correction In some applications we use what is called the continuity correction to get a better approximation This is because Y is a discrete random variable and we are using a continuous distribution to approximate the distribution of Y The following list may help STA4321 Chap 7 Page 32 of 37 Given Y Bn 11 such that ngtlt1t 2 10 AND ngtlt1 1t 2 10 and an integer 10 y When we re asked to nd PY S y We use Normal approximation and nd HYlty When we re asked to nd PY 2 y We use Normal approximation and nd PY gt y 12 When we re asked to nd PY gt y We use Normal approximation and nd HYgty When we re asked to nd PY lt y We use Normal approximation and nd PY lt y 12 10 Be careful when y is not an integer ie y is a continuous random variable we do not need to make any continuity correction STA4321 Chap 7 Page 33 of 37 Example 89 Given n 100 71 02 Proportion of nonconformance among wafers in the Population 11 Y Number of nonconformance among wafers in the sample Lot is accepted if Y S 12 Y B100 02 Why Required To nd PAcceptance when 71 020 That is nd PY S 12 when n 100 and 71 020 Using the binomial distribution the probability is 12 2100 PYs12 PY y I 4pm 0 20 yzo y y 0 However this is long and tedious to calculate Instead we may nd the probability approximately by using the normal approximation to the binomial distribution 11 Please see note on page 22 for the de nitions of 71 p in your text and p I3 in your text STA4321 Chap 7 Page 34 of 37 As always we MUST check if the conditions for using this approximation are satis ed 1t 1 n One way to do that 1s to see if R i24 is inside the interval from zero to one Substituting 71 02 and n 100 yields 7 i2 l 020 r 008 or 012 028 Since 012 028 is inside the interval 0 l we can use the normal approximation to the binomial distribution 12 Then W EY 1 1gtlt7l 20 by Rule 7 and 6y n7tl 7t 12 4 by Rule 8 and Y 5 N20 4 approximately by Rule 11 So 2 Altematively we see that 1 1gtlt7l 100x020 20 gt 10 AND ngtltl 71 80 gt10 Hence we can use the normal approximation to the binomial distribution since the conditions of Rule 11 are satis ed STA4321 Chap 7 Page 35 of 37 PYs 12 JB PY 12 5 Using continuity correction 125 20 try 4 P2188lt05 P0 Z1880 i P by standardizatin Example Normal approximation to binomial Let X Number of voters in the sample Who favor candidate A We are given n 100 and 71 050 and asked to nd Pp Z 055 NOW up 71 050 by Rule 6 and 0 Jm 7z0501 050 605byRu169I 1 n 100 Since 1 1gtlt7i 50 gt10 and nl 71 50 gt10 we can use Rule 10 and state that p 5 N050 005 approximately STA4321 Chap 7 Page 36 of 37