Alice and Bob arrange to meet for lunch on a certain day at noon. However, neither is known for punctuality. They both arrive independently at uniformly distributed times between noon and 1 pm on that day. Each is willing to wait up to 15 minutes for the other to show up. What is the probability they will meet for lunch that day?
Read more- Statistics / Introduction to Probability 1 / Chapter 7 / Problem 1
Table of Contents
Textbook Solutions for Introduction to Probability
Question
Alice and Bob arrange to meet for lunch on a certain day at noon. However, neither is known for punctuality. They both arrive independently at uniformly distributed times between noon and 1 pm on that day. Each is willing to wait up to 15 minutes for the other to show up. What is the probability they will meet for lunch that day?
Solution
Step 1 of 2
Define the random variables and
, which represent the arrival times of Alice and Bob, respectively.
We are given that
We need to find
full solution
Alice and Bob arrange to meet for lunch on a certain day at noon. However, neither is
Chapter 7 textbook questions
-
Chapter 7: Problem 1 Introduction to Probability 1 -
Chapter 7: Problem 2 Introduction to Probability 1Alice, Bob, and Carl arrange to meet for lunch on a certain day. They arrive independently at uniformly distributed times between 1 pm and 1:30 pm on that day. (a) What is the probability that Carl arrives first? For the rest of this problem, assume that Carl arrives first at 1:10 pm, and condition on this fact. (b) What is the probability that Carl will have to wait more than 10 minutes for one of the others to show up? (So consider Carls waiting time until at least one of the others has arrived.) (c) What is the probability that Carl will have to wait more than 10 minutes for both of the others to show up? (So consider Carls waiting time until both of the others has arrived.) (d) What is the probability that the person who arrives second will have to wait more than 5 minutes for the third person to show up?
Read more -
Chapter 7: Problem 3 Introduction to Probability 1One of two doctors, Dr. Hibbert and Dr. Nick, is called upon to perform a series of n surgeries. Let H be the indicator r.v. for Dr. Hibbert performing the surgeries, and suppose that E(H) = p. Given that Dr. Hibbert is performing the surgeries, each surgery is successful with probability a, independently. Given that Dr. Nick is performing the surgeries, each surgery is successful with probability b, independently. Let X be the number of successful surgeries. (a) Find the joint PMF of H and X. (b) Find the marginal PMF of X. (c) Find the conditional PMF of H given X = k.
Read more -
Chapter 7: Problem 4 Introduction to Probability 1A fair coin is flipped twice. Let X be the number of Heads in the two tosses, and Y be the indicator r.v for the tosses landing the same way. (a) Find the joint PMF of X and Y . (b) Find the marginal PMFs of X and Y . (c) Are X and Y independent? (d) Find the conditional PMFs of Y given X = x and of X given Y = y.
Read more -
Chapter 7: Problem 5 Introduction to Probability 1A fair die is rolled, and then a coin with probability p of Heads is flipped as many times as the die roll says, e.g., if the result of the die roll is a 3, then the coin is flipped 3 times. Let X be the result of the die roll and Y be the number of times the coin lands Heads. (a) Find the joint PMF of X and Y . Are they independent? (b) Find the marginal PMFs of X and Y . (c) Find the conditional PMFs of Y given X = x and of Y given X = x.
Read more -
Chapter 7: Problem 6 Introduction to Probability 1A committee of size k is chosen from a group of n women and m men. All possible committees of size k are equally likely. Let X and Y be the numbers of women and men on the committee, respectively. (a) Find the joint PMF of X and Y . Be sure to specify the support. (b) Find the marginal PMF of X in two dierent ways: by doing a computation using the joint PMF, and using a story. (c) Find the conditional PMF of Y given that X = x.
Read more -
Chapter 7: Problem 7 Introduction to Probability 1A stick of length L (a positive constant) is broken at a uniformly random point X. Given that X = x, another breakpoint Y is chosen uniformly on the interval [0, x]. (a) Find the joint PDF of X and Y . Be sure to specify the support. (b) We already know that the marginal distribution of X is Unif(0, L). Check that marginalizing out Y from the joint PDF agrees that this is the marginal distribution of X. (c) We already know that the conditional distribution of Y given X = x is Unif(0, x). Check that using the definition of conditional PDFs (in terms of joint and marginal PDFs) agrees that this is the conditional distribution of Y given X = x. (d) Find the marginal PDF of Y . (e) Find the conditional PDF of X given Y = y.
Read more -
Chapter 7: Problem 8 Introduction to Probability 1(a) Five cards are randomly chosen from a standard deck, one at a time with replacement. Let X, Y, Z be the numbers of chosen queens, kings, and other cards. Find the joint PMF of X, Y, Z. (b) Find the joint PMF of X and Y . Hint: In summing the joint PMF of X, Y, Z over the possible values of Z, note that most terms are 0 because of the constraint that the number of chosen cards is five. (c) Now assume instead that the sampling is without replacement (all 5-card hands are equally likely). Find the joint PMF of X, Y, Z. Hint: Use the naive definition of probability.
Read more -
Chapter 7: Problem 9 Introduction to Probability 1Let X and Y be i.i.d. Geom(p), and N = X + Y . (a) Find the joint PMF of X, Y, N. (b) Find the joint PMF of X and N. (c) Find the conditional PMF of X given N = n, and give a simple description in words of what the result says.
Read more -
Chapter 7: Problem 10 Introduction to Probability 1Let X and Y be i.i.d. Expo(), and T = X + Y . (a) Find the conditional CDF of T given X = x. Be sure to specify where it is zero. (b) Find the conditional PDF fT |X(t|x), and verify that it is a valid PDF. (c) Find the conditional PDF fX|T (x|t), and verify that it is a valid PDF. Hint: This can be done using Bayes rule without having to know the marginal PDF of T, by recognizing what the conditional PDF is up to a normalizing constantthen the normalizing constant must be whatever is needed to make the conditional PDF valid. (d) In Example 8.2.4, we will show that the marginal PDF of T is fT (t) = 2tet , for t > 0. Give a short alternative proof of this fact, based on the previous parts and Bayes rule.
Read more -
Chapter 7: Problem 11 Introduction to Probability 1Let X, Y, Z be r.v.s such that X N (0, 1) and conditional on X = x, Y and Z are i.i.d. N (x, 1). (a) Find the joint PDF of X, Y, Z. (b) By definition, Y and Z are conditionally independent given X. Discuss intuitively whether or not Y and Z are also unconditionally independent. (c) Find the joint PDF of Y and Z. You can leave your answer as an integral, though the integral can be done with some algebra (such as completing the square) and facts about the Normal distribution.
Read more -
Chapter 7: Problem 12 Introduction to Probability 1Let X Expo(), and let c be a positive constant. (a) If you remember the memoryless property, you already know that the conditional distribution of X given X>c is the same as the distribution of c + X (think of waiting c minutes for a success and then having a fresh Expo() additional waiting time). Derive this in another way, by finding the conditional CDF of X given X>c and the conditional PDF of X given X>c. (b) Find the conditional CDF of X given X
Read more -
Chapter 7: Problem 13 Introduction to Probability 1Let X and Y be i.i.d. Expo(). Find the conditional distribution of X given X Y <Y in two dierent ways ways: (a) by using calculus to find the condition PDF. (b) without using calculus, by arguing that the conditional distribution of X given X<Y is the same distribution as the unconditional distribution of min(X, Y ), and then applying an earlier result about the minimum of independent Exponentials.
Read more -
Chapter 7: Problem 14 Introduction to Probability 1(a) A stick is broken into three pieces by picking two points independently and uniformly along the stick, and breaking the stick at those two points. What is the probability that the three pieces can be assembled into a triangle? Hint: A triangle can be formed from 3 line segments of lengths a, b, c if and only if a, b, c 2 (0, 1/2). The probability can be interpreted geometrically as proportional to an area in the plane, avoiding all calculus, but make sure for that approach that the distribution of the random point in the plane is Uniform over some region. (b) Three legs are positioned uniformly and independently on the perimeter of a round table. What is the probability that the table will stand?
Read more -
Chapter 7: Problem 15 Introduction to Probability 1Let X and Y be continuous r.v.s., with joint CDF F(x, y). Show that the probability that (X, Y ) falls into the rectangle [a1, a2] [b1, b2] is F(a2, b2) F(a1, b2) + F(a1, b1)
Read more -
Chapter 7: Problem 16 Introduction to Probability 1Let X and Y have joint PDF fX,Y (x, y) = x + y, for 0 <x< 1 and 0 <y< 1. (a) Check that this is a valid joint PDF. (b) Are X and Y independent? (c) Find the marginal PDFs of X and Y . (d) Find the conditional PDF of Y given X = x.
Read more -
Chapter 7: Problem 17 Introduction to Probability 1Let X and Y have joint PDF fX,Y (x, y) = cxy, for 0 0 <x<y< 1. (a) Find c to make this a valid joint PDF. (b) Are X and Y independent? (c) Find the marginal PDFs of X and Y . (d) Find the conditional PDF of Y given X = x.
Read more -
Chapter 7: Problem 18 Introduction to Probability 1Let (X, Y ) be a uniformly random point in the triangle in the plane with vertices (0, 0),(0, 1),(1, 0). Find the joint PDF of X and Y , the marginal PDF of X, and the conditional PDF of X given Y .
Read more -
Chapter 7: Problem 19 Introduction to Probability 1A random point (X, Y, Z) is chosen uniformly in the ball B = {(x, y, z) : x2+y2+z2 1}. (a) Find the joint PDF of X, Y, Z. (b) Find the joint PDF of X, Y . (c) Find an expression for the marginal PDF of X, as an integral.
Read more -
Chapter 7: Problem 20 Introduction to Probability 1Let U1, U2, U3 be i.i.d. Unif(0, 1), and let L = min(U1, U2, U3), M = max(U1, U2, U3). (a) Find the marginal CDF and marginal PDF of M, and the joint CDF and joint PDF of L, M. Hint: For the latter, start by considering P(L l,M m). (b) Find the conditional PDF of M given L
Read more -
Chapter 7: Problem 21 Introduction to Probability 1Find the probability that the quadratic polynomial Ax2 +Bx+ 1, where the coecients A and B are determined by drawing i.i.d. Unif(0, 1) random variables, has at least one real root. Hint: By the quadratic formula, the polynomial ax2 + bx + c has a real root if and only if b2 4ac 0.
Read more -
Chapter 7: Problem 22 Introduction to Probability 1Let X and Y each have support (0, 1) marginally, and suppose that the joint PDF fX,Y of X and Y is positive for 0 <x<y and 0 otherwise. (a) What is the support of the conditional PDF of Y given X = x? (b) Show that X and Y cant be independent.
Read more -
Chapter 7: Problem 23 Introduction to Probability 1The unit ball in Rn is {(x1,...,xn) : x2 1 + + x2 n 1}, the ball of radius 1 centered at 0. As mentioned in Section A.7 of the math appendix, the volume of the unit ball in n dimensions is vn = n/2 (n where is the gamma function, a very famous function which is defined by (a) = Z 1 0 xae x dx x for all a > 0, and which will play an important role in the next chapter. A few useful facts about the gamma function (which you can assume) are that (a + 1) = a(a) for any a > 0, and that (1) = 1 and ( 1 2 ) = p. Using these facts, it follows that (n)=(n 1)! for n a positive integer, and we can also find (n + 1 2 ) when n is a nonnegative integer. For practice, please verify that v2 = (the area of the unit disk in 2 dimensions) and v3 = 4 3 (the volume of the unit ball in 3 dimensions). Let U1, U2,...,Un Unif(1, 1) be i.i.d. (a) Find the probability that (U1, U2,...,Un) is in the unit ball in Rn. (b) Evaluate the result from (a) numerically for n = 1, 2,..., 10, and plot the results (using a computer unless you are extremely good at making hand-drawn graphs). The facts above about the gamma function are sucient so that you can do this without doing any integrals, but you can also use the command gamma in R to compute the gamma function. (c) Let c be a constant with 0 c. What is the distribution of Xn? (d) For c = 1/ p2, use the result of Part (c) to give a simple, short derivation of what happens to the probability from (a) a
Read more -
Chapter 7: Problem 24 Introduction to Probability 1Two students, A and B, are working independently on homework (not necessarily for the same class). Student A takes Y1 Expo(1) hours to finish his or her homework, while B takes Y2 Expo(2) hours. (a) Find the CDF and PDF of Y1/Y2, the ratio of their problem-solving times. (b) Find the probability that A finishes his or her homework before B does.
Read more -
Chapter 7: Problem 25 Introduction to Probability 1Two companies, Company 1 and Company 2, have just been founded. Stock market crashes occur according to a Poisson process with rate 0. Such a crash would put both companies out of business. For j 2 {1, 2}, there may be an adverse event of type j, which puts Company j out of business (if it is not already out of business) but does not aect the other company; such events occur according to a Poisson process with rate j . If there has not been a stock market crash or an adverse event of type j, then company j remains in business. The three Poisson processes are independent of each other. Let X1 and X2 be how long Company 1 and Company 2 stay in business, respectively. (a) Find the marginal distributions of X1 and X2. (b) Find P(X1 > x1, X2 > x2), and use this to find the joint CDF of X1 and X2.
Read more -
Chapter 7: Problem 26 Introduction to Probability 1The bus company from Blissville decides to start service in Blotchville, sensing a promising business opportunity. Meanwhile, Fred has moved back to Blotchville. Now when Fred arrives at the bus stop, either of two independent bus lines may come by (both of which take him home). The Blissville companys bus arrival times are exactly 10 minutes apart, whereas the time from one Blotchville company bus to the next is Expo( 1 10 ). Fred arrives at a uniformly random time on a certain day. (a) What is the probability that the Blotchville company bus arrives first? Hint: One good way is to use the continuous law of total probability. (b) What is the CDF of Freds waiting time for a bus.
Read more -
Chapter 7: Problem 27 Introduction to Probability 1A longevity study is being conducted on n married hobbit couples. Let p be the probability that an individual hobbit lives at least until his or her eleventy-first birthday, and assume that the lifespans of dierent hobbits are independent. Let N0, N1, N2 be the number of couples in which neither hobbit reaches age eleventy-one, one hobbit does but not the other, and both hobbits reach eleventy-one, respectively. (a) Find the joint PMF of N0, N1, N2. (b) Using (a) and the definition of conditional probability, find the conditional PMF of N2 given this information, up to a normalizing constant (that is, you do not need to find the normalizing constant in this part, but just to give a simplified expression that is proportional to the conditional PMF). For simplicity, you can and should ignore multiplicative constants in this part; this includes multiplicative factors that are functions of h, since h is now being treated as a known constant. (c) Now obtain the conditional PMF of N2 using a direct counting argument, now including any needed normalizing constants so that you are providing a valid conditional PMF. (d) Discuss intuitively whether or not p should appear in the answer to (c). (e) What is the conditional expectation of N2, given the above information (simplify fully)? This can be done without doing any messy sums, and without having done (b) or (c).
Read more -
Chapter 7: Problem 28 Introduction to Probability 1There are n stores in a shopping center, labeled from 1 to n. Let Xi be the number of customers who visit store i in a particular month, and suppose that X1, X2,...,Xn are i.i.d. with PMF p(x) = P(Xi = x). Let I DUnif(1, 2,...,n) be the label of a randomly chosen store, so XI is the number of customers at a randomly chosen store. (a) For i 6= j, find P(Xi = Xj ) in terms of a sum involving the PMF p(x). (b) Find the joint PMF of I and XI . Are they independent? (c) Does XI , the number of customers for a random store, have the same marginal distribution as X1, the number of customers for store 1? (d) Let J DUnif(1, 2,...,n) also be the label of a randomly chosen store, with I and J independent. Find P(XI = XJ ) in terms of a sum involving the PMF p(x). How does P(XI = XJ ) compare to P(Xi = Xj ) for fixed i, j with i 6= j?
Read more -
Chapter 7: Problem 29 Introduction to Probability 1Let X and Y be i.i.d. Geom(p), L = min(X, Y ), and M = max(X, Y ). (a) Find the joint PMF of L and M. Are they independent? (b) Find the marginal distribution of L in two ways: using the joint PMF, and using a story. (c) Find EM. Hint: A quick way is to use (b) and the fact that L + M = X + Y . (d) Find the joint PMF of L and M L. Are they independent?
Read more -
Chapter 7: Problem 30 Introduction to Probability 1Let X, Y have the joint CDF F(x, y)=1 e x e y + e (x+y+xy) , for x > 0,y > 0 (and F(x, y) = 0 otherwise), where the parameter is a constant in [0, 1]. (a) Find the joint PDF of X, Y . For which values of (if any) are they independent? (b) Explain why we require to be in [ (c) Find the marginal PDFs of X and Y by working directly from the joint PDF from (a). When integrating, do not use integration by parts or computer assistance; rather, pattern match to facts we know about moments of famous distributions. (d) Find the marginal CDFs of X and Y by working directly from the joint CDF.
Read more -
Chapter 7: Problem 31 Introduction to Probability 1Let X and Y be i.i.d. Unif(0, 1). Find the standard deviation of the distance between X and Y .
Read more -
Chapter 7: Problem 32 Introduction to Probability 1Let X, Y be i.i.d. Expo(). Find E|X Y | in two dierent ways: (a) using 2D LOTUS and (b) using the memoryless property without any calculus.
Read more -
Chapter 7: Problem 33 Introduction to Probability 1Alice walks into a post oce with 2 clerks. Both clerks are in the midst of serving customers, but Alice is next in line. The clerk on the left takes an Expo(1) time to serve a customer, and the clerk on the right takes an Expo(2) time to serve a customer. Let T be the amount of time Alice has to wait until it is her turn. (a) Write down expressions for the mean and variance of T, in terms of double integrals (which you do not need to evaluate). (b) Find the distribution, mean, and variance of T, without using calculus.
Read more -
Chapter 7: Problem 34 Introduction to Probability 1Let (X, Y ) be a uniformly random point in the triangle in the plane with vertices (0, 0),(0, 1),(1, 0). Find Cov(X, Y ). (Exercise 18 is about joint, marginal, and conditional PDFs in this setting.
Read more -
Chapter 7: Problem 35 Introduction to Probability 1A random point is chosen uniformly in the unit disk {(x, y) : x2 + y2 1}. Let R be its distance from the origin. (a) Find E(R) using 2D LOTUS. Hint: To do the integral, convert to polar coordinates (see the math appendix). (b) Find the CDFs of R2 and of R without using calculus, using the fact that for a Uniform distribution on a region, probability within that region is proportional to area. Then get the PDFs of R2 and of R, and find E(R) in two more ways: using the definition of expectation, and using a 1D LOTUS by thinking of R as a function of R2.
Read more -
Chapter 7: Problem 36 Introduction to Probability 1Let X and Y be discrete r.v.s. (a) Use 2D LOTUS (without assuming linearity) to show that E(X+Y ) = E(X)+E(Y ). (b) Now suppose that X and Y are independent. Use 2D LOTUS to show that E(XY ) = E(X)E(Y
Read more -
Chapter 7: Problem 37 Introduction to Probability 1Let X and Y be i.i.d. continuous random variables with PDF f, mean , and variance 2. We know that the expected squared distance of X from its mean is 2, and likewise for Y ; this problem is about the expected squared distance of X from Y . (a) Use 2D LOTUS to express E(X Y ) 2 as a double integral. (b) By expanding (x y) 2 = x2 2xy + y2 and evaluating the double integral from (a), show that E(X Y ) 2 = 22 . (c) Give an alternative proof of the result from (b), based on the trick of adding and subtracting : (X Y ) 2 = (X + Y ) 2 = (X ) 2 2(X .
Read more -
Chapter 7: Problem 38 Introduction to Probability 1Let X and Y be r.v.s. Is it correct to say max(X, Y ) + min(X, Y ) = X + Y ? Is it correct to say Cov(max(X, Y ), min(X, Y )) = Cov(X, Y ) since either the max is X and the min is Y or vice versa, and covariance is symmetric? Explain.
Read more -
Chapter 7: Problem 39 Introduction to Probability 1Two fair six-sided dice are rolled (one green and one orange), with outcomes X and Y respectively for the green and the orange. (a) Compute the covariance of X + Y and X Y . (b) Are X + Y and X Y independent?
Read more -
Chapter 7: Problem 40 Introduction to Probability 1Let X and Y be i.i.d. Unif(0, 1). (a) Compute the covariance of X + Y and X Y . (b) Are X + Y and X Y independent
Read more -
Chapter 7: Problem 41 Introduction to Probability 1Let X and Y be standardized r.v.s (i.e., marginally they each have mean 0 and variance 1) with correlation 2 (1, 1). Find a, b, c, d (in terms of ) such that Z = aX + bY and W = cX + dY are uncorrelated but still standardized.
Read more -
Chapter 7: Problem 42 Introduction to Probability 1Let X be the number of distinct birthdays in a group of 110 people (i.e., the number of days in a year such that at least one person in the group has that birthday). Under the usual assumptions (no February 29, all the other 365 days of the year are equally likely, and the day when one person is born is independent of the days when the other people are born), find the mean and variance of X.
Read more -
Chapter 7: Problem 43 Introduction to Probability 1Let X and Y be Bernoulli r.v.s, possibly with dierent parameters. Show that if X and Y are uncorrelated, then they are independent. (b) Give an example of three Bernoulli r.v.s such that each pair of them is uncorrelated, yet the three r.v.s are dependent.
Read more -
Chapter 7: Problem 44 Introduction to Probability 1Find the variance of the number of toys needed until you have a complete set in Example 4.3.11 (the coupon collector problem).
Read more -
Chapter 7: Problem 45 Introduction to Probability 1A random triangle is formed in some way, such that the angles are identically distributed. What is the correlation between two of the angles (assuming that the variance of the angles is nonzero).
Read more -
Chapter 7: Problem 46 Introduction to Probability 1Each of n 2 people puts his or her name on a slip of paper (no two have the same name). The slips of paper are shued in a hat, and then each person draws one (uniformly at random at each stage, without replacement). Find the standard deviation of the number of people who draw their own names.
Read more -
Chapter 7: Problem 47 Introduction to Probability 1Athletes compete one at a time at the high jump. Let Xj be how high the jth jumper jumped, with X1, X2,... i.i.d. with a continuous distribution. We say that the jth jumper sets a record if Xj is greater than all of Xj1,...,X1. Find the variance of the number of records among the first n jumpers (as a sum). What happens to the variance as n
Read more -
Chapter 7: Problem 48 Introduction to Probability 1A chicken lays a Pois() number N of eggs. Each egg hatches a chick with probability p, independently. Let X be the number which hatch, so X|N = n Bin(n, p). Find the correlation between N (the number of eggs) and X (the number of eggs which hatch). Simplify; your final answer should work out to a simple function of p (the should cancel out).
Read more -
Chapter 7: Problem 49 Introduction to Probability 1Let X1,...,Xn be random variables such that Corr(Xi, Xj ) = for all i 6= j. Show that 1 n1 . This is a bound on how negatively correlated a collection of r.v.s can all be with each other. Hint: Assume Var(Xi) = 1 for all i; this can be done without loss of generality, since rescaling two r.v.s does not aect the correlation between them. Then use the fact that Var( X1
Read more -
Chapter 7: Problem 50 Introduction to Probability 1Let X and Y be independent r.v.s. Show that Var(XY ) = Var(X)Var(Y )+(EX) 2 Var(Y )+(EY ) 2 Var(X). Hint: It is often useful when working with a second moment E(T 2) to write it as Var(T)+ (ET) 2.
Read more -
Chapter 7: Problem 51 Introduction to Probability 1Stat 110 shirts come in 3 sizes: small, medium, and large. There are n shirts of each size (where n 2). There are 3n students. For each size, n of the students have that size as the best fit. This seems ideal. But suppose that instead of giving each student the right size shirt, each student is given a shirt completely randomly (all allocations of the shirts to the students, with one shirt per student, are equally likely). Let X be the number of students who get their right size shirt. (a) Find E(X). (b) Give each student an ID number from 1 to 3n, such that the right size shirt is small for students 1 through n, medium for students n + 1 through 2n, and large for students 2n + 1 through 3n. Let Aj be the event that student j gets their right size shirt. Find P(A1, A2) and P(A1, An+1). (c) Find Var(X).
Read more -
Chapter 7: Problem 52 Introduction to Probability 1A drunken man wanders around randomly in a large space. At each step, he moves one unit of distance North, South, East, or West, with equal probabilities. Choose coordinates such that his initial position is (0, 0) and if he is at (x, y) at some time, then one step later he is at (x, y + 1),(x, y 1),(x + 1, y), or (x 1, y). Let (Xn, Yn) and Rn be his position and distance from the origin after n steps, respectively. General hint: Note that Xn is a sum of r.v.s with possible values 1, 0, 1, and likewise for Yn, but be careful throughout the problem about independence. (a) Determine whether or not Xn is independent of Yn. (b) Find Cov(Xn, Yn). (c) Find E(R2 n)
Read more -
Chapter 7: Problem 53 Introduction to Probability 1A scientist makes two measurements, considered to be independent standard Normal r.v.s. Find the correlation between the larger and smaller of the values. Hint: Note that max(x, y) + min(x, y) = x + y and max(x, y) min(x, y) = |x
Read more -
Chapter 7: Problem 54 Introduction to Probability 1Let U Unif(1, 1) and V = 2|U| 1. (a) Find the distribution of V (give the PDF and, if it is a named distribution we have studied, its name and parameters). Hint: Find the support of V , and then find the CDF of V by reducing P(V v) to probability calculations about U. (b) Show that U and V are uncorrelated, but not independent. This is also another example illustrating the fact that knowing the marginal distributions of two r.v.s does not determine the joint distribution.
Read more -
Chapter 7: Problem 55 Introduction to Probability 1Consider the following method for creating a bivariate Poisson (a joint distribution for two r.v.s such that both marginals are Poissons). Let X = V + W, Y = V + Z where V, W, Z are i.i.d. Pois() (the idea is to have something borrowed and something new but not something old or something blue). (a) Find Cov(X, Y ). (b) Are X and Y independent? Are they conditionally independent given V ? (c) Find the joint PMF of X, Y (as a sum).
Read more -
Chapter 7: Problem 56 Introduction to Probability 1You are playing an exciting game of Battleship. Your opponent secretly positions ships on a 10 by 10 grid and you try to guess where the ships are. Each of your guesses is a hit if there is a ship there and a miss otherwise. The game has just started and your opponent has 3 ships: a battleship (length 4), a submarine (length 3), and a destroyer (length 2). (Usually there are 5 ships to start, but to simplify the calculations we are considering 3 here.) You are playing a variation in which you unleash a salvo, making 5 simultaneous guesses. Assume that your 5 guesses are a simple random sample drawn from the 100 grid positions. Find the mean and variance of the number of distinct ships you will hit in your salvo. (Give exact answers in terms of binomial coecients or factorials, and also numerical values computed using a computer.) Hint: First work in terms of the number of ships missed, expressing this as a sum of indicator r.v.s. Then use the fundamental bridge and naive definition of probability, which can be applied since all sets of 5 grid positions are equally likely.
Read more -
Chapter 7: Problem 57 Introduction to Probability 1This problem explores a visual interpretation of covariance. Data are collected for n 2 individuals, where for each individual two variables are measured (e.g., height and weight). Assume independence across individuals (e.g., person 1s variables gives no information about the other people), but not within individuals (e.g., a persons height and weight may be correlated). Let (x1, y1),...,(xn, yn) be the n data points. The data are considered here as fixed, known numbersthey are the observed values after performing an experiment. Imagine plotting all the points (xi, yi) in the plane, and drawing the rectangle determined by each pair of points. For example, the points (1, 3) and (4, 6) determine the rectangle with vertices (1, 3),(1, 6),(4, 6),(4, 3). The signed area contributed by (xi, yi) and (xj , yj ) is the area of the rectangle they determine if the slope of the line between them is positive, and is the negative of the area of the rectangle they determine if the slope of the line between them is negative. (Define the signed area to be 0 if xi = xj or yi = yj , since then the rectangle is degenerate.) So the signed area is positive if a higher x value goes with a higher y value for the pair of points, and negative otherwise. Assume that the xi are all distinct and the yi are all distinct. (a) The sample covariance of the data is defined to be r = 1 n Xn i=1 (xi x)(yi y), where x = 1 n Xn i=1 xi and y = 1 n Xn i=1 yi are the sample means. (There are diering conventions about whether to divide by n 1 or n in the definition of sample covariance, but that need not concern us for this problem.) Let (X, Y ) be one of the (xi, yi) pairs, chosen uniformly at random. Determine precisely how Cov(X, Y ) is related to the sample covariance. (b) Let (X, Y ) be as in (a), and (X, Y ) be an independent draw from the same distribution. That is, (X, Y ) and (X, Y ) are randomly chosen from the n points, independently (so it is possible for the same point to be chosen twice). Express the total signed area of the rectangles as a constant times E((X X)(Y Y )). Then show that the sample covariance of the data is a constant times the total signed area of the rectangles. Hint: Consider E((XX)(Y Y )) in two ways: as the average signed area of the random rectangle formed by (X, Y ) and (X, Y ), and using properties of expectation to relate it to Cov(X, Y ). For the former, consider the n2 possibilities for which point (X, Y ) is and which point (X, Y ); note that n such choices result in degenerate rectangles. (c) Based on the interpretation from (b), give intuitive explanations of why for any r.v.s W1, W2, W3 and constants a1, a2, covariance has the following properties: (i) Cov(W1, W2) = Cov(W2, W1); (ii) Cov(a1W1, a2W2) = a1a2Cov(W1, W2); (iii) Cov(W1 + a1, W2 + a2) = Cov(W1, W2); (iv) Cov(W1, W2 + W3) = Cov(W1, W2) + Cov(W1, W3).
Read more -
Chapter 7: Problem 58 Introduction to Probability 1A statistician is trying to estimate an unknown parameter based on some data. She has available two independent estimators 1 and 2 (an estimator is a function of the data, used to estimate a parameter). For example, 1 could be the sample mean of a subset of the data and 2 could be the sample mean of another subset of the data, disjoint from the subset used to calculate 1. Assume that both of these estimators are unbiased, i.e., E(j ) = . Rather than having a bunch of separate estimators, the statistician wants one combined estimator. It may not make sense to give equal weights to 1 and 2 since one could be much more reliable than the other, so she decides to consider combined estimators of the form = w1 1 + w2 2, a weighted combination of the two estimators. The weights w1 and w2 are nonnegative and satisfy w1 + w2 = 1. (a) Check that is also unbiased, i.e., E() = . (b) Determine the optimal weights w1, w2, in terms of minimizing the mean squared error E() 2. Express your answer in terms of the variances of 1 and 2. The optimal weights are known as Fisher weights. Hint: As discussed in Exercise 55 from Chapter 5, mean squared error is variance plus squared bias, so in this case the mean squared error of is Var(). Note that there is no need for multivariable calculus here, since w2 = 1 w1. (c) Give a simple description of what the estimator found in (b) amounts to if the data are i.i.d. random variables X1,...,Xn, Y1,...,Ym, 1 is the sample mean of X1,...,Xn, and 2 is the sample mean of Y1,...,Ym.
Read more -
Chapter 7: Problem 59 Introduction to Probability 1A Pois() number of people vote in a certain election. Each voter votes for candidate A with probability p and for candidate B with probability q = 1 p, independently of all the other voters. Let V be the dierence in votes, defined as the number of votes for A minus the number for B. (a) Find E(V ). (b) Find Var(V ).
Read more -
Chapter 7: Problem 60 Introduction to Probability 1A traveler gets lost N Pois() times on a long journey. When lost, the traveler asks someone for directions with probability p. Let X be the number of times that the traveler is lost and asks for directions, and Y be the number of times that the traveler is lost and does not ask for directions. (a) Find the joint PMF of N, X, Y . Are they independent? (b) Find the joint PMF of N,X. Are they independent? (c) Find the joint PMF of X, Y . Are they independent?
Read more -
Chapter 7: Problem 61 Introduction to Probability 1The number of people who visit the Leftorium store in a day is Pois(100). Suppose that 10% of customers are sinister (left-handed), and 90% are dexterous (right-handed). Half of the sinister customers make purchases, but only a third of the dexterous customers make purchases. The characteristics and behavior of people are independent, with probabilities as described in the previous two sentences. On a certain day, there are 42 people who arrive at the store but leave without making a purchase. Given this information, what is the conditional PMF of the number of customers on that day who make a purchase?
Read more -
Chapter 7: Problem 62 Introduction to Probability 1A chicken lays n eggs. Each egg independently does or doesnt hatch, with probability p of hatching. For each egg that hatches, the chick does or doesnt survive (independently of the other eggs), with probability s of survival. Let N Bin(n, p) be the number of eggs which hatch, X be the number of chicks which survive, and Y be the number of chicks which hatch but dont survive (so X + Y = N). Find the marginal PMF of X, and the joint PMF of X and Y . Are X and Y independent?
Read more -
Chapter 7: Problem 63 Introduction to Probability 1There will be X Pois() courses oered at a certain school next year. (a) Find the expected number of choices of 4 courses (in terms of , fully simplified), assuming that simultaneous enrollment is allowed if there are time conflicts. (b) Now suppose that simultaneous enrollment is not allowed. Suppose that most faculty only want to teach on Tuesdays and Thursdays, and most students only want to take courses that start at 10 am or later, and as a result there are only four possible time slots: 10 am, 11:30 am, 1 pm, 2:30 pm (each course meets Tuesday-Thursday for an hour and a half, starting at one of these times). Rather than trying to avoid major conflicts, the school schedules the courses completely randomly: after the list of courses for next year is determined, they randomly get assigned to time slots, independently and with probability 1/4 for each time slot. Let Xam and Xpm be the number of morning and afternoon courses for next year, respectively (where morning means starting before noon). Find the joint PMF of Xam and Xpm, i.e., find P(Xam = a, Xpm = b) for all a, b. (c) Continuing as in (b), let X1, X2, X3, X4 be the number of 10 am, 11:30 am, 1 pm, 2:30 pm courses for next year, respectively. What is the joint distribution of X1, X2, X3, X4? (The result is completely analogous to that of Xam, Xpm; you can derive it by thinking conditionally, but for this part you are also allowed to just use the fact that the result is analogous to that of (b).) Use this to find the expected number of choices of 4 nonconflicting courses (in terms of , fully simplified). What is the ratio of the expected value from (a) to this expected value?
Read more -
Chapter 7: Problem 64 Introduction to Probability 1Let (X1,...,Xk) be Multinomial with parameters n and (p1,...,pk). Use indicator r.v.s to show that Cov(Xi, Xj ) = npipj for i 6= j
Read more -
Chapter 7: Problem 65 Introduction to Probability 1Consider the birthdays of 100 people. Assume peoples birthdays are independent, and the 365 days of the year (exclude the possibility of February 29) are equally likely. Find the covariance and correlation between how many of the people were born on January 1 and how many were born on January 2.
Read more -
Chapter 7: Problem 66 Introduction to Probability 1A certain course has a freshmen, b sophomores, c juniors, and d seniors. Let X be the number of freshmen and sophomores (total), Y be the number of juniors, and Z be the number of seniors in a random sample of size n, where for Part (a) the sampling is with replacement and for Part (b) the sampling is without replacement (for both parts, at each stage the allowed choices have equal probabilities). (a) Find the joint PMF of X, Y, Z, for sampling with replacement. (b) Find the joint PMF of X, Y, Z, for sampling without replacement.
Read more -
Chapter 7: Problem 67 Introduction to Probability 1A group of n 2 people decide to play an exciting game of Rock-Paper-Scissors. As you may recall, Rock smashes Scissors, Scissors cuts Paper, and Paper covers Rock (despite Bart Simpson saying Good old rock, nothing beats that!). Usually this game is played with 2 players, but it can be extended to more players as follows. If exactly 2 of the 3 choices appear when everyone reveals their choice, say a, b 2 {Rock,Paper, Scissors} where a beats b, the game is decisive: the players who chose a win, and the players who chose b lose. Otherwise, the game is indecisive and the players play again. For example, with 5 players, if one player picks Rock, two pick Scissors, and two pick Paper, the round is indecisive and they play again. But if 3 pick Rock and 2 pick Scissors, then the Rock players win and the Scissors players lose the game. Assume that the n players independently and randomly choose between Rock, Scissors, and Paper, with equal probabilities. Let X, Y, Z be the number of players who pick Rock, Scissors, Paper, respectively in one game. (a) Find the joint PMF of X, Y, Z. (b) Find the probability that the game is decisive. Simplify your answer. (c) What is the probability that the game is decisive for n = 5? What is the limiting probability that a game is decisive as n ! 1? Explain briefly why your answer makes sense.
Read more -
Chapter 7: Problem 68 Introduction to Probability 1Emails arrive in an inbox according to a Poisson process with rate (so the number of emails in a time interval of length t is distributed as Pois(t), and the numbers of emails arriving in disjoint time intervals are independent). Let X, Y, Z be the numbers of emails that arrive from 9 am to noon, noon to 6 pm, and 6 pm to midnight (respectively) on a certain day. (a) Find the joint PMF of X, Y, Z. (b) Find the conditional joint PMF of X, Y, Z given that X + Y + Z = 36. (c) Find the conditional PMF of X+Y given that X+Y +Z = 36, and find E(X+Y |X+ Y + Z = 36) and Var(X + Y |X + Y + Z = 36) (conditional expectation and conditional variance given an event are defined in the same way as expectation and variance, using the conditional distribution given the event in place of the unconditional distribution)
Read more -
Chapter 7: Problem 69 Introduction to Probability 1Let X be the number of statistics majors in a certain college in the Class of 2030, viewed as an r.v. Each statistics major chooses between two tracks: a general track in statistical principles and methods, and a track in quantitative finance. Suppose that each statistics major chooses randomly which of these two tracks to follow, independently, with probability p of choosing the general track. Let Y be the number of statistics majors who choose the general track, and Z be the number of statistics majors who choose the quantitative finance track. (a) Suppose that X Pois(). (This isnt the exact distribution in reality since a Poisson is unbounded, but it may be a very good approximation.) Find the correlation between X and Y . (b) Let n be the size of the Class of 2030, where n is a known constant. For this part and the next, instead of assuming that X is Poisson, assume that each of the n students chooses to be a statistics major with probability r, independently. Find the joint distribution of Y , Z, and the number of non-statistics majors, and their marginal distributions. (c) Continuing as in (b), find the correlation between X and Y.
Read more -
Chapter 7: Problem 70 Introduction to Probability 1In humans (and many other organisms), genes come in pairs. Consider a gene of interest, which comes in two types (alleles): type a and type A. The genotype of a person for that gene is the types of the two genes in the pair: AA, Aa, or aa (aA is equivalent to Aa). According to the Hardy Weinberg law, for a population in equilibrium the frequencies of AA, Aa, aa will be p2, 2p(1p),(1p) 2 respectively, for some p with 0 . Suppose that the Hardy-Weinberg law holds, and that n people are drawn randomly from the population, independently. Let X1, X2, X3 be the number of people in the sample with genotypes AA, Aa, aa, respectively (a) What is the joint PMF of X1, X2, X3? (b) What is the distribution of the number of people in the sample who have an A? (c) What is the distribution of how many of the 2n genes among the people are As? (d) Now suppose that p is unknown, and must be estimated using the observed data X1, X2, X3. The maximum likelihood estimator (MLE) of p is the value of p for which the observed data are as likely as possible. Find the MLE of p. (e) Now suppose that p is unknown, and that our observations cant distinguish between AA and Aa. So for each person in the sample, we just know whether or not that person is an aa (in genetics terms, AA and Aa have the same phenotype, and we only get to observe the phenotypes, not the genotypes). Find the MLE of p.
Read more -
Chapter 7: Problem 71 Introduction to Probability 1Let (X, Y ) be Bivariate Normal, with X and Y marginally N (0, 1) and with correlation between X and Y . (a) Show that (X + Y,X Y ) is also Bivariate Normal. (b) Find the joint PDF of X + Y and X Y (without using calculus), assuming 1 < <
Read more -
Chapter 7: Problem 72 Introduction to Probability 1Let the joint PDF of X and Y be fX,Y (x, y) = c exp x2 2 y2 2 for all x and y, where c is a constant. (a) Find c to make this a valid joint PDF. (b) What are the marginal distributions of X and Y ? Are X and Y independent? (c) Is (X, Y ) Bivariate Normal?
Read more -
Chapter 7: Problem 73 Introduction to Probability 1Let the joint PDF of X and Y be fX,Y (x, y) = c exp x2 2 y2 2 for xy > 0, where c is a constant (the joint PDF is 0 for xy 0). (a) Find c to make this a valid joint PDF. (b) What are the marginal distributions of X and Y ? Are X and Y independent? (c) Is (X, Y ) Bivariate Normal?
Read more -
Chapter 7: Problem 74 Introduction to Probability 1Let X, Y, Z be i.i.d. N (0, 1). Find the joint MGF of (X + 2Y, 3X + 4Z,
Read more -
Chapter 7: Problem 75 Introduction to Probability 1Let X and Y be i.i.d. N (0, 1), and let S be a random sign (1 or 1, with equal probabilities) independent of (X, Y ). (a) Determine whether or not (X, Y, X + Y ) is Multivariate Normal. (b) Determine whether or not (X, Y, SX + SY ) is Multivariate Normal. (c) Determine whether or not (SX, SY ) is Multivariate Normal.
Read more -
Chapter 7: Problem 76 Introduction to Probability 1Let (X, Y ) be Bivariate Normal with X N (0, 2 1) and Y N (0, 2 2) marginally and with Corr(X, Y ) = . Find a constant c such that Y cX is independent of X. Hint: First find c (in terms of , 1, 2) such that Y cX and X are uncorrelated.
Read more -
Chapter 7: Problem 77 Introduction to Probability 1A mother and a father have 6 children. The 8 heights in the family (in inches) are N (, 2) r.v.s (with the same distribution, but not necessarily independent). (a) Assume for this part that the heights are all independent. On average, how many of the children are taller than both parents? (b) Let X1 be the height of the mother, X2 be the height of the father, and Y1,...,Y6 be the heights of the children. Suppose that (X1, X2, Y1,...,Y6) is Multivariate Normal, with N (, 2) marginals and Corr(X1, Yj ) = for 1 j 6, with < 1. On average, how many of the children are more than 1 inch taller than their mother?
Read more -
Chapter 7: Problem 78 Introduction to Probability 1Cars pass by a certain point on a road according to a Poisson process with rate cars/minute. Let Nt Pois(t) be the number of cars that pass by that point in the time interval [0, t], with t measured in minutes. (a) A certain device is able to count cars as they pass by, but it does not record the arrival times. At time 0, the counter on the device is reset to 0. At time 3 minutes, the device is observed and it is found that exactly 1 car had passed by. Given this information, find the conditional CDF of when that car arrived. Also describe in words what the result says. (b) In the late afternoon, you are counting blue cars. Each car that passes by is blue with probability b, independently of all other cars. Find the joint PMF and marginal PMFs of the number of blue cars and number of non-blue cars that pass by the point in 10 minutes.
Read more -
Chapter 7: Problem 79 Introduction to Probability 1In a U.S. election, there will be V Pois() registered voters. Suppose each registered voter is a registered Democrat with probability p and a registered Republican with probability 1 p, independent of other voters. Also, each registered voter shows up to the polls with probability s and stays home with probability 1s, independent of other voters and independent of their own party aliation. In this problem, we are interested in X, the number of registered Democrats who actually vote. (a) What is the distribution of X, before we know anything about the number of registered voters? (b) Suppose we learn that V = v; that is, v people registered to vote. What is the conditional distribution of X given this information? (c) Suppose we learn there were d registered Democrats and r registered Republicans (where d + r = v). What is the conditional distribution of X given this information? (d) Finally, we learn in addition to all of the above information that n people showed up at the polls on election day. What is the conditional distribution of X given this information?
Read more -
Chapter 7: Problem 80 Introduction to Probability 1A certain college has m freshmen, m sophomores, m juniors, and m seniors. A certain class there is a simple random sample of size n students, i.e., all sets of n of the 4m students are equally likely. Let X1,...,X4 be the numbers of freshmen, . . . , seniors in the class. (a) Find the joint PMF of X1, X2, X3, X4. (b) Give both an intuitive explanation and a mathematical justification for whether or not the distribution from (a) is Multinomial. (c) Find Cov(X1, X3), fully simplified. Hint: Take the variance of both sides of X1 + X2 + X3 + X4 = n.
Read more -
Chapter 7: Problem 81 Introduction to Probability 1Let X Expo() and let Y be a random variable, discrete or continuous, whose MGF M is finite everywhere. Show that P(Y <X) = M(c) for a certain value of c, which you should specify.
Read more -
Chapter 7: Problem 82 Introduction to Probability 1To test for a certain disease, the level of a certain substance in the blood is measured. Let T be this measurement, considered as a continuous r.v. The patient tests positive (i.e., is declared to have the disease) if T >t0 and tests negative if T t0, where t0 is a threshold decided upon in advance. Let D be the indicator of having the disease. As discussed in Example 2.3.9, the sensitivity of the test is the probability of testing positive given that the patient has the disease, and the specificity of the test is the probability of testing negative given that the patient does not have the disease. (a) The ROC (receiver operator characteristic) curve of the test is the plot of sensitivity vs. 1 minus specificity, where sensitivity (the vertical axis) and 1 minus specificity (the horizontal axis) are viewed as functions of the threshold t0. ROC curves are widely used in medicine and engineering as a way to study the performance of procedures for classifying individuals into two groups (in this case, the two groups are diseased people and non-diseased people). Given that D = 1, T has CDF G and PDF g; given that D = 0, T has CDF H and PDF h. Here g and h are positive on an interval [a, b] and 0 outside this interval. Show that the area under the ROC curve is the probability that a randomly selected diseased person has a higher T value than a randomly selected non-diseased person. (b) Explain why the result of (a) makes sense in two extreme cases: when g = h, and when there is a threshold t0 such that P(T >t0|D = 1) and P(T t0|D = 0) are very close to 1.
Read more -
Chapter 7: Problem 83 Introduction to Probability 1Let J be Discrete Uniform on {1, 2,...,n}. (a) Find E(J) and Var(J), fully simplified, using results from Section A.8 of the math appendix. (b) Discuss intuitively whether the results in (a) should be approximately the same as the mean and variance (respectively) of a Uniform distribution on a certain interval. (c) Let X1,...,Xn be i.i.d. N (0, 1) r.v.s, and let R1,...,Rn be their ranks (the smallest Xi has rank 1, the next has rank 2, . . . , and the largest has rank n). Explain why Rn =1+ n X1 j=1 Ij , where Ij = I(Xn > Xj ). Then use this to find E(Rn) and Var(Rn) directly using symmetry, linearity, the fundamental bridge, and properties of covariance. (d) Explain how the results of (a) and (c) relate. Then prove the identities Xn j=1 j = n(n + 1) 2 and Xn j=1 j 2 = n(n + 1)(2n + 1) 6 , using probability (rather than induction).
Read more -
Chapter 7: Problem 84 Introduction to Probability 1A network consists of n nodes, each pair of which may or may not have an edge joining them. For example, a social network can be modeled as a group of n nodes (representing people), where an edge between i and j means they know each other. Assume the network is undirected and does not have edges from a node to itself (for a social network, this says that if i knows j, then j knows i and that, contrary to Socrates advice, a person does not know himself or herself). A clique of size k is a set of k nodes where every node has an edge to every other node (i.e., within the clique, everyone knows everyone). An anticlique of size k is a set of k nodes where there are no edges between them (i.e., within the anticlique, no one knows anyone else). For example, the picture below shows a network with nodes labeled 1, 2,..., 7, where {1, 2, 3, 4} is a clique of size 4, and {3, 5, 7} is an anticlique of size 3. 1 2 3 7 4 5 6 (a) Form a random network with n nodes by independently flipping fair coins to decide for each pair {x, y} whether there is an edge joining them. Find the expected number of cliques of size k (in terms of n and k). (b) A triangle is a clique of size 3. For a random network as in (a), find the variance of the number of triangles (in terms of n). Hint: Find the covariances of the indicator random variables for each possible clique. There are n 3 such indicator r.v.s, some pairs of which are dependent. *(c) Suppose that n k < 2( k 2)1. Show that there is a network with n nodes containing no cliques of size k or anticliques of size k. Hint: Explain why it is enough to show that for a random network with n nodes, the probability of the desired property is positive; then consider the complement.
Read more -
Chapter 7: Problem 85 Introduction to Probability 1Shakespeare wrote a total of 884647 words in his known works. Of course, many words are used more than once, and the number of distinct words in Shakespeares known writings is 31534 (according to one computation). This puts a lower bound on the size of Shakespeares vocabulary, but it is likely that Shakespeare knew words which he did not use in these known writings. More specifically, suppose that a new poem of Shakespeare were uncovered, and consider the following (seemingly impossible) problem: give a good prediction of the number of words in the new poem that do not appear anywhere in Shakespeares previously known works. Ronald Thisted and Bradley Efron studied this problem in the papers [9] and [10], developing theory and methods and then applying the methods to try to determine whether Shakespeare was the author of a poem discovered by a Shakespearean scholar in 1985. A simplified version of their method is developed in the problem below. The method was originally invented by Alan Turing (the founder of computer science) and I.J. Good as part of the eort to break the German Enigma code during World War II. Let N be the number of distinct words that Shakespeare knew, and assume these words are numbered from 1 to N. Suppose for simplicity that Shakespeare wrote only two plays, A and B. The plays are reasonably long and they are of the same length. Let Xj be the number of times that word j appears in play A, and Yj be the number of times it appears in play B, for 1 (a) Explain why it is reasonable to model Xj as being Poisson, and Yj as being Poisson with the same parameter as Xj . (b) Let the numbers of occurrences of the word eyeball (which was coined by Shakespare) in the two plays be independent Pois() r.v.s. Show that the probability that eyeball is used in play B but not in play A is e ( 2 /2! + 3 /3! 4 /4! + ...). (c) Now assume that from (b) is unknown and is itself taken to be a random variable to reflect this uncertainty. So let have a PDF f0. Let X be the number of times the word eyeball appears in play A and Y be the corresponding value for play B. Assume that the conditional distribution of X, Y given is that they are independent Pois() r.v.s. Show that the probability that eyeball is used in play B but not in play A is the alternating series P(X = 1) P(X = 2) + P(X = 3) P(X = 4) + .... Hint: Condition on and use (b). (d) Assume that every words numbers of occurrences in A and B are distributed as in (c), where may be dierent for dierent words but f0 is fixed. Let Wj be the number of words that appear exactly j times in play A. Show that the expected number of distinct words appearing in play B but not in play A is E(W1) E(W2) + E(W3) E(W4) + .... (This shows that W1 W2 + W3 W4 + ... is an unbiased predictor of the number of distinct words appearing in play B but not in play A: on average it is correct. Moreover, it can be computed just from having seen play A, without needing to know f0 or any of the j . This method can be extended in various ways to give predictions for unobserved plays).
Read more