Adv. Topics ISYE 8843
Popular in Course
Popular in Industrial Engineering
This 0 page Class Notes was uploaded by Maryse Thiel on Monday November 2, 2015. The Class Notes belongs to ISYE 8843 at Georgia Institute of Technology - Main Campus taught by Staff in Fall. Since its upload, it has received 24 views. For similar materials see /class/234209/isye-8843-georgia-institute-of-technology-main-campus in Industrial Engineering at Georgia Institute of Technology - Main Campus.
Reviews for Adv. Topics
Report this Material
What is Karma?
Karma is the currency of StudySoup.
Date Created: 11/02/15
1 The Likelihood Principle Likelihood principle concerns foundations of statistical inference and it is often invoked in arguments about correct statistical reasoning Let mw be a conditional distribution for X given the unknown parameter 0 For the observed data X m the function 60 zw considered as a function of 0 is called the likelihaadf nctian The name likelihood implies that given z the value of 9 is more likely to be the true parameter than 0 if 940 gt f t9 Likelihood Principle In the inference about 0 after z is observed all relevant experimental information is contained in the likelihood function for the observed m Furthermore two likelihood functions contain the same information about 9 if they are proportional to each other Remark The maximumlikelihood estimation does satisfy the likelihood principle Figure 1 Leonard Jimmie Savage Born November 20 1917 Detroit Michigan Died November 1 1971 New Haven Connecticut The following example quoted by Lindley and Phillips 1976 is an argument of Leonard Savage dis cussed at Purdue Symposium 1962 It shows that the inference can critically depend on the likelihood principle Example 1 Testing fairness Suppose we are interested in testing 0 the unknown probability of heads for possibly biased coin Suppose H0 012 v5 H10gt12 An experiment is conducted and 9 heads and 3 tails are observed This information is not suf cient to fully specify the model fzl0 A rashomonian analysis follows 0 Scenario 1 Number of ips n 12 is predetermined Then number of heads X is binomial 301 0 with probability mass function P9X z fz0ltquotgt0w17 a lt192gt0Q1 7 03 220 091 7 a For a frequentist the p value of the test is 1 12 66 220 T 0073 12 PX 2 90110 Z 12w171212z m9 and ifyou recall the classical testing the H0 is not rejected at level 04 005 0 Scenario 2 Number of tails successes Oz 3 is predetermined ie the ipping is continued until 3 tails are observed Then X number of heads failures until 3 tails appear is Negative Binomial1 NB3 17 0 amil 0471 fmo lt gt17 90117 17 0W lt3 1 1 7 309 55 0917 0 For a frequentist large values of X are critical and the p value of the test is 00 3 7 1 PX 2 ngo Zlt 3 gt12w123 7 00327 m9 since w The hypothesis H0 is rejected and this change in decision not caused by observations According to Likelihood Principle all relevant information is in the likelihood 0 X 091 7 93 and Bayesians could not agree more Edwards Lindman and Savage 1963 193 note The likelihood principle emphasized in Bayesian statistics implies among other things that the rules governing when data collection stops are irrelevant to data interpretation It is entirely appropriate to collect data until a point has been proven or disproven or until the data collector runs out of time money or patience 2 Suf ciency Suf ciency principle is noncontroversial and frequentists and Bayesians are in agreement If the inference involving the family of distributions and parameter of interest allows for a suf cient statistic then the suf cient statistic should be used This agreement is nonphilosophical it is rather a consequence of mathematics measure theoretic considerations 1Let be the probability of success in a trial The number of failures in a sequence of trials until rth success is observed is Negative Binomial N30 p with probability mass function PltXx Tliv uim 20012 For T 1 the Negative Binomial distribution becomes the Geometric distribution N 81 p E Q Suppose that a distribution of random variable X depends on the unknown parameter 0 A statistics TX is su cient if the conditional distribution ofX given TX t is free of 0 The FisherNeyman factorization lemma states that the likelihood can be represented as 50 K940 9093000 Example Let X17 Xn is a sample from uniform 07 0 distribution with the density mw 10 x g 0 Then 50 H fXl0 ginuo g miinXi1miaxXi g a i1 The statistics T maxiXi is suf cient Here x 10 minim andgT7 0 91minizi If the likelihood principle is adopted all inference about 9 should depend on suf cient statistics since 50 0lt 9Tz7 0 Suf ciency Principle Let the two different observations x and y have the same values Tm Ty of a statistics suf cient for family f Then the inferences about 9 based on x and y should be the same 3 Conditionality Perspective Conditional perspective concerns reporting data speci c measures of accuracy In contrast to the frequentist approach performance of statistical procedures are judged looking at the observed data The difference in approach is illustrated in the following example Example 2 Consider estimating 9 in the model P9X071 P9Xt917 0 ER on basis of two observations X1 and X 2 The procedure suggested is M Mm X71717 le1 X2 To a frequentist this procedure has con dence of 75 for all 0 ie P6X 0 075 The conditionalist would report the con dence of 100 if observed data in hand are different easy to check or 50 if the observations coincide Does it make sense to report the preexperimental accuracy which is known to be misleading after observing the data Conditionality Principle If an experiment concerning the inference about 9 is cho sen from a collection of possible experiments independently of 0 then any experi ment not chosen is irrelevant to the inference Example From Berger 1985 a variant of Cox 1958 example Suppose that a substance to be analyzed is to be sent to either one of two labs one in California or one in New York Two labs seem equally equipped and quali ed and a coin is ipped to decide which one will be chosen The coin comes up tails denoting that California lab is to be chosen After the results are returned back and report is to be written should report take into account the fact that coin did not land up heads and that New York laboratory could have been chosen Common sense and conditional View point say NO but the frequentist approach calls for averaging over all possible data even the possible New York data The conditionality principle makes clear the implication of the likelihood principle that any inference should depend only on the outcome observed and not on any other outcome we might have observed and thus sharply contrasts with the method of likelihood inference from the NeymanPearson or more generally from a frequentist approach In particular questions of unbiasedness minimum variance and risk consis tency the whole apparatus of con dence intervals signi cance levels and power of tests etc violate the conditionality principle Example 1 continued Testing fairness Here is yet another scenario that will not impress a conditional ist 0 Scenario 3 Another coin not the coin for which 9 is to be tested with probability of heads equal to 5 known was ipped If heads were up an experiment as in the Scenario 1 was performed and if tail was up the Scenario 2 was used The number of heads in experiment was 9 and the number of tails observed was 3 Can you design 5 so that p value of the test matches exactly 5 However even the conditionalist agrees that the following scenario yields different evidence about 9 than the Scenarios 13 The selection of the experiment depends on the parameters 9 which is in violation of the conditionality principle 0 Scenario 4 The coin for which 9 is to be tested was pre ipped to determine what kind of experiment is to be performed If the coin was heads up an experiment as in the Scenario 1 was performed and if it was tails up the Scenario 2 was used The number of heads in the subsequent experiment was 9 and the number of tails observed was 3 and the initial ip to specify the experiment was not counted The following result establishes equivalence between likelihood principle and conjunction of suf ciency and conditionality principles Birnbaum 1962 Suf ciency Principle Conditionality Principle E Likelihood Principle Berger 1985 Berger and Wolpert 1988 Robert 2001 have additional discussion and provide more examples The proof of Birnbaum theorem can be found in Robert 2001 pages 18 19 4 Some Sins of Being nonBayesian Thou shalt not integrate with respect to sample space A perfectly valid hypothesis can be rejected because the test failed to account for unlikely data that had not been observed The Lindley Paradox2 Suppose W9 N N09 We wish to test H0 t9 0 vs the two sided alternative Suppose a Bayesian puts the prior P09 0 P09 7 0 12 and in the case of alternative the 12 is uniformly spread over the interval 7M2 Suppose n 40000 and y 001 are observed so y 2 0 Classical statistician rejects H0 at level 04 005 c We will show that posterior odds in favor of H0 are 11 if M 1 so Bayesian statistician stroneg favors H0 Ten in a row Consider testing 9 the unknown probability of getting a correct answer H0 t9 12 corresponds to guessing The alternative H1 9 gt 1 2 to a r i of J ability or ESP 0 A lady who adds milk to her tea claims to be able to tell whether the tea or the milk was poured to the cup rst In all 10 trials conducted to test this she is correct in determining what was poured rst 0 A music expert claims to be able to distinguish a page of Haydn score from a page of Mozart score In all 10 trials conducted to test this he makes correct determination each time o A drunken friend claims to be able to predict result of a fair coin ip In all 10 trials conducted to test this he is correct each time In all 3 situations the onesided pvalue is 2 10 and the hypothesis H0 t9 12 is rejected We will return to this bullet later with a Bayesian alternative r Probability of heads A coin is biased and the probability of heads 9 is of interest The coin is ipped 4 times and 4 tails are obtained The frequentist estimate is 9 0 More sensible Bayes solution will be proposed later More sins to follow 5 Exercises 1 Let X1 Xn be Bernoulli Berp sample Show that TX Xi is suf cient by demonstrating thatXlT tis discrete uniform on space ofallntuples z ml zn xi 0 or 1 such that xi t with probabilities 71 and thus independent of p 2 Let X1 Xn be a sample from the exponential distribution 50 Show that Z Xi is suf cient by demonstrating that the conditional distribution of X1 Xn given 2 X t is free of 0 Use the fact that Z X is Gamma 3 The TwoEnvelope Paradox Here is paradox coming from a subtle use of conditional reasoning at the place where unconditional reasoning is needed I am presented with two envelopes A and B and told that one contains twice as much money as the other I am given envelope A and offered the options of keeping envelope A or switching to B What should I do I reason i For any x ifI knew that A contained m then the odds are even that B contains either 2x or z2 so the expected amount in B would be 5z4 So ii for 2Lindiay D v 1957 A Statistical Paradox Biametn39ka 44 1877192