Popular in Course
Popular in Statistics
This 6 page Study Guide was uploaded by Orval Funk on Monday September 28, 2015. The Study Guide belongs to STAT431 at University of Pennsylvania taught by Staff in Fall. Since its upload, it has received 46 views. For similar materials see /class/215438/stat431-university-of-pennsylvania in Statistics at University of Pennsylvania.
Reviews for STATISTICALINFERENCE
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 09/28/15
Statistics 431 Statistical Inference Facts and Formulas Probability foundations The normal distribution and its samples The probability density function of a N u 72 W is 2 x 2 M y 1 1 V27rlt72 exp 5 72 The population mean is u the population SD is 7 For a sample X1 Xquot of size n from a normal population The sample estimate of u is the sample mean 1 n X Z X il The sample estimate of 72 is the sample variance 1 quot n 1quot 2 2 2 2 s zm in X n121Xi X t 1 The distribution 0fthe normal sample mean X Nu 7271 The SD ofX U is also called the standard error SE of X We estimate it as S X and S2 are independent as rvs The distribution of the sample variance 7 1S2 anil The sample histogram and the normal quantile plot are two graphical tools to judge whether a sample comes from an approximately normal population Prefer the quantile plot The binomial distribution and its samples Let X count the number of successes in n independent Bernoulli trials each with probability p of success Then X has the binomial distribution X Binn p The probability mass function of a binomial W is PXkpk1 pquot k k0n The population mean is np the population SD is Anpl p For a binomial W X based on a Bernoulli sample Z 1 Z n from a population the estimate of p is g3 X 71 The SD off is Mp l p n it is also called the SE of We estimate it as 1f7l n For large n the distribution of g3 is approximately N p pl p 71 Chapter 7 Con dence intervals One sample mean A lOOy 2 1001 70 con dence interval CI for an unknown population mean 1 has the general form 7 0 a XC X C XC a where C is an appropriate upper quantile and 7 is an appropriate population SD or estimate thereof The meaning of the con dence statement is that Pu E Interval y at least approximately The important situations are 7 known and either population normal or 71 large 7 7 and Cquot ZOE2 the 12 upper quantile of the standard normal 7 unknown and 71 large 7 S and Cquot ZOE2 7 unknown and population normal 7 S and C ta2n1 the 12 upper quantile of the t distribution with n 1 degrees of freedom df The sample size needed to get an interval of width to is approximately quot10 ZZa22 rounded up to the nearest integer When 7 is unknown use an estimate from previous experience or from the corresponding value of S in a pilot experiment One population proportion When n is large say n l gt 20 and g3 is not too near 0 or 1 you can use the classical largesample CI formula A A 1 A piZaZ p n p Otherwise use Wilson s formula zi22n i 1 nZ 24n2 z 2 1232n a 1232n When in doubt use Wilson s formula 7 it is almost equivalent to the largesample formula on large samples but is better on small samples The sample size needed to get an interval of width to is 2 p1 p w a quot10 ZZaz rounded up to the nearest integer Here is a prior guess for p A worstcase possibly too large n is obtained by letting 12 One sided con dence con dence interval aka con dence bound A 1001 00 upper con dence bound results from replacing l in the above by and ZOE2 by 20c or ta2n1 by IOWA The other end ofthe interval is 00 A 1001 00 lower con dence bound results from replacing l in the above by and ZOE2 by 20c or ta2n1 by IOWA The other end ofthe interval is 00 Prediction intervals Let Y be a single future observation froma normal population distribution A 1001 00 prediction interval for Y takes the form X l Cquot 041 ln with Cquot and 0 as above The interval has the property that P Y E Interval l 0 Chapter 8 Hypothesis tests General theory Tests involve a null hypothesis H0 and an alternative hypothesis H A A typical case tests H0 u 0 vs HA u 75 0 for an unknown mean parameter u This is atwosided test A onesided test looks like H0 u 5 0 vs HA u gt 0 To conduct the test we need atest statistic T and a rejection region like T gt c or T gt c Here 0 is the critical value Suppose we are testing H0 u 5 0 vs HA u gt 0 Then Png0 Reject H0 2 PType I error for this 1 3 and PngA Do not reject H0 2 PType 11 error 8 for this 1 The signi cance level a is the probability of a power ofthe test is P GHA Reject H0 Type I error at the boundary value 0 The If we observe the test statistic value T t the pvalue is the smallest a at which we can reject H0 using I If you know the pvalue of a test you know the outcome for every level a pvalue lt a gt reject pvalue 3 a gt do not reject Duality for the usual twosided tests a level a test does not reject H0 u 0 exactly when a 1001 a CI for 1 contains 0 There is a similar relationship between onesided tests and upperlower con dence bounds Palticular tests one population mean Atest ofHo u 0 vs HA u 75 yo rejects when T gt Cwhere TX 0 w Here 7 and C are as in the above discussion of twosided con dence intervals The pvalue corresponding to T t can be found by looking up PT gt t for the normal distribution when n is large or the 071 distribution when n is small NOTE because of the absolute value signs you must multiply the tabled value by 2 The sample size n at which a twosided level a test has power 1 8 under the alternative 1 is approximately n 02052 ZN 0 M In the onesided case put 20c in place of 20E 2 The resulting n is only valid if it is large since the formula uses largesample normality Atest ofHo u 3 0 vs HA u lt 0 would reject when T lt Cquot where Cquot is from the lower con dence bound case discussed above pvalues can be found analogously do not multiply the tabled value by 2 Palticular tests one population proportion TotestH0pp0vsHA pg p0use Po VPoll P05n 4 with the critical value determined in the usual manner as a standard normal upper quantile pvalues are also determined from T t in an analogous manner Here 71 should not be too small np0l p0 gt 5 should suf ce For smaller 71 there is a procedure we have not covered based on the binomial distribution The n for which a twosided test of po has power 1 8 under the alternative p p is approximately 2 ZaZVPO1 P0Z3xP1 P P P0 rounded up to the nearest integer For a onesided test replace ZOE2 with 2 Chapter 9 Inferences based on two samples Inferences about the difference of two population means Atwosided hypothesis test has the form H0 ul p2 A0 vs HA ul 2 75 A0 Often A0 0 If the sample from population A is independent of the sample from population B reject when T gt Cquot where 2 l A0 01quot2711 02quot2712 Here 71 72 and C are like the values in the onesample procedures However if m or 712 is small and 7 is unknown you need to assume normal population distributions and treat T as having a t distribution with 1 df Here sf s22 2 I E v 3127112 Szznz2 r1171 T rounded to the nearest integer Note that minn1 1 n2 l 5 v 5 m mg l A 1001 00 CI for 1 ug takes the form I C SE where SE is the denominator of the test statistic T pvalues can be found in the usual way from the value T t and the corresponding table Formulas for 8 can be derived from the structure of the test Samples size calculations are complicated except in special cases We do not give general formulas here or for two proportions below If the additional assumption 71 72 is tenable use 2 I A0 S ooled1m 1n2 5 where l 2 n2 l 2 2 32 7 1 pooled quot1n221 n1n22 When each observation from sample A is paired with an observation from sample B inferences are based on the di erences D X Yi The twosample test ofHo ul p2 A0 then reduces to a onesample test of H0 up A0 with the Di s as the sample Inferences about the difference of two population proportions When testing H0 g1 p2 A0 75 0 the situation is identical to the twomeans case except l g replaces X Y and EU replaces 0i2z39 l 2 When testing H0 p1 p2 0 use T A 71 P2 VPU P1n1 1712 where f7 is the combined sample estimate of p de ned by AXY n1A n2 A Ky 711712Vl1712p1 quot17sz2 Here again m and 712 should not be small NOTE To build a con dence interval for p1 p2 you should not use the pooled variance estimate based on
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'