# Statistical Methods for Bioscience I STAT 571

UW

GPA 3.57

This 10 page Class Notes was uploaded by Mrs. Triston Collier on Thursday September 17, 2015. The Class Notes belongs to STAT 571 at University of Wisconsin - Madison taught by Staff in Fall.

Date Created: 09/17/15

STATISTICS 571 TA Perla Reyes DISCUSSION 6 Review 1 Con dence Interval for u a Suppose X N Nu02 and X1X2 Xn is a random sample from this distribution i If a is known the 17 04 CI for u is a 7 a g i Za2 7 where Z042 is such that PZ 2 Z042 042 ii If a is unknown then the 17 04 CI for u is E7Za2 s 7 s 90 Tn71a2 S M S 95 Tn71a2 7 where Tn1a2 is such that PTn1 2 Tn1a2 042 b Suppose the distribution of X is unknown but the sample size n is large Then if EXu VarX02 i If a is known the 17 04 CI for u is 0 0 9 Zea2 S M S 9 Zea2W7 where Z042 is such that PZ 2 Z042 042 ii If a is unknown then the 17 04 CI for u is wzaz S 9 Z042 W7 8 7 lt V5 where Z042 is such that PZ 2 Z042 042 2 Con dence Interval for p If X is distributed as Bnp and 7113 gt 5 and n17 13 gt 5 then the 17 04 CI for p is 131723 A A 13 1 7 13 P Zaz SPSPZ042 Note that in hypothesis testing we use the hypothesized value p0 to calculate p value but we use 13 in computing CI for p OJ Relation between 01 and twosided hypothesis testing not for binomial distribu tion a If the 17 04 CI for 1 contains the hypothesized value uo then we do not reject the null hypothesis H0 at level 04 b If the 17 04 CI for 1 does not contains the hypothesized value 0 then we reject the null hypothesis H0 at level 04 email reyes statwiscedu 1 Office 248 MSC M230 330 R330 430 STATISTICS 571 TA Perla Reyes DISCUSSION 6 Practice Problem 1 An horticultor is interested in evaluate the time of blossom of a new type of ower For previous experiences the time of blossom of a plant is normally distributed The horticultor select randomly 21 plants from his greenhouse and measure their time of blossom with the following results X 39 days and s 51 days a Give a 90 CI of the mean time to blossom 10 Describe the effects on the CI if a is known and a 51 to A forester measures 100 needles randomly selected from a pine tree and nds X 31 cm and s 07 cm a What is a point estimate for u ie7 the population mean of needle length of the tree 7 b Construct 95 CI for u Does the interval cover the true mean 7 c Test the claim that u 3 against the 2 sided alternative using result in 03 A machine in a food processing factory must be repaired if it produces more than 10 defectives among the large lot of items it produces in a day A random sample of 100 items from the day s production contains 15 defectives a Using 04 0017 would you conclude that the machine needs repair 7 In Find 99 CI for the defectives in the population c State if any the relationship between results in a and q Suppose we are sampling from Nu7 64 distribution How large must n be so that a 95 CI for 1 has length equal to 017 email reyes statwiscedu 2 Office 248 MSC M230 330 R330 430 STATISTICS 571 Discussion No 1 TA PERLA E REYES O ice B248 MSC Basement Phone 263 7310 Email reyesstatwiscedu O ice Hours M 230 330 and R 330 430 RULES I Review 1 Stem and Leaf Plots a advantage it can be constructed quickly we can extract all the data values from plot b disadvantage not useful for large data sets the choice of stem values may a ects the distribution pattern of data 2 Histograms a advantage useful for large data sets b disadvantage the choice of class boundaries can a ect the appearance of the histogram 3 Dot Plots a advantage it can be constructed quickly b disadvantage when the number of data is small7 it is dif cult to identify any pattern of variation 4 Boxplots constructed by min max median 1stQ 3rin ve number summary a advantage they are particularly e ective for graphically portraying comparisons among sets of data they have a high visual impact b disadvantage it is more dif cult to construct 5 Measures of Location a A 7 V A O Sample Mean i 221 mi 71 Sensitive to outlying values Sample Median for ordered data when sample size is odd7 rnedianthe value for the middle observation when sample size is even7 median the average of the middle two Robust to outlying values Finding the pth sample quantilealso called the 100pth percentile zm H Put the data in order7 from smallest to largest Compute 71p7 where n is the sample size ii If np is an integer7 then zm is the average of the 7119 h and the np 1 h numbers in the list lf np is not an integer7 then round up7 and use the observation which occurs at that place in the list 1st quartile the 025 quantile 3rd quartile the 075 quantile 6 Measures of Spread rangemaximum minimum interquartile rangelQR3rd quartile 1st quartile varianceSZniiL 21 7 i2Ti1E1 7 71 standard deviationS S coef cent of variationcv IIPractice Problems 1 Consider the following two sets of data X 4 5 78 64 15 y 33 79 187 6 559 a How do you evaluate the following i 2 xi ii 2 95 iii 2 x02 iV 2719139 V 211 iiyi Vi 2151 96 EiLi vii 211 am with a 2 viii a2 xi with a 2 iX 211 1 with a 4 b Which sample7 X or y7 do you think has the larger mean c Which sample7 X or y7 do you think has the larger variance d Verify numerically that7 eXcept for rounding error7 the n 5 values satisfy the following i gm i 92 0 5 ii 5 7 5 7 5 397 92 11 Ei1i 32 221 i 71002 221 22 z 2 A company bottles milk in several sizes of container A random sample of 17 containers is obtained from the small container size The volumn of milk in ounces is measured for each container The volumns are 599 584 595 609 593 588 592 604 600 589 595 597 590 591 603 589 598 a Make a stern and leaf display b Find the mean standard deviation rnedian 1st quartile 3rd quartile range lQR and 20th percentile of the data 0 Construct a box plot for these data Stat 571 Discussion 4 Fall7 2003 Review 1 The mean and variance of XN Binomialnp are EX np VarX np1 7p 2 If X1 X2 Xn constitute a random sample from a distribution with mean u and variance 72 then EX u and VarX 7271 TL TL Xi 71p and VarZXi n02 2 1 21 3 Let random sample X1X2 Xn N Nu02 then random variables X and 221 Xi are distributed as 2 n 7 a X N N 7 d X N N 2 M n an 2 mmw 4 Central Limit Theorem For a random sample X1X2 Xn when n is large from an arbitrary distribution with mean u and variance 72 X will be approximately distributed by X N Nu7 5 Let X N Bnp then the normal approximation for X is X N Nnpinp1 19 The proportion Y can be approximated by a normal distribution Npp1 7 The requirement is that np 2 5 and n1 7 p 2 5 for both approximations 6 Let random sample X1X2 Xn N Nu02 then random variable 71 7132 2 7 2 V 7 2 NXTlil7 039 where 32 is the sample variance de ned as 32 EL Xi 7 X2 Practice Problem 1 Suppose the random variable X is distributed as N1007 144 a Consider a random sample with n 167 What is the distribution of X 2211 Xi 7 b Find out PrX lt 103 c d What is the distribution of 2321 X2 7 vvvv How large does 71 have to be in order that VarX 1 7 TA Guang Cheng O icez4261 CSSC Phonez263 7310 e mailchengstatWiscedu Stat 571 Discussion 4 Fall7 2003 2 Suppose that 20 of the trees in a forest are infected with a certain type of parasite Let X denote the number of trees having the parasite in a random sample of 300 trees a Compute Pr49 S X S 71 b De ne Y to be the proportion of trees having parasite Y X300Find PrY lt 025 3 Suppose we have observations from a Nu02 distribution Compute the following 724 20 11 7 9 250 7 29 025 99 6 8 85 95 4 There are two coins7 a green one and a red one The green coin is fairprobability of heads is 057 whereas the red coin has probability of heads equal to 08 a Suppose the green coin is ipped independently twice and the red coin is ipped once Assume that the ips of the green and red coins are independent of each other Find the probability that the total number of heads is exactly 2 b Let X be the total number of heads from the 4 ips Find the probability distribution of X c Find the expected value of X and the variance of X TA Guang Cheng O icez4261 CSSC Phonez263 7310 e mailchengstatwiscedu Stat 571 Discussion 4 Fall7 2003 Solution for practice problem 2 X N N100 that is X N N1009 PrX lt 103PrZ lt 1033100PrZ lt 108413 2331 X N N1610016144 N16002304 VarX 7271 sample size n has to be 144 in order that VarX 1 n a X N Bm300020 Approximately X N N300203002080 N6048 Therefore Pr49 g X g 71Pr494 0 lt Z lt 71k Pr7159 g z 315917 20559 8882 10 Y WXO N N20 223g N20 00053 PrY lt 025 PrZ lt W PrZ lt 217 170150 9850 a rst row ms 3 7 PrV2 g PrV2 g 35 0025lt ms 3 7lt005 ms 2 9 PrV2 2 PrV2 2 45 090lt ms lt 7lt095 b second row ms 3 a PrV2 3 w 0025 1531a383 ms 2 b PrV2 3 w 099 W1356b339 C third row ms 3 85 PrV2 g 183 095 Water 72423 4 De ne X1 as the number of heads obtained from the green coin and X 2 as the number of heads for the red coin Then X1 X2 are independent and we have the probability distributions of X 7 1 and X2 a PX 2 PX1 1X 1 PX1 2X2 0 58 252 45 2 X 0 1 2 3 b Px 05 3 45 2 C EX187 which is the same as EX1 EX2 1 087 VarXO827 which equals to VarX1 VarX2 05 032 TA Guang Cheng O icez4261 CSSC Phonez263 7310 e mailChengstatWisCedu STATISTICS 571 TA Perla Reyes DISCUSSION 4 Review I The mean and variance of XN Binomialnp are EX np Var X npl 7 p E0 If X17 X27 m Xn constitute a random sample from an arbitrary distribution with mean it and variance 02 then for the random variables X ELI Xin and 21 Xi EX M and VaTX 02n Xi TLM and VaTZ Xi n02 i1 i1 3 Let random sample X17 X2 Xn N NW 02 then random variables X and 21 Xi are distributed as 2 n 7 a X N N 7 d Xi N N 2 7 n an E 71 n0 i1 g Central Limit Theorem For a random sample X17X27 Xn when n is large from an arbitrary distribution with mean it and variance 02 X will be approximately distributed by X N NW o1 Normal Approximation a Let X N Bnp and np 2 5 and nl 7 p 2 57 then the normal approximation for X is X N NOTE WC 7 10 b Let Y and np 2 5 and nl 7p 2 57 then the proportion Y can be approximated by a normal distribution7 X N Np7p1 10V 6 Hint VUXI amp21 7 Practice Problem I The dry weight of organic matter in a particular tissue from soybean plants is known to be normally distributed with mean 282 mg and variance 8 mg a What is the probability that the dry weight of organic matter in the particular tissue from a randomly selected soybean plant is less than 260 mg b What is the value of dry weight such that 85 of the tissue have a dry weight lower than that value7 c Find symmetric limits around the mean such that 90 of tissues will have dry weight between those limits 2 Suppose the random variable X is distributed as N1007 144 a Consider a random sample with n 167 what is the distribution of X TIE 231 Xi 7 b Find out PrX lt 103 c What is the distribution of 231 Xi 7 d How large does n have to be in order that VaTX l 7 3 Suppose that 20 of the trees in a forest are infected with a certain type of parasite Let X denote t e number of trees having the parasite in a random sample of 300 trees a Compute Pr49 S X S 71 b De ne Y to be the proportion of trees having parasite Y XSOOFind PrY lt 025 email reyes statwiscedu 1 Of ce 248 MSC M230 330 R330 430

