Review Sheet for PUBHLTH 640

Date Created: 02/06/15

Review Sheet for PUBHLTH 640

Date Created: 02/06/15

Puleth 640 Unit 1 Review of Puleth 540 Introductory Biostatistics Intermediate Biostatistics Practice Problems Week 1 SOLUTIONS 1 Recall that variables can be of different types We learned in introductory biostatistics that appropriate methods of summarization depend on the variable type And we noticed that sometimes a method of summarization is not appropriate For example it is not appropriate to construct a cumulative frequency graph summary of nominal data Using whatever sources you have for introductory biostatistics complete the following table Random Variable Discrete Nominal Mina Continuous Interva1 Ratio Descriptive Methods Bar chart Bar chart Ple Chart Ple Chart Dot diagram Dot diagram Scatter plot 2 variables Scatter plot 2 variables Stem Leaf Stem Leaf Histogram Histogram Box Plot Box Plot Quantile Quantile Plot Quantile Quantile Plot Numerical Summaries Freq Table Freq Table Rel Freq Table Rel Freq Table Cum Freq Table Cum Rel Freq Table means variances percentiles means variances percentiles Wklisolutionsdoc Puleth 640 Intermediate Biostatistics 2 Try your hand at the following probability exercises a Divide a line segment into three parts such that one portion is half the length of original line and the other two portions are each one quarter then length of the original line Choose a point at random What is the probability that this point is in the 12 length portion Answer 5 Solution As all points along the line segment are equally likely length is pr0p01ti0nal t0 probability Thus 12 length corresponds to 12 0f the total probability b If there is a 14 chance that any person selected at random was born on a Monday what is the probability that of any seven people selected at random exactly one was born on a Monday Answer 40 Solution This is a binomial probability calculation X Realized count of number of Monday birthdays in sample of 7 N trials 7 Event of interest is birthday on a Monday TE Probabilityevent l4 Calculate Pr X 1 3965 httpfacullvyassan edulawrvbinamialX html c What are the odds of getting exactly one pair in ve card stud poker using a 52 card deck Answer Odds are 42 to 58 Solution This is a combinatorial calculation that assumes all ve card hands are equally likely hands that are exactly one pair robabilit exactl one air p y y p Totalhands possible 51752515049481598360 total hands possible E 7 5 5 47 5432l To solve for the hands that are exacth one pair the idea is to think in steps 1 choices ofa rank ace 0r 2 0r 3 0r 0r queen 0r king 13 Wklisolutionsdoc Puleth 640 Intermediate Biostatistics 2 For selected rank choices of suits to be in the pair 4 i m 6 2 22 20 3 Now think about the remaining 3 cards They have to be of distinct ranks else hand will no longer be exactly one pair choices of distinct ranks for 7 12 7 121110 321 the other 3 cards from the leftover 12 ranks 4 Finally recognize that for each rank you also get to choose its suit choices for suit of 1st of remaining 3 cards 4 5 Similarly choices for suit of 2nd of remaining 3 cards 4 6 Last but not least choices for suit of 3rd of remaining 3 cards 4 Putting these all together gives us the count of the hands that are exactly one pair Count 136220444 1098240 So now we can solve hands that are exactly one pair 7 l 098 240 7 042257 Total hands p0551ble 2 598 960 probabilityexactly one pair 042257 7042257 1042257 057743 Thus odds exactly one pair d Suppose a quiz contains 20 truefalse questions You know the correct answer to the rst 10 questions You have no idea of the correct answer to questions 11 through 20 and decide to answer each using the coin toss method Calculate the probability of obtaining a total quiz score of at least 85 Answer 18 Solution This is also a binomial probability calculation X Count of correct answers among questions 11 20 N trials 10 A grade of 85 or better corresponds to 8520 17 correct or more Thus among questions 11 20 you must be correct 7 or more times TE Probabilitycorrect answer 50 Calculate Pr x 3 7 171875 httpfacullvyassan edulawrvbinamialX hzml Wkl7SOlUllOnSd0C Puleth 640 Intermediate Biostatistics 3 Refresh your memory of the elements of a con dence interval There are three 1 point estimate 2 standard error of the point estimate and 3 con dence coef cient Complete the following summary table Normal Distribution Confidence Interval for ul u Two Independent Groups CI point estimate i confcoeffSEpoint estimate 039 and 039 are both 039 and 039 are both NOT 039 and 039 are both known known but are assumed NOT known and NOT EQUAL Equal EStmate Group 1 Group 2 XGroup l Group 2 XGroup l Group 2 SE to use 0 2 0 2 A 00 sfm A s2 s2 SEfxcmpl Xcmpl n n SEXGMP17XGW1 n1 n21 SEXGMP17XGW1 iii where you already have obtained 2 n1 1s12n2 TDS p001 7 n1 Tln2 71 Confidence Normal Student s t Student s t Coef cient Use Percentiles from 2 2 2 S1 S2 f 111 n 2 Degrees Not appllcable n1 1 n2 1 82 2 82 freedom 1 2 n n 1 2 n1 1 n2 1 Wklisolutionsdoc Puleth 640 Intermediate Biostatistics 4 See if you can recall some of the important concepts that are discussed in introductory biostatistics a What is a sampling distribution Answer A sampling distribution is a probability distribution for a random variable that is itself a statistic Thus sampling distributions refer to the probability distributions of such things as the sample mean the sample variance etc These probability distributions are the result of the idea of replicate sampling infmitely many times b What does the central limit theorem tell us Why is it so useful to us What is a zscore What is a tscore Answer The central limit theorem tells us that as the choice of sample size increases the sampling distribution of the average approaches normality This is very useful to us because it allows us even for moderate sample size to regard the average as a realization of a Normal distribution random variable Erandom variable zscore SErandom variable random variable Erandom variable t score Estimated SErandom variable c In a sentence or two explain the meaning of a 95 con dence interval for a population mean that has lower limit 356 and upper limit 528 Answer This interval allows the investigator to say that heshe is 95 con dent that the unknown true population mean is between 356 and 528 Of course the true mean is either in this interval or not we don t actually know Recall that it is NOT CORRECT to say that the probability is 95 that the population mean is between 356 and 528 Wklisolutionsdoc Puleth 640 Intermediate Biostatistics d De ne pvalue Interpret plt05 and plt01 Given identical study conditions which gives stronger evidence against the null hypothesis Answer A p value is a chance statement A p value is the probability that the test statistic of interest attains a value as extreme or more extreme relative to the null hypothesis under the null hypothesis probability model All other things being equal a p lt 01 gives stronger evidence against the null hypothesis e Suppose a two sided hypothesis test of treatment benefit in a randomized controlled trial of placebo versus active treatment yields a pvalue of 0045 What are the possible explanations for this result Answer There are several possible explanations and ultimately we do not know which one is the correct explanation Among them are The active treatment is truly different than the placebo The active treatment is equivalent to the placebo but an event of low probability has occurred The active treatment is equivalent to the placebo but one or more biases have biased the data in the direction of statistical signi cance albeit marginal Wklisolutionsdoc

