## Introductory Applied Statistics for the Life Sciences

# Introductory Applied Statistics for the Life Sciences STAT 371

Hao Zheng

This 6 page Class Notes was uploaded by Mrs. Triston Collier on Thursday September 17, 2015. The Class Notes belongs to STAT 371 at University of Wisconsin - Madison taught by Hao Zheng in Fall.

Date Created: 09/17/15
STATISTICS 371 Feb11 2009 Review 1 Elementary Outcome and Event 3 9 7 Cf 03 5 00 0 An elementary outcome is an individual possible result of an experiment 0 An event is a collection of elementary outcomes Suppose E1 and E2 are two events 0 E1 and E2 the set of elementary outcomes in both E1 and E2 0 E1 or E2 the set of elementary outcomes either in E1 in E2 or in both 0 E1 doesn7t happen the set of elementary outcomes not in E1 denoted by 0 E1 and E2 are mutually exclusive disjoint if they do not share any common elementary outcome A probability is a numerical quantity that expresses the likelihood of an event The probability of an event E is written as PrE Basic Rules for Probability Assignments 0 For any event E 0 S PrE S 1 o If Sall elementary outcomes then PrS1 o For any event E PrE0 17 PrE o If two events E1 and E2 are disjoint then PrE1 or E2PrE1PrE2 o For any two events E1 and E2 PrE1 or E2PrE1PrE27PrE1 and E2 The conditional probability of event E2 given E1 is PrE2 and E1 P E E rlt 2 1 PrE1 Two events E1 and E2 are independent if PrE1 and E2 PrE1 gtlt PrE2 Properties of a Binomial Experiment 0 Each trial has two possible outcomes which are arbitrarily labled success77 S or failure77 Bernoulli Trial 0 The probability of success for each individual trial is denoted by p and is assumed to be constant from one trial to the next The probability of failure for each trial is denoted by q and q 1 7 p 0 Trials are independent The outcome of one trial doesn7t impact other trials Notations for Binomial Distribution o n a xed number of Bernoulli S or F trials 0 p the probability of success for each trial 0 X the total number of successes for all 71 trials Variable X is called a binomial random variable lts distribution is called bino mial distribution7 denoted by X N B01710 9 The binomial random variable with 71 trials and success probability p could be de scribed using 71 1 nil PrX s Xnizxp q Practice Problems 1 100 students reported the numbers of brothers and sisters they have The results are summarized in the following table of Brothers Consider the experiment of selecting 1 person at random from a class of 100 students Compute the following probabilities a Prat least 1 sister b Prexactly 1 sister or more than 1 brother c Prexactly 1 brother given that the student has at least 2 sisters d Prexactly 1 sister given that the student has exactly 1 brother 3 According to the Mendelian theory of inherited characteristics7 a cross fertilization of related species of red and white owered plants produces a generation whose offspring contain 25 red owered plants Suppose that a horticulturist wishes to cross 5 pairs of the cross fertilized species Of the resulting 5 offspring7 what is the probability that a there will be no red owered plants b there will be at least 4 red owered plants STATISTICS 371 Review H 3 9 7 CT CT 5 Variable A variable is a symbol that stands for a value that may vary wikipedia A variable is a characteristic of a person or a thing that can be assigned a number or a category textbook a Categorical variable i Ordinal ii Nominal b Quantitative variable i Discrete ii Continuous Population a In statistics a statistical population is a set of entities concerning which sta tistical inferences are to be drawn or simply a set of entities that you may be interested b A population characteristic is called a parameter Sample a A sample is a collection of people or things from a larger group on which we measure variables b Sample size is the number of observations in this sample usually denoted by n c A sample characteristic is called a statistic Statistical diagrams Histogram Bar chart Dotplot Stem leaf plot Boxplot etc Measure of center a Mean Median Mode b Mean vs Median Measure of dispersion a Range b Variation Variance and Standard Deviation c Coef cient of Variation d lnterquartile Range lQR Boxplot Five number summary Min 1st Quartile Q1 Median 3rd Quartile Q3 Max Practice Problems 1 What is the mode of the following data set 91071171378777299 2 Below are data on enrollment in ve Psychology courses Perception 45 Learning 50 Statistics 165 Abnormal 115 Industrial 25 a Construct an appropriate graph of these data b Identify what measure of central tendency would be appropriate c What is the value of that measure of central tendency 3 A number of students majored in life sciences take one statistics course The fol lowing numbers are the scores for a few of them in the rst mid term exam 739188765777784783790787 a Compare the mean and median b Calculate the variance c Because of TAs mistake7 these students are taken off extra 5 points Please correct the data and perform a and b again 4 The boxplot shows that same data that are shown in one of the two histograms Which one goes with the boxplot mum mum u STATISTICS 371 Feb25 2009 Review 1 Con dence interval for population mean M 0 Sample mean i is calculated to estimate the population mean M o The standard error of the mean measures how far i is likely to be from M 0 Instead of estimating M by a single value i an interval likely to include M could be given These intervals are con dence intervals for M o How likely the interval is to contain the M is determined by the con dence level 1 7 04 0 Increasing the desired con dence level will widen the con dence interval o If the scores from the population are assumed normally distributed7 say X N NM703927 then has a standard normal distribution7 where 0X UW so X 7 Pr7196 M g 196 095 U More generally7 7 X 7 M Pr2a2 S 7 S 217042 1 a Since 2012 721042 above formula is equivalent to M X 7 1317217042 S 7 S 217042 1 04 Then7 7 7 PrX 721704247 3 M S X21a2UX 1704 2 Hypothesis testing 0 State the problem and the nature of the data 0 Specify and state the hypothesis to be tested 0 Draw a random sample 0 Determine the sample distribution of the statistic being calculated in the study 0 Specify the level of signi cance 0 Determine the regions of retention and rejection 0 Calculate the value of the statistic of interest 0 Test the hypothesis Practice Problem A horticulture research group are interested in evaluating the orescence of an improved type of ower Based on previous experience7 the duration of blossom should be normally distributed and has standard deviation 039 9 days They select 100 plants and measure the duration of blossom Average time is 37 days 1 Give a 90 con dence interval of the mean orescence of this ower 2 If the mean orescence of the ower before its improved is 35 days7 does the duration of blossom change a In testing this claim7 what are the null and alternative hypotheses b What sample statistic will be used when testing the hypothesis c What transformation on the sample statistic will be needed in order to calculate the probabilities under the sampling distribution of the mean d At the level 04 0057 perform the hypothesis testing and state the result

