### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# ADV BIOSTATISTICS EEMB 276

UCSB

GPA 3.9

### View Full Document

## 49

## 0

## Popular in Course

## Popular in Ecology, Evolution, and Marine Biology

This 12 page Class Notes was uploaded by Judson Fisher on Thursday October 22, 2015. The Class Notes belongs to EEMB 276 at University of California Santa Barbara taught by W. Rice in Fall. Since its upload, it has received 49 views. For similar materials see /class/226888/eemb-276-university-of-california-santa-barbara in Ecology, Evolution, and Marine Biology at University of California Santa Barbara.

## Reviews for ADV BIOSTATISTICS

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 10/22/15

Biostatistics EEMB 276 176 Lecture in PSYCH 1806 Professor Bill Rice Teaching Assistant Alison Pischedda Lecture MonWed 10001150 Laboratory Thur 400600 PM PSYCH1806 Short laboratory this week Today we need to schedule a time for 1 Lecture problems discussion session 2 Lab discussion session Grading midterm1 40 Of gradeaboutweek6 take home over weekend midterm2 60 of grade cumulative weekend before fina s take home Philosophy of the course Calculus old branch of mathematics texts stress unification of concepts little historical context Statistical inference new branch of mathematics texts stress historical context of concepts too little unification of concepts Purpose of this class make statistics more like calculus Repeated samp ng Pva ues no aspects of under y ng d strbuton spec fied Laboratory Untran sformed ryngdstrbuton RANK transformed BINARY most aspects of underlying Approx mate distribution unspecified repeated samp ng Pva ues resamp e data wrep acement a ost no a e s of urmery ng Eg trtgut on spec fled Not always successful resample data woreplacement no aspects of undery ng d str but on spec fled or special transformations to make the data But dam must be conform to a prescribed distribution Inlrinsically interchangeable Be able to decide on the best testing approach more than onetest s usua y OK but genera y on y one best test Be able to use a computerized statistical package to carry out tedious calculations Be able to calibrate your intuition to the robustness of a statistical procedure eg If the data s not norma y d str buted s t OK to use a t test Concept of a Probability Distribution 2 a complete description of a random variable Some basic definitions Random variable any variate something capable of changing or varying whose value cannot be predicted with certainty 39 v x x Deterministic variable V Random variable V Discrete random variable a random variate that may take on only a specified set of discontinuous values such as 1234 Discrete random variables are usually but not always counts such as the number of individuals in a family Continuous random variable a random variate that may take on an infinite number of uninterrupted possible values usually between two end point of a range such as the values between X and XA Continuous random variables are frequently but not always measurements such as height and weight We start by deriving the major discrete probability distributions used in statistics Bernoulli Binomial Poisson Hypergeometric Consider an urn jar filled with a total of T data elements marbles Some marb es are marked w th a 1quot and the others w th a 0quot proportion p marked 1quot success pT 1 marked 0quot failure 1pT 0quot T elements CASE A Randomly draw one sample from the urn 1 ndependent Bernou tr a 0 Exam ne amp record Let X outcome 1 success or O failure of the random draw X is a discrete random variable The probability distribution of the discrete variable X in words A full description of the relative frequency of occurrence of the possible values of the random variable X When the random variable is continuous the comparable description is called a C pdf Probability Density Function gt In the urn example ProbXO failure 2 1p ProbX1 2 success 2 p ProbXanything else 2 0 probability function X Graph of the probability function of X Bernoulli trial A probability distribution can be summarized by two parameters that describe where it is centered central tendency and the degree to which is spread out dispersion mean u 2 measure of central tendency balancing point of distribution mathematically szz xpmmX J Eh owvm1m aX p Variance symbolized o 2 measure of dispersion spread of probability distribution about its mean mathematically O 2 Z 2 X fr prob average squared deviation X a X about the mean For our Bernoulli probability distribution o 231 u high variance low variance low variance most data po nts are most data po nts are very c ose to the mean very c ose to the mean most data po nts are d stant to the mean 0i ou21p 1u2P 0p2 1p 1p2 p W f p2 The random variable X is called a Bernoulli variate it iS designated Bernoullip wherep sthe probab ty ofasuccess example a single toss ot a fair coin let a head a success 2 1 is a Bernoulli12 variate sex of an offsprin Bernoulli traits are common in biology survival to sexual maturit success of a single raptor strike More commonly however biology concerns collections of Bernoulli trails In a sample of 100 individuals how many are female In a sample of 50 individuals how many are diseased In a sample of 20 elephant seals how many mate successfully Randomly draw N samples from the urn with replacement 2 N independent Bernoulli trials repeat Ntimes 0 Exam ne amp return Let S be the number of successes n N draws from the rn N Bernoulli trials What is the probability distribution of S Proms Binomial Distribution 01234N S We will derive the probability distribution by first considering an example Suppose that N3 amp p 23 What is the probability that S2 e that we obtan 2 successes n 3 ndependenttra s There are 3 random draws that we will illustrate with 3 blank spaces draw1 draw2 draw3 Weneed 2Samp1 F One way to obtain 2 S amp1 F is 4 draw1 Prob of this event 23 23 13 232 13 draw dr W3 because the probability of a set of independent events is the product of the probability of each event But there are other ways to get 2 S amp1 F eg III draw1 draw2 draw3 2 Prob of this event 2 1 3 23 23 23 1 3 same as before We can generalize these results Probability of a speci c sample with S successes amp F tailures FNS ms 1p 39s At this point we have solved tor the probability of getting S2 in one particular way Now we need to solve for how many ways we can get S2 In our example we can determine this by simply enumerating all possibilities III 3ways to get El 1 2successamp 1 failure E draw1 draw2 draw3 waysto39 Probability 32 in so Pr bs2 etS2 J aparticularway 2 3 23 13 0444 ways to When N is larger simple enumeration in not pragmatic et 3 so we need a general solution eg if N 20 amp S 5 there are 53l 30 enumerations In general set of N elements How many ways can all N elements be arranged in an array ways to fi arra os t on position 1 2 3 4 5 N ways to arrange N N N1N2N3N4 1 N elements In n array Next how many ways can we arrange subsets of size K from N elements l set of N elements Arrange a subset of K elements in an array 233233 N l N4 position 1 2 3 K P1353 NN1N2 NK1 NN1N2NK1NKNK11 N KN K11 Next how many subsets of size K from N elements irrespective of order combinations set of N elements combinations 9 for every of size K comblnation of eg If K 3 size K 6 there are K combination PermUta onS SO permutations ELEBEO permutations of size K so there are Kt permutatlons combinations of size K K comblnatlon N N N NKVK moa quotCKK Returning to our binomial probability problem How many ways can arrange S successes In N trlals 3 4 5 N position 1 2 ayraayns 0 ways to draw N g S positions from N quotC ssuccess ossible ositions s SKN39S in N trials P P N s F 3 Putting the pieces together Probability of a speci c 3 sample with S successes p 1 pN S and F tailures shorthand notation Number of ways to obtain S N N successes amp F failures S F S N S FNS prob S given N S p 1 K S 1 2 3 IHN 0 Otherwise Now we can return to the original problem What is the probability distribution of S when N 3 amp P 23 General solution Binomial probability function Proms Sl39le ms 1 pF g ps1pF N S F Genera ProbS 05 Graphically S F P 113 ProbSO 3 NW 23 13 20037 l ProbS1 3 231 13 20222 2 Probs2 33 23 13 o444 3 a ProbS3 BHO39 23 13 20296 The full probability distribution of Binomial N3p23 is shown below 05 ProbS D D 0 1 2 3 S What are the mean u amp variance 02 when S N Binom variate We start with a result from probability theory The mean or variance of the sum of independent random variables is the sum of the component means or variances ie means amp variance 0t of sums of independent random variables are additive S successes sum of independent Bernoulli trails 2 X1 AX mean X p for each Bernoulli trial 0i 2 variance X p 1D SO N LS 2meanX pppNp i1 N O variance X p1p p1p p1p Np1p Summary Binomial probability function Proms Ps1PF 0 Np1p Ps139PF CASE C The Binomial Distribution converges on the Poisson Distribution when N is large amp p is small repeat Ntimes Exam ne amp return Very large sample size N a prob success N oo Binomial p gt 0 E Poisson Distribution Distribution There are many situations in biology where the phenomena of interest is the sum of a large number of independent Bernoulli trails 1 Lethal mutations per genome o thousands of mutable genes 0 low mutation rate per gene so new lethal mutations is Poisson random variable ndepen dent mutat ons across genes must be 2 animals or plants In a large sample area ndependem Cross 0 thousands of potential positions where a postons specific type of plant or animal could be found oonly a small proportion of positions actuall contain the specific type of plant or animal To solve tor the Poisson Distribution we will look at how the binomial distribution changes as N gt oo amp p gt 0 limit Binomial Distribution E3 7 N5 limit A 1 quot S N M S NS p p P39gt0 whenp ssma 1pe39p eg f p 001 N 1 P 1p 099 N6 3 p e 3 e39P o990049 142quot5 minquot S 1 when S ltltlt N N g pf e39NP amp 1p z 10 p N e Np N N1 N2 N3 N 81 NS W when N gtgt S wh ch s expected when p approaches 0 Summarizing so far limit Binomial Distribution N 00 p 0 ms e39NP Collecting terms 8 ND e 39 P S Substituting u Np Poisson probability distribution 5 ep S Note that there is only one parameter M instead of 2 N amp P Prob s so only u need be known so animals or plants per sample is Poisson random variable amp not both N amp p individually Mean amp Variance of a Poisson variate mean amp variance of a binomial variate when p is small amp N is big M Np S same as binomial because N amp lp offsetting 2 O s Np 1 p Np because 1p gt1 when p is small M3 because Np M So the mean amp variance of Poisson variate are the same One of the major advantages of the Poisson distribution is that the whole distribution can be estimated based on the size of the zero class EXAMPLE Suppose that 1000 trog eggs are exposed to high dose of UV light to test tor a harmful effect 21 mutation dead alive Results Exper group 10 eggs sunive 900 eggs die Control group No eggs die Jimulations 8 Assume that death was due to dominant lethal mutations What is the distribution of lethal mutations per egg We only have dead amp alive eggs but remarkably we can estimate the entire distribution of mutations per egg Why should we expect lethal mutations to follow a Poisson distribution Lethal Mutations 1 can occur independently at any of thousands of genes in the genome 2 probability of a mutation success at any individual gene 3 very small 80 N is large and p is small and a Poisson distribution is expected To solve tor the distribution of mutations per egg we start by reasoning that surviving eggs carry NO dom lethal mutations So this is the case where S 0 We can use the size of the 820 class to estimate the entire Poisson distribution 8 0 e e ProbS0 S39 0 39 1 In our example our estimate of ProbS0 is 101000 2 001 Setting 001 e39 We can solve tor u by taking logs of both sides In001 In e39 In001 u 2 46 Mes This is our estimate of MS amp OS What is the estimated probability distribution of S let u ues 46 HS eru sma ProbS in general rounan 05 SI 0 M error ProbS0 M 2 0009 0 u 1 2 3 05 1 1x ProbS1 M 2 0043 0 1 395 1 2 3 05 2 eM ProbS2 1quot 20100 0 2 1 2 3 05 fieM ProbS3 339 o157 0 m 39 etc 0 1 2 3 The full probability distribution of Poisson u46 is shown below 05 probS Note that the distribution continues out to intinity but S gt 12 have almost no chance of occurring So even though most mutations are hidden by multiple mutations in the same dead individuals we can use the size of the zero class to estimate the whole Poisson probability distribution Summary Poisson probability function its Np OSZNP Randomly draw N elements Hypergeometric without replacement distribution Examine amp do not return N Again let S 2 i1 X We cannot express as as a sum of independent Bernoulli trials because the data observations are not independent not independent because the value of p changes after each element is drawn amp not replaced 10 We can solve for the probability distribution of S in a simple way 39 of possible samples of size N with Pr0bS S successes 9 Total possible 39 of samples J of size N same logic of possible ways to draw ways 0 drawin sampies of S successes FNS failures oi pOSSibie ways to draw ways of drawmg Size N wiih from a pooi of from a pool of sampes of successes FNS failures from S successes Txp eiemenis T1p elements size N with rom a pool of a P00 0f T1D Pr0bS elements Tot 39 ProbS M Tpelemems ZL o sg gig e Total successes in Urn Total POSSible ways to draw a 39 l size N s a In the sample otsamples sample of size N Tp g 0 33929 N from a pool of size ST 5T P39S Le in urn T elements 0 000 T 030 Total Failur sin Urn a In the sample ET1pF Leftin urn o 0 Sr combinations ofSTelemenls of possible Will510 draw Ways of dranng aciii gigsssiiile s giggggsgrsaw Cs lakequot Satahme sampies of S successes FNS failures size N with from a pool of ST W size N with from a pool of from a pool of s successes Tm elements si sis s S successes T p elements T1p elements ProbS ProbS Total possible TOIaI POSSible wa s to draw a i of samples Tmquot Masseyquot mo 1 of samples sample of size N fsize N G S 1 size N from a pool of size 13p 9 O T elements sT impsis quot 39 939 6 D 630 T Total Failures in Urn lt o Tp T1p T1P S N39s Fi o ProbS T N Mean amp variance of a hypergeometric random variable N same as binomial because lack of MS p replacement does not change the average outcome of a sample Start with an empirical example T P Binomial variance 0 Np1p1 T1 Hypergeom etric xxxxx X Hypergeome39tric 1 2 3 4 S 6 7 8 9 10 T N Proportion of Binomial AsT o variance remaining Correction for a Finite Population CFP Binomial 12

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

#### "I signed up to be an Elite Notetaker with 2 of my sorority sisters this semester. We just posted our notes weekly and were each making over $600 per month. I LOVE StudySoup!"

#### "There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

#### "Their 'Elite Notetakers' are making over $1,200/month in sales by creating high quality content that helps their classmates in a time of need."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.