STTEACHLEARN SCIEN BSC 5936
Popular in Course
Popular in Biological Sciences
This 71 page Class Notes was uploaded by Kari Harber Jr. on Thursday September 17, 2015. The Class Notes belongs to BSC 5936 at Florida State University taught by Staff in Fall. Since its upload, it has received 13 views. For similar materials see /class/205433/bsc-5936-florida-state-university in Biological Sciences at Florida State University.
Reviews for STTEACHLEARN SCIEN
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 09/17/15
Peter Beerli 2000 Extensions of the coalescent Peter Beerli Genome Sciences University of Washington Seattle WA Peter Beerli 2000 Outline 2 and examples Different approaches to estimate parameters of interest Populatian growth Gene ow Recambinat ion Population divergeme Selection Combination cf same of the population genetics QDL QQ G Peter Beerli 2000 An alternative aproach GriffithsTavar algorithm 0 Infinite sites model Use MCMC to sample a path through the possible histories 0 Sample many different possible histories Phalu by Slephen J unu quot M 2 u u u a b A x o wor mo vwn Wuew wa s we 1ounxsz1uw4I 123214stI halmszlllllallldlli V 213s Peter Beerli 2000 Other alternatives 9 Wilson and Balding 1998 Beaumont 1999 microsatellite specific 9 Nielsen 2001 mutations as missing data Both methods treat mutations as additional events on the geneaiogies T G 03 C96 mArr 0121 can M Peter Beerli 2000 Variants and extensian of the coalescent 9 Population 5326 changes over time 4 Gene flow among multiple papulation 9 Recombination rats 9 Selection 9 Divergence Peter Beerli 2000 general approach to these extensions P Glparameter D All tima imls a f waiting time Prob event happens Peter Beerli 2000 general approach to these extensions P Glparameter D All tima imls a male II e i f waiting time Prob event happens Peter Beerli 2000 general approach to these extensions P Glparameter D All tima imls a male II 9 i f waiting time Prob event happens if39event is a caalescence if event is of type a if event is of type 3 2 g 14090713 II f 9 fiaf i Prob a 5 Prob 13 Peter Beerli 2000 Variable population size lt9 In a small population lineages coalesce quickly 4 In a large papulation lineages coalesce slowly This leaves a signature in the data We can exploit this and estimate the population growth rate 9 jointly with the population size 6 10 Pew Beerli 2000 Exponen al population size expanSion or shrinkage 1000 1mm quot N 537 am 100 1m 40 90 400 muo 1 40000 snoo 2ooo 1 W E I g 11 Peter Beerli 2000 Grow a frog 750 0 50 130 g Mutation Rate Population sizes 10000 generations Present 10 8 8 300 000 8 360 000 10quot7 780 000 836 000 10 6 40500 83 600 Peter Beerli 2000 Gene flow PW 2 pGI6M 11 H mewm 4 155 i g if event is a coalescence j if event is a migration from 339 to i 13 Peter Beerli 2000 Gene flow What researchers used and still use 14 Peter Beerli 2000 What researchers used and still use Sewall Wright shomd that 1 FST 1 4Nm and that it a umes O migration into all subpnpuiation is the same 9 population size of each island is the same 15 Peter Beerli 2000 Simulated data and Wright s formula 1 V M21 91 0 82 21 Una True values Estimated values 1 001 001 114307 1 10 001 F 001 1mm 1 I 10 I 035 0 ms 114611854 16 Peter Beerli 2000 Maximum Likelihood method to estimate gene flow parameters Beerli and Felsenstein 1999 100 huelocus datasets with 25 sampled individuals for each of 2 populations and 500 base pairs bp per locus PoWation 1 Population 2 6 4N9 m1 9 4N 239mz Truth 1000 00050 100 Mean 00476 835 00048 121 Std dev 00052 109 00005 015 10 005 0 0005 1 17 Peter Beerli 2000 Some examples of possible migration models Full sized Mid sized gtl Economy Peter Beerli 2000 Comparison of approximate con dence intervals Population pair M mu Model 25 MLE 975 Africa Asia 300 490 2190 Fun 30 590 2590 Asia Africa 0 0 650 Full 20 360 1600 Restricted 22 Peter Beerli 2000 Comparison between migrate and geneth Beam and Feisenstein 2001 1 01 4Nm 001 0001 00001 0060 0601 061 9 23 Peter Beerli 2000 Recombi natinn rate estimation 24 Peter Beerli 2000 Haplotypes 25 Peter Beerli 2000 Haplotypes Either haplotypes must be resolved or the program must integrate over all possible haplotype assignments 26 Peter Beerli 2000 Recombination rate estimation Kuhner at al 2000 Human lipoprotein lipase LPL data 9734 bp of intan and exon data derived from thme populations African Americans from Mississippi Finns from North Karelia Finland and nonHispanic Whites from Minnmta Population Haplotypes 8w 6K 13 m Jackson 48 00018 00072 14430 01531 North Karelia 48 00013 00027 03710 03910 Rochater 46 000014 00031 03350 02273 Combined 142 00016 00073 06930 01521 are the number of haplotym in each section of the data set from Wattewons estimator 9w and RECOMBINE 8x and from Hudsons atimator mg and RECOMBINE 11 27 Peter Beerli 2000 Estimation of divergence time Wakeley and Nielsen 2001 Present 39 V Divergence time Past 28 Peter Beerli 2000 Estimation of divergence time Wakeley and Nielsen 2001 Figure 7The joint integrated likelihood surface for T and M estimated from the data by Orti et al 1994 Darker values indicate higher likelihood jg rm gumgar 29 Peter Beerli 2000 Selection coefficient estimation Krone and Neuhauser 1999 Felsenstein unpubl 30 Peter Beerli 2000 Joint estimation of recombination rate and gene flow 7 31 Pew Beerli 2000 Joint estimation of recambinatibn rate and migration 0 2 5 g quoto 001 01 1 E U 02 3910 10 100 1000 1 10 100 1000 32 Ia arEkg i2000 Any questions Pointers to suftware through httpavolutiongsuaahingtonedulamarcpopgensoftwarahtml 3 3 Peter Beerli 2000 Population Genetics using Trees Peter Beerli Genome Sciences University of Washington Seattle WA Peter Beerli 2000 Outline 1 Introduction to the basic coalescent 1 Populaticn models 9 The coalescent Q Likelihood astimati39on of parameters of interest Why do we Markov chain Monte Carlo t 2 Extensions and exampla Peter Beerli 2000 Population genetics can help us to find answers lt9 the PCR revolution allows us to generate lots of data from many individuals and many loci Q We are still intere ed in questions like Where are we or other Species coming from How big are populations Are populations species What is the recombination rate in species x Peter Beerli 2000 Population genetics in the age of genamics Q Why do we need theoretical population genetics when we can have the complete s equemm of our favorite organism Peter Beerli 2000 Basics WrightFisher population model o 7 V 1 L 7 39 o04 o o o o o o i All individuals release many gametes and new individuals for the next generation are formed randomly from these 5 Peter Beerli 2000 WrightFisher population model Q PopulatiOn size N is constant through time 4 Each individual gets replaced every Ageneration 9 Next generation is drawn randomly from a large gamete pool 9 Gnly genetic drift is manipulating the allele frequencies Peter Beerli 2000 The Coalescent Sewall Wright showed that the probability that 2 gene copies come from the same gene copy in the preceding generation is 1 Prob two genes share a parent 1 2 Probhaving same parent12N Probhaving a parent1 Peter Beerli 2000 The Coalescent Present 1 o 0 o 0 Past In every generation there is a chance of 12N to coalesce Following the sampled lineages through generations backwards in time we realize that it follows a geometric distribution with 2N the expectation of the time of coalescence u of two tips is 2N 8 Pewr Beer 2009 The kascent JFC Kirigmtan xg neralized this for 1 gene capies kltk 1 Prob 14 copies an redqu to k 1copie5 4N Peter Beerli 2000 Kingman s ncoalescent P resent r 5 o 0 0 Past The expectation for the time interval uk is 4N Mk 1 mm Hiexplt uigt 10 3 Peter Beerli 2000 Naiver we cauld estimate 1Time 0f the most recent common ancestor For a population size can calculate the time of the most recent common ancestor MRCA 1 Get a TRUE genealogy topnlogyand branch lengths fram an infallible Oracle 2 Get the papulation sime from the same oracle 3 Calculate the time 0f the MRCA by summing aver all intervals 11 Peter Beerli 2000 1 Time of the most recent common ancestor Shortcut 1 Get the population size from another oracle 2 Use the upectation for your data type to get an estimate of the time of the MRCA The expectation for the time of the MRCA is Em 4N for diploid organisms Eta 2N for haploid organisms Mu N for maternally transmitted mtDNA paternally transmitted Ychmmosome39 assumptiom sexratio is 11 Pew Beerli 2000 2 Calculate the size of the population 1 We get THE genealogy from our Oracle 2 We know that we can galculate pGenealogle Ml llii ii l 13 Pew Beerli 2000 2 Calculate the size of the population 1 We get THE genealogy from our Oracle 2 We remember the probability calculation Ml r x 390 1 1 MGM pu1Nak gtlt pIIlt2INk DE 14 Pew Beerli 2000 2 Calculate the size of the population 1 We get THE genealogy from our Oracle 2 We remember the probability calculation Ml T pGeneangyN Hexu 39 j 15 Pew Beerli 2000 2 Calculate the size of the population Pmb a l N 1043 20 39 1 5 10 39 05 Popula onsiu 16 Peter Beerli 2000 2 Calculate the size of the population N 2270 N 12286 l l 17 IquotI Peter Beerli 2000 Problems with these very naive approaches We assume we know the TRUE genealogy topology and branch length 18 Peter Beerli 2000 Variability of the coalescent til quot quot l 10 coalescent trees generated with the same population size N 10000 3 Peter Beerli 2000 Variability of mutations AGCTTHTTM marr C I Peter Beerli 2000 How many samples do we need mg u 21 Peter Beerli 2000 Summary of the basic Cbalescent lt9 Mathematically tractable way to calculate probabilities of genealogies in a population 9 The coalescent is a very noisy distribution of times on a tree 9 Variability because of mutation increases the uncertainty of these times lt9 The population size is correlated with the depth of the tree 0 Estimatians of population size or the time of the MRCA fram a single tree are very terrorprone 22 Peter Beerli 2000 Parameter estimation using maximum likelihood Q Mutation model Nucleotide mutation model 4 Population genetics model the Coalescent Prnb datalN p Prob N439 Prob data LNu Prob datalN p c Prob N uIdam Prob ledata 23 Pewr Beer 2000 quotParameter estimation using LNm m mum 2 Manny Pmb datale 24 Peter Beerli 2000 Parameter estimation using maximum likelihood me Prob datam 6136an Prob datale We cannot observe the mutation events Instead of estimating N and u we estimate the product 8 4Nu and scale G with u L9 pG9 Prob dataG Problem We need to integrate over all genealogies all different labelled histarles all different branch39lengths 25 mp 3 P t rfBeerIJ 2060 3920 O uooowoxmb m V 4 2571912000 6958057668962400000 50448098951887305913 56060000000 39 3 2839632 5 40 2 quot31416 x 10 A Ma 0 mm Ix Pg Moran Q audifmm sm w OHM mlfm onfm MMbh ham WMONL How do we change a genealogy D IuL sa mnt Inn IuciIn hu sa n n 8 3m 33 29 Peter Beerli 2000 119 1399 Markov chain Monte Carlo Markov chain Monte Carlo create a new 0005 ms was 63 Pew Beerli 2000 Markov chain Monte Carlo OO 000 00 male 0 Evaluae 00 39 3 O G2I8PDlePGilGe 0800 0880 08 quot lt GJIQPDIGIPG2IGJ O luckily reduces most aften to a PDGz o8808 O O 080 31 Markov chain Monte Carlo Make anotth change tn the tree 0005 MW lad Markov chain Monte Carlo 0005 ms was 63 33 Markov chain Monte Carlo Make anotth change tn the tree 0005 MW lad Markov chain Monte Carlo 0005 ms was 63 Markov chain Monte Carlo 0005 ms was 63 Peter Beerli 2000 Markov chain Monte Carlo
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'