### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# Seminar In Ecology ECL 290

UCD

GPA 3.51

### View Full Document

## 84

## 0

## Popular in Course

## Popular in Ecology

This 46 page Class Notes was uploaded by Shaina Mohr PhD on Tuesday September 8, 2015. The Class Notes belongs to ECL 290 at University of California - Davis taught by Staff in Fall. Since its upload, it has received 84 views. For similar materials see /class/187702/ecl-290-university-of-california-davis in Ecology at University of California - Davis.

## Reviews for Seminar In Ecology

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 09/08/15

Genetic Distance The math free version Distance a familiar word I Geographic Distance Essentially a way of describing how different two physical locations are I There are many different ways to measure distance As the crow flies Mileage in a car Driving time Genetic Distance I Same idea except now we re talking about how different some nucleic acids are I Could be individuals populations species I Similarly there are tons of ways to express these differences each with its own merits Dissimilarity is the key I Different is a binary statistic either 1 or 0 true or false yes or no I Either or statistics are not very informative I Dissimilarity or Similarity is vastly more informative comes in shades of grey Close to zero not so dissimilar Close to one not much in common Inputs I Any genetic data will do Allele frequencies Genotype frequencies Allele sizes Allozymes SNPs RAPDs AFLPs mtDNA Ancl everyones favorite microsatellites I The more loci the merrier I The more alleles per locus the merrier I The more individuals the merrier User beware I It is possible to have too much power I Statistically significant does not equal BIOLOGICALLY significant I Especially with msats Loci galore Alleles galore Cheap enough for individuals galore assuming you don t have to develop your own of course Assumptions I Markers must be Neutral Selection can lead to over differential selection or underestimation stabilizing selection Test for this looking for deviations from HWE I Markers must be Unlinked I Avoid Null alleles Leads to overestimation of homozygotes Models of Evolution Each with its own distances of course I IAM Infinite alleles I SMMTPM Stepwise model mutation model Two I Any allele can mutate Phase m0d to any other allele I SMM orders alleles and I Doesn39t anow mutations can only proceed homopIasy eads to one step forward or back overestimation I Can correct for homoplasy I TPM is just SMM with rare multistep events Measures of Distance I DSA proportion shared alleles I DS Nei s D allele frequencies I DCH Chord linear distance I 9 Angular Distance IAM I SB sum of squares of difference in allele sizes SMMTP M I 5M2 delta mu squared Some random comments gleaned from putting this together I DSA Good for assigning individuals to populations I DS Assumes instantaneous fragmentation I DCH Good for dendrograms but you need lotsa loci gt30 and distantly related populations I 9 Highly sensitive to population size I SB SMM distances are best for distantly related populations and higher mutation rates I 5p2 In principle independent of pop size The bottom line I You want linearity and low variance I Choice of a distance depends on the model of evolution IAM tends to work for less diverged pops or lower mutation rates SMM is often best for msats which have high mutation rates I All of these distance metrics are sensitive to demography I But the good news all of them are improved with more loci more alleles and more individuals so at least there s that So what do we do with these again I Assigning individuals to populations I Estimating divergence I Estimating gene flow I Reconstructing phylogenies IOr Mantel s Tests Mantel s Test Guess who came up with this one In 1967 So what is it I Basically it s a correlation between matrices consisting of some measurements estimations or predictions of dissimilarity or similarity of course I Parametric analyses the conventional kinds of statistics are confounded by autocorrelation among variables and this is the work around Matrices can come from anywhere I Experimental estimates of distance I Predicted distances generated from a theoretical model I Geographic variables I Environmental predictor variables I You get the picture the key is that it can be adapted for variables of different logical types eg categorical rank intervalscale data What does it mean I The operative question Do samples that are similar for metric 1 tend to be similar for metric 2 I Don t worry I ll ttyto make this more clear in a minute I Significance must be assessed using permutation I Importantly all nonlinear relationships are lost in these tests Mantel s Test on Geographic Distances I Species similarity is the dependent distance matrix for example I Geographic distance spatial dissimilarity is the predictor matrix I The question Are samples that are found close together similar in their species composition More simple Mantel s I Species similarity and rain fall Does similar precipitation levels lead to similar species composition I Observed and predicted species similarity Is the predicted composition similar to the observed composition I And so on Partial Mantel s Test I Things like geographic location and climatic variables are obviously not unrelated which will confound analysis I Ideally we d like to know how much is explained by the predictors and whether the residuals themselves are similar to another matrix Genetic distance isolationby distance and historical divergence I Telles and DinizFilho 2005 I Isolationbydistance explained 52 of the variation in genetic distance I Historical divergence of Eugena dysenterca in Brazil using a binary matrix of eastern and western groups explains 72 of the variation I To resolve these contributions they used a partial Mantel s test the results Figure 2 from Telles and DinizFilho 2005 I lBD I Long term divergence matrix is itself correlated with geography therefore large overlap between the two is expected a is the variation explained by historical divergence alone and c is from IBD alone and they clearly show that a simple Mante l s test of IBD alone though plenty significant does not tell the whole story Populag es Individuos Fr 1 Al J 3213 G en 6 39E39 p 03 M todos Diwectos M todos Bayesianos Populations Individuals Allele Frequencies Direct methods Bayesian methods Genotypes individuals classified in source pop individuals misciassified IndIVIduals clasSIerd In source pop individuals misclassified excluded or classified in more than one pop Assignment Testing The Basics Manel et al 2005 Assignment methods TREE paper class website Definitions Assignment Method Ascribing population membership of individuals or groups Assignment Test Hypothesis test that multilocus genotype arises from a particular population Admixture composite gene pool from more than one population More Definitions Classification Assigning individuals to predefined categories Clustering Decomposing mixture into component parts gene pools or pops Groupings unknown Mixture AnalysisEstimates proportion of individuals from different source populations Parentage Analysis Determining parents of an individual or group Traditional Assignment Tests 1 Array of source populations sampled 2 Expected genotypic probabilities likelihood computed from potential source populations w Monte Carlo simulations 3 Individuals tested against confidence interval exclusion threshold in or out for each population Genetic Mixture Analysis 1 Likelihood for Individuals belonging to source populations calculated as in Assignment Tests 2 Maximum likelihood used to estimate posterior source probabilities for each individual 3 The proportion of each population s contribution to admixture is estimated Parentage Analysis 1 Potential Parents genotyped 2 All but one pair excluded based on progeny genotype 3 Fractional parentage analyses can be used as in Mixture Analysis 4 Can estimate mating structure AM Statistical Methods 1 Frequentist 1 Predefined or simulated freq distribution 2 Likelihood 1 Maximum Likelihood 1 Point estimates of model parameters that maximize likelihood function 2 Bayesian 1 Posterior distributions for model parameters See Manel et al 2005 Table 1 GENECLASSZ Frequentist Monte Carlo resampling algorhythm 1st generation migrant detection Calculates exclusion probabilities STRUCTURE does not Highly conservative Genepop How to estimate Fis Fst and Fit 1 Go to Option 6 Fst amp other correlations 2 Choose Allele identity Fstatistics Submit le SturgeoniGenepopimput Results le GenepopiFstin cC Discussion Question Why do we want to use both What is the better one How can we test for population differentiation a Exact Tests 1 Go to Option 3 Population differentiation 2 Choose Genotypic differentiation for all populations when they are not in HWE Submit file SturgeoniGenepopimput Results file Genepopiexacttesthenotypic Note We should apply the Bonferroni correction on each pvalue obtained for each locus Discussion Question Should we take out population CANhat not in HWE and do the genic differentiation Is the genic more powerful than the genotypic method FSTAT software for calculating F statistics exact tests and more 1 Open the FSTAT program As the program opens it may or may not ask you to provide random numbers for initialization of statistical tests Ithink it only might do this the first time you open the program on a particular computer la Before you can open your file in FSTAT you need to convert it to the proper input format CONVERT cannot directly create an FSTAT file for you but it can convert your data to a GENEPOP format and FSTAT will convert from GENEPOP to FSTAT format 2 To convert your file from GENEPOP to FSTAT format go to the Utilities pull down menu and select File Conversion Utilities 9File Conversion 9Genepop 9FSTAT 3 Select the GENEPOP file you want FSTAT might ask you under what name you want to save the FSTAT converted file It may just create a converted file with the same name and a dat extension If it asks for a name enter in an informative name and hit Save 4 Go to the File pulldown menu and select open Select your newly converted FSTAT file Notice that after you open the data file you can now manipulate the analysis options on the program interface 3 Select the following options Under the menu Global Statistics choose Weir and Cockerham Fstatistics This will allow you to estimate Fis Fit and Fst To perform an exact test look under the menu Testing 7 Population Differentiation and choose Test NOT assuming HW within samples Set the number of permutations 1000 is suggested if you have fewer than 10 loci Also choose a Nominal Level for Multiple Tests 5100 which is selecting 005 as your alpha level The program will use this as your baseline alpha when it makes a Bonferroni correction on your data 4 Hit the Run button Your output file will be located in the same directory of your input file and will have the same name with an OUT extension This output file will give you an estimation of Fis Fit and Fst for each locus and over all loci the 95 Confidence Intervals CI for F Fit 8 Fst and f Fis which are estimated by bootstrapping over loci Results for the exact tests are also included If the 95 CI around your Fis Fit and Fst estimates includes zero than your estimate is not significantly different from zero If the CI does not include zero you can say the estimate is significant Discussion Question What are the differences between CI and Exact Tests Are these ones more powerful Some references Cockerham CC and Weir BS 1993 Estimation of gene ow from Fstatistics Evolution 47855863 Excoffier L 2001 Analysis of population subdivision In Handbook of statistical genetics Balding Bishop amp Cannings Eds Wiley amp Sons Ltd Goudet J 1999 An improved procedure for testing key innovations American Naturalist 53549555 Goudet J Raymond M Demeeus T and Rousset F 1996 Testing differentiation in diploid populations Genetics 144 19331940 Nei M 1973 Analysis of gene diversity in subdivided populations Proc Natl Acad Sci USA 7033213323 Raymond M and Rousset F 1995 An exact test for population differentiation Evolution 49 12801283 Slatkin M 1995 A measure of population subdivision based on microsatellite allele frequency Genetics 139457462 So you want to estimate pairwise Fsts in GDA 1 Open the data file File 9 Open in GDA 2 Start a log file File 9 Log 7 save under informative file name Now all your results will be saved in an output file 2 Go to Distance pull down menu and click on Options Distance 9 Options A dialog box will appear that will give you several choices of what type of analysis you want to perform The top of the box will ask ou about what t e of enetic distance metric ou want to calculate 39 v l r This is an easy way to estimate pairwise Fst in GDA but the Pvalues will not be calculated Select coancestry identity from the pulldown menu for above diagonal Select any other genetic distance measure you might be interested in for below the diagonal more on genetic distance later i 3 Hit the Estimate button Note that the distance matrix will be strangely truncated in the program window due to space constraints You can open the output file in Excel and play around with it to get into the proper orientation remember there should be a diagonal down the middle 4 Close the log le to save your results Getting 95 CI for pairwise Fsts in GDA 1 Open your data le in GDA Your input le should be in a nexus format CONVERT can do this for you Hint Doublecheck the number of populations loci and individuals that will appear in GDA to make sure there are no errors in the data le and that you have opened the correct one 2 Open a log le File 9 Log 7 save under informative le name Now all your results will be saved in an output le 3 GDA will provide you with a description of the analyses you will be running Go to the Misc pulldown menu and selection Preferences Click the little box that says verbose next to it top right hand comer of dialog box 4 Time to get sneaky There are two ways to estimate pairwise Fsts in GDA I am going to mention an alternative method when I discuss genetic distances The rst way that I will show you now will allow you get evaluate signi cance of your estimate First open the Misc pull down menu and select IncludeExclude populations Misc 9 IncludeExclude Populations 5 The IncludeExclude Population menu will allow you to select which populations you want to include in your analysis The default condition is that all populations are included First we are going to estimate pairwise Fst among our rst two populations Highlight all populations with the exception of the first two listed alphabetically and hit the Exclude button Now only the rst two populations will be included in subsequent analyses Select Okay to close the box Hint GDA will tell you now that only two populations will be included in analyses 6 Open the Fstats pull down menu and select Bootstrap across loci Fstats 9 Bootstrap across loci This will estimate Fis Fit and Fst and bootstrap across loci to create a 95 con dence interval for them Because only two populations are included in analyses ThetaP aka Weir and Cockerham s Fst is the pairwise Fst estimate for the rst two populations Hint If your confidence interval does not include zero the pairwise estimate is signi cant aka signi cantly different from zero 7 Repeat this process for all additional combinations of populations Hint The only problem with this method is that it will not allow you to perform a Bonferroni correction for multiple comparisons Other programs FSTAT SPAGeDi will provide you with an exact Pvalue that can be used when conducting Bonferroni corrections Pairwise Fsts P Values and Bonferroni Corrections in FSTAT 1 Open the FSTAT program As the program opens it may or may not ask you to provide random numbers for initialization of statistical tests I think it only might do this the rst time you open the program on a particular computer la Before you can open your le in FSTAT you need to convert it to the proper input format CONVERT cannot directly create an FSTAT le for you but it can convert your data to a GENEPOP format and FSTAT will convert from GENEPOP to FSTAT format 2 To convert your le from GENEPOP to FSTAT format go to the Utilities pulldown menu and select File Conversion Utilities 9File Conversion 9Genepop gt FSTAT 3 Select the GENEPOP le you want FSTAT might ask you under what name you want to save the FSTAT converted le It may just create a converted le with the same name and a dat extension If it asks for a name enter in an informative name and hit Save 4 Go to the File pulldown menu and select open Select your newly converted FSTAT le Notice that after you open the data le you can now manipulate the analysis options on the program interface 5 Select the following options Fst per pair of samples under Global statistics on F statistics Testing and Disequilibrium tab and Pairwise Tests of Differentiation found at bottom left of program interface Also choose a Nominal Level for Multiple Tests This sets the initial alpha level that FSTAT will modify when it conducts a Bonferroni correction on your Pvalues most people select 5 100 but this would depend on the questions being asked To minimize the amount of data in your output I would deselect all other options on the program interface unless you are interested in those at this point as well 6 Hit the Run button FSTAT is unusual in that it generates multiple results les which can be confusing Open these in either notepad or Excel The output le that you actually named when you conducted the initial le conversion will give you a bunch of summary statistics Fstatistics at each locus along with variance components and F statistics bootstrapped across loci for 95 CI The le labeled FSTAT gives you the seeds it uses for its randomization tests This may not be interesting It will give you one output le that will include your pairwise Fst matrix FST extension and another le will have pp in the name and pvl as an extension will give your Pvalues Bonferroni corrected alpha and signi cance matrix will indicate which positions on the pairwise Fst matrix are signi cant Hint The le labeled FSTAT gives you the seeds it uses for its randomization tests This may not be interesting You may also get a le that will appear to your computer to be in Winamp format Ithought this might be a little cadence of fanfare to announce your results but if you open it in notepad it turns out to be a matrix of values I can t identify All values in the matrix generated from the sturgeon data were above 1 so it can t be Fst or Pvalues Quick tutorial Pairwise Fsts in SPAGeDi Another Way Get Some Real PValues 1 Download SPAGeDi httpwww Illb ac 39 39 39 139 html 2 Put your le into the proper format Formatting the intput le from scratch will take a long time and is extremely frustrating but the program will convert from GENEPOP or FSTAT le formats CONVERT can make a GENEPOP le for you Like in GDA to get Pvalues for pairwise estimates you can only analyze to populations at a time All input les must consist of population pairs only Hint Make sure input le and SPAGeDi program are stored in the same le The program will not run otherwise 2 Open SPAGeDi You will see a DOS window This may inspire a wave of panic but don t worry the program isn t that difficult to manipulate 3 SPAGeDi will prompt you to enter the name of your input le If you are opening a GENEPOP or FSTAT formatted le hit space and then Enter If you are opening a le formatted by hand type in the name and be sure to add the txt extension Hit Enter 3a Select whether you have a GENEPOP or FSTAT input format Enter 1 or 2 and hit Enter 3b Type the name of your input le Be sure to include the le extension if there is one 3c SPAGeDi will ask you to rename the le for SPAGeDi input You might want to change the name to avoid losing the le in its original GENEPOP of FSTAT format Don t forget the txt le extension Hit enter when the program asks you if you want to proceed 4 Now the program will ask you to name the results le Again be sure to add the txt extension so the output will be saved properly Hit Enter 5 The program will give you a quick summary of the data Doublecheck this to make sure your le is being read correctly by the program I have had instances where mistakes have occurred at this point Hit Enter if there are no discrepancies 6 SPAGeDi will ask you at what level you want to conduct these analyses Select Population Level by entering 2 Hit Enter 7 SPAGeDi will now list all of the statistics available under the Population Level option Note that you can calculate Fsts several ways and you can conduct several analyses at once We only want to estimate pairwise Fsts here so select 1 Hit enter 8 The next menu is for Computational Options This allows you to incorporate spatial information if you have it option 1 We want to determine the signi cance of our pairwise Fst estimates which can be conducted with permutation tests Select 3 and hit enter Hint You may get a warning message here informing you that you have no spatial information in your data le At this point you have the option to add spatial info from another le If you don t have any of these data just hit enter 9 Now you will see a list of permutation options If you select 1 your output will include only the Pvalue for your estimates and not the thousands of permutations necessary to get that Pvalue The second two options allow you to control how the permutations are conducted I have always stuck with default values Select 1 to proceed with default settings Hit enter 10 SPAGeDi will ask you to enter the number of permutations you would like to conduct I have always selected the maximum number 20000 It should only take 35 minutes to run through 20000 permutations 11 Now you can select how you would like your output to be formatted Select 3 for pairwise genetic statistics 12 This next menu is also about formats particular for pairwise stats 1 will give you an output with values in columns as opposed to matrices Because you are only dealing with a pair of populations columns are suf cient 13 Your output le will be saved to the same folder as your input le I recommend opening the output in Excel At the top you will see allele frequency information for each locus Next you will see a list of Fstatistics Fis Fit Fst for each locus and an estimate for ALL LOCI The estimate for ALL LOCI is your pairwise Fst for your population pair Below this is a list of Pvalues for the Fstats at each locus and ALL LOCI The Pvalue for ALL LOCI is the Pvalue for your pairwise estimate Finally you get the above results in an alternate format row Calculating Genetic Distance in GDA 1 Open the data file File 9 Open in GDA 2 Start a log file File 9 Log 7 save under informative file name Now all your results will be saved in an output file 2 Go to Distance pull down menu and click on Options Distance 9 Options A box will appear that will give you several choices of what type of analysis you want to perform The top of the box will ask you about what type of genetic distance metric you want to calculate You can calculate two genetic distance statistics simultaneously with the results of one show above a diagonal in the distance matrix and one below the diagonal Choose what measure you want above and below the diagonal from the pull down menu Be sure to click the box labeled Show distance matrix J i alternative way to estimate pa1rw1se Fst 1n GDA but the Pvalues will not be calculated 3 The second portion of the Options dialog box will ask you what kind of tree you want to be generated from your distance matrix WPGMA UPGMA or NeighborJoining Select which option you would like from the pulldown menu The tree will be generated from the values located below the diagonal so make sure that the estimates from which you want to infer a tree are below the diagonal on the distance matrix Be sure to select the Show Phenogram box if you want to see your tree in the program interface Hint I would advise not selection the Use Line Drawing Characters box This will construct the branches of your tree with letters and numbers instead of lines It looks awful and can be especially confusing if the branch is constructed with the same letters found in the population nameabbreviation 4 Hit Estimate Note that the distance matrix will be strangely truncated in the program window due to space constraints You can open the output file in Excel and play around with it to get into the proper orientation remember there should be a diagonal down the middle 5 Stop the results log by clicking on unchecking the Log option under the File pull down menu File 9 Log 6 If you would like to see your tree in a more interpretable format open it in the TreeView program First open the TreeView program You need to have a printer driver installed on your computer in order to do this The open the Dist pulldown menu and select Invoke TreeView Dist 9 Invoke TreeView Hint TreeView needs to already be open for this function to work Comparison Cervus and ML Relate Two different takes on one data set Data set 6 parents mass spawned 87 offspring All individuals genotyped Analyses based upon 5 microsatellite loci no linkage Nulls were identified as a new state 999bp Giving both programs as much information as possible Cervus Assigned parentage to all offspring except one contaminant Assignment corresponded to manual assignment Comparing Data Cervus we know the true relationships of all individuals ML Relate no way to input which ones are parents and which are offspring Ran it two ways With parents and offspring to look at PO assignment With only offspring to see if sibships assigned differently Used painNise comparisons of 16 individuals Survey says With parents included in the ML Relate file Correct Classification PO FS HS U 16 11 20 31 IPercent 100 39 34 100 58 Incorrect Classification PO FS HS U o 17 39 o Percent 61 66 42 10PO 39 EQ 3U FS Said that 10 full sibs were PO and 3 were Unrelated Said that 5 half sibs were PO With parents in the ML Relate file 139 Without parents in the ML Relate file Correct Classification FS HS U 14 22 41 Percent 42 39 100 59 F8 HS U 19 34 o Percent 58 61 41 4LJ E 1OPO 7P0 7 halfsibs called PO What that tells us These are based on the relationship ML Relate said is most likely In many instances 60 it could not exclude other relationships but sometimes it did exclude the correct one ML Relate did assign parents well But also assigned PO relationship to halfsibs Use all programs with caution

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

#### "When you're taking detailed notes and trying to help everyone else out in the class, it really helps you learn and understand the material...plus I made $280 on my first study guide!"

#### "There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

#### "It's a great way for students to improve their educational experience and it seemed like a product that everybody wants, so all the people participating are winning."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.