### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# Bioinformatics Methods BINF 630

Mason

GPA 3.64

### View Full Document

## 41

## 0

## Popular in Course

## Popular in BioInformatics

This 12 page Class Notes was uploaded by Nathanael Schowalter on Monday September 28, 2015. The Class Notes belongs to BINF 630 at George Mason University taught by Iosif Vaisman in Fall. Since its upload, it has received 41 views. For similar materials see /class/215256/binf-630-george-mason-university in BioInformatics at George Mason University.

## Reviews for Bioinformatics Methods

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 09/28/15

Lecture 7 Phylogenetic Analysis Additional Reference Molecular Evolution A Phylogenetic Approach Roderic D M Page and Edward C Holmes Uses of Phylogentic Analysis Evolutionary trees Multiple sequence alignment Evolutionary Problems 1 The fossil record suggests that modern man diverged from apes about 56 million years ago Modern Homo sapiens emerged between 10000060000 years ago ii DNA and sequence alignment by Paabo support this iii Work based on mitochondrial DNA by Wilson et al suggest the modern man emerged only 200000 years ago with the divergence into different races 50000 years ago 1 mitochondrial DNA circular 2 maternal inheritance 3 10X faster mutation rate than nuclear DNA Algorithms Types of Data Distances Nucleotide sites U39PGMA Neighbor Joining mmm iv uumsnig Maximum Minjmum Parsimony Evolution Maximum Likelihood pomew 3 Pl l39 ll 1101191113 A1119 unldo From Page and Holmes Molecular Evolution A Phylogenetic Approach Preliminaries Taxon taxa plural or operation taxon unit is a entity whose distance from other entities can be measures ie species amino acid sequence language etc Comparisons are made on measurements or assumptions concerning rates of evolutionary change This is complicated by back mutations parallel mutations and variations in mutation rate We will only consider substitutions Amino Acid Sequences i For example the amino acid substitution rate per site per year is 53 X 10399 for guinea pig but only 033 X 10399 for other organisms ii The evolutionary time is the average time to produce one substitution per 100 amino acids Amino Acid Sequences Example 7 There are 2 differences in a sequence of 100 amino acids when comparing calf and pea histone H4 Since plants and animals gagged 1 billion years ago Tu 05 billion years 1011 iii probability of substitution 7 several way to calculate it The best way is using the PAM matrices Q Nucleotide Sequences Different from amino acid sequences due to redundancy in the genetic code ie several codons can code for a particular amino acid Most substitutions in the 3rd position are synonomous UC is the RNA coding for serine 7 the corresponding DNA would be AG Since evolution should depend on function and this is conferred by the amino acid sequence it has been suggested that the molecular clock should be based on the substitution rate in the third position of the codon In fact in the fibrinopeptides this is as high as the amino acid substitution rate Nucleotide Sequences iii In the definition of PAM matrices one assumes a discrete Markov Chain with the PAM matrix being the transition matrix for the Markov Chain Markov Chains Assume that we have a process that has discrete observable states X1 X2 When we monitor this over time we get a sequence of the states occupied q1 q2 where qi any of X1 X2 This sequence is a Markov Chain Note that while there can be an infinite number of states the Markov chain has a countable number of elements Markov Chains Another property of a Markov process is that history does not matter This means that the state assumed at time tl depends on the state assumed on t not on any other previous state This is called the Markov property Let X XW n 2 be a discrete time random process with state space 8 whose elements are s1 s2 X is a Markov chain if for any n 0 the probability that Xm1 takes on any value sk S is conditional on the value of Xn but does not depend on the values of Xm XM The onetimestep transition probabilities pjkn PrXn sk l Xn1 sjjkl2 n 12 Since X0 is a random variable called the initial condition pj0 PrltX0 sj jl2 Markov Chains Transition matrix 7 put the pjk into a matrix P A sequence of amino acids can be thought of as a Markov chain Stationary Markov process 7 the probabilities pjkn do not depend on n that is they are constant Another way of saying this is an initial distribution 11 is said to be stationary if nPt11 Irreducible 7 every state can be reached from every other state Application of Markov processes to evolutionary models i The PAM matrix has its substitution probabilities determined from closely related amino acid sequences it assumes that the substitutions have occurred through one application of the transition matrix ie no multiple substitutions and a given site and assumes that evolutionary distance results from repeated application of the same PAM matrix ii A better evolutionary model is needed text p 140144 This requires the use of a continuous Markov process rather than a discrete Markov chain This still has the Markov property Application of Markov processes to evolutionary models A time homogenous Markov process for the stochastic function Xt consists of a set of states Q 12 n a set of initial state distributions 752051 TEn and transition probability functions P11t P1nt Pm bn1tpint Application of Markov processes to evolutionary models We can apply this to nucleotide sequences Let Ql234 correspond to ACGT P11t P14t Pt Pn4t P44t PAAt PCAt PG At PT At PA Ct PC Ct PG CI PT Ct PAlGt PClGt PGlGt PTlGt PAlTt PClTt PGlTt PTlTt J ukesCantor Model a A G a a a a C T a Transitions Transversions Rates of Nucleic Acid Change The Jukes Cantor model assumes that u1u2u3u4a yielding the rate matrix Then P1P2P3P4a Use in Maimum Likelihood Calculation HKY Model Purines A T Pyrimidines C Transitions gt Transversions a gt B De nitions taxa 7 entities whose distance from other entities can be measured A directed graph GV E consists of a set V of nodes or vertices and a set EV of directed edges Then ij E means that there is a directed edge from i to j A graph is undirected if the edge relation is symmetric that is ij E iffg39j E A directed graph is connected if there is a directed path between any two nodes De nitions A directed graph is acyclic if it does not contain a cycle ie ij jk and ki all belong to E A tree is a undirected connected acyclic graph A rooted tree has a starting node called a root The parent node is immediately before a node on the path from the root The child node is a node that is follows a node De nitions An ancestor is any node that came before a node on the path from a root A leaf or external node is a node that had no children Nonleaf nodes are called internal nodes The depth of a tree is one less than the maximal number of nodes on a path from the root to a leaf

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

#### "I bought an awesome study guide, which helped me get an A in my Math 34B class this quarter!"

#### "Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

#### "Their 'Elite Notetakers' are making over $1,200/month in sales by creating high quality content that helps their classmates in a time of need."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.