### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# COMPUTATIONAL MOLECULAR BIOENGINEERINGBIOPHYSICS BIOS 589

Rice University

GPA 3.78

### View Full Document

## 15

## 0

## Popular in Course

## Popular in Biological Sciences

This 14 page Class Notes was uploaded by Jaden Stiedemann on Monday October 19, 2015. The Class Notes belongs to BIOS 589 at Rice University taught by Staff in Fall. Since its upload, it has received 15 views. For similar materials see /class/225027/bios-589-rice-university in Biological Sciences at Rice University.

## Popular in Biological Sciences

## Reviews for COMPUTATIONAL MOLECULAR BIOENGINEERINGBIOPHYSICS

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 10/19/15

Notes on Random Walk for BIOE BIOS 589 Cheng Zhang Nov 18 2008 This article gives some details of the lecture I gave on random walk Since the class is supposed to be introductory7 arguments are made plausible but not rigorous l commonly use the approximation log1 x m z 7 x22 or equivalently 1 z expx 7 22 and the Stirling7s approximation for the factorial n 1gtlt 2 gtlt gtltn x 27m expn lognin The notation z N y means z is proportional to y it is used when the coef cient of proportionality is unimportant Contents 1 Introduction 2 2 OneDimensional Random Walk 2 21 Probability Distribution 2 22 Proof 3 23 Continuous Limit and Gaussian Distribution 3 24 Averages From the Distribution 4 25 Variance 5 26 Central Limit Theorem 5 27 A Smart Way of Deriving the Distribution 6 28 Self Avoiding WalkSAW 6 3 Diffusion Equation 7 31 Population Interpretation of Probability Distribution 7 32 Evolution of a Distribution 7 4 Langevin Equation 8 41 Free Di usion 9 42 Di usion In a Potential 9 43 Detailed Balance 10 44 Time of Escaping a Potential Well 11 A From Binomial to Gaussian 12 B Deriving R N N35 for SAW 13 1 Introduction Random walk is the simplest example to illustrate how a random process a process involves some probabilities happens For a random variable X we study its probability instead of its absolute value so instead of saying X is 60 we say eg the probability of X being 59 is 75 and the probability of X being 63 is 25 etc Random walk has a large number of applications in statistical physics eg the diffusion process can be thought as a random walk 2 OneDimensional Random Walk 21 Probability Distribution Problem Motion of a Monkey Suppose on a very long street there are many equally spaced trees labeled by 73 72 710 1 23 We initially put a monkey at the origin ie the tree labeled by 2 0 In each step the monkey has 12 probability to jump to the tree on the right ie the 2 1th tree and 12 probability to jump to the tree on the left ie the 2 7 1th tree if the current tree index is 2 Now what is the probability of observing the monkey at the 2th tree after N steps By the way any motion that bears some resemblance to our monkey is formally known as a random walk This problem can be solved as the following 1 After the rst step the probabilities of being at 2 i1 are both 12 2 After the second step the probabilities of being at 2 7202 are 14 24 and 14 respectively 3 After the third step the probabilities of being at 2 737113 are 18 38 38 18 respectively 4 After the fourth step the probabilities of being at 2 7472024 are 116 416 616 416 116 respectively 5 We immediately see that the probability corresponds to Newton7s binomial coef cients ie the probability of being at tree 2 after the Nth step is given by PN2 1 The probability of observing a monkey at different trees can be thought as a function of the tree index 2 Or we can call it a probability distribution of the tree index 2 Particularly the distribution given by Equation 1 is a binomial distiibution an example of which is plotted in Figure 1 Figure 1 Binomial distribution 22 Proof Let7s prove Equation 1 just to be a little rigorous We rst denote a sequence of random walks by a sequence of 1 s and 1 s where 1 means going right 1 means going left So if the monkey goes right right left right the sequence is 1 111 Thus the sequence conveniently denotes the distance the monkey moves in each step In each step we have 12 probability to get a 1 or a 1 so the probability of observing a particular sequence of N steps is always 12 gtlt 12 gtlt gtlt 12 1 2N The position is just a sum of the i1 sequence Thus if we have K steps with 17s the rest N K steps with 17s the nal position 239 K gtlt 1 N K gtlt 1 2K N This gives us K Nz392 We also know that there are i di erent ways of choosing from the N steps to be the K steps with 17s eg if N 4 and K 2 you can choose either the steps 1 and 2 or steps 1 and 3 or steps 1 and 4 or steps 2 and 3 or steps 2 and 4 or steps 3 and 4 as the two 1 steps and the rest two steps as 1 steps so in total there are 6 ways All these di erent ways will bring us to the same position 239 Multiplying this number by the probability of each individual sequence which equals to 12N we reached Equation 23 Continuous Limit and Gaussian Distribution Equation 1 only works for integers we now want to study its continuous limit With some mathematical manipulation appendix A we can show that the dis 3 tribution approaches 2 i2 Pm mexp 2N lt2 In the continuous limit the step size the distance between two nearby trees should be a small number a instead of 1 and the actual position is given by X a gtlt i Since the distribution function is now a continuous function of X it makes more sense to study the probability of being in a small interval from X to X AX rather than the probability of being exactly at X The probability of being in a interval is generally proportional to the width of the interval AX It is convenient to de ne a probability density as the ratio of the probability and AX In the present case the minimal unit of AX is 2a eg for the distribution at N 3 the monkey can be at 73a 71a1a3a the spacing between any two successive trees is 2a instead of 1a Therefore PM 1 X2 7 iexp if lt3 AX V 277Na2 2N02 This distribution is a Gaussian distribution or normal distribution It is a con tinuous function and is simpler than the binomial distribution We plotted it in Figure 2 pNX Figure 2 Gaussian normal distribution solid line 24 Averages From the Distribution We can calculate an average value by multiplying it to the distribution and making an integral over the whole range ltXgt X pNltXgt dX 0 lt4 4 X2 X2 pNX dX NaZ 5 where7 the averages are denoted by The importance of a distribution is that it contains all information inside it The rst equation tells us that the average of X ie7 average distance from the initial tree is zero This is quite expected7 because the random walk is symmetrical to the right and to the left7 so X can be neither positive nor negative The second equation tell us the average distance from the origin after N steps measured by 1X2 ma This an important conclusion the average distance does not grow linearly with N7 rather7 it grows slowly with 25 Variance We want to emphasize an important aspect of the Gaussian distribution it is determined by only one parameter 1 This fact can be seen by substitute A for NaZ Then we rewrite Equation 3 as pNX exp 727 Obviously A is the parameter that determines the distribution The physical meaning of the parameter A is the average value of X2 according to Equation 57 or the square of the average width of the Gaussian distribution It is formally known as the variance Once we know the variance7 we know the entire Gaussian distribution 26 Central Limit Theorem So if we know the variance7 we can determine a Gaussian distribution But how to know that the distribution we want is a Gaussian distribution in the rst place Here is a very powerful theorem that saves the day Central limit theorem If you add many identical and independent random variables together7 the sum obeys the Gaussian distribution In short7 the Gaussian distribution is the most common distribution To be clear7 let us apply the theorem to our random walk case The term random variables77 means the i17s we used above to describe each step of the random walk Steps are identical because each step is of size a Steps are inde pendent of each other7 so the current step does not depend on the previous one In addition7 the nal position X equals to the sum of the random i1 variables So all conditions are satis ed X obeys a Gaussian distribution We have already shown a special case of the central limit theorem in the above example The exact distribution is a binomial distribution Equation 1 as we increase N7 the distribution approaches a Gaussian distribution Equation 1We assume the distribution is centered at the origin If it is not so7 we can manually shift the center of the distribution to the origin 27 A Smart Way of Deriving the Distribution Now if we know the central limit theorem then we can determine the distribution more easily We rst write Xx1z2zN where 1 2 zN are the previous i1 variables but we now upgrade them to ia in order to more readily transit to the continuous case Then we calculate W5 X2 x v 12 1M N71Ngt Now the nice thing is that the terms on the second line all vanish after the aver aging To see this consider zlzg since 1 and x2 can independently be ia a gtltaa gtlt 7a7a gtlta7a gtlt 7a 4 0 951952 But terms on the rst line remain For example 2 axalteagtxlteagt 2 wef z The same argument applies to So we reach X2 Naz By applying the central limit theorem we know that Equation 3 is the dis tribution of X 28 SelfAvoiding WalkSAW So far we discussed a free77 random walk But we usually have some restrictions on the random walk A simple example is the self avoiding walk SAW As its name implies it does not allow the monkey to jump back to a tree that is visited before In 1D the monkey can only go straight to the right or to the left the motion no longer random at all However in a higher dimension 2D3D there are still a large number of ways to perform such a walk imagine a monkey in the forrest or a space walker monkey One can show that the distance the monkey traveled in 3D is given by see Appendix B for a derivation R N N35 6 A realization of SAW is a random polymer A polymer is chain of residues where two nearby residues are connected by a chemical bond with a xed length Famous examples are DNAs and proteins Now if we put a monkey on the rst residue then let the monkey to jump to the second residue then to the third we will nd that the monkey is just doing a self avoiding walk The restriction of the monkeys not returning to a visited place conveniently avoids collisions between residues Equation 6 simply gives the typical size of a polymer of length N 6 3 Diffusion Equation 31 Population Interpretation of Probability Distribution So far we interpret the probability distribution as the odds of observing a single monkey at X or an interval XX AX for the continuous case after N steps What if we initially put a million monkeys at the origin and let them all do the same kind of random walks What is the number of monkeys or population on di erent trees after N steps The answer is naturally as you may have guessed the probability distribution times the number of monkeys For example after the rst step we expect about half a million of the monkeys to show up at tree 1 and the rest at tree 71 after the second step about a quarter of a million monkeys at tree 2 another quarter at tree 72 the rest two quarters at the origin The fractions coincides with the probabilities The population viewpoint of a probability distribution can be summarized as the probability distribution at X equals to the fraction of the total population at X The two viewpoints are equivalent But sometimes the population viewpoint is more useful because we dont need to keep track of each individual monkey we only need to know how the population distribution as a whole evolves with time Note although the motion of each individual monkey is random the evolution of the monkey distribution is usually deterministic Thus the population viewpoint is more useful for hard problems 32 Evolution of a Distribution According to the population approach we need to determine the equation that governs the time evolution of a probability distribution then solve the equation This is what we shall do for the random walk problem We rst slightly generalize the problem let rR and rL be the probabilities of going right and going left respectively We further assume that rR rL l The problem is reduced to the previous one if both rR and rL are 12 The number of monkeys on tree i at step N l is entirely determined by the number of monkeys on trees i 7 l i and i l at the previous step N PN1l 17 TR 7 TLPNl TRPNl 7 l TLPNl l The rst term of the right hand side is the number of monkeys leaving tree i the last two terms are for monkeys coming to tree i For example if PNi is the number of monkeys on tree i then rRPNi is the average number monkeys jumped from tree i to tree i 1 Equation 7 becomes exact if the number of monkeys approaches in nity Now we want to regroup the terms and take the continue limit 1 3P3t PN1 7 PNi is the change of probability on tree i we have interpreted N as the time t 2 Similarly 7rRPNi rRPNi 7 l is a difference of probabilities at di erent places so it can be written as a derivative with respective to position X 3 P 7TRPNl TRPNl 7 l 7 7a X m 6X Xi712 7TLPNl TLPNl l 7 0 X M 6X Xi12 The ax is a change of units to go from a discrete 2 to a continuous X Thus Equation 7 can be rewritten as 3P 7 i 31 TRP 31 TLP at 6X X239712 6X X23912 We now notice the average velocity at 2 equals to V TR gtlt 21 TL gtlt 7a a monkey on tree 2 has TR probability to increase X by 1 TL probability to decrease X by a this leads to the following change of variables arLa7V2 8a arR aV2 8b where we have used TL TR 1 Now in terms of the new variables we have XilZ Now we can again replace the difference by a derivative since it is a derivative of a derivative it is a second derivative We further introduce a constant D 122 We reached the di usz39on equation 9P 6V P 3932P E WDm 1613 26X 376X 5W 6P 6V P a 6P X23912 9 If the average velocity V is a function of X as VX the solution is generally complicated But a simple and important special case is V 0 or a free diffusion 6P 62F E 6X2 10 The solution of the above equation is P 1 X2 e m m You may need to look up some mathematics book to actually solve the equation Here is a shortcut to convince yourself plug the solution into Equation 10 take derivatives on the both sides and make a comparison This is the same old result of Equation 3 under 12a2 a D and N a t The variance of the Gaussian distribution is ltX2 2m 11 which is completely equivalent to Equation Thus the diffusion equation is just a population version of a random walk 4 Langevin Equation We now want to derive77 the random walk from Newton7s equations of motion in order to map physical quantities to random walk variables eg the step size a In 8 other words if we can calculate the step size from a few physical properties the motion of a realistic particle is solved In a noisy environment a particle we now switch from a monkey to a particle usually experiences a deterministic force F a frictional force f and a random noisy force R The random force R exists because the particle is continuously kicked by surrounding particles and its velocity is randomly changed usually the surrounding water molecules do the kicking job The frictional force f serves as a damping force so when the velocity gets too high the environment absorbs part of the kinetic energy away from the particles Usually f 77V where V is the velocity of the particle and y is the friction coef cient The deterministic part of the force can be written as the derivation of the potential energy 76EX6X Now the equation of motion Newton7s equation can be written as ma 76EX6Xi yVR 12 where a dVdt is the acceleration m is the mass of the particle 41 Free Diffusion Let us rst consider the case of free diffusion or EX 0 We multiply X on both sides of the Equation 12 and take averages ltmXagt WVX ltRXgt7 where means an average Now on the left hand side Xa XdVdt dXVdt i dXdtV 7V2 On the right hand side XV XdXdt 12dX2dt RX 0 because the current position is uncorrelated with the random force This gives us iltmV2gt VQ dltX2gtdt7 From statistical mechanics we know that mVZ kBT where k3 is the Boltzmann constant and T is the temperature Thus we have X2 2kBTry t Compare this result with Equation 11 we immediately identify that the D kBTy 13 This is an important relation discovered by Einstein Despite its simplicity the relation reveals a deep connection between the macroscopic diffusion constant D and the microscopic friction coef cients 7 42 Diffusion In a Potential We now want to study the diffusion in a potential In such a case the diffusion is not so random and the space available to the particle is limited For simplicity we examine a special case where the mass of the particle m is very small Then the term ma is negligible and we have V 76EX6XR y If we further average out the random force R the deterministic component left out is V 7176EX6X 14 Plug this into Equation 97 we have WWWMMM1D 6t 6X 9X2 We wish to study the stationary case7 NJ9t 07 ie7 what is the nal distri bution The right hand side should also be zero We further remove a 66X on both terms of the right hand side What is left is 0 PM 6EX6X Bap6X The solution of this simpli ed differential equation is P N exp7D yEX Using the Einstein relation Equation 137 we have 13700 PX 7 exp m 15 This is the famous Boltzmann distribution 2 Which means that the motion under the Langevin equation is equivalent to the motion under a constant temperature T Another way to put this is that the noisy force R and the friction f simply play the role of keeping the particle at a constant temperature Such a device is referred to as a thermostat in molecular simulations 43 Detailed Balance There is also a profound result that gives the relation on kinetic aspects on the diffusion Erom Equation 8b7 we have TL 1 7 V expliVa VZ2a2 TR a V x W exp72Va Now use Equation 14 a wmampmmmimm Replace 122 by the diffusion constant D7 and use the Einstein relation D y kBT7 we have rX a a X exp7EXkBT rX a X a 7 exp7EX akBT 2Note7 in the free diffusion case EX 07 the nal distribution is at PX is constant everywhere The above formula only applies to a neighboring pair of positions but it can be easily generalized to any pair of states here is an example TX21gtX irX2aaXarXa X TltX X2Qgt T rXaXarXaaX2a rX2a Xa rXaaX rXaaX2a rXaXa exp7EX akBT eXp7EXkBT exp7EX 2akBT eXp7EX akBT expiAEltXgtkBTi exp7EX 2akBT39 So generally we have rY a X i eXp7EXkBT 16 rX a Y expiEYkBT39 This relation is called the detailed balance This equation says that the rate of mutual transition is inversely proportional to the Boltzmann distribution 44 Time of Escaping a Potential Well As an important application let us consider the following case we have a potential well or a basin surrounded by barriers state B is at the bottom of the well where EB 0 state T is on the top of the surrounding barriers ET E Question Typically how long does it take for a particle at B to escape the well Since the well is surrounded by potential barriers the particle must jump to the state T to escape the well According to the detailed balance Equation 16 we have TB a T expliETkBT eXpFEkBT rT a B eXp7EBkBT Since the process of T a B is a downhill process this rate should be a constant r0 thus we reached that the rate of escaping the well is TB a T To exp7EkBT 17 At a low temperature kBT lt E the exponential on the right hand side can be terribly small This explains why at low temperature structure of a protein is stable while at a high temperature it is readily unfolded A From Binomial to Gaussian Here I give some mathematical details on how to go from Equation 1 to Equation 2 We shall proceed under an approximation N is very large compared with 239 We rst use the Stirling7s approximation 711 27m expn logn 7 n 18 to expand the factorials inside the combination Nfiw n can be replaced by N N 22 and N 7 2392 What we have is here N 39 N 39 N739 N739 exp NlogN7jlog 27 ZlogTZ 2N 19 2 P 39 m7 7 W Wm 2 2 2 The factor before the exponential should be m under the condition that 2 lt N Next we will separate the log 2 to a major contribution term and another small term as log N2 log log1 2N The second term is small because we assumed that N is large compared with i In addition the small term can be be expanded as a Taylor series Ni log12 N mN i 12 2N2 Accordingly N 39 N 39 N 39 N ilogj ilog7 2N 7 12 2N2 2 2 2 2 WNlo N 1lo Nl N 2 g 2 2 g 2 4N In the last step we have dropped the 2393 term We can do the similar thing for log N and use them in Equation 19 2 PN 239 22 N N 2 N 22 Nl N7 7l 7 71 l 7 7 exp 0g 2 og22og24N l MAN 22 N 2 N 22 1 7771 1 7 7 2N 0g 21og214N N1 2 Z3922N 8X 0 if 27TN p g 2N N 2 ex 1i N 27TN p 2N This is Equation B Deriving R N N35 for SAW Due to the self avoiding nature the Gaussian distribution Equation 3 for a free77 random walks no longer applies to SAW Instead we shall treat Equation 3 as a starting point and add a correction due to the self avoiding effect In the free random walk case the probability distribution for the last residue is P0X Y ZdXdeZ N epoXZ Y2 ZZ2Na2dXdeZ We now rewrite the expression as a function of R where R X2 Y2 Z2 is the distance from the origin poltRgtR2dR R2 expliRZNaz dR P0RR2dR is the probability of observing a residue in the RR dR interval Here we used the subscript 0 to denote the distribution of a free random walk The average value R gives us the typical size of a polymer Particularly we ask how does it grow with the number of residues N For a free random walk we already know that R N NlZ However for a self avoiding walk this relation is modi ed We now add a potential to it to take into account the self avoiding effect Since the total volume of the polymer is proportional to R3 the average volume occupied by each residue is given by 00 RSN Since each residue repels other residues away from itself the repulsive energy is inversely proportional to 110 And since there are totally N residues the total repulsive energy is roughly Erep UONvo UONZRS where U0 is some coef cient We are now at a position to modify the distribution According to statistical mechanics we put the interaction energy to the exponential and multiply the factor to the free distribution PR P0R exp7ErepkBT Here kBT is the Boltzmann constant times the temperature we can simply absorb the kBT into the factor U0 This gives the following expression 20 R2 UNZ PR Rzexp 0 2Na2 7 R3 Now we want to nd what is the most probable R as one changes N We can do so by maximize PR with respect to R Since the exponential function dominates the whole distribution we could limit ourselves to maximize the factor on the exponential This can be done by differentiating fR 7 7 U0 with respect to R and nding the location where the derivative f R 0 This gives 7 R 3U0N2 i 0 Na R4 or R N N35 21 This is rst derived by Nobel Laureate P J Flory 1949 The most intriguing feature of this equation is the 35 exponent Exercises 1 N people attend a party N lt 60 Show that the probability of no two people sharing a birthday is approximately eXp7NN1730 Hint use 1 z N expx for a small D Feynman7s dish problem You frequent a restaurant say you go there M times The restaurant has N a large number N gt M dishes and their sat isfactions to you are 12 N respectively unfortunately you don7t know the ranking So in the rst K times you randomly pick a dish and in the rest M 7 K times you will stick to the best dish known to you What is the K that maximizes your satisfaction References 1 The Feynman Lectures on Physics Vol 1 Chap 6414243 2 Robert Zwanzig Nonequilibrium Statisical Mechanics Chaps 1 4 3 Richard Zallen The Physics of Amorphous Solids Sec 38 self avoiding walk

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

#### "When you're taking detailed notes and trying to help everyone else out in the class, it really helps you learn and understand the material...plus I made $280 on my first study guide!"

#### "Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

#### "It's a great way for students to improve their educational experience and it seemed like a product that everybody wants, so all the people participating are winning."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.