CELL PHYSIOLOGY BIO 4441
Popular in Course
verified elite notetaker
Popular in Biology
This 118 page Class Notes was uploaded by Keely Moen on Wednesday September 23, 2015. The Class Notes belongs to BIO 4441 at Texas State University taught by J. Koke in Fall. Since its upload, it has received 33 views. For similar materials see /class/212809/bio-4441-texas-state-university in Biology at Texas State University.
Reviews for CELL PHYSIOLOGY
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 09/23/15
Chapter 3 Proteins THE SHAPE AND STRUCTURE OF PROTEINS DEFINITIONS 3 1 Quaternary structure 3 2 x helix 3 3 Primary structure 3 4 Binding site 3 5 Tertiary structure 3 6 Polypeptide backbone 3 7 3 sheet 3 8 Protein domain 3 9 Secondary structure TRUEFALSE 3 10 True In both states stretched like a string and properly folded a protein has a highly ordered arrangement of its atoms A folded protein is stable at a near entropy minimum because the entropic cost is more than balanced by the contributions of weak bonds A stretched out protein however is not stable at this entropy minimum and will assume a more disordered state that is it will maximize its entropy 3 11 True In a 3 sheet the amino acid side chains in each strand are alternately positioned above and below the sheet This relationship can be seen in Fig ure 3 38 seeAnswer 3 19 which shows that the carbonyl oxygens alternate from one side of the strand to the other Thus each strand in a Bsheet can be viewed as a helix in which each successive amino acid is rotated 180 3 12 True Chemical groups on such protruding loops can often surround a molecule allowing the protein to bind to it with many weak bonds THOUGHT PROBLE MS 3 13 Free amino acids have an amino group and a carboxylate group both of which are charged at neutral pH In proteins these groups are involved in peptide bonds which are uncharged Thus the hydrophobicityhydro philicity of a free amino acid is not the same as that of its side chain in a pro tein To measure the hydrophobicityhydrophilicity of the side chains it is common to assess the properties of sidechain analogs Thus for alanine one would use methane for threonine ethanol for aspartic acid acetic In This Chapter THE SHAPE AND A45 STRUCTURE OF PROTEINS PROTEIN FUNCTION A53 A46 hydrophobicity dark bars amino kcal acid mole xomzonm iltmngt mlt o on acid and so on To assess hydrophilicity one can measure the solubility of the sidechain analogs in water In general this is done by determining how the sidechain analog partitions between a vapor of the analog and water Hydrophobicity can be measured in an analogous way by assessing how a sidechain analog partitions between water and a nonpolar solvent such as cyclohexane One might imagine that the rank order for hydrophilicity would be the reverse of that for hydrophobicity From the rank order lists in Figure 3 34 that is mostly true but there are differences most notably tyro sine Y and tryptophan W Reference CreightonTE 1993 Proteins 2nd ed pp 153 155 NewYorkWH Freeman 3 1 4 increasing hydrophobicity H Chapter 3 Proteins hydrophilicity Iightbars amino kcal acid mole G L V A F C M T 7727 E s 7745 g W 7827 5 Y 7850 E Q 71177 2 K 71191 E N 71207 E E 71263 E H 71266 D 71334 i R 72231 hydrophobic force 3 15 A Heating eggwhite proteins denatures them allowing them to interact with one another in ways that were not possible at the lower temperature of the hen s oviduct This process forms a tangled meshwork ofpolypeptide chains In addition to these interactions interchain disulfide bonds also form so gt hydrophobicity D kcalmole L o l 720 hydrophilicity D Hydrogen bonds electrostatic attractions van derWaals attractions and the that hardboiled egg white becomes one giant macromolecule B Dissolving hardboiled egg white requires a strong detergent to overcome the noncovalent interchain interactions and mercaptoethanol to break the covalent disulfide bonds Together but not separately the two reagents eliminate the bonds that hold the tangled protein chains in place Try it for yoursel 3 1 6 In an or helix the carbonyl oxygen of the first amino acid hydrogen bonds to the amide nitrogen of amino acid 5 Figure 3 35 Thus there can be no or helix shorter than five amino acids The single hydrogen bond that would be formed with five amino acids gives too little stability to the structure for any helicity to be detected Only with six amino acids two hydrogen bonds do you begin to detect some Helicity becomes increasingly apparent as more amino acids hence more hydrogen bonds are added 3 17 The ends ofochelices like polar amino acids are almost always found at the surface of a protein where they can interact with polar water molecules In addition to their partial charge the backbones of the four amino acids at either end of the helix carry hydrogenbonding groups that are unsatisfied by hydrogenbonding within the helix Figure 3 36 These groups also add to the polarity of the termini ofoc helices Figure 3 34 Measurements of hydrophilicity and hydrophobicity of sidechain analogs Answer Bel 3 Partition coefficients are listed for each amino acid sidechain analog A partition coefficient is the equilibrium constant for the solubility of a solute in two different phases and is expressed as kcaImole The rank orderfor hydrophobicity is listed in descending order while that for hydrophilicity is listed in ascending order Ifthere were a perfect negative correspondence between hydrophobicity and hydrophilicitythe amino ac39 5 would appear In the same order in the two lists The data are plotted with the amino acids arranged in order of decreasing hydrophobicity dark bars from left to rig ht Hydrophilicity Is indicated bythe overlaid gray bars Figure 3 35 Schematic ofan 0c helix showing the pattern of hydrogen bonds between carbonyl oxygens and amide nitrogens within the helix Answer 346 THE SHAPE AND STRUCTURE OF PROTEINS A47 9 PEPTIDE 1 PEPTIDE 2 g Q E s F 39 f R 8 1 E R 8 1 V 9 g k K 15 12 S R 15 12 V g 4 5 4 5 C D D 11 16 T R 11 16 I e L 18 7 9 M s 18 7 9 v r a I 7 2 L R 7 2 L a 14 13 14 13 E F W T 3 6 L ko 393 10 17 6V I 10 17 L 1 V 1 F L L g N 5 N Figure 3 36 Representation ofan Figure 3 37 Arrangement ofamino PEPT39DE 3 0c helix showing dipole and acids of the three peptide L R unsatisfied hydrogenbonding sequences around helix wheels F 8 1 V groups at its ends Answer 347 Answer 348 R 15 12 S Nonhydrogenbonded Os and Ns 4 5 are labeled A 11 16 L I 18 9 K The first two peptide sequences but not the third would give amphiphilic V 7 2 L helices as shown in Figure 3 37 R14 13 I 6 As illustrated in Figure 3 38 the first three strands of the sheet are antipar F 3 10 17 R allel to their neighbors whereas the fourth strand is parallel to the third I L Because the side chains of the amino acids alternately project above and below the sheet a sequence that could form a strand in an amphiphilic 3 sheet should have alternating hydrophobic and hydrophilic amino acids Only choice D satisfies this condition None of these folds would give a knot when stretched out This is a general principle proteins fold without forming knots One might imagine that it would be difficult to thread the end of a protein through an interior loop to form a knot A folding pathway to such a knotted form might not be achie veable through random motions in a reasonable time Antiparallel strands are commonly formed by a polypeptide chain that folds back on itself Thus only a few amino acids are required to allow the polypeptide chain to make the turn By contrast parallel strands must be connected by a polypeptide chain that is at least as long as the strands in the Figure 3 38 A segment off sheet showing the polarity N to C ofthe individual strands Answer 349 3 24 F Chapter 3 Proteins sheet For a long peptide a common solution for satisfying backbone hydro genbonding requirements is an or helix Figure 3 39 Proteins obviously can t search all possible conformations on their way to finding the correct one Thus there must be defined pathways to simplify the search It is now thought that weak interactions rapidly cause the protein to collapse into a molten globule in which bonding interactions are tran sient and chains maintain uidity Within the molten globule very weak sec ondary structures form and disappear as do tertiary interactions The for mation of small elements of correct secondary structure stabilized by appropriate tertiary interactions then appears to nucleate formation of the final structure This general folding pathway represents a fight between the maximization of entropy which tends to keep the protein as random as pos sible and the minimization of enthalpy through formation of weak bonds Increasing numbers of weak interactions pull the structure through a suc cession of increasingly well defined states to the final conformation This conceptualization of folding has been likened to a funnel and is commonly referred to as the folding funnel with multiple routes of progress down the funnel accompanied by an increase in nativelike structures Reference FershtA 1999 Structure and Mechanism in Protein Science pp 575 600 NewYorkWH Freeman Many different strings of amino acids can give rise to identical protein folds The many amino acid differences between the homeodomain proteins from yeast and Drosophila are among the manypossible ones that do not alter fold ing and function This question could have been framed in another way namely howmany amino acid changes are required to convert say an ochelix into a 3 sheet The answer is surprisingly few These two answers underscore the difficulty in predicting protein structures from amino acid sequences The statement is correct The length of evolutionary separation does not depend on how long yeast and Drosophila have been around but rather on when their last common ancestor existed Evolutionary separation is calcu lated as twice the time from when the last common ancestor existed that is counting backwards from one kind of organism to the common ancestor and then forward to the other kind of organism The protein in Figure 3 5 is composed of two domains The protein can be cleaved in the exposed peptide segment that links the two domains Figure 3 40 Fragments that correspond to individual domains are likely to fold properly It is common experience that isolated domains are easier to crys tallize than the entire protein The ability to form a crystal depends on the surface characteristics of the protein because it must be able to interact with itself in a repeating pattern to form a crystal Homologous proteins from different species which fold the same like the homeodomain proteins in Problem 3 24 differ subtly in their surface characteristics As a result the protein from one species may crystallize readily while that from another species may not crystallize at all A single amino acid change sometimes makes all the difference Perhaps this is so Nevertheless it seems likely that new and useful protein folds have been invented during evolution by the chance fusion of genes A distribution ofprotein folds within the tree of life would be informative Pro tein folds that are distributed in all divisions of the tree archaea eubacteria and eucaryotes were very likely present in their last common ancestor More recentlyinvented protein folds would likelybe confined to a single divi sion or a few branches Even if all protein folds were present in the last com mon ancestor it seems unlikely that there would be a onetoone correspon dence between folds and genes Surely the evolution that led up to the last c mmon ancestor would already have exploited some of the benefits of gene duplication and refinement of function that lead to families of related genes ANTIPARALLEL STRAN DS from lysozyme PARALLEL STRAN DS from alcohol dehydrogenase Figure 3 39 Connections between antiparallel and parallel strands ofa Bsheet Answer 3722 Figure 3 40 Catabolite activator protein Answer 3726The arrow shows the site ofcleavage in the exposed peptide segment linking the two domains THE SHAPE AND STRUCTURE OF PROTEINS The limited number of protein folds raises a more fundamental question about the total number ofprotein folds that are possible It maybe that evo lutionary processes have already exploited most of the stable folds that are possible Alternatively it may be that the number of possible stable folds greatly exceeds those currently used on Earth The simple cataloguing of natural protein folds will not address this more basic question Generally speaking an identity of at least 30 is needed to be certain that a match has been found Matches of 20 to 30 are problematical and diffi cult to distinguish from background noise Searching for distant relatives with the whole sequence usually drops the overall identity below 30 because the less conserved portions ofthe sequence dominate the compar isonThus searching with shorter conserved portions of the sequence gives the best chance for finding distant relatives The close juxtaposition of the N and Ctermini of this kelch domain identi fies it as a plugin type domain Inline type domains have their N and C termini on opposite sides of the domain As shown in Figure 3 41 the three protein monomers have distinctly differ ent assembly properties because of the threedimensional arrangement of their complementary binding surfaces Monomer A would assemble into a sheet monomer B would assemble into a long chain monomer C would assemble into a ring composed of four subunits Choice B e is the only arrangement of DNAbinding sites that matches the arrangement of subunits in the headtohead Cro dimer The DNA that corresponds to such an arrangement is known as a palindrome A39I CG CGAT TAGC GCTA Rotation of this sequence 180 about the central dot gives an identical sequence just as does rotation of the arrows e This demonstrates that the Cro dimer and its recognition sequence have the same symmetry as expected Headtotail dimers have unsatisfied binding sites at each end which would lead to the formation of chains see Figure 3 41B Proteins 1 3 4 and 5 can form headtohead dimers as illustrated for pro tein 1 in Figure 3 42A All binding surfaces that allowproteins to interact are complementary The binding surfaces that allow two copies of a protein to form a headtohead dimer must be selfcomplementary because they bind to themselves To be selfcomplementary one half of the binding site must be complementary to the other half This means that the two halves can be folded on top of one another with properlymatched binding across a line drawn through the center of the binding site as illustrated for the pro tein 1 binding surface in Figure 3 42B There is no line across which proteins 2 and 6 can be folded to make their binding partners match Inclusion of protrusions and invaginations would not have altered this general principle complementary binding surfaces can be folded so that a protrusion on one side inserts into an invagination on the other side The coil 1A segment ofnuclear lamin C matches the heptad repeat at 9 of 11 positions Figure 3 43 which is very good The match need not be perfect hydrophobic amino acids DLQELNDRLAVYIDRV39RSLETENAGLRLRITESEEW A rr TA rr rr rA match with heptad repeat A C Figure 3 41 Assembly of protein monomers Answer 3730 A 3 Figure 3 42 Selfcomplementarity in proteins Answer 3733 A Headtohead dimerformation B Selfcomplementary binding surface Figure 3 43 Heptad repeat motif in the coil 1A region of nuclear lamin C Answer 3734 Hydrophobic amino acids are marked with an asterisk When a hydrophobic amino acid occurs at an A or D In the heptad repeat it Is assigned a The start ofthe heptad repeat was positioned to maximize matches P5 Chapter 3 Proteins Figure 3 44 Vernier assembly of 10nm and 14nm subunits Answer 3735 i W 70 nm to allow formation of a coiledcoil The matches to the heptad repeat in the other two marked segments coil 1B and coil 2 see Figure 3 9 are not as good but they are still acceptable for the formation of a coiledcoil Reference McKeon FD Kirschner MW amp Caput D 1986 Homologies in both primary and secondary structure between nuclear envelope and inter mediate filament proteins Nuture 31 463 468 The final fiber will be 70 nm in length and will contain 7 of the 10nm sub units and 5 of the 14nm subunits Figure 3 44 Assemblies that are any shorter will have unoccupied binding sites at one end or the other It makes no difference how the original pair come together So long as the unoccupied binding sites of the initial pair are filled in by other subunits the fiber will always grow to the same length CALCULATIONS 3 36 05 At equilibrium there would be 1 unfolded protein for every 107 folded pro teins This ratio comes from substituting values for AG 99 kcalmole R and Tinto the equation and solving for logK 10gK 99 23 x 198 x 1103 gtlt 310 10 7 Since K UF logK10g UFl 7 Taking the log of both sides UF 107 or U 10 7 F The stability of lysozyme at 37 C is 10 kcalmole From your measurements Gounfolded 128 119 9 kcalmole Gofolded 75 76 1kCalmole Thus AG Gounfolded Gofolded 9 1 10 kcalmole This problem illustrates the essence of the protein folding problem the overall stability of a protein derives from small differences between large numbers For lysozyme for example not only must the hundreds of inter actions in the folded state be evaluated but so also must the hundreds of interactions in the unfolded state In each state the sum of the enthalpic contributions is large as is the sum ofthe entropic contributions As shown by the calculation above these large numbers are subtracted from one another to give a small number Thus the enthalpic and entropic values of the individual interactions must be known with exquisite precision in order to predict the stability of the folded protein Yet individual weak bonds can THE SHAPE AND STRUCTURE OF PROTEINS DU vary considerably in strength For example individual hydrogen bonds in proteins vary from a few tenths of a kcalmole to over 1 kcalmole These considerations make prediction ofprotein structure from these sorts of cal culations virtually impossible Reference Creighton TE 1993 Proteins 2nd edn pp 297 300 New York WH Freeman Since there are 20 possible amino acids at each position in a 300 amino acid long protein there are 20300 which is 10390 possible proteins The mass of one copy of each possible protein would be 110 d 300 aa mass 7 x aa protein 390 39 x 10 proteins gtlt 4L6 X1023 d mass 55 gtlt10370 g Thus the mass ofprotein would exceed the mass ofthe observable universe 1080 g by a factor of about 102901 The fraction of correctly synthesized proteins will be 091 for a 1000 amino acid protein 037 for a 10000 amino acid protein and 000005 for a 100000 amino acid protein The calculation is shown below for a 1000 amino acid protein PC CFCquot 099991000 091 As calculated in the previous problem synthesis of one 10000 amino acid protein would be expected to occur correctly 37 of the time By contrast synthesis of each 200 amino acid subunit would be expected to occur cor rectly 98 of the time PC 0 09999200 098 Assembly of the mix ture of subunits into correct ribosomes follows the same equation PC 1 09850 037 Thus making a ribosome from subunits gives the same final fraction of correct ribosomes 37 as making them from one long pro The assumption in partA that correct and incorrect subunits are assembled into ribosomes with equal likelihood is not true Any mistake that interferes with the correct folding of a subunit or that interferes with the ability of the subunit to bind to other subunits would eliminate the subunit from assem bly into a ribosome As a result the fraction of correctly assembled ribo somes would be higher than calculated in part A Thus the value of subunit synthesis lies not in more accurate synthesis but rather in permitting qual ity control mechanisms to reject incorrect subunits efficiently DATA HANDLING 3 4 1 If unfolding ofthe protein simply reflected the titration of a buried histidine it should require 2 pH units to go from 9 to 91 completion The actual unfolding curve takes only 03 pH units to span this range This sharp tran sition indicates a highly cooperative process when the protein starts to unfold it completes the process rapidly For example it might be that sev eral buried histidines can ionize when the chain starts to unfold so that when one goes they all go together Note also that as soon as a buried histi dine pKof4 in this example becomes accessible to solvent its pKwill shift toward its normal value of 6 significantly steepening its titration curve Reference CreightonTE 1993 Proteins 2nd ed pp 288 289 NewYork2WH Freeman P5 0 U P5 0 Chapter 3 Proteins As expected hydrophobic amino acid side chains are most frequently buried and hydrophilic side chains are least commonly buried Perhaps the biggest surprise in this list is the high proportion of cysteine C side chains that are buried Cysteine is generally grouped with polar amino acids because of its SH group but its hydrophobichydrophilic properties see Answer 3 13 indicate that it is at best weakly polar Tyrosine Y also deserves comment Tyrosine is usually grouped with polar amino acids because of its hydroxyl moiety however its measured hydrophobic hydrophilic properties are ambiguous see Answer 3 13 indicating that it is only weakly polar By the criterion of buriedness it clearly behaves like other polar amino acids These data are consistent with the hypothesis that the springlike behavior of titin is due to the sequential unfolding of lg domains First the fragment contained seven lg domains and there are seven peaks in the forceversus extension curve In addition the peaks themselves are what you might expect for sequential unfolding Second in the presence ofa protein denat urant conditions under which the domains will already be unfolded the peaks disappear and the extension per unit force increases Third when the domains are crosslinked and therefore unable to unfold the peaks disap pear and extension per unit force decreases The spacing between peaks about 25 nm is almost exactly what you would calculate for the sequential unfolding of lg domains The folded domain occupies 4 nm but when unfolded its 89 amino acids would stretch to about 30 nm 89 X 034 nm a change of 26 nm The existence of separate discrete peaks means that each domain unfolds when a characteristic force is applied implying that each domain has a defined stability The collection of domains unfolds in order from least stable to most stable Thus it takes a little more force each time to unfold the next domain The sudden collapse of the force at each unfolding event re ects an impor tant principle of protein unfolding namely its cooperativity Proteins tend to unfold in an allornone fashion see Problem 3 41 A small number of hydrogen bonds are crucial for holding the folded domain together Figure 3 45 The breaking of these bonds triggers cooperative unfolding Reference RiefM Gutel M Oesterhelt F Fernandez lM amp Gaub HE 1997 Reversible folding of individual titin immunoglobulin domains by AFM Science 276 1109 1112 None are detected in this experiment Treating first with radiolabeled NEM shows that many cytosolic proteins have cysteines that are not linked by disulfide bonds Treating first with unlabeled NEM to block these sites fol lowed by DTT to break disulfide bonds should expose any SH groups that were linked by disulfide bonds These newly exposed SH groups should be labeled by subsequent treatment with radiolabeled NEM The absence of labeling indicates that no cysteines were involved in disulfide bonds BSA and insulin are labeled extensively only after their disulfide bonds have been broken by treatment with DTT In the absence of DTT treatment BSA is weakly labeled Since BSA has an odd number of cysteines at least one cannot be involved in disulfide bonds Structural analysis confirms that one of its 37 cysteines is not involved in a disulfide bond Because the ER is the site where disulfide bond formation is catalyzed in preparation for export of proteins it is expected that lysates from cells that have internal membranes would have many proteins with disulfide bonds Figure 3 45 Hydrogen bonds that lock the domain into its folded conformation Answer 3743The indicated hydrogen bonds gray lines when broken trigger unfolding ofthe domain Ifyou compare this topological diagram with the three dlmenslonal structure in Figure 3 12A you can pick out the two short 5 strands that are involved in forming these hydrogen bonds PROTEIN FUNCTION PROTEIN FUNCTION DEFINITIONS 3 45 Scaffold protein 3 46 Feedback inhibition 3 47 Antibody 3 48 Active site 3 49 Enzyme 3 50 Protein phosphatase 3 51 Linkage 3 52 Protein kinase 3 53 Transition state 3 54 SCF ubiquitin ligase 3 55 Allosteric protein 3 56 Proteomics 3 57 Coenzyme TRUEFALSE 3 58 False The pKvalues of specific sidechain groups depend critically on the environment On the surface of a protein in the absence of surrounding charged groups the pKof a carboxyl group is usually close to that of the free amino acid In the neighborhood of negatively charged groups the pK of a carboxyl group is usually higher that is the proton dissociates less readily since the increase in local density of negative charge is not favored The opposite is true in a positively charged environment In hydrophobic sur roundings the dissociation of a proton can be substantially suppressed since the presence of a naked charge in such an environment is highly dis favored It is this ability to alter the reactivities of individual groups that allows proteins to finetune their biological functions 3 59 False Assuming that the threedimensional structure of at least one family member is known it would be possible to use evolutionary tracing fitting the primary sequence to the structure to determine where the conserved amino acids cluster on the surfaces of the proteins Clusters of conserved amino acids are likely to correspond to important regions such as those involved in binding to specific ligands or other proteins Knowing where such binding sites reside on the surface does not identify the protein s func tion You would not knowwhether the protein was an enzyme or a structural protein or what it bound to Some other approach usually biochemical would be required to elucidate the function 3 60 False The turnover number is constant since it is Vmax divided by enzyme concentration For example a 2fold increase in enzyme concentration would give a 2fold higher Vmax but it would give the same turnover num ber 2 VmaXZ E k3 3 61 True The term cooperativity embodies the idea that changes in the confor mation of one subunit are communicated to the other identical subunits in any given molecule so that all ofthese subunits are in the same conformation 3 62 True Each cycle of phosphorylation dephosphorylation hydrolyzes one molecule ofATP however it is not wasteful in the sense of having no benefit A54 Chapter 3 Proteins Figure 3 46 The two binding sites for valine and threonine on vaIyItRNA synthetase Answer 3765 Constant cycling allows the regulated protein to switch quickly from one state to another in response to stimuli that require rapid adjustments of cel lular metabolism or function This is the essence of effective regulation 3 63 False Although many of the conformational changes induced by ligand binding are relatively small in some instances these local changes are prop agated through a molecule to give rise to changes of more than a nanome ter The conformational change triggered by hydrolysis of GTP by EFTu for example allows two domains of the protein to separate by4 nm THOUGHT PROBLEMS 3 64 Antifreeze proteins function by binding to tiny ice crystals and arresting their growth therebypreventing the fish from freezing Ice crystals that form in the presence of antifreeze proteins are abnormal in that their surfaces are curved instead of straight The various forms of the antifreeze proteins in these fishes are all composed of repeats of a simple glycotripeptide Thr AlaProAla with a disaccharide attached to each threonine The genes for these antifreeze proteins were apparently derived by repeated duplication of a small segment of a protease gene References Cheng CHC amp Chen L 1999 Evolution of an antifreeze protein Nature 401 443 444 lia Z amp Davies PL 2002 Antifreeze proteins an unusual receptor ligand interaction Trends Biochem Sci 27 101 106 3 65 To bind to valine the valyltRNA synthetase uses a binding pocket of the proper shape that is lined with hydrophobic residues Such a binding site permits valine to bind well but does not fully exclude threonine which has the same shape and a single polar hydroxyl group Figure 3 46 The second binding site is much more specific for threonine because it contains an appropriately positioned hydrogenbond acceptor that makes a specific hydrogen bond with threonine but not with valine Even though valine can fit into the site it cannot bind tightly and is thus a verypoor substrate for the hydrolysis reaction 3 66 Of all the possible pairs only proteins 2 and 6 can interact in a way that sat isfies all the binding groups on their binding surfaces Figure 3 47 This sort of complementary arrangement of binding moieties is characteristic of sur face surface interactions between proteins 3 67 The problem is that the offrate for the antibody enzyme complexis too slow Ln order for the peptide to displace the enzyme from the column the enzyme must first dissociate from the antibody The antibody binding sites would then be quickly bound by the peptide whose high concentration would pre vent the enzyme from reattaching to the antibody any newly exposed anti body binding site would be bound by peptide Ln principle you could soak the column with peptide for several days for several dissociation halftimes see Problem 3 90 but this usually has adverse consequences for the quality and activity of the enzyme preparation In general highaffinity antibodies Figure 347 Binding of protein 2 with have slow off rates and are unsuitable for affinity chromatography protein 6 Answer 3766 2 2 6 6 PROTEIN FUNCTION 3 72 0 gt P5 O A B Special procedures have been devised for preparing or identifying anti bodies that work in such experiments Usually loweraffinity antibodies are used or chromatography is carried out under special conditions that reduce the affinity ofthe antibody Reference Thompson NE amp Burgess RR 1996 Immunoaffinitypurification of RNA polymerase II and transcription factors using polyolresponsive monoclonal antibodies Methods Enzymol 274 513 526 The reaction rate for the altered enzymewould be substantially slower than for the normal enzyme The reaction rate is related to the activation energy which is the difference in energy between the trough labeled ES in Figure 3 16 and the transition state the larger the activation energy the slower the rate If the altered enzyme bound the substrate with higher affinity a lower ES trough then the activation energy would increase and the reaction would slow down Because an enzyme has a fixed number of active sites the rate of the reac tion cannot be further increased once the substrate concentration is suffi cient to bind to all the sites It is the saturation of binding sites that leads to an enzyme s saturation behavior The other statements are all true butnone is relevant to the question of saturation Since k1 corresponds to the on rate and k4 corresponds to the off rate Kd EllSllESl koffkon k71161 Km is approximately equal to Kd when kcat is much less than k4 that is to say when the ES complex dissociates much more rapidly than substrate is converted to product This is true for many enzymes but not all Because kcat is in the numerator of the expression Km will always be some what larger that Kd Since lower values of Kd indicate higher binding affinity Km will always underestimate the binding affinity When kcat is much less than k4 the underestimate will be slight and Km will essentially equal Kd All explanations have at their heart the idea that the quantity of active enzyme per total protein the specific activity of the enzyme is 10fold less in bacteria Such a situation could arise for a number of reasons 90 of the enzyme may fold incorrectly in bacteria An essential cofactor of the enzyme which is normally tightly bound may be limiting in bacteria so that only 10 of the enzyme molecules acquire it These explanations which propose that there are 10 normally active enzymes among otherwise dead molecules account naturally for the observation that the Km is identical Km is independent of the concentration of active enzyme while Vmax is lower Vmax is dependent on the concentration of active enzyme One common suggestion is that the enzyme in bacteria folds so that each molecule has 10 of the normal activity This possibility can be ruled out because the lower activity of each molecule would show up as a change in Km as well as Vmax An enzyme composed entirely of mirrorimage amino acids would be expected to fold stably into a mirrorimage conformation that is it would look like the normal enzyme when viewed in a mirror Amirrorimage enzyme would be expected to recognize the mirror image of its normal substrate Thus D hexokinase would be expected to add a phos phate to Lglucose and to ignore Dglucose This experiment has actually been done for HIV protease The mirror image protease recognizes and cleaves a mirrorimage substrate Reference Milton RC Milton SC amp Kent SB 1992 Total chemical synthesis of a Denzyme the enantiomers of HIV1 protease show reciprocal chiral substrate specificity Science 256 1445 1448 05 Chapter 3 Proteins Phosphoglycolate is a transitionstate analog for the triosephosphate iso merase reaction It has the two characteristics that define a transition state analog it resembles the reaction intermediate and it binds more tightly here about 15 times more tightly than the substrates References Kyte I 1995 Mechanism in Protein Chemistry pp 207 208 New York Garland Publishing Pauling L 1948 Chemical achievement and hope for the future Am Sci 36 5 8 Amino acid side chains in proteins often have quite different pKvalues from those in solution Glu 35 is uncharged because its local environment is non polar which makes ionization less favorable raises its pK The local envi ronment ofAsp 52 is more polar permitting ionization near its solution pK As the pH drops below 5 Asp 52 picks up a proton and becomes nonionized interfering with the mechanism As the pH rises above 5 Glu 35 begins to release its proton also interfering with the mechanism Water from rusty pipes provides iron which is essential for all forms of life Egg white contains a special protein ovotransferrin which binds iron very tightly analogous to the binding of biotin by avidin Washing eggs in rusty water allows iron to enter in sufficient quantities to exceed the binding capacity of ovotransferrin thereby making free iron available to the microorganisms This simple question required decades of research to provide a complete and satisfying answer At the simplest level hemoglobin binds oxygen effi ciently in the lungs because the concentration partial pressure of oxygen is highest there In the tissues the concentration of oxygen is lower because it is constantly being consumed in metabolism Thus hemoglobin will tend to release bind less oxygen in the tissues This natural tendency an effect on the binding equilibrium is enhanced by allosteric interactions among the four subunits of the hemoglobin molecule As a consequence much more oxygen is released in the tissues than would be predicted by a simple bind ing equilibrium Although the rate of diffusion cannot be altered by changes to the enzymes the average distance over which a molecule must diffuse can be manipu lated Linking the two enzymes together decreases the average distance for diffusion of the first product to the second enzyme A decrease in the dis tance reduces the time for diffusion and thus increases the overall rate of the reactions catalyzed by the pair of enzymes When S gtgt Km the enzyme will be virtually saturated with substrate at all times and capable ofoperating atmaximum rate independent of small uc tuations in substrate concentration For many enzymes that use ATP and a second substrate as protein kinases do the Km for ATP is usually very low a few HM for most protein kinases relative to the concentration ofATP in the cell 1 2 mM This situation allows the kinases to operate effectively regard less ofthe typical uctuations in ATP concentration Under these conditions the rate of phosphorylation depends solely on the concentration of the other substrate When S 5 Km the rate of the enzymecatalyzed reaction will vary in pro portion to the changes in substrate concentration This is the typical situa tion for most enzymes involved in metabolic pathways which allows them to keep up with the changing ow through the pathway When S ltlt Km the enzyme will be mostly unoccupied by substrate and will be operating much below its maximum rate This is a strategy that might be used for example if multiple different enzymes draw on a common pool of substrate An enzyme with a high Km would use only a small fraction ofthe pool unless the concentration increased dramatically lust such a strategy is employed in animals for routing glucose for metabolism Most cells in the PROTEIN FUNCTION 0 100239 FgtGAgtAMP 50 R5P4gtA4gtBAgtCgtDgtE 0 50A 100 Ha I4gtGMP body use hexokinase which has a low Km to add phosphate to glucose to ini tiate its metabolism By contrast liver cells use glucokinase which has a high Km to carry out this reaction Between meals the circulating glucose is routed mainly to nonliver cells which use the low Km enzyme hexokinase After meals when the circulating concentration ofglucose is much higher the liver captures a much larger fraction because glucokinase activity increases much more with higher substrate concentration than does hexokinase activity The liver uses much ofthat glucose to make glycogen which serves as a glu cose reserve for use between meals Cells cannot in uence rates of diffusion which are limited by physical parameters beyond a cell s control As discussed in Answer 3 77 cells can decrease the time it takes for substrates to reach an enzyme by increasing the concentration of enzyme or by linking related enzymes in multienzyme complexes One reasonable proposal would be for excess AMP to feedback inhibit the enzyme for converting Eto F and excess GMP to feedback inhibit the step from Eto H Intermediate E which would then accumulate would feedback inhibit the step from R5P to A Some branched pathways are regulated in just this way Purine nucleotide synthesis is regulated somewhat differently however Figure 3 48 AMP and GMP regulate the steps from E to F and from Eto H as above but they also regulate the step from R5P to A Regula tion by AMP and GMP at this step might seem problematical since it sug gests that a rise in AMP for example could shut off the entire pathway even in the absence of GMP The cell uses a very clever trick to avoid this problem Individually excess AMP or GMP can inhibit the enzyme to about 50 of its normal activity together they can completely inhibit it In resting muscle ATP usage is at a minimum hence the group of ATPlike signal metabolites accumulate Specific members of this group inhibit glycogen phosphorylase and stimulate glycogen synthase ensuring that glycogen reserves are maintained or increased In exercising muscle ATP usage is high and AMPlike signal metabolites increase Specific AMPlike signal metabolites stimulate glycogen phospho rylase and inhibit glycogen synthase ensuring a breakdown of glycogen to provide glucose units for ATP production The substrate phosphate and the activator AMP both bind to the rarer con formation of glycogen phosphorylase thereby shifting the conformational equilibrium in favor of the more active species This makes good biological sense because phosphate and AMP concentrations both rise when the cell increases its rate ofATP hydrolysis and activation of glycogen phosphory lase is one way to provide metabolic substrates for the synthesis of addi tional ATP In both cases the overall activity ofthe enzyme increases because the fraction of enzymes in the highactivity conformation is increased The first MWC postulate which states that the subunits are arranged sym metrically rules out all arrangements except those shown in the leftmost and rightmost columns of the diagram If ligand binds much more tightly to circles then the allowed arrangements are those shown in Figure 3 49 If the ligand binds equally to both subunit conformations then all the arrange ments in the leftmost and rightmost columns are allowed consistent with MWC postulate 1 A57 Figure 3 48 Pattern ofinhibition in the metabolic pathway fo synthesis Answer 378 DE DE ID DD DD ID 0 I D U DO I D D DE DD 00 DD 0 DD 0 DD 0 ID O 00 D0 DO 0 D O D r purine nucleotide 0 Figure 3 49 Arrangements ofsubunit conformations that are consistent with the MWC postulates Answer 3783 Shaded area indicates those arrangements that are excluded bythe MWC postulates for a ligand with af nity for one conformation ofsubunit circle Chapter 3 Proteins Detailed studies on a few cooperative enzymes such as aspartate transcar bamoylase have found no evidence for intermediate nonsymmetrical con formations The rate of the metabolic reaction depends on the population of enzyme molecules not on an individual enzyme molecule While an individual molecule is either on or off depending on whether it is phosphorylated the activity of the population of enzyme molecules depends on the propor tion of these molecules that are phosphorylated As the proportion of phos phorylated molecules increases from 0 to 100 the activity of the popula tion of enzymes the rate of the metabolic reaction will decrease smoothly from 100 to 0 The phosphorylation state of a population of enzyme molecules is controlled by the balance between the opposing activities of protein kinases which attach phosphates and protein phosphatases which remove them A mutation that decreases the rate of GTP hydrolysis by Ras would prolong its activated state leading to excessive stimulation of cell proliferation Indeed many cancers contain just such a mutant form of Ras The mutant Ras proteins in all other choices would lead to a decreased ability to trans mit the downstream signal thus decreasing cell proliferation For example a mutation that increased the affinity of Ras for GDP choice B would pro long the inactive state ofRas thereby interfering with the growth signal and decreasing cell proliferation A nonfunctional GAP choice A or a permanently active GEF choice D would allow Ras to remain in the active state with GTP bound longer than normal and thus might cause excessive cell proliferation In the absence ofATP a motor protein would stop moving The conforma tional shifts that are required for movement are triggered by ATP binding and hydrolysis In the absence ofATP the motor protein would be stuck in its lowestenergy conformation If the freeenergy change for the hydrolysis ofATP by the motor protein were zero conditions under which ATP is as easily made as hydrolyzed the motor protein would wander back and forth With zero freeenergy change there would be no barrier between conformations P5 CALCULATIONS 3 88 A The equilibrium constant K equals 106 M4 K w w AIIBI 7 10 6 M X 10 6 M K 106 M 1 B The same calculation as above when each component is present at 10 9 M gives an equilibrium constant of 109 M4 This example illustrates that interacting cellular proteins present at low con centrations need to bind to one another with high affinities if a high pro portion of the molecules are to be bound together A threeorder of magni tude decrease in the equilibrium constant corresponds to a freeenergy dif ference of about 42 kcalmole Thus effective binding at the lower con centration would require the equivalent of 4 5 extra hydrogen bonds The freeenergy difference between the two equilibrium constants can be calcu lated For an equilibrium constant of 106 M4 C AG 23 RTlog K PROTEIN FUNCTION A59 Substituting AG 141 kcalmole X 6 AG 846 kcalmole For an equilibrium constant of 109 M4 AG 141kcalmole X 9 AG 1269 kcalmole Thus the higher equilibrium constant corresponds to a freeenergy differ ence that is 42 kcalmole more negative To supply this amount of binding energy with hydrogen bonds about 1 kcalmole would require about 4 5 extra hydrogen bonds The antibody binds to the second protein with an equilibrium constant K of 5 X 107 M4 A useful shortcut to problems of this sort recognizes that AG is related to log K bythe factor 23 RT which equals 14 kcalmole at 37 C Thus a fac tor often increase in the equilibrium constant an increase in log Kof 1 cor responds to a decrease in AG of 14 kcalmoleA 100fold increase in Kcor responds to a decrease in AG of 28 kcalmole and so on For each factor of ten increase in K AG decreases by 14 kcalmole for each factor of ten decrease in K AG increases by 14 kcalmole This relationship allows a quick estimate of changes in equilibrium constant from freeenergy changes and vice versa In this problem you are told that AG increased by 28 kcalmole a weaker binding gives a less negative AG According to the relationship developed above this increase in AG requires that Kdecrease by a factor of 100 a decrease by 2 in log K thus the equilibrium constant for binding to the second protein is 5 X 107 M4 The solution to the problem can be calculated by first determining the freeenergy change represented by the binding to the first protein AG 23 RTlogK Substituting for K AG 1368 kcalmole The freeenergy change associated with binding to the second protein is obtained by adding 28 kcalmole to the freeenergy change for binding to the first protein giving a value of 1088 kcalmole Thus the equilibrium constant for binding to the second protein is log K 1088 kcalmole 141 kcalmole 77 K5gtlt 107 M4 The equilibrium constants for the two reactions are the same 108 M4 K AbPTllAbllprl konkoff For the first antibody protein reaction K kenkm 105 M 1 sec 11O 3 sec 1 K 108 M 1 P5 0 P5 Chapter 3 Proteins For the second reaction K kenCoff103 M 1 sec 11O 5 sec 1 K 8 71 Since the first reaction has both a faster association rate and a faster disso ciation rate it will come to equilibrium more quickly than the second reac tion The time it takes for half the complex to dissociate can be calculated from the relationship given in the problem 23 log Ab Pr tAb Pr0 kofft Substituting 05 for Ab PrtAb Pr0 and rearranging the equation t 23 log 05 4603 For the first complex with keg 10 3 sec l t 23 log 05 off It 692 seconds or 115 minutes For the second complex with km 10 5 sec l the calculation gives 1 69 X 104 seconds or about 19 hours Thus the first complex which falls apart relatively quickly would be much more difficult to work with than the second complex which falls apart more slowly Inappropriate reliance on the equilibrium constant instead of the off rate constant can lead an investigator astray in this sort of experiment At equilibrium the rates ofthe forward and reverse reactions are equal This is the definition of equilibrium The overall reaction rate at equilibrium will be 0 The equilibrium constant equals 103 At equilibrium the forward and reverse reactions are equal Thus MA kr B kfkr BlAl K Thus K 10 4 sec 11O 7 sec 1103 Enzyme catalysis does not alter the equilibrium for a reaction it only speeds the attainment of equilibrium Thus the equilibrium constant is 103 If the equilibrium is unchanged and kflS increased by a factor of 109 then kr must also be increased by a factor of 109 At S zero the rate equals OIKm and the rate is therefore zero At S Km the ratio of SS Km equals 12 and the rate is 12 Vmax At infinite S the ratio of S S Km equals 1 and the rate is equalto Vmax If Km increases then the concentration of substrate necessary to give half maximal rate also increases At a concentration of substrate equal to the Km of the unphosphorylated enzyme the phosphorylated enzyme would have a slower rate thus phosphorylation inhibits this enzyme PROTEIN FUNCTION 3 94 D The substrate concentration must be increased by a factor of 16 to increase gt the rate from 20 to 80 Vmax Substituting a rate of 20 Vmax into the Michaelis Menten equation gives 02 Vmax Vmax Sllsl Km Cancelling Vmax and multiplying both sides by S Km gives 02 S 02 Km S 08 S 02 Km S 025 Km at 20 Vmax An analogous calculation shows that S 4 Km at 80 Vmax Thus S must increase by a factor of 16 4 Km025 Km for the rate to go from 20 to 80 Vmax The turnover number for carbonic anhydrase is 61 X 107min or 10 X 106sec For this calculation it is necessary to express the amount of C02 hydrated and the amount ofthe enzyme on the same molecular basis either as molecules or moles For C02 23 x W Xw ule 123 X 1022 moleculesminmL min mL g 44 For carbonic anhydrase 17 118 x w x W 20 X1014 moleculesmin mL mL 30000 Lg Lmole The turnover number 122 x 1022 moleculesmin mL 61 X107min 20gtlt 1014 moleculesmL k3 VmaXlEl The numerical value of the product ofthe Kd values for the substrates is 30 x 10 7 27 X 10 6 X 11 X 10 3 which is 9fold greater than the K1 for PALA 27 X 10 8 suggesting that PALA might be a transition state analog PALA how ever is composed not of aspartate plus carbamoyl phosphate but of succi nate plus a close analog of carbamoyl phosphate If one substitutes the K1 for succinate 09 mM into the calculation the product of the Kd values is 24 x 1043 27 X 10 6 X 09 x 103 which is very close to the K1 for PALA Thus PALA is likely to be a bisubstrate analog rather than a transitionstate analog You may have wondered whether it is valid to compare Kd values in this way Recall that AG 23RTlog K Using this equation we could have con verted Kd values to AG values and compared the sum of AG values for aspartate and carbamoyl phosphate with the AG for PALA Since AG is pro portional to logK this is equivalent to comparing the products of the K1 val ues for aspartate and carbamoyl phosphate with the K1 for PALA Do the cal culation for AG values and convince yourself that this is true Reference FershtA 1999 Structure and Mechanism in Protein Science pp 360 361NewYorlltWH Freeman An understanding of this phenomenon comes from a consideration of K1 which for the binding of PALA to aspartate transcarbamoylase ATC is PALA ATC 27 108 M Kd PALA ATC X Since the concentration of enzyme is negligible relative to the concentration 3 98 F 11gt 05 Chapter 3 Proteins of added PALA we can use 27 X 10 6 as the equilibrium concentration of PALA Substituting this value into the K1 expression gives ATC 001 PALA ATC Thus in the presence of PALA only 1 of the enzyme is free This is true for normal cells and resistant cells For resistant cells which have 100 times as much enzyme 1 free corresponds to the amount that is present in unin hibited normal cells Thus they can grow perfectly well in the presence of PALA Mutational resistance to an inhibitor requires a subtle change in the enzyme that allows it to decrease its binding affinity for the inhibitor while not sig nificantly altering its binding affinity for substrates This may not be a feasi ble response to PALA because it so closely resembles the two substrates A change that decreases PALA binding will likely decrease substrate binding as well Reference Wahl GM Padgett RA amp Stark GR 1979 Gene amplification causes overproduction of the first three enzymes of UMP synthesis in N phosphonacetylLaspartate PALA resistant hamster cells J Biol Chem 254 8679 8689 The relative concentrations of the normal and mutant Src proteins are inversely proportional to the volumes in which they are distributed The mutant Src is distributed throughout the volume of the cell which is Vcen 4371r3 4371 10 x106 m3 41888 gtlt1015 m3 Normal Src is confined to the 4nm layer beneath the membrane which has a volume equal to the volume of the cell minus the volume of a sphere with a radius 4 nm less than that of the cell Vlayer Vcell 437T r 4 mn3 Vcen 4371 10 gtlt1O 6 m 4 x109 m3 41888 gtlt1015 m3 41838 x 1015 m3 Vlaye 00050 x 1015 m3 Thus the volume of the cell is 838 times greater than the volume ofa 4nm thick layer beneath the membrane 41888 x 10 15 m300050 X 10 15 m3 Even allowing for the interior regions of the cell from which it would be excluded nucleus and organelles the mutant Src would still be a couple of orders of magnitude less concentrated in the neighborhood of the mem brane than the normal Src Its lower concentration in the region of its target X at the membrane is the reason why mutant Src does not cause cell proliferation This notion can be quantified by a consideration of the binding equilibrium for Src and X Src X a Src X Src X SIC X The lower concentration of the mutant Src in the region of the membrane will shift the equilibrium toward the free components reducing the amount of complex If the concentration is on the order of 100fold lower the amount of complex will be reduced up to 100fold Such a large decrease in complex formation could readily account for the lack of effect of the mutant Src on cell proliferation PROTEIN FUNCTION DATA HANDLING 3 99 A P5 0 3 100 3 101 gt P5 Your results support the idea that the PI 3kinase interacts with the activated PDGF receptor through its SH2 domains The interaction is blocked specifi cally by the phosphorylated pentapeptides 708 and 719 In their nonphos phorylated forms these same pentapeptides do not block the association The common features of the seven peptides that can bind to PI 3kinase are a phosphotyrosine and a methionine M located three positions away in the Cterminal direction Although not shown explicitly here there seems to be no requirement for specific amino acids on the Nterminal side ofthe phos photyrosine Recognition of a couple of amino acids in a short sequence is characteristic of a surface string interaction Indeed recognition of sequences by SH2 domains is often cited as a prime example of such an interaction The immunoblot shows that antibodies BPAI and BPA2 react specifically with a 220kd protein which is likely to be Brcal By contrast although anti body C20 reacts with the same protein it seems to react even more strongly with a second protein of about 180 kd Thus a likely explanation for the con tradictory celllocalization experiments is that C20 antibodies were identify ing the location of the 180kd protein whereas BPAI and BPA2 were show ing the location of Brca1 Brcal is now thought to function in the nucleus Additional experiments have identified the epidermal grth factor EGF receptor as the protein with which C20 crossreacts The EGF receptor has a couple of regions of similarity to the peptide that was used to generate the C20 antibody Crossreactivity of antibodies is not an uncommon problem For this reason celllocalization studies are usually performed with anti bodies raised against more than one region of a protein Agreementwith dif ferent antibodies decreases the likelihood that crossreactivity is a problem References Iensen RA Thompson ME Ietton TL Szabo CI van der Meer R Helou B Tronick SR Page DL King MC amp Holt IT 1996 BRCAI is secreted and exhibits properties of a granin Nat Genet 12 303 308 Thomas IE Smith M Rubinfeld B Gutowski M Beckmann RP amp Polakis P 1996 Subcellular localization and analysis of apparent 180kDa and 220 kDa proteins of the breast cancer susceptibility gene BRCAI J Biol Chem 271 28630 28635 The slopes 1Kd of the lines in Figure 3 26 can be estimated by taking the difference between two points on the y axis divided by the difference between the corresponding points on the x axis Thus the slope of line A is IKd 008 0355 X 10 7 M I gtlt10 7 M 675 gtlt105 M 1 Kd 15 X 10 6 M For line B 1Kd 020 0905 x 107 M 1 gtlt107 M Kd 57 x104 M The precise values are dependent on your estimate of the corresponding val ues on the x axis The lower the K1 the tighter the binding thus the tighter IPTGbinding mutant of the Lac repressor corresponds to line B Kd 57 gtlt10 7 M and the wildtype Lac repressor corresponds to line A Kd 15 X 10 6 M That a lower value corresponds to tighter binding is apparent from the definition of K1 in the problem Tighter binding will give more complex Pr L and fewer 3 102 P5 0 3 103 3 104 gt P5 Chapter 3 Proteins free components Pr L thus the ratio of concentrations Kd will be smaller References GilbertW amp MullerHill B 1966 Isolation of the Lac repressor Proc NatlAczzd Sci USA 56 1891 1898 Kyte I 1995 Mechanism in Protein Chemistry pp 175 177 NewYork Gar land Publishing 1995 When the concentrations of free and bound ligand are equal their ratio becomes 1 and the concentration of free protein is equal to Kd Pr L1 Kd Pr L When L equals Pr L Kd PI Visual inspection of the data in Figure 3 27 shows that the concentrations of free and bound tmRNA are approximately equal when the concentration of SmpB is 188 nM Thus K1 is around 20 nM Not all of the added SmpB protein is free since some is obviously bound to tmRNA But because tmRNA was included at a concentration of 01 nM the bound fraction at an SmpB concentration of 188 nM is only 005 nM Thus the correction for bound SmpB is minuscule less than 1 and can be neglected It is critical in this kind of experiment that the tmRNA concentration be kept well below Kd If the concentration oftmRNA had been 100 nM for example the shift to 50 bound would have occurred at around 50 nM SmpB and most of the protein would have been in the bound not the free form If tmRNA were included at 100 uM the shift to 50 bound would have occurred at around 50 uM SmpB Thus if tmRNA were included at a con centration above K1 the point at which 50 was shifted to the bound form would bear no relationship to Kd Reference Karzai AW Susskind MM amp Sauer RT 1999 SmpB a unique RNAbinding protein essential for the peptidetagging activity of SsrA tmRNA EMBO 18 3793 3799 The calculated values of fraction bound versus protein concentration are shown in Table 3 6 Also shown are ruleof thumb values which are easier to remember see the answer to Question 2 54 These relationships are useful not only for thinking about Kd but also for enzyme kinetics which we cover in other problems The rate of a reaction expressed as a fraction of the maximum rate is rate S Wax s1 Km which has the same form as the equation for fraction bound Thus when the concentration of substrate S is 10fold above the Michaelis constant Km the rate is 90 of the maximum Vmax When S is 100fold below Km the rate is 1 of Vmax The relationship also works for the fractional dissociation of an acidic group HA as a function oprWhen the pH is 2 units above pK 99 of the acidic group is ionizedWhen the pH is 1 unit less than pK 10 is ionized Since the concentration of Lac repressor is 105 times the K1 for binding you would expect 99999 of the sites in a bacterial population to be occupied by the Lac repressor When inducer is present the concentration of Lac repressor will be only Table 3 6 Calculated values for fraction bound versus protein concentration Answer 3 103 104 Kd 103 K 102 Kd 101 Kd Kd 101 Kd 10Z Kd 103 K 104 K 91 099 0099 00099 001 PROTEIN FUNCTION 0 U 3 105 gt D 0 3 106 0 3 107 gt 100 times more than the K1 but you would still expect 99 of the sites to be occupied If 99 of the binding sites were occupied by the repressor even in the pres ence of the inducer you would expect that the genes would still be very effectively turned off This sort of straightforward calculation and its non biological answer after all the genes are known to be turned on by the inducer tells you that some critical information is missing Lowaffinity nonspecific binding of the Lac repressor is the missing infor mation suggested by the calculation in part C Since there are 4 X 106 non specific binding sites in the genome a number equal to the size of the genome there is a competition for repressor between the multitude oflow affinity sites and the single highaffinity site This competition reduces the effective concentration of the repressor As can be calculated the competi tion reduces repressor occupancy at the specific site to about 96 in the absence oflactose and to about 3 in the presence of lactose These num bers account nicely for the genes being turned off in the absence oflactose and turned on in its presence It is important that only a small quantity of product is made because other wise the rate of the reaction would decrease as the substrate was depleted and product accumulated Thus the measured rates would be lower than they should be and the kinetic parameters would be incorrect The Michaelis Menten plot shown in Figure 3 50 is a rectangular hyper bola as expected if this enzyme obeys Michaelis Menten kinetics To deter mine values for Km and Vmax from this plot by visual inspection you must estimate the rate at infinite substrate concentration From the curve of the line in the figure you might reasonably estimate Vmax as anywhere from 18 to 20 umolmin As developed in the answer to Problem 3 103 a useful rule of thumb is that at a concentration of substrate 10fold above Km the rate is about 90 of Vmax If you chose 20 umolmin then 05 Vmax 10 umolmin corresponds to a substrate concentration of 10 uM which is the value of Km The visual uncertainty in this plot led early researchers to transform the equation into a straightline form so a line could be fitted to the data and the kinetic parameters could be more accurately determined As indicated in the Lineweaver Burk plot in Figure 3 50 the yintercept is 05 1Vmax and the xintercept is 10 1Km Thus Vmax equals 20 umolmin and Km equals 10 uM Although this form ofa straightline plot is commonly discussed in textbooks it is rarely used in practice because the data points that are most reliable are tightly grouped at one end of the line Conse quently the slope of the line is unduly in uenced by the low and usually less accurately determined rates at low substrate concentration Other straightline transformations of the Michaelis Menten equation such as the Eadie Hofstee plot which is analogous in form to the Scatchard plot shown in Problem 3 101 are generally preferred In this era of computers however the data can be fitted perfectly well to the nonlinear Michaelis Menten equation although it is still common to present such data in a linear form Since NAM cannot occupy site C that site must normally be occupied by NAG and since the cellwall polysaccharide is an alternating polymer of NAM and NAG the NAM monomers must occupy sites B D and F Because cleavage occurs after NAM monomers the site of cleavage must be between sites B and C or between sites D and E Since triNAG occupies sites A C but is not cleaved whereas longer NAG polymers are the catalytic groups for cleavage must lie between sites D and E Binding of aspartate normally shifts the conformation of ATCase from the lowactivity to the highactivity state At low aspartate concentrations not all of the ATCase will have been shifted to the highactivity conformation The peculiar activating effect of malate occurs because its binding helps Michaelierenten plot rate umolmin 3 0 0 50 10 501M LineweaverrBurk plot 6 E 3 c g 4 E 2 2 05 a i o 15 4 0 Us 39luM Figure 3 50 Michaelis Menten and Lineweaver Burk plots ofthe data In Table 3 3 Answer 34 05Thex and y intercepts are indicated on the Lineweaver Burk plot P5 3 108 3 109 11gt P5 0 3 110 gt Chapter 3 Proteins complete the shift ofATCase to the highactivity conformation In the pres ence of a low concentration ofmalate the number of active sites in the high activity conformation increases thus enzyme activity increases This peculiar activating effect of malate is not observed at high aspartate concentrations because ATCase is already entirely shifted to its highactivity conformation Under these conditions each molecule of malate that binds to an active site will reduce the total number of sites accessible to aspartate and thus reduce the overall activity ofATCase References Cantor CR amp Schimmel PR 1980 Biophysical Chemistry pp 944 945 NewYork2WH Freeman Gerhart IH amp Pardee AB 1963 The effect of the feedback inhibitor CTP on subunit interactions in aspartate transcarbamylase Cold Spring Harbor Symp Quaint Biol 28 491 496 These changes are exactly what you would expect for an allosteric enzyme such as ATCase Because binding at one active site one of six per ATCase molecule is sufficient to shift the conformation of a molecule ofATCase the change in global conformation change in sedimentation is expected to lead the change in occupancy of binding sites change in spectral mea surement References Cantor CR amp Schimmel PR 1980 Biophysical Chemistry pp 954 956 NewYorkWH Freeman Kirschner MW amp Schachman HK 1973 Local and gross conformational changes in aspartate transcarbamylase Biochemistry 12 2997 3004 Both cyclinA and phosphorylation of Cde are required to activate Cde for efficient phosphorylation of histone H1 see Figure 3 32 lane 5 Absence of cyclin A lane 1 or absence of phosphorylation of Cde lane 3 results in much reduced levels of histone HI phosphorylation Cyclin A which binds tightly to both forms of Cde Kd 005 uM dramat ically improves the binding of both forms to histone H1 In the absence of cyclin A PCdk2 for example binds histone HI with a Kd of 100 uM whereas in the presence of cyclin A it binds histone H1 with a Kd of 07 uM an increase in the tightness ofbinding ofmore than a factor of 100 In addition as shown in Figure 3 32 phosphorylation of Cde activates its protein kinase activity allowing it to phosphorylate histone HI when cyclinA is pre sent to increase its ability to bind to histone H1 Given that the intracellular concentrations ofATP and ADP are more than 10fold higher than the measured dissociation constants the changes in affinity for ATP and ADP are unlikely to be critical for the function of Cde The binding sites for ATP will be nearly saturated regardless of the phospho rylation state of Cde ADP which binds at the same site as ATP is unlikely to interfere significantly with ATP binding because ADP has a higher K1 and its cellular concentration is generally lower than that ofATP Reference Brown NR Noble MEM Lawrie AM Morris MC Tunnah P Divita G Iohnson LN amp Endicott IA 1999 Effects of phosphorylation of threonine 160 on cyclindependent kinase 2 structure and activity J Biol Chem 274 8746 8756 Mutant 2 is Asp181a Ala D181A It is the best candidate because it has a Km close to that of the wildtype enzyme but a very low turnover number kcat With these kinetic parameters it might be expected to bind normally to its target substrates but not remove phosphate from tyrosine The next most likely candidate would be Arg221a Lys which has a slightly lower Km than wildtype PTP1B and turns over slowly although about 20 times faster than D181A Further studies identified the band at 180 kd as the epidermal growth factor EGF receptor PROTEIN FUNCTION B C2158 showed no activity as expected since the 8H group of cysteine is required for catalysis Because C2158 was not active it was not possible to determine its Km which might have been similar to that of the wildtype enzyme Measurements of K1 were not made Thus C2158 was a reason able candidate to test Lack of success with C2158 suggests that it binds phosphotyrosinecontaining proteins very poorly Reference Flint A Tiganis T Barford D ampTonllts NK 1997 Development of substratetrapping mutants to identify physiological substrates of protein tyrosine phosphatases Proc NutlAczzd Sci USA 94 1680 1685 Chapter 5 DNA Replication Repair and Recombination THE MAINTENANCE OF DNA SEQUENCES DEFINITIUN5 5 1 Mutation 5 2 Germ cell 5 3 Somatic cell TRUEFALSE 5 4 True If the DNA in somatic cells were not sufficiently stable that is if it accumulated mutations too rapidly the organism would die of cancer for example and if organisms died before they could reproduce the species would be at risk If the DNA in reproductive cells were not sufficiently stable many mutations would accumulate and be passed on to future generations increasing the risk that the species would die out THOUGHT PROBLEMS 5 5 5 6 For either hypothesis you might expect to see about 10 surviving colonies per plate If the bacteriophages induced resistance the surviving colonies would appear in random positions on each of the replica plates Ifthe muta tions preexisted the resistant colonies would appear at the same locations on each of the three replica plates In actual experiments of this kind the surviving colonies appear at the same locations indicating that the muta tions preexist in the population Reference Hartwell LH Hood L Goldberg ML Reynolds AE Silver LM amp Veres RC 2000 Genetics From Genes to Genomes pp 217 218 NewYork McGraw Hill Each time the genome is copied in preparation for cell division there is a chance that mistakes mutations will be introduced If a few mutations are made during each replication then no two daughter cells will be the same The rate of mutation for humans is estimated to be 1 nucleotide change per 109 nucleotides each time the DNA is replicated Since there are 64 X 109 nucleotides in each diploid cell an average of 64 random mutations willbe introduced into the genome each time it is copied Thus the two daughter cells from a cell division will usually differ from one another and from the parent cell that gave rise to them Occasionally genomes will be copied per fectly giving rise to identical daughter cells The proportion ofidentical cells depends on the exact mutation rate If the true mutation rate in humans were 10fold less 1 change per 1010 nucleotides most of the cells in the body would be identical In This Chapter THE MAINTENANCE OF DNA SEQUENCES DNA REPLICATION MECHANISMS THE INITIATION AND COMPLETION OF DNA REPLICATION IN CHROMOSOMES DNA REPAIR HOMOLOGOUS RECOMBINATION TRANSPOSITION AND CONSERVATIVE SITESPECIFIC RECOMBINATION A93 A95 A103 A109 A115 A118 5 7 Chapter 5 DNA Replication Repair and Recombination Mutations that alter the amino acid sequence in such away that the protein no longer functions will tend to be lost through natural selection that is through the preferential death of organisms that carry the mutations This preferential loss leads to an underestimate of mutation rate because some mutations are not counted Fibrinopeptides are much less sensitive to such effects because their function does not depend on their amino acid sequence Thus they can tolerate almost any amino acid change and as a result estimates of mutation rates are more accurate Even so corrections still must be made for mutations that occur at the site of a previous muta tion changing it to the normal sequence or to a new mutant sequence in the first case two mutations would be counted as none in the second two mutations would be counted as one These corrections become increasingly important with greater evolutionary separation Natural selection alone is not sufficient to eliminate recessive lethal genes from the population Consider the following line of reasoning Homozy gous defective individuals can arise only as the offspring of a mating between two heterozygous individuals By the rules of Mendelian genetics offspring of such a mating will be in the ratio of 1 homozygous normal 2 heterozygous 1 homozygous defective Thus statistically heterozygous individuals should always be more numerous than the homozygous defec tive individuals And although natural selection effectively eliminates the defective genes in homozygous individuals through death it can t touch the defective genes in heterozygous individuals because they do not affect the phenotype Natural selection will keep the frequency of the defective gene low in the population and indeed act to reduce it but in the absence of any other effect there will always be a reservoir of defective genes in the hetero zygous individuals At low frequencies of the defective gene another important factor chance comes into play Chance variation can increase or decrease the fre quency ofheterozygous individuals and thereby the frequency of the defec tive gene By chance the offspring of a mating between heterozygotes could all be homozygous normal which would eliminate the defective gene from that lineage Increases in the frequency of a deleterious gene are opposed by natural selection however decreases are unopposed and can by chance lead to elimination of the defective gene from the population CALCULATIONS 5 9 In the initial population there are 106 copies ofyour lOOObp gene or a total of 109 bp to be replicated when the population doubles At a mutation rate of 1 mutation per 109 bp per generation you might expect 1 copy of your lOOObp gene to carry a mutation one mutant cell in the population of2 X 106 cells which is a frequency of5 X107 1 mutant cell2 X 106 total cells After the first doubling there will be 2 X 106 copies of your lOOObp gene for a total of 2 X 109 bp to be replicated at the next population doubling At the same rate of mutation you would now expect 2 mutant copies of the gene to be generated which is a frequency of new mutants of 5 X 10 7 2 mutant cells4 X 106 total cells The frequency of total mutations in the population is greater because the mutant cell thatwas generated in the first doubling will divide to produce two mutant cells in the second generation for an overall frequency ofmutants equal to 10 6 44 X 106 After the second doubling there will be 4 gtlt1O6 copies ofyour1000bp gene for a total of4 X 109 bp to be replicated You would expect 4 mutant copies of the gene to be generated which is a frequency of new mutants of5 X 10 7 48 X 106 in the third generation The four mutant cells present after the second doubling would also double to generate 8 mutant cells thus the overall frequency of mutants would be 15 gtlt1TS128 x106 This exercise illustrates a key difference between rates ofmutation andfre quencies ofmutation Rates are constant whereas frequencies increase with increasing growth of the cell population DNA REPLICATION MECHANISMS DATA HANDLING 5 10 The variation in frequency of mutants in different cultures exists because of variations in the time at which the mutations arose For example cultures with one mutant acquired the mutation in the last generation cultures with two mutants likely acquired a mutation in the next to last generation and the mutant cell divided once cultures with four mutants likely acquired the mutation in the third to last generation and the mutant cell divided twice Cultures with large numbers of mutant cells acquired a mutation early in growth and those cells divided many times To understand this variability it is best to think of the mutation rate 1 mutation per 109 bp per generation as a probability a 10 9 chance ofmaking a mutation each time a nucleotide pair is copied Thus sometimes a mutationwill occur before 109 nucleotides have been copied and sometimes after Analysis of the variation in frequencies among cultures grown in this way which is known as fluctuation analysis is a common method for determin ing rates of mutation Luria and Delbruck originally devised the method to show that mutations preexist in populations of bacteria that is they do not arise as a result of the selective methods used to reveal their presence Reference Luria SE amp Delbruck M 1943 Mutations of bacteria from virus sensitivity to virus resistance Genetics 28 491 511 DNA REPLICATION MECHANISMS DEFINITIONS 5 11 RNA primer 5 12 DNA ligase 5 13 Stranddirected mismatch repair 5 14 DNA helicase 5 15 Leading strand 5 16 Sliding clamp 5 17 DNA topoisomerase 5 18 Replication fork 5 19 Lagging strand TRUEFALSE 5 20 False The sequence ofnucleotides in a newly synthesized strand is very dif ferent from that ofthe parental strand used as the template for its synthesis the new strand is complementary to the 3to5 sequence of the parental template strand 5 21 True At each replication fork the leading strand is synthesized continuously and the lagging strand is synthesized as Okazaki fragments Since half the DNA at each replication fork is stitched together from Okazaki fragments half the genome must be made this way 5 22 True If the replication fork moves forward at 500 nucleotide pairs per sec ond the DNA ahead of it must rotate at 48 revolutions per second 500 nucleotides per second105 nucleotides per helical turn or 2880 revolu tions per minute The havoc this would wreak on the chromosome is pre vented by a DNA topoisomerase that introduces transient nicksjust in front of the replication fork The action confines the rotation to a short single stranded segment of DNA A96 Chapter 5 DNA Replication Repair and Recombination 5 23 False The methylation dependent repair system relies on methyl groups in the parent strand and their absence from the progeny strand in order to dis tinguish the two strands 5 24 True When topoisomerase I cleaves DNA it stores the energy of the phosphodiester backbone in a phosphotyrosine bond to the enzyme which it then uses to remake the phosphodiester bond in DNA THOUGHT PROBLEMS 5 25 The complementary strand is 5 TGATTGTGGACAAAAATCC3 Recall that the two strands of a DNA double helix are antiparallel that is they run in opposite directions By convention sequences of single strands are written in the 5 to3 direction 5 26 Because the two strands of the DNA double helix are antiparallel the indi cated phosphate is at the 5 end of the fragment to which it is attached 5 27 A Dideoxycytidine triphosphate ddCTP is identical to dCTP except that it lacks the 3 hydroxyl group on the sugar ring ddCTP is recognized by DNA polymerase as dCTP and becomes incorporated into DNA Because it lacks the crucial 3 hydroxyl group its addition to a growing DNA strand creates a dead end to which no further nucleotides can be added Thus when ddCTP is added in large excess each new strand will be synthesized until the first G is encountered in its template strand ddCTP will then be incorporated in place of C and the extension of this strand will be terminated B If ddCTP is added at 10 the concentration of dCTP there is a l in 10 chance of its being incorporated whenever a G is encountered in the template strand Thus a population of DNA fragments will be synthesized and from their lengths the location of the G nucleotides in the template strand can be deduced The use of such terminator nucleotides forms the basis of several methods for DNA sequencing C Dideoxycytidine monophosphate ddCMP lacks the 5 triphosphate group as well as the 3 hydroxyl group of the sugar ring The absence of the triphos phate means that ddCMP cannot provide the free energy that drives the polymerization of nucleotides into DNA It is not a substrate for DNA poly merase and will not be incorporated into the replicating DNA The molecule at either concentration is therefore not expected to affect DNA replication 5 28 If the proofreading exonuclease activity of DNA polymerase were lost you would expect the fidelity of DNA synthesis to be compromised The proof reading exonuclease accounts for about a factor of 100 in the overall fidelity of DNA synthesis in E call and its loss might be expected to lower overall fidelity by this amount Loss of proofreading activity would also be expected to affect the rate of DNA synthesisWhen a nucleotide is misincorporated a normal DNA poly merase can quickly remove itwith its proofreading activity and then continue on By contrast a misincorporated nucleotide might affect a DNA poly merase lacking a r f quot a more 39 quot since DNA polymerase requires a basepaired primer Thus a deficient DNA polymerase might be expected to pause or stall at each misincorporated nucleotide Because the normal frequency ofmisincorporation is so low about 1 in 105 it might be difficult to demonstrate such a rate change in practice 5 29 A The four processes that lead to highfidelity DNA replication in E coli are 1 Correct pairing of the nucleotide with its complementary base in the template strand 2 A conformational change by the polymerase which introduces a delay that allows an incorrectly bound nucleotide additional time to dissociate before it is added to the growing chain Removal of an incorrect nucleotide at the 3 terminus ofthe growing chain by the 3 to5 proofreading exonuclease activity of the polymerase w DNA REPLICATION MECHANISMS 4 Removal of a mispaired nucleotide deeper in the growing chain by the mismatch repair system B It should come as no surprise that the strand synthesized in the 5 to3 direction the standard direction is made with high fidelity It is surprising that the strand made in the 3 to5 direction is also synthesized with high fidelity For this direction of DNA synthesis proofreading is problematical Synthesis using nucleoside 5 triphosphates would place the activating triphosphate at the terminus of the growing chain Removal of an incorrect terminal nucleotide by a 5 to3 proofreading exonuclease would leave a terminus with a 5 phosphate or a 5 hydroxyl neither of which would carry the activating triphosphate necessary to drive addition of the next nucleotide In the absence of a proofreading activity you might expect fidelity to be diminished Mutations that compromise the proofreading activity of bacterial DNA polymerases lead to a higher rate of mutation and strains harboring such mutations are known as mutator strains There are several ways that fidelity for a 3 to5 DNA polymerase might be improved It is unlikely that pairing of the nucleotides with their comple mentary bases on the template strand could be improved since the speci ficity of that interaction is limited by physicochemical properties of the nucleotides It might be possible to improve the efficiency of the kinetic proofreading step by the DNA polymerase perhaps by building in a longer delay by building in multiple delays or by having the polymerase actively check the correctness of the incipient base pair before making the phospho diester bond It may be possible as well to improve the efficiency of mis match repair so that the higher frequency of mismatches that escape the polymerase can be dealt with more effectively It is also possible in principle for a 3 to5 DNA polymerase to have an effective proofreading activity This would require at least one new activity in order to regenerate a triphosphate at the terminus of the growing chain after removal of a mismatched nucleotide A triphosphate could be regenerated by adding a diphosphate from ATP for example to the 5 phosphate at the terminus or a triphosphate to a terminal 5 hydroxyl Such an activity might allow for a relatively normal proofreading activity by the 3 to5 DNA polymerase 0 While the process may seem wasteful it provides an elegant solution to the difficulty ofproofreading during primer formation To start a new primer on a piece of singlestranded DNA one nucleotide needs to be put in place and then linked to a second and then to a third and so on Even if these first nucleotides were perfectly matched to the template strand such short oligonucleotides bind with very low affinity and it would consequently be difficult to distinguish the correct from incorrect bases byproofreading The task of the primase is to just get anything down that binds reasonably well and not worry about accuracy Later these sequences are removed and replaced by DNApolymerase which uses the accurately synthesized DNA of the adjacent Okazaki fragment as its primer DNA polymerase has the advantage which primase does not have of putting the new nucleotides onto the end of an already existing strand The newly added nucleotide is held firmly in place and the accuracy of its base pairing to the next nucleotide on the template strand can be checked Therefore as DNA poly merase fills the gap it can proofread the new DNA strand that it makesWhat appears at first glance as energetically wasteful is really just a necessary price to be paid for accuracy Sequences in singlestranded DNA that can form hairpin helices are self complementary which means that they can basepair and that the resulting duplexwill have strands running in opposite directions An example of such a sequence is shown in Figure 5 48 A In general proteins that are required for movement of the replication fork will display a quickstop phenotype because the fork will be unable to progress in the absence of their function Thus temperaturesensitive 5 r rAGGCCr rrGGCCT r 73 5 ea 7 G 7c 7 c l 3 Figure 5 48 An example ofa sequence of singlestranded DNA that could form a hairpin helix Ahswer573l 55 g P5 Chapter 5 DNA Replication Repair and Recombination mutants of DNA topoisomerase I inability to relieve winding tension ahead of the replication fork SSE protein inability to stabilize the singlestranded DNA at the fork DNA helicase inability to melt the DNA ahead ofthe repli cation fork and DNA primase will display the quickstop phenotype Of these only the phenotype of DNA primase is difficult to predict DNA pri mase directly affects synthesis of the lagging strand but it is not required for synthesis of the leading strand Its quickstop phenotype may result from either oftwo indirect effects 1 exposure of sufficient singlestranded DNA to use up all the SSB protein or 2 interference with the DNA helicase which is linked to DNA primase as part of the primosome Proteins that are not involved in the movement of the replication fork will display a slow stop phenotype Thus a temperaturesensitive initiator pro tein would show the slowstop pattern of replication since DNA molecules that had passed the initiation step before the temperature was increased would continue to replicate the chromosome until an initiation step was required in the next cycle Similarly a temperaturesensitive DNA ligase would show a slow stop phenotype since the progress of the replication fork would not be stopped Replication would cease only during the next cycle when the nicks were uncovered on the template strand The mixed extracts should be fully competent for DNA replication at 42 C that is the mixture should exhibit a nonmutant phenotype The defective DNA helicase extract would provide normal DNA ligase and the defective DNA ligase extract would provide normal DNA helicase Thus the entire complement of normal proteins would be present in the mixed extract This mutual correction by extracts with different deficiencies is called complementation Because of the extreme complexity of DNA replication and the large number ofproteins involved cellfree extracts are not capable of 39 39 39 DNA rquot 39 39 f39 39 In practice the behaviors of extracts from slowstop mutants and from nonmutant cells are often diffi cult to distinguish Mismatch repair normally corrects a mistake in the new strand using infor mation in the old parental strand If the old strand were rep aired using the new strand that contains a replication error as the template then the error would become a permanent mutation in the genome The old correct infor mation would be erased in the process Therefore if repair enzymes did not distinguish between the two strands there would be only a 50 chance that any given replication error would be corrected Overall such indiscriminate rep air would introduce the same number of mutations as would be introduced if mismatch repair did not exist In the absence of repair a mismatch would persist until the next replicationWhen the replication fork passed the mismatch and the strands were separated properly paired nucleotides would be inserted opposite each of the nucleotides involved in the mismatch A normal nonmutant duplex would be made from the strand containing the original information a mutant duplex would be made from the strand that carried the misincorporated nucleotide Thus the original misincorporation event would lead to 50 mutants and 50 nonmutants in the progeny This outcome is equivalent to that of indiscriminate repair averaged over all misincorporation events indiscriminate repair would also yield 50 mutants and 50 nonmutants among the progeny Clearly DNA polymerases must be able to extend a mismatched primer occasionally otherwise no mismatches would be present in the newly syn thesized DNA Most mismatches are removed by the 3 to5 proofreading exonuclease associated with the DNA polymerase When the exonuclease does not remove the mismatch the polymerase can extend the growing chain In reality DNA r and the r f 139 a are in competition with each other In the case of bacteriophage T7 DNA poly merase numbers are available that illustrate this competition NormallyT7 DNA polymerase synthesizes DNA at 300 nucleotides per second while the exonuclease removes terminal nucleotides at 02 nucleotides per second DNA REPLICATION MECHANISMS A NICKS 5 n ick 14 5 B THYMINE DIMERS 5 thymine dimer 4 3 5 III M A 5 IV 5 l suggesting that 1 in 1500 02 300 correctly added nucleotides are removed y the exonuclease When an incorrect nucleotide has been incorporated the rate of removal increases 10fold to 23 nucleotides per second and the rate of polymerization decreases 105fold to about 001 nucleotide per sec ond Comparison of these rates for a mismatched primer suggests that about 1 in 200 00123 mismatched primers will be extended by T7 DNA polymerase Reference Iohnson KA 1993 Conformational coupling in DNA polymerase fidelity Annu Rev Biochem 62 685 713 When DNA polymerase encounters a nick in either template for the leading or lagging strands the replication fork collapses generating a doublestrand break as shown in Figure 5 49A When DNA polymerase encounters a thymine dimer on the template for the leading or lagging strand the DNA polymerase stops The polymerase on the opposite nonblocked strand however continues for a while before it stops Figure 5 498 The enzyme topoisomerase II is responsible for unlinking SV40 daughter duplexes Topoisomerase II introduces a transient doublestrand break into one circle and then guides the second duplex through the first before it reseals the break CALCULATIONS 5 37 A P5 O A 112 weight ratio of nucleotides to SSB protein corresponds to an 8821 ratio of nucleotides to SSB molecules nucleotides 1 dnucleotide X 35000 dSSB X SSB molecule 12 d SSB SSB molecule 88 nucleotidesSSE molecule 1 nucleotide Since there are 104 nucleotides per 34 nm of singlestranded DNA the 88 nucleotides would stretch about 3 nm If the singlestranded DNA were fully extended it would stretch about twice as far The 12nm length of an SSB molecule suggests that at saturation SSB proteins are in contact with one another and probably overlap considerably The absence of significant binding at a low SSB concentration but nearly quantitative binding at 14fold higher concentration suggests that SSB pro tein binds cooperatively to DNA In essence cooperative binding means 330 d nucleotides Figure 5 49 Consequences of replication across a damaged site on the templates for the leading and lagging strands Answer 5735 A Nicks B Thymine dimers 3 T A100 5 38 Chapter 5 DNA Replication Repair and Recombination I leading lagging I W M lagging leading that the binding of one monomer makes it easier for additional monomers to bind If the monomers overlap with one another when bound as sug gested by the calculation in part B cooperativity likely arises because each monomer has two binding sites one for DNA and one for other monomers Binding of the first monomer to DNA will be weak because it can bind only through its DNAbinding site By binding adjacent to bound monomers subsequent monomers can make use ofboth their binding sites Mathemat ically this type ofinteraction leads to a steep dependence of binding on con centration Reference Alberts BM amp Frey L 1970 T4 bacteriophage gene 32 a structural protein in the replication and recombination of DNA Nature 227 1313 1318 Since both strands of the E coli genome must be copied a total of92 gtlt1O6 nucleotides must be polymerized Polymerization of nucleoside triphos phates into DNA consumes two highenergy phosphoanhydride bonds for each nucleotide added the nucleoside triphosphate is hydrolyzed to add the nucleoside monophosphate to the growing strand and the released pyrophosphate is hydrolyzed to phosphate Therefore 18 X 107 highenergy phosphate bonds are hydrolyzed during each round of replication Since each glucose can provide 30 highenergy phosphate bonds 6 X 105 18 X 10730 molecules are required to provide sufficient energy for one round of replication At 180 dmolecule 6 X 105 glucoses would have a mass of 18 gtlt10 16 g 6 X 105 molecules X 180 dmolecule X 1 g6 X 1023 d This amount of glu cose is roughly 002 18 gtlt10 16 g glucose1 gtlt10 12 g E call of the mass of one E coli cell DATA liANDLINf 5 39 A A replication bubble with two replication forks moving in opposite direc tions is shown in Figure 5 50 Arrowheads show the 3 ends of the growing chains B The four structures that were most commonly observed along with those that were never seen are indicated in Figure 5 51 Based on the mechanism of bidirectional DNA replication you might well have picked structure 5 as the most common because it has two singlestranded regions corresponding percent of A OBSERVED STRUCTURES B STRUCTURES NOT SEEN total forms observed 29 39l a 2 a 39 3 l 4 D 22 5 39 l 6 H 0 8 m 7 3 Figure 5 50 A labeled diagram ofa replication bubble Answer 5739 Figure 5 51 Structures that were and were not observed In replicating bacteriophage lambda molecules Answer 5739The frequencies ofthe different obsened structures are indicated at the left DNA REPLICATION MECHANISMS gt P5 0 starting cells first generation second generation third generation to the expected sites of laggingstrand synthesis Structure 8 is also a rea sonable expectation it appears that lagging strand synthesis at one end has not yet filled in the singlestrand gap It is somewhat unexpected that struc tures like 1 and 3 which have no singlestranded DNA at one or both forks were observed so commonly These structures could result from asyn chronous growth at the fork such that the leading strand pauses until the lagging strand catches up from some artifact of the preparation of DNA for viewing or from singlestranded regions that were too short to be distin guished reliably from doublestranded DNA Reference Inman RB amp Schnos M 1971 Structure ofbranch points in repli cating DNA presence of singlestranded connections in lambda DNA branch points J Mol Biol 56 319 325 DNA isolated from your starting cells has a heavy density as you would expect After one generation in normal light medium the DNA has uni formly shifted to medium density synthesis of the new DNA from light nucleotides results in hybrid DNA molecules that contain one heavy mater nal strand and one light newly synthesized strand Figure 5 52 After another round of replication in light medium two forms of DNA appear in about equal proportions one form is again a hybrid of a light and a heavy strand and has medium density while the other form is composed of two light strands and has a low density Figure 5 52 During the subsequent rounds of replication more light DNA is formed and the proportion of mediumdensity DNA diminishes Your results are therefore in complete agreement with the hypothesis that you set out to test The different labels used for theT and C nucleotides make it easy to measure their respective losses from the polymeric substrate The radioactive disin tegrations from the energetic 32p and from the rather weak 3H can be distin guished using a liquid scintillation counter Because DNA polymerase 1 possesses an exonuclease activity that is it removes nucleotides from the ends of strands the T nucleotides cannot be released until all the C nucleotides have been removed hence the lag When dTTP is added to the reaction polymerization will begin just as soon as a proper AT nucleotide pair is uncovered at the end of the primer Poly merization does not occur efficiently from a mismatched AC pair Since the rate ofpolymerization exceeds the rate of exonuclease digestion by two or three orders ofmagnitude the labeled Ts will be buried quickly and will be unavailable to the exonuclease A101 Figure 5 52 Change in density of DNA from cells grown initially in heavy medium and then for various numbers of generations in light medium Answer 5740 L M and H referto light medium and heavy density respectively A102 U P5 P5 0 Chapter 5 DNA Replication Repair and Recombination The results will not be affected by the presence of dCTP Since the template is polydA C nucleotides cannot be incorporated If they are incorporated by mistake the mismatched C will not serve as a primer for polymerization Reference Brutlag D amp KornbergA 1972 Enzymatic synthesis of deoxyri bonucleic acid 36 A proofreading function for the 3 to 5 exonuclease activ ity in deoxyribonucleic acid polymerases J Biol Chem 247 241 248 These results indicate that RNA priming of DNA synthesis begins at specific sites on M13 DNA Because several specific singlestranded DNA segments were generated there must be a corresponding number of specific sites at which RNA priming begins If RNApriming were totally random then every length of newly synthesized DNA would be generated which would show up as a smear upon electrophoresis A comparison ofthe template sequences and the product sequences shows that the product DNA chains start five nucleotides in from the left end of the template sequences as shown in Figure 5 53 Remember that the strands in a double helixrun in opposite directions Since the RNA primers are five nucleotides long they start an additional five nucleotides to the right At this point in each template there is a conserved sequence 5 GPyT 3 Py stands for pyrimidine The RNA primers start with a purine nucleotide opposite the unspecified pyrimidine in the template A more extensive comparison of sequences suggests that these three nucleotides are sufficient for priming by the components used in these experiments which were from bacteriophage T4 Reference Cha TA amp Alberts BM 1986 Studies of the DNA helicase RNA primase unit from bacteriophage T4 A trinucleotide sequence on the DNA template starts RNA primer synthesis I Biol Chem 261 7001 7010 ATP hydrolysis is required for unwinding because energy is needed to melt DNA Strand separation is energetically unfavorable because stacking inter actions between the planar base pairs are largely lost upon strand separa tion in addition the hydrogen bonds that link the bases present a kinetic barrier to strand separation Since DnaB melts off only the 3 fragment of substrate 3 lanes 9 and 10 Fig ure 5 11 it must bind to the long single strand and move along it in the 5 to3 directionWhen it reaches the doublestranded region formed by the 3 fragment it unwinds the fragment The 5 to3 movement of DnaB suggests that it unwinds the parental duplex at the replication fork by moving along the template for the lagging strand If DnaB moves in the 5 to3 direction why does it not melt the 5 fragment off substrate 3 by binding to the short 5 tail Pat yourself on the back if you wondered about this In real experiments a small amount of the 5 fragment is melted off This is a rarer event because of the difference in target size DnaB is much more likely to bind to the long single strand If SSB is added first it inhibits DnaBmediated unwinding because it coats the singlestranded DNA preventing DnaB from binding By contrast if SSB is added after DnaB has bound it stimulates unwinding by preventing the unwound DNA from reannealing Reference LeBowitz lH amp McMacken R 1986 The Escherichia coli DnaB replication protein is a DNA helicase J Biol Chem 261 4738 4748 There is a similar increase in reversions of the LuCZ allele in the R orienta tion in mismatchrepair deficient 19fold and proofreading deficient 25 fold strains of E coli For the Rifgene however there was no difference in mismatchrepair deficient 11fold and proofreading deficient 10fold strains as expected Since reversion results from misincorporation of G opposite T which occurs on the leading strand in orientation R see Figure required sequence T T T T T T T T HOOHOOWH G T G C G T G T G C G T G C G T 2 1 ltl 539 DNA chain RNA primer Figure 5 53 Start sites and required sequences for RNA priming Answer 5412 THE INITIATION AND COMPLETION OF DNA REPLICATION IN CHROMOSOMES 5 12 leadingstrand DNA synthesis appears to be less accurate than lag ging strand synthesis B The reason for the apparent difference in fidelity of DNA synthesis on the leading and lagging strands is not clear Four different alleles of LacZ were tested in this study all showed the same 2 to 5fold lower fidelity of synthe sis on the leading strand The difference is unlikely to be due to the intrin sic properties of the DNA polymerase since the same polymerase is used to make both strands If you thought about transcription of the LacZ gene good for you But transcription occurs at a very low level under the condi tions used here and the direction of transcription did not correlate with mutation frequency for the four alleles that were studied The authors suggested the following explanation Because the polymerase on the lagging strand must dissociate and rebind each time it comes to the end of an Okazaki fragment it might dissociate with greater ease than the polymerase on the leading strand If the polymerase on the lagging strand dissociated from mismatches more readily as well the mismatched primer would be exposed more often on the lagging strand than on the leading strand They argue that such an exposed mismatch might be subject to repair by other 3 to5 exonucleases in the cell In effect this would mean that the lagging strand has two ways to repair a mismatched primer while the leading strand has only one Such an explanation might account for the difference in fidelity but it remains to be proven Reference Fijalkowska U Ionczyk P Tkaczyk MM Bialoskorska M amp Schaaper RM 1998 Unequal fidelity of leading strand and lagging strand DNA replication on the Escherichia coli chromosome Proc Natl Acad Sci USA 95 10020 10025 THE INITIATION AND COMPLETION OF DNA REPLICATION IN CHROMOSOMES DEFINITIONS 5 45 S phase 5 46 Origin recognition complex ORC 5 47 Replication origin 5 48 Telomerase TRUEFALSE 5 49 True See for example Figure 5 50 5 50 True Consider a single template strand with its 5 end on the left and its 3 end on the right No matter where the origin is synthesis to the left on this strand will be continuous leading strand and synthesis to the right will be discontinuous lagging strand Thus when replication forks from adjacent origins collide a rightwardmoving lagging strand will always meet a left wardmoving leading strand 5 51 True Experiments using the thymidine analog bromodeoxy uridine BrdU to label newly synthesized DNA in synchronized cell populations show that different regions of each chromosome are replicated in a reproducible order during S phase 5 52 False If one origin is deleted the adjacent DNA which would normally be replicated from that origin will be replicated instead from a neighboring A103 A104 Chapter 5 DNA Replication Repair and Recombination origin Thus replication ofthe DNA adjacent to the deleted origin may occur somewhat later than normal but it will occur THOUGHT PROBLEMS 5 53 5 54 5 55 As always you come through with flying colors Although you were initially bewildered by the variety of structures you quickly realized that H forms were just like the bubbles except that cleavage occurred within the bubble instead of outside it Next you realized that by reordering the molecules according to the increasing size of the bubble and flipping some structures endforend you could present a convincing visual case for bidirectional replication away from a unique origin of replication Figure 5 54 The case for bidirectional replication is clear since unidirectional replication would give a set of bubbles with one end in common Replication from a unique origin is likely but not certain because you cannot rule out the possibility that there are two origins on either side of and equidistant from the restric tion site used to linearize the DNA Repeating the experiment using a differ ent restriction nuclease will resolve this issue and define the exact position of the origins on the viral DNA Your advisor is pleased F Each newly synthesized strand in a daughter duplex was synthesized by a mixture of continuous and discontinuous DNA synthesis from multiple origins Consider a single replication origin The fork moving in one direc tion synthesizes a daughter strand continuously as part of leadingstrand synthesis The fork moving in the opposite direction synthesizes a portion of the same daughter strand discontinuously as part of lagging strand syn thesis E The two daughter chromosomes will be shorter at opposite ends This out come is illustrated in Figure 5 55 which shows replication from a single ori gin to the ends of the chromosome Multiple origins make no difference to the outcome The leading strand can continue all the way to the end of the chromosome but the lagging strand cannot The very last RNA primer can not be replaced by DNA because there is no upstream primer for DNA poly merase to extend CALCULATIONS 5 56 5 57 A The approximate positions of the origins of replication and their associated replication forks are labeled in Figure 5 56 on a schematic diagram of the electron microgra h The distance between replication forks 4 and 5 is about 03 um 300 nm which corresponds to about 880 nucleotides 300 nm034 nm nucleotide If replication forks 4 and 5 were each traveling at 50 nucleotidessecond they would collide in about 9 seconds 88050 X 2 Replication forks 7 and 8 are moving in opposite directions and would therefore never collide D If there were no time constraints on replication one origin would be required for each chromosome thus a minimum of 46 origins equal to the number of chromosomes in a human cell would be needed In 8 hours 28800 seconds the two replication forks from one origin Figure 5 54 Bidirectional replication from a unique origin Answer 5753 leading lagging v 7 3 V lagging leading would synthesize 288 gtlt106 nucleotides 2 forks X 50 39 d X 288 X 104 seconds To replicate the entire genome would require about 2200 equally spaced origins 64 X 109 nucleotides288 X 106 nucleotides origin 2222 origins It is estimated that the human genome has about 10000 origins of replication more than enough to finish replication within the time allotted in the cell cycle 3 g a p m 339 539 Figure 5 55 Consequences ofthe end replication problem for daughter duplexes Answer5755 THE INITIATION AND COMPLFIION OF DNA REPLICATION IN CHROMOSOMES replication fork 0 l m l 2 H 3 T replication Origin A DATA HAN DLI NG 5 58 A 05 0 The regions of the tracks that are dense with silver grains correspond to those segments ofDNA that were replicated when the concentration of the label was high The less dense regions mark segments of DNA that were replicated when the concentration of label was low The difference in the arrangements of the dark and light sections of the tracks derives from the difference in the labeling schemes in the two experr iments In the first experiment see Figure 5715A 3Hrthymidine was added 39 39 L quot hlnrkThu 39 39 quot39 ated at origins in the presence oflabel giving a continuous dark section on both sides of the originWhen the concentration oflabel was lowered replir cation proceeded in both directions away from the origin leaving light secr tions at both ends ofthe dark sections In the second experiment see Figure 5715B replication began at origins in the absence of aHythymidine 0 that t e origin was unlabeled Addition of a high concentration oflabel followed they are linked by the unlabeled therefore invisible segment that contains the replication origin The approximate rate offork movement can be estimated from the labeling times and the lengths of the labeled sections In the first experiment seg7 ments roughly 100 um in length were labeled during the 457minute labeling period Because two replication forks were involved in synthesizing each labeled segment each replication fork synthesized about 50 um ofDNA in 45 minutes Therefore the rate offork movement is about 11 ummin 50 um45 min In the second experiment segments roughly 50 um in length were labeled however each was synthesized by only one replication fork T us the rate offork movementwas also about 11 ummin 39 39 39 39 uf rient 39 39 l 39 replicate the entire genome The missing information is the number of active origins ofreplication and their distribution Assuming that all origins are activated at the s ime and all forks move te minimum time required to replicate the genome regardless ofits size is fixed by the dis tance between the two origins that are farthest apart Reference Huberman IA amp Riggs AD 1968 On the mechanism ofDNA replication in mammalian chromosomes Mol Biol 32 3277341 In addition to its siterspecific DNA binding Trantigen possesses an ATPV dependent DNA helicase activity which is required to unwind DNA The presence ofTrantigen at the orks is also consistent with its activity as ahelir case The ability ofTrantigen to bind SV40 origins specifically and unwind A105 Figure 5 56 Repliration bubbles in a Dmsophlla hmmosome An 3wer5756 The xhematir diagram of the elemon mirrograph shows the on repliration and the mplirati n of on forks A106 P5 O 0 Chapter 5 DNA Replication Repair and Recombination them is a natural first step in DNA replication These activities expose single stranded regions so that primases and DNA polymerases can gain access to the DNA To map the unwound regions you can digest the DNA with a restriction nuclease that cuts at a defined location relative to the origin Since the two ends of a linear molecule cannot be distinguished in the electron micro scope at least two different restriction digestions are required to map the unwound regions unambiguously If unwinding starts outside the origin the bubble will not normally include the origin Figure 5 57A If unwinding starts at the origin the bubble will always include the origin Ifsuch unwind ing occurs in just one direction one end of the bubble will always coincide with the origin Figure 5 57B If unwinding proceeds in both directions at the same rate the center of the bubble will always coincide with the origin Figure 5 57C The actual experimental results indicate that unwinding occurs in both directions at approximately the same rate Topoisomerase I is required for unwinding closed circular DNA in order to relieve overwinding strain in the duplex part of the molecule Ln a covalently closed circle the removal of duplex winding in one segment causes the rest of the duplexto become overwound which is energetically unfavorable The transient nicking closing activity of topoisomerase 1 allows this winding tension to be relieved If Tantigen initiated replication multiple times from an SV40 origin inte grated in a chromosome a complex multistranded structure would be pro duced Three rounds of initiation at a chromosomal origin are illustrated in Figure 5 58 Apparently a chromosomal SV40 origin in the presence ofT antigen repli cates in a manner similar to that shown in Figure 5 58 The process is termed onionskin replication because the resulting structure has the lay ered appearance reminiscent of an onion Reference Dodson M Dean FB Bullock P Echols H amp Hurwitz I 1987 Unwinding of duplex DNA from the SV40 origin of replication by Tantigen Science 238 964 967 The three density peaks represent from light to heavy unreplicated DNA pl at d DNA and t i pl at d DNA Figure 5 59The injected DNA is labeled with 3H but is otherwise normal DNA which is light Each newly synthesized strand incorporates 32F label and BrdU which increases its density Thus after one round of replication the DNA will be intermediate first econd third initiation round of round of round of initiation initiation initiation origin U A UNWINDING NOT ARTED AT ORIGIN origin ltgt B UNIDIRECTIONAL UNWINDING origin C BIDIRECTIONALUNWINDING origin Cgt Figure 5 57 The use of restriction digestion to map the site of unwinding by Tantigen relative to the SV40 origin of replication Answer 5759 The ends ofthe linear molecule were generated by digestion with a restriction nuclease Figure 5 58 Multiple initiation events at a chromosomal SV40 origin of replication Answer 5759 THE INITIATION AND COMPLETION OF DNA REPLICATION IN CHROMOSOMES P5 0 gt P5 O in density containing one light 3Hlabeled strand and one heavy 32Plabeled strand After the second round of replication the hybrid DNA will give rise to one hybriddensity duplex and to one duplex that contains two 32P labeled heavy strands The fully heavy duplex will appear at the densest position in the gradient Since the fully light DNA is unreplicated it will con tain no 32F label Since the fully heavy DNA contains two new strands it will contain no 3H label The formation of discrete peaks in this experiment makes an important point most of the observed labeling is due to replication and not to repair synthesis If the incorporation of label were due to repair synthesis which is patchy the label would be smeared through the gradient rather than con centrated in discrete peaks The injected DNA mimics the expected behavior of chromosomal DNA in one very important way it undergoes a single round of replication in one cell cycle This behavior is apparent from the lack of fully heavy DNA after one cell cycle In another way the injected DNA behaves very differently from chromosomal DNA a large fraction ofthe injected DNA does not repli cate even after two cell cycles The lack of replication is apparent from the persistence of a fully light peak of DNA It is not clear why some of the injected DNA does not replicate Perhaps some of the eggs have been ren dered incompetent for replication by the experimental protocol Since cycloheximide is an inhibitor of protein synthesis it should have no direct effect on the synthesis of DNA Indeed in the presence of cyclohex imide one round of replication is completed normally although no more occur thereafter Cycloheximide apparently blocks further progress through the cell cycle because a key cellcycle event depends on protein synthesis The important point is that the DNA will not replicate again unless the cells progress through the cell cycle Reference Harland RM amp Laskey RA 1980 Regulated replication of DNA microinjected into eggs of Xenopus laevis Cell21 761 771 Hybridization at the 45kb position is due to plasmid molecules that were not replicating at the time DNA was isolated The intensity of this spot indi cates that the majority of plasmid molecules were not replicating The low frequency of replicating molecules even during S phase was one of the con tributing factors in the difficulty of proving that an ARS was an origin of replication The results in Figure 5 19 indicate that Arsl behaves as an origin of replica tion The gel pattern with Bgllldigested DNA looks like the pattern due to replicating molecules with two branches see Figure 5 18C The gel pattern with Pvuldigested DNA looks much like the pattern for replication inter mediates with symmetrically located replication bubbles see Figure 5 1813 The very short tail on the spot at 9 kb in Figure 5 19B indicates that the repli cation bubbles are slightly asymmetrically situated These gel patterns are exactlywhat would be expected ifreplication began at Arsl As shown in Fig ure 5 60 cleavage with Bglll which cuts at Arsl would generate molecules with two branches Cleavage with Pvul which cuts almost half way around the circle from Arsl generates molecules with nearly symmetric replication bubbles The discontinuity in the arc of hybridization of Pvulcut plasmids see Fig ure 5 1913 results from the difference in migration of bubble forms and branched forms Molecules that have just begun replicating will be con verted to bubble forms by Pvul cleavage whereas molecules replicated past the Pvul site will be converted to branched forms Thus a replicating molecule that is cleaved either has a bubble or it is branched there is no intermediate Since the two forms migrate differently there is a gap in the electrophoretic pattern Reference Brewer Bl amp Fangman WL 1987 The localization of replication origins on ARS plasmids in S cereuisiae Cell51 463 471 A107 3H 3H light density first round of replication 32p 3H 3H 32p hybrid density second round of replication 32p 3H 32p 32 p hybrid heavy density density Figure 5 59 Schematic representation of strand labeling Answer 5760 Pvul Pvul Bglll Bglll cut with Bglll cut with Pvul Figure 5 60 Conversion of replicating plasmid molecules Into linearforms with two branches by Bglll orwlth replication bubbles by Pvul Answer 576l A108 gt 11gt gt Chapter 5 DNA Replication Repair and Recombination The DNA from an amplified cluster is an onionskin structure as illustrated in Figure 5 61 When examined by electron microscopy DNA from late stage follicle cells shows multiple nested replication forks just as expected for this mechanism of amplification If every origin were activated each round of replication would double the number of chorion genes Therefore it would take six rounds of quot 39 to achieve a 60fold amplification 26 64 Given the overreplication of the chorion gene cluster the 510nucleotide amplificationcontrol element is probably an origin of replication It cannot however be a standard origin it must also contain a sequence that allows it to escape the block to rereplication in follicle cells at specific stages It is known that ORC binds throughout the nucleus until a specific stage of development at which it is cleared from all origins except those that are to be amplified The clearing of ORC from most origins and its continued bind ing at amplification sites are dependent on the activities of other proteins but the details of these processes are not yet defined P5 0 References OrrWeaver TL amp Spradling AD 1986 Drosophilu chorion gene amplification requires an upstream region regulating s18 transcription Mol Cell Biol 6 4624 4633 Royzman 1 Austin RI Bosco G Bell SP amp OrrWeaver TL 1999 ORC local ization in Drosophilu follicle cells and the effects of mutations in dEZF and dDP Genes Dev 13 827 840 ORC binding protects two neighboring locations on the origin DNA as indi cated by blank regions where bands that were visible in the absence of ORC are missing in its presence Figure 5 62 regions marked with P Note that a couple ofbands marked by an are more intense when ORC is bound indi cating that they are more accessible to DNase I in the complex than in native DNA ATP is required for binding by both the wildtype and the mutant ORCs For the wildtype ORC about 100 nM ATP gives full binding Figure 5 62 lane 3 for the mutant ORC a whopping 10 mM 105 more than for wildtype ORC is required lane 14 Because exactly the same results were obtained with ATP and the nonhy drolyzable analog ATPyS ATP hydrolysis cannot be required for ORC bind ing The authors of this study suggest that ATP hydrolysis is important for a subsequent step in the complex process that enables an origin for replica tion The Walker motif in Orcl is important to the function of ORC as shown by the dramatically different binding results when the motif is mutated The binding of mutant ORC at very high ATP concentrations suggests that the mutation lowers the affinity of Orcl for ATP but probably does not compro mise other functions of the protein P5 0 p Reference Klemm RD Austin R amp Bell SP 1997 Coordinate binding ofATP and origin DNA regulates the ATPase activity of the origin recognition com plex Cell 88 493 502 A peak represents a high frequency of nascent strands just what would be expected in the neighborhood of an origin of replication Thus the two peaks in Figure 5 22B probably represent two origins of replication The symmetrical decline in nascent strands on each side of each origin is con sistent with bidirectional replication Unidirectional replication would give peaks that are sharply defined on one side with a gradual decline on the other The different heights of the peaks suggest that the two origins are not equally active that is that the higher peak represents a more active origin one that initiates more frequently B The data in Figure 5 22B are not consistent with replication beginning any origin FIRST INITIATION SECOND INITIATION HOH il l THIRD INITIATION all chorion gene Figure 5 61 Onionskin structure ofan amplified cluster ofchorion genes Answer 5762Three initiation events are illustrated six would be required to amplify the chorion gene cluster 60fold DNA REPAIR A109 where within the initiation zonei Rather they suggest that an initiation zone ORC Wild tVPe muta quott may represent a cluster of distinct originsi Reference Kobayashi T Rein T amp DePamphilis ML 1998 Identification of ATP 0 A o A p imary initiation sites for DNA replication in the hamster dihydrofolate reductase gene initiation zonei Mal Cell Biol 18 3266732771 0 E E 2 5 65 A Most of the SV40 DNA is not replicated as is clear from comparison of the CAF1treated samples Figure 5723 CAF1 treatment moves most of the replicated labeled DNA to the supercoiled position but does not signi cantly alter the distribution of bulk DNA as indicated on the stained gel The small increase in stained DNA at the supercoiled position in the CAF1 treated sample shows how little of the total DNA has been replicated m g gt m m 3 B E m m 3 C fl m o m o B m m a lt 0 3 H m FL 5 E m n E gt 93 E B n E m n U lt F CF I absence of any signi cant effect on the bulk stained DNA Figure 5723 7 7 1 Because CAF1 speci cally targets replicated DNA for assembly into nucleo somes the replicated DNA must bear some mark of replication Further experiments by the authors of this study identi ed the mark as the sliding clamp PCNA which tethers DNA polymerase to the duplex The interac tion between the clamp and CAF1 is clearly useful allowing nucleosome 1 3 5 7 9 11 13 15 assembly to occur immediately in the wake of the DNA polymerase 2 4 6 8 1 12 M 0 Reference Shibahara K amp Stillman B 1999 Replicationdependent marking Figure 5 62 ORC binding sites on origin of DNA by PCNA facilitates CAF1coupled inheritance of chromatin Cell 96 DNA 35 FEVEBIEd by DNBSE f mpriming 57 85 Answer 5763 P indicates protected areas indicates enhanced cleavage A1 The region of intense hybridization to telomeres in the unaffected spores 1 and 3 extends from less than 200 nucleotides to just over 300 nucleotides averaging about 250 nucleotides Since the cleavage site is 35 nucleotides from the beginning of the telomere repeats the average length of telomere repeat in ssion yeast is just over 200 nucleotides 1 The descendants of spores 2 and 4 show telomere shortening with time whereas the descendants of spores 1 and 3 remain the same size Thus spores 2 and 4 appear to lack telomerase and it looks as though your iden ti cation of the ssion telomerase gene was correct It is worth noting that a number of genes in yeast cause a similar telomereshortening phenotype but only one of them encodes the catalytic subunit of telomerase Such genes are known as Est genes for ever shorter telomeresi 1 Although it is somewhat dif cult to estimate precisely it looks as though telomeres lose about 60 nucleotides every 3 days At four generations per day 24 hours day 6 hoursgeneration the yeast go through about 12 generations in 3 days Thus they lose ab out 5 nucleotides per generation 60 nucleotides 12 generations This problem doesn t give you a rm basis for a prediction but in fact the majority of ssion yeast that lose their telomeres stop dividing but continue to grow in size forming abnormally large cells Reference Nakamura TM Morin GB Chapman KB Weinrich SL Andrews WH Lingner l Harley CB amp Cech TR 1997 Telomerase catalytic subunit homologs from ssion yeast and human Science 277 95579591 w 0 0 DNA REPAIR DEFINiTiONE 5 67 Nonhomologous endjoining 5 68 DNA repair A110 Chapter 5 DNA Replication Repair and Recombination 5 69 Homologous recombination TRUEFALSE 5 70 False Repair ofdamage to a single strand bybase excision repair or nucleotide excision repair for example depends on just the two copies of genetic infor mation contained in the two strands ofthe DNA double helix By contrast pre cise repair of damage to both strands ofa duplex a doublestrand break for example requires information from a second duplex either a sister chro matid or a homolog 5 71 True Both spontaneous depurination and removal of deaminated C by uracil DNA glycosylase leave a sugar that is missing its base which is the substrate recognized byAP endonuclease 5 72 True The initial steps including recognition of damage and DNA incision are specific for repair whereas the later steps tend to be catalyzed by enzymes such as helicases DNA polymerases and ligases whose activities are common features of DNA metabolism THOUGHT PROBLEMS 5 73 The statement is incorrect DNA defects introduced by deamination and depurination reactions occur spontaneously They do not arise from repli cation errors and are therefore equally likely to occur on either strand If DNA repair enzymes recognized such defects only on newly synthesized DNA strands half ofthe defects would go uncorrected Also there is no fun damental reason to link such repair events to replication The bases pro duced by deamination and depurination are distinct from the normal bases and can be recognized in any sequence context By contrast misincorpora tion during replication adds normal bases that are mispaired The only way to identify them correctly is to search the newly synthesized strand 5 74 At many sites in vertebrate cells the sequence 5 CG3 is selectively methy lated on the cytosine ring Spontaneous deamination of methylC gives T A special DNA glycosylase recognizes a mismatched base pair involving T in the sequence TG and removes the T This DNA repair mechanism is not 100 effective as methylated C nucleotides are common sites for mutation in vertebrate DNA Over time the enhanced mutation of CG dinucleotides has led to their preferential loss accounting for their underrepresentation in the human genome 5 75 Repair of a doublestrand break by homologous recombination requires an intact homologous chromosome as a template for repair In a haploid cell in G1 each chromosome is present in only one copy Thus when a break occurs in G1 there is no intact homologous template to use for repair In haploid cells in G2 there are two copies of each chromosome sister chromatids so that a broken copy can be repaired from the intact sister chromatid 5 76 The variable in these experiments is light the brighter the light the less the observed killing Thus visible light can reverse the effects of UV irradiation Direct reversal of UV damage is common in microorganisms and is called enzymatic photoreactivation The enzyme from E coli has two chro mophores that cooperate in capturing photons from sunlight and using their energy to unlink pyrimidine dimers The account here is not much different from the original discovery of photoreactivation by Albert Kelner in the 1940s While investigating the effects ofpostirradiation temperature on UV survival Kelner was plagued by another variable In his own words Careful consideration was made of variable factors which might have accounted for such tremendous variationWe were using a glassfronted water bath placed on a table near a window in which were suspended transparent DNA REPAIR bottles containing the irradiated spores The fact that some ofthe bottles were more directly exposed to light than others suggested that light might be a fac tor Experiments showed that exposure ofUVirradiated suspensions to light resulted in an increase in survival rate or a recovery of 100000 to 400000 fold Controls kept in the dark showed no recovery at all Reference Friedberg ECWalker GC amp SiedeW 1995 DNA Repair and Muta genesis pp 92 103 NewYorkWH Freeman CALCULATIONS 5 77 The average distance between the centers of the Ku dimers is 65 nm which is equal to the width of about 8 Ku dimers The average distance from the edge ofone dimer to the next is equal to the width ofabout 7 Ku dimers This means that usually there will be a Ku dimer within half that distance within 4 Ku diameters of any potential doublestrand break Having Ku dimers in such close proximity suggests that doublestrand breaks will be rapidly rec ognized The volume of the nucleus is 113 X 1011 nm3 4713 X 3000 nm3 The nuclear volume per Ku dimer is 28 X 105 nm3 113 x1011nm34 X 105 which is equal to a cube 65 nm on a side 28 X 105 nm3033 Ifyou imagine a Ku dimer in the middle of each cube the average separation of their cen ters will be 65 nm or the equivalent of8 times the width of a Ku dimer Reference Lieber MR MaY Pannicke U amp Schwarz K 2003 Mechanism and regulation of human nonhomologous DNA endjoining Nat Rev Mol Cell Biol 4 712 720 If the inaccurately repaired breaks were randomly distributed around the genome then 2 of them would be expected to alter crucial coding or regu latory information Thus the functions of about 40 genes 002 X 2000 would be compromised in each cell although the specific genes would vary from cell to cell Because not all genes are expressed in every cell gene mutations in some cells would be without consequence In addition because the human genome is diploid the effect of mutations in expressed genes would be mitigated by the remaining allele For most loci one func tional allele 50 of normal protein is adequate for normal cell function however for some loci 50 is not adequate Thus the mutations would be expected to compromise the functions of some cells Reference Lieber MR MaY Pannicke U amp Schwarz K 2003 Mechanism and regulation of human nonhomologous DNA endjoining Nat Rev Mol Cell Biol 4 712 720 DATA HANDLING 5 79 A P5 The extreme UV sensitivity of UvrARecA double mutants relative to cells with mutations in two Uvr genes suggests that there are two separate path ways for dealing with UV damage The Uvr gene products are involved in one pathway whereas RecA is involved in a different pathway As a rule of thumb if a combination of mutant genes produces a phenotype that is no more defective than those ofthe individual mutant genes the gene products are likely to act in the same pathway A lethal hit in the UurARecA strain corresponds to about one pyrimidine dimer The number of pyrimidine dimers per lethal hit can be calculated as follows Since E coli is 50 GC all four bases are equally represented in the genome If they were arranged randomly which they are not but this assumption is a reasonable approximation then of the 16 possible dinu cleotide pairs in DNA onequarter would be pyrimidine pairs Therefore the E coli genome 46 X 106 base pairs contains 12 X 106 possible UV targets Given that a dose of 400 lm2 converts 1 of the pyrimidine pyr pairs into A111 A112 5 8 1 P5 gt P5 P5 Chapter 5 DNA Replication Repair and Recombination pyrimidine dimers the number of pyrimidine dimers per lethal hit in E coli is pg dimers 12 X 106 pg pairs X 0041m2 X 1 pg dimer X 1 lethal hit E coli lethal hit 100 pg pairs 400 lm2 pg dimers 1 2 lethal hit 39 Excision repair of UV damage which is initiated by the UvrABC endo nuclease is very accurate It is the only pathway that functions in the RecA strain thereby accounting for the low frequency ofmutations RecA partici pates in an errorprone pathway one component of the SOS response that yields the high frequency of mutations in the UvrA strain Since the wild type strain yields only 1 ofthe frequency of mutations that the UvrA strain does the Uvrpathway of repair must predominate in wildtype E coli Incorporation of adenine nucleotides opposite pyrimidine dimers should yield only onethird of the number of mutations generated by random incorporation incorporation of a random nucleotide at each ambiguous site would be correct only one out of four times 25 correct By contrast allA incorporation would be correct 75 of the time because Ts account for 75 ofthe pyrimidines in pyrimidine dimers All the As incorporated oppo site CC 10 would be incorrect and half the As incorporated opposite TC and CT 15 would be incorrect Thus A incorporation is a good strategy for dealing with UVinduced DNA damage The fairly even distribution of frameshift mutations indicates that UV damage is distributed throughout the gene A frameshift mutation anywhere in the Luchene would disrupt the function of the gene it was fused to and therefore would be scored in the gene fusion assay If UV damage were evenly dis tributed as it is expected to be the nonrandom distribution of missense mutations in the Luchene would indicate that the ends ofthe LacI protein are more critical to its function than the middle Most mutations in the ends yield a nonfunctional protein which is the basis for their detection as mutants The middle of the gene being less critical for function can accommodate some alterations and still produce a functional protein These silent mutations would not be detected in an assay that depends on loss of function The common deletion of one nucleotide in response to UV damage is thought to occur as a mistake during DNA synthesis opposite a pyrimidine dimer Presumably in response to the abnormal spacing ofbases caused by the pyrimidine dimer DNA polymerase inserts a single nucleotide rather than two The frameshift hot spots are hot because they contain runs ofTs and therefore multiple possibilities for dimer formation Reference Miller JH 1985 Mutagenic specificity of ultraviolet light J Mol Biol 182 45 65 The patterns of radioactivity in Figure 5 27B lanes 6 and 10 show that the XPV enzyme can elongate the labeled primer thus it is a DNA polymerase This enzyme is known as DNA polymerase T If the XPV enzyme simply helped a normal polymerase to overcome a chemical block for example it would not be able to elongate the primer by itself Both the XPV enzyme DNA polymerase T1 and DNA polymerase 05 can elongate the primer on an undamaged template As shown by the bands just below the fulllength band see Figure 5 27B lanes 6 and 10 DNA poly merase T did not always make a fulllength product whereas DNA poly merase 05 did On the template containing the cyclobutane dimer DNA polymerase T can still synthesize a product that is essentially the same as on an undam aged template lanes 6 and 10 however DNA polymerase 05 stops synthesis when it reaches the site of the damage lane 9 DNA REPAIR O U gt D gt P5 1 On the template L DNA 1 T adds one nucleotide and stops lane 14 Once again DNA polymerase 05 stops synthesis when it reaches the site of the damage lane 13 As you might imagine if the specificity of DNA polymerase T is relaxed suf ficiently so that it can insert nucleotides opposite a cyclobutane dimer it might also be sloppy when copying normal DNA In fact DNA polymerase T makes mistakes on normal DNA at a frequency of about I in 32 Thus it is an errorprone DNA r It lacks a r f d39 a and does not require an accurately basepaired 3 end to initiate synthesis This is not surprising given that it has evolved to recognize a completely different kind of structure in its template Under normal circumstances specialized poly merases such as DNA polymerase T are allowed to operate only at sites of damage thus their errorproneness on normal DNA is not an issue In normal individuals NER does not fix all the UV damage before the repli cation fork arrives The residual damage blocks the replicative DNA poly merase and triggers its r b DNA 1 This 1 39 quot A polymerase adds As across from cyclobutane dimers which are mostlyT T dimers thereby minimizing mutation Patients with XPV are sensitive to sunlight and prone to cancer because they are missing DNA polymerase T When UV damage appears at a replication fork in XPV patients another 1 r 39 quot J for UV dumug mries out the bypass syn thesis As a result higher frequencies of incorrect nucleotides are intro duced which ultimately leads to mutations and cancer Reference Masutani C Araki M Yamada A Kusumoto R Nogimori T Maekawa T Iwai S amp Hanaoka F 1999 Xeroderma pigmentosum variant XP V correcting protein from HeLa cells has a thymine dimer bypass DNA polymerase activity EMBO 18 3491 3501 The adaptive response to low levels of MNNG must require the synthesis of newproteins since the response is blocked by chloramphenicol see Figure 5 29 If activation of a preexisting protein were all that was required chlor amphenicol would not be expected to block adaptation The adaptive response might be shortlived for several reasons Presumably once the signal for adaptation ultimately MNNG is removed the induced synthesis of new proteins would halt The resistance to MNNG mutagenesis and killing would then depend on the stability of the induced proteins If they were relatively unstable the resistant state would decay rapidly as the proteins became inactive Even if the proteins were stable the resistant state of the population of bacteria would decay fairly rapidly due to their growth and the resulting dilution of the protein Reference Teo I Sedgwick B Kilpatrick MW McCarthy TV amp Lindahl T 1986 The intracellular signal for induction of resistance to alkylating agents in E coli Cell45 315 324 As shown in Figure 5 30 untreated bacteria and bacteria adapted to lowlev els of MNNG differ only in the presence or absence of 06methylguanine The absence of 06methylguanine in adapted bacteria correlates with the lowlevel of mutation suggesting that it is the mutagenic lesion G smethyl guanine is thought to be mutagenic because it can mispair with T during replication The kinetics of removal of the methyl group from 06methylguanine are peculiar because the amount removed does not increase with time as one might expect for a typical enzyme In addition the amount that is demethylated is directly proportional to the amount of purified protein added to the reaction One possible explanation for such behavior is that the enzyme is very unstable however the identical end points at 5 C and 37 C argue against this explanation since enzyme stability usually varies with temperature A113 A114 P5 0 Chapter 5 DNA Replication Repair and Recombination A calculation of the number of mutagenic bases that are demethylated per enzyme molecule indicates that each enzyme removes only one methyl group This calculation shows that the protein is used stoichiometrically instead of catalytically which explains the peculiar kinetics For example 25 ng of protein removes half of the initial number of 06 methylguanines or 013 pmol 05 X 026 pmol from the DNA Thus the number of enzyme molecules is nmol 19000 ng 79 gtlt1010 molecules 60 X 1014 molecules 25 x x enzymes ng nmol and the number of methyl groups is 60 gtlt1011 methyl groups methyl groups 013 pmol gtlt pmol 78 gtlt1010 methyl groups It turns out that methyl groups are transferred to one particular cysteine in the protein Once methylated the protein is dead and ultimately is degraded Because the protein inactivates itself during the reaction it is not an enzyme in the usual sense An enzyme is a catalyst which by definition is not consumed during the reaction Reference Lindath Demple B amp Robins P 1982 Suicide inactivation ofthe E coli 06methylguanineDNA methyltransferase EMBO 1 1359 1363 Highlevel expression ofMGMT confers resistance to killing by the alkylat ing agent MNNG but not to killing by y irradiation as shown in Figure 5 32 Because MGMT is specific for removal of 06methylguanine and its overexpression confers resistance 06methylguanine must be responsible for cell killing Reference Meikrantlel Bergom MA Memisoglu A amp Samson L 1998 06 Alkylguanine DNA lesions trigger apoptosis Carcinogenesis 19 369 372 The calculated Pvalue for chisquare analysis of these two distributions is less than 0001 Thus the observed distribution is significantly different from the distribution expected by chance There is less than a 11000 possi bility that the observed distribution is the same as the distribution expected by chance This analysis says that microhomologies are relevant to the mechanism ofNHEl but they don t specify how It is thought that the micro homologies in conjunction with NHEl proteins mayhelp to align the two duplexes so that the other manipulations required to link the duplexes together can occur Reference Roth DB Porter TN ampWilson HI 1985 Mechanisms of nonho mologous recombination in mammalian cells Mol Cell Biol 5 2599 2607 As evident in the micrograph in Figure 5 34A Rad52 binds to the ends of DNA Also it apparently binds more effectively to ends with singlestrand tails than it does to bluntended molecules since DNA molecules with Rad52 bound to the ends are commonly observed only when the DNA has singlestranded tails By binding to the ends of the linear DNA Rad52 prevents access by the exonuclease which requires a free end Bound Rad52 does not interfere with digestion by the endonuclease which can cleave in the interior of the DNA The preferential binding of Rad52 to ends with singlestrand tails as opposed to blunt ends suggests that broken ends are processed first by an exonuclease to create single strands to which Rad52 can bind The binding of Rad52 then protects against further exonuclease action At this point it is thought that Rad52 loads Rad51 a RecAlike recombinase onto the single strands leading to the subsequent step of strand invasion on a homologous duplex as shown in Figure 5 63 i BREAK 5 3 STRIP ENDS 5 gt 3 5 3 BIND Rad52 3 o o RECRUIT Rad5 l 5 5 3 INVADE i HOMOLOGOUS DUPLEX 5 3 3 5 Figure 5 63 The initial steps in Rad52 promoted repair of DNA breaks by homologous recombination Answer 5787 HOMOLOGOUS RECOMBINATION A115 Reference Van Dyck E Stasiak AI Stasiak A amp West SC 1999 Binding of doublestrand breaks in DNA by hum an Rad52 protein Nature 398 728 73 1 HOMOLOGOUS RECOMBINATION DE Fl N ITION 5 5 88 Allele 5 89 Gene conversion 5 90 Hybridization 5 91 Holliday junction TRUEFALSE 5 92 True The L 39 of L a 39 39 require fairly large stretches ofnearlyidentical DNA in order to initiate and complete a recom bination event These long regions ofnear identity are incorporated into the L 39 L a 39 39 to ensure that duplexes recom bine only at corresponding points along the chromosome and not for example between closely related repeated DNA elements that litter the genomes of higher eucaryotes 5 93 True Conversion is a change in frequency of markers nucleotide differ ences during recombination In the starting duplexes each marker is pre sent equally once in each duplex or twice in each duplex if the individual strands are counted In the products of a recombination event associated with conversion the frequency of markers is altered so that they are no longer equal Instead of 11 or 22 as in the input duplexes the frequency becomes 20 or 40 or 31 in the output duplexes The two common mech anisms for generating this change mismatch repair and DNA synthesis both involve some amount of DNA synthesis THOUGHT PROBLEMS 5 94 The recombination substrates and products are shown in Figure 5 64 The first rule for deducing the recombination products is to align the homologous segments that is to draw the arrows one above the other so that they are pointing in the same direction Alignment requires a twisting of substrates 3 4 and 6 Alignment is necessary in order to form a Holliday junction as would be more apparent ifreal sequences were used instead of arrows Substrates 3 and 4 illustrate a useful rule Recombination between direct repeats in a chromosome as in substrate 3 deletes one copy of the repeat and the intervening DNA Recombination between inverted repeats in a chromosome as in substrate 4 simply inverts the DNA between the repeats 5 95 A Your friend is correct Because the crossover point in any individual molecule is equidistant from a defined sequence the unique site for the restriction nuclease the sequences involved in the crossover must be homologous Since they occur at random distances from the restriction site there are no preferred sites for recombination B If you repeated the experiments in a RecAdeficient strain of E coli no fig ure 8s or X forms would be found C If the figure 8s were intermediates in a sitespecific recombination between the monomers the X forms would all have had exactly the same crossover point and would all look identical A116 0 P5 O The presence ofa 500 quotd pa patch of L Chapter 5 DNA Replication Repair and Recombination A D C B 1 4 gt A D 2 a 2 B C A D l 3 4 A D B C 4 4 W A D s a C C B 6 4 If the figure 8s were intermediates in a random nonhomologous recombi nation between the monomers the X forms would have had four arms of dif ferent lengths Reference Potter H amp Dressler D 1976 On the mechanism of genetic recombination electron microscopic observation of recombination inter mediates Proc NatlAczzd Sci USA 73 3000 3004 The first labeled restriction fragment to appear after starting the reaction is fragment 6 and label appears in the other fragments progressively with fragment 1 the last to become double stranded This order of appearance matches the order of the fragments in the 5 to3 direction on the single stranded circle shown in Figure 5 3813 starting from the top Since pairing between DNA strands is antiparallel invasion must start at the 3 end of the minus strand of the linear DNA and branch migration must proceed in the 3 to5 direction along the minus strand Figure 5 65 It takes about 20 minutes for the last fragment to become well labeled This indicates that the rate of movement of the branch point is about 350 nucleotidesminute 7000 nucleotides20 minutes or about 6 nucleotides second Compared with the rate of replication which is about 500 nucleotidessecond the rate of branch migration catalyzed by RecA is very slow DNAwould inhibit branch migration severely Nevertheless RecA can catalyze branch migration through such a nonhomology at a low frequency and produce a doublestranded circle with the nonhomologous DNA looped out as a single strand Figure 5 65 Reference Cox M amp Lehman IR 1981 The polarity of the recA protein mediated branch migration Proc NatlAczzd Sci USA 78 6023 6027 Figure 5 64 Alignments and crossovers in various recombination substrates Answer 5794 5 gt3 239 displaced strand 5 l region of nonhomology blocks the 54gt 3r advancing branch 539 r 3 0000000 3 l 5 sometimes the invading omologous strand jumps the 5 3 structures with the insert looped out Figure 5 65 Effect ofa nonhomologous atch on branch migration catalyzed by RecA Answer 57 HOMOLOGOUS RECOMBINATION This statement is incorrect Crossing and noncrossing pairs of strands can be interconverted by rotational movements that do not require strand brea age The double Holliday junction that would result from strand invasion is shown in Figure 5 66 Two versions are shown both equally correct The upper one looks simpler because the invading duplex has been rotated so that the marked 5 end is on the bottom This arrangement minimizes the number of lines that must cross which is whymost recombination diagrams are shown in this way The lower representation is perfectly correct but it looks more complicated DNA synthesis uses the 3 end of the invading duplex as a primer and fills the singlestrand gap by 5 to3 synthesis as indicated A large percentage of the human genome is made up ofrepetitive elements such as Alu sequences which are scattered among the chromosomes If recombination were to occur between two such sequences that were on dif ferent chromosomes for example a translocation would be generated Unrestricted recombination between such repeated elements would quickly rearrange the genome beyond recognition Different rearrangements in dif ferent individuals would lead to large numbers of nonviable progeny putting the species at risk This calamity is avoided through the action of the mismatchrepair sys tem Repeated sequences around the genome differ by a few percent of their sequence When recombination intermediates form between them many mismatches are present in the heteroduplex regions When the mismatch repair system detects too high a frequency of mismatches it aborts the recombination process in some way This surveillance mechanism ensures that recombining sequences are nearly identical as expected for sequences at the same locus on homologous chromosomes DATA i39iANDLiNG 5 100 A P5 5 101 gt P5 0 RecBCD must cut the DNA at or near the Chi site since a 400nucleotide fragment was generated from a substrate labeled at the left end see Figure 5 41 lane 2 and a 100nucleotide fragment was generated from a substrate labeled at the right end lane 5 These lengths mark the position of the Chi site in the original DNA fragment see Figure 5 40 Since shorter labeled fragments were generated only when the top strand 5 L and 3 R was labeled RecBCD must cut only the top strand This result means that RecBCD recognizes the orientation of a Chi site if the Chi site were ipped in this experiment the bottom strand would have been cleaved The ability of RecBCD to separate DNA strands is shown by the unboiled control which was indistinguishable from the boiled sample see Figure 5 41 compare lanes 5 and 6 If RecBCD simply bound to the Chi site and introduced a nick the 100nucleotide single strand would have remained attached to the original fragment Reference Ponticelli AS Schultz DVV Taylor AF amp Smith GR 1985 Chi dependent DNA strand cleavage by RecBCD enzyme Cell41 145 151 Rqu can cleave only 4 of the 256 44 possible 4nucleotide sequences which is 164 of all possible sequences No One of the four 4nucleotide sequences would be expected on average every 64 nucleotides Thus only a small amount of branch migration would be required to juxtapose a Holliday junction with an appropriate cleavage sequence In cells Rqu operates in conjunction with RuvAB which is a heli case that drives branch migration of Holliday junctions Evidently the two subunits of Rqu coordinate their cleavages Only when A117 Figure 5 66 Double Hollidayjunction Answer 5798 New DNA synthesis is indicated by wavy lines A118 Chapter 5 DNA Replication Repair and Recombination both have encountered an appropriate cleavage sequence does either site get cleaved This conclusion is apparent in the results with the hybrid junc tion in Figure 5 42BWhen one duplex carries a resolution sequence but the other does not the sequence is not cleaved This indicates that the two sub units do not operate independently of one another Additional experiments in the reference below showed that in Hollidayjunctions with two cleavable but nonidentical sequences both sequences were cleaved The duplexes generated by cleavage of the indicated strands of the Holliday junction in Figure 5 42A would have the same sequences shown in the fig ure except that segment a would be connected to segment d and segment cwould be connected to segment 1 Thus a crossover would be generated In the absence of any proteins but Rqu the two product duplexes would each carry a nick at the site of Rqu cleavage U Reference Shah R Cosstick R ampWest SC 1997 The Rqu protein dimer resolves Holliday junctions by a dual incision mechanism that involves basespecific contacts EMBO 16 1464 1472 TRANSPOSITION AND CONSERVATIVE SITESPECIFIC RECOMBINATION DEFINITIONS 5 102 DNA only transposon 5 103 Reverse transcriptase 5 104 Retrovirus 5 105 Bacteriophage 5 106 Conservative sitespecific recombination TRUEFALSE 5 107 False Transposable elements integrate nearly randomly and genes often are destroyed or altered by the integration event While it is true that some of these events are lethal to the cell and to the transposable element most events are not Spreading throughout the genome even at the cost of a few cells and transposons ensures that the transposable element will survive with the species THOUGHT PROBLEMS 5 108 C m diut d 39 inn between pp sit I oriented LoxP sites inverts the sequences between the sites whereas recombination between loxP sites in the same orientation deletes the sequences Figure 5 67 This result should remind you of the similar outcome obtained for homologous recombination between direct repeats and inverted repeats in Problem 5 94 see Figure 5 64 substrates 3 and 4 As in that problem the easiest way to work out the products is to align the LoxP sites and then follow the crossover between them 5 109 The key to understanding these results is to realize that the products ofthe chromosomal recombination event in Figure 5 44A are the substrates shown in Figure 5 44B and vice versa Thus both reactions can occur starting from either substrate and the final outcome is a balance between the two reactions If chromosomal recombination is more efficient than a b c d a b c d 4gt ltii H a b a b 3 c d d c a c b d a d Figure 5 67 Products ofCremediated recombination between oppositely oriented and directly repeated LoxP sites Answer 5408 TRANSPOSITION AND CONSERVATIVE SITESPECIFIC RECOMBINATION A119 integration then the balance will always be in favor of the unintegrated molecule Thus when the Cre recombinase catalyzes integration most of the time it will convert the product back to the starting materials Because the FLP recombinase is less efficient at promoting recombination a higher fraction of the integrants will survive a lower fraction are converted back to the starting materials DATA HANDLING 5 110 A All colonies must have arisen by transposition of TnIO into the bacterial genome because survival depends on the presence of the tetracyclineresis tance gene carried by TnIO The presence of mixed colonies with blue and white sectors is the key observation Since the frequency of sectored colonies is high but transposition is rare sectored colonies must arise com monly in individual transposition events A replicative mechanism can transfer only one strand of the parent heteroduplex and thus can generate onlywhite or blue colonies depending on which strand is transferred A cut andpaste mechanism transfers both strands of the heteroduplex which upon replication and segregation into daughter bacteria will produce a sec tored colony Once the bacteria are spread onto a Petri dish all the descen dants ofthe original infected cell are confined to the immediate vicinity and thus grow together to form the colony If two different daughters are pro duced at the first division their descendants will grow together to produce a single colony with sectors containing the two different kinds of bacteria The pure blue and pure white colonies arise from transposition events that involve the homoduplexes The proportions of blue white and sectored colonies are as expected from the equal mixture of heteroduplexes which give rise to the sectored colonies and homoduplexes which give rise to the pure colonies B Each heteroduplex contains a mismatched region of DNA corresponding to the position of the mutation in the LaCZ gene If these heteroduplexes were introduced into bacteria that could repair such mismatches then the fre quency of sectored colonies would be expected to decrease markedly In essence each repair event would convert a heteroduplex into a homo duplex If the mismatch repair were unbiased the frequencies of blue colonies and white colonies would each increase equally These experiments were carried out in bacteria defective in mismatch repair precisely to avoid that distortion of the data Reference Bender I amp Kleckner N 1986 Genetic evidence that Tn10 trans poses by a nonreplicative mechanism Cell 45 801 815 5 111 A Transposition ofthe Ty element depends on reverse transcription of an RNA intermediate Normally reverse transcriptase is expressed at a very low level Your modified plasmid however places the gene under control of the galac tose control elements In the presence of glucose absence of galactose the galactose control elements turn the gene off and as a result the expression of reverse transcriptase is very low In the presence of galactose the reverse transcriptase gene is expressed at very high levels Thus the frequency of transposition increases substantiall B The frequency of Tyinduced Hisf colonies is low because a very specific kind of transposition event is required to activate the defective histidine gene the Ty element must transpose to a site near the 5 end of the gene Thus even though nearly all cells show evidence for transposition insertion near the defective histidine gene is still relatively rare C The data in Figure 5 47 indicate that nearly every cell harboring the Tybear ing plasmid suffers one or more transposition events when grown on galac tose Each Ty transposition has the potential for altering the function or expression of genes near the site of integration Ifthe element integrates into A120 Chapter 5 DNA Replication Repair and Recombination the coding portion of a gene it can eliminate the encoded function if it inte grates in the noncoding region near a gene it may alter the gene s expression In organisms such as yeasts which have been finely tuned to their environ mental niche by evolutionarypressure it is unlikely that random insertion of a Ty element will improve growth characteristics Thus it is not unreasonable that a high rate of transposition should cause cells to grow poorl These data do not prove that the cells grow more slowly because of the high rate of transposition even though that explanation is very likely to be correct As the authors point out the high level of expression of reverse transcriptase might interfere directly with RNA metabolism For example mRNA molecules could be inactivated by reverse transcription Alterna tively the reverse transcripts of the cellular mRNAs could be mutagenic to the nuclear genes Errors introduced during reverse transcription into DNA could be incorporated into the nuclear genes by recombination Reference Boeke ID Garfinkel D Styles CA amp Fink GR 1985 Ty elements transpose through an RNA intermediate Cell40 491 500 Chapter 2 Cells Chemistry and Biosynthesis THE CHEMICAL COMPONENTS OF A CELL DEFINITIONS 2 1 Avogadro s number 2 2 Hydrophobic force 2 3 Molecule 2 4 Atomic weight 2 5 Hydrogen bond 2 6 Acid 2 7 van derWaals attraction TR U EFAL SE 2 8 True With each halflife half the remaining radioactivity decays After 10 halflives 1210 about 11000 exactly 11024 of the original radioactivity will remain It is useful to remember that 210 is about 1000 2 9 False The pH of the solution will be very nearly neutral essentially pH 7 because the few Hf ions contributed by HCl will be outnumbered by the H ions from dissociation ofwater No matter how much a strong acid is diluted it can never give rise to a basic solution Ln fact calculations that take into account both sources of H ions and also the effects on the dissociation of water give a pH of 698 for a 10 8 M solution of HCl 2 10 False Strong acids bind protons weakly and give them up readily in a water environment 2 11 False Many of the functions that macromolecules perform rely on their abil ity to associate and dissociate readily which would not be possible iftheywere linked by covalent bonds By linking their macromolecules noncovalently cells can for example quickly remodel their interior when they move or divide and easily transport components from one organelle to another It should be noted that some macromolecules are linked by covalent bonds This occurs primarily in situations where extreme structural stability is required such as in the cell walls of many bacteria fungi and plants and in the extra cellular matrix that provides the structural support for most animal cells THOUGHT PROBLEMS 2 12 Organic chemistry in laboratories even the very best is rarely carried out in a water environment because oflow solubility of some components and because water is reactive and usually competes with the intended reaction In This Chapter THE CHEMICAL A13 COMPONENTS OF A CELL CATALYSIS AND THE A27 USE OF ENERGY BY CELLS HOW CELLS OBTAIN A34 ENERGY FROM FOOD Do U P5 0 Chapter 2 Cell Chemistry and Biosynthesis The most dramatic difference however is the complexity It is critical in lab oratory organic chemistry to use pure components to ensure a high yield of the intended product By contrast living cells carry out thousands of differ ent reactions simultaneously with good yield and virtually no interference between reactions The key of course is that cells use enzyme catalysts which bind substrate molecules in an active site where they are isolated from the rest of the environment There the reactivity of individual atoms can be manipulated to encourage the correct reaction It is the ability of enzymes to provide such special environments miniature reaction cham bers that allows the cell to carry out an enormous number of reactions simultaneously without cross talk between them The atomic weights of elements represent the average for the element as iso lated from nature Elements in nature include a mixture of isotopes For most elements one isotope represents that vast majority those elements have atomic weights that are nearly integers Chlorine however has two abundant isotopes 75 35Cl and 25 37Cl which average to an atomic weight of 355 The atomic number of carbon which equals the number of protons is six The atomic weight which equals the number ofprotons plus neutrons is 12 The number of electrons which equals the number of protons is six The first shell can accommodate two electrons and the second shell eight Carbon therefore needs four additional electrons or would have to give up four electrons to obtain a full outermost shell Carbon is most stable when it shares four additional electrons with other atoms including other carbon atoms by forming four covalent bonds Carbon 14 has two additional neutrons in its nucleus Because electrons determine the chemicalproperties of an atom carbon 14 is chemicallyiden tical to carbon 12 The identity of the element specifies the number of protons thus 14C and 12C for example both have 6 protons The difference in atomic weight is due to differences in numbers ofneutrons 14C has 2 more neutrons than 12C 3H has 2 more neutrons than 1H 35S has 3 more neutrons than 328 and 32F has 1 more neutron than 31F The change in identity of the element from P to S indicates that the num ber of protons has increased by 1 from 15 for P to 16 for S Because the atomic weight has remained the same the number of neutrons must have decreased by 1 from 17 in 32p to 16 in 32S The conversion of a neutron into a proton is accompanied by emission of an electron The equations for decay are indicated below with the number of protons shown as subscripts to make the relationships clearer This process is illus trated for decay of 14C in Figure 2 38 14Cg a14N7 e 3H1a 3Hez e 35516 a 35le7 8 e antineutrino 396 particle39 Figure 2 38 Radioactive decay of C5 to 1 N7Ahswer 2715 THE CHEMICAL COMPONENTS OF A CELL 2 1 7 2 20 0 gt P5 0 Among the product atoms only the isotope of helium 3Hez is not the most common the most common isotope is 4Hez In each case the product atom will initially be missing an electron and thus will be positively charged It will have one extra proton but the same num ber of electrons it started with in its electron shell remember the emitted electron came from the nucleus andis long goneA positive charge on these product atoms would be highly unstable and the atoms will steal an elec tron from some other molecule in their environment returning the atom to electrical neutrality But the theft of an electron from another molecule will initiate afreeradical cascade as thatmolecule in turn scrambles to replace its missing electron Such freeradical cascades are one source ofthe biolog ical damage caused by radioactivity You would expect the DNA backbone to break The chemistry of an atom is determined by its number of protons which establishes the number and reactivity ofelectrons in the outer electron shellAphosphorus atom is com fortable chemically speaking bonded to the four oxygen atoms in the arrangement that makes up the phosphodiester bond in DNA whereas a sulfur atom which is what the phosphorus atom would become is not The time at which an individual radioactive atom will decayis impossible to predict For a quot39 39 39 large 1 r 39 39 39 toms however it is possible to calculate very accurately the fraction that will decay over a defined period of time There is a good reason for the inverse relationship between halflife and max imum specific activity the shorter the halflife the greater the number of atoms that will decay per unit time hence the greater the number of dpm or curies If the radioactive atoms were present on an equimolar basis then those with shorter halflives would give more dpm that is they would have more Cimmol a higher specific activity No It is a coincidence that the ratio of C H and O in living organisms is the same as that for sugars Much of the H and O 70 is due to water and the rest is from a mixture of sugars amino acids nucleotides and lipids the whole variety of small and large molecules that make up living organisms From the innermost to the outermost the first three electron shells can carry 28 and 8 electrons H can gain one electron to fill the first shell or it can lose one electron to leave a completely empty first shell C can gain or lose four electrons to gen erate a filled outer shell N and P will gain three electrons to fill their outer shells O and S will gain two electrons to fill their outer shells In general it is energetically most favorable for an atom to lose or gain the fewest number of electrons required to generate a filled outer shell The valences for these atoms equal the number of electrons that must be gained or lost to complete the outer shell Thus the valences are the same as the numbers in part B B less than 1 kcalmole E 1 3 kcalmole A 12 kcalmole C 84 kcal mole and D 675 kcalmole Permanent dipoles are critical in biology because they allow molecules to interact through electrical forces Any large molecule with many polar groups will have a pattern of partial positive and negative charges on its sur face When such a molecule encounters a second molecule with a comple mentary set of charges the two molecules will be attracted to one another by interactions between their permanent dipoles Such interactions resem ble ionic bonds but are weaker Hydrogen bonds form between specific groups one is always hydrogen linked in a polar bond to a nitrogen or an oxygen and the other is usually a A15 P5 05 Chapter 2 Cell Chemistry and Biosynthesis nitrogen or an oxygen atomVan derWaals attractions are weaker and occur between any two atoms that are in close enough proximity Both hydrogen bonds and van derWaals attractions are shortrange interactions that come into play only when two molecules are already close Hydrogen bonds are directional whereas van derWaals attractions are not Both types of bonds can be thought of as ways to finetune an interaction that is helping to position two molecules correctly with respect to each other once they have been brought together by diffusion Van der Waals attractions would occur in all three examples A hydrogen bond would occur only in example 3 that is between a nitrogen atom and a hydrogen bound to an oxygen atom Because of its larger size the outermost electrons in a sulfur atom are not as strongly attracted to the nucleus as they are in an oxygen atom Conse quently the hydrogen sulfur bond is much less olar than the hydrogen oxygen bond Because of the reduced polarity the sulfur in H28 is not strongly attracted to hydrogen atoms in adjacent HzS molecules and hydrogen bonds do not form It is the lack of hydrogen bonds in H28 that allows it to be a gas and the presence of strong hydrogen bonds in water that makes it a liquid Although the symbol p in common usage denotes the negative logarithm of what it stands for is unclear In the original 1909 paper in which the con cept of pH was developed the author Danish chemist Soren PL Sorensen was not explicit In textbooks where it is commented on at all it is most commonly reputed to stand for the French or German words for power or potential Close examination of the original paper reveals that the p in pH is likely a consequence of the author s arbitrary choice to call two solutions by the letters p and q The q solution had the known Hf concen tration of 1 the p solution had the unknown Hf concentration If the solu tions had been switched do you think qH would ever have caught on Reference Norby JG 2000 The origin and the meaning ofthe little p in pH Trends Biochem Sci 25 36 37 A solution ofsodium chloride will be neutral Neither the sodium ion nor the chloride ion binds H or OH and thus neither in uences the dissociation of water A solution of potassium acetate the salt of a weak acid will be basic because the acetate ion will steal sufficient numbers of protons from water to satisfy the equilibrium CH3COO H20 CH3COOH OH The increase in hydroxyl ions will cause the number of protons to decrease satisfying the equilibrium for water ionization OH Hf 10 and mak ing the solution basic A solution of ammonium chloride the salt of a weak base will be acidic because the ammonium ion will dissociate sufficiently to satisfy the equilibrium NH4 H20 NH3 H30 The increase in hydronium ions lowers the pH and makes the solution more aci ic The dissociation expression for a carboxylate group is COOH Hf COO The dissociation expression for an amine group is NH3 H 12 The acidic carboxyl group gives up its proton much more readily than the protonated amine group that s why the amine group is basic it tends to pick up a proton from water Thus as shown in Figure 2 39 the pKfor the 4 a 2 597 2 NHBCHZCOOH amp39NH3CH2COO 1 NHSCHZCOOH NHZCHZCOO G NH CH coo amp NHZCHZCOO 3 NHBCHZCOO i i i 0 05 10 15 NaOH added equivalents Figure 2 39Titration ofa solution of glycine Answer 2727 20 THE CHEMICAL COMPONENTS OF A CELL 2 28 2 29 0 U carboxyl group corresponds to the point at which 05 equivalents of 0H have been added which is pH 23 The pKfor the amine group corresponds to the point at which 15 equivalents of 0H have been added which is pH 9 6 The predominant ionic species of glycine are shown in Figure 2 39 At point 2 the pK for the carboxyl group two species H3NCH2C00H and HgNCH2C00 are present in equal concentrations Similarly at point 4 the pKfor the amine group two species are present at equal concentrations The isoelectric point occurs when 10 equivalents of 0H have been added Figure 2 39 At that point point 3 on the curve the predominant ionic species is HgNCH2C00 which carries no net charge The isoelectric point for glycine occurs at pH 597 which is exactly halfway between the pKval ues for the carboxyl group 234 and the amine group 960 At this pH all the other minor ionic species of glycine are present in exactly balancing amounts so that there is no net charge on the solute The structures of these three forms of glycine are shown in Figure 2 40 The titration of histidine is shown in Figure 2 4A and that for glutamate is shown in Figure 2 4B The requirement for three equivalents of 0H in both cases indicates that three ionizable groups are involved Estimating the pK values from the points on the curves at which 05 15 and 25 equivalents of 0H were added allows a match to be made with the amino acids listed in Table 2 2 The rank order for pKvalues is expected to be 4 1 2 3 It is convenient to discuss the rank order starting with the aspartate side chain carboxyl group on the surface of a protein with no other ionizable groups nearby 1 The side chain would be expected to have a pK around 45 somewhat higher than observed in the free amino acid because of the absence of the influence of the positively charged amino group If the side chain were buried in a hydrophobic pocket on the protein 2 its pKwould be higher because the presence of a charge in a hydrophobic environment without the easybond ing to water would be disfavored If there were another negative charge in the same hydrophobic environment 3 the pK of the aspartate side chain would be elevated even further even more difficult to give up a proton and become charged because of electrostatic repulsion If there were a posi tively charged group in the same environment 4 then the favorable elec trostatic attraction would make it very easy for the proton to come off low ering the pKeven below that of the side chain on the surface 1 You should advise the runner to breathe rapidlyjust before the race Since a sprint will cause a lowering of blood and cell pH the object of the prerace routine would be to raise the pH with the idea that the runner could then sprint longer before feeling fatigue Holding your breath or breathing rapidly both temporarily affect the amount of dissolved C02 in the bloodstream Holding your breath will increase the amount of C02 and push the equilib rium to the right leading to an increase in H and a lower pH By contrast breathing rapidly will reduce the concentration of C02 and pull the equilib rium to the left leading to a decrease in Hf and a higher pH The majority of aspirin is absorbed into the bloodstream through the lining of the stomach At the low pH in the stomach which is below the pK of aspirin most of the aspirin will be uncharged and will therefore diffuse through the plasma membranes of the cells that line the stomach COO COOH COO Nal l l l HRNic iH Cl H3NCH HZNiciH l l H H H free glycine glycine hydrochloride glycine sodium salt Figure 2 40 Three forms of glycine Answer 2728 A17 2 33 HO l CH2 0 H OH HO OH HO HO OH CH2 0 HO T HO HO HO o o HO O HO CH2 CH2 CH2 4 CH2 0 OH HO OH HO O O HO OH HO OH HO OH 06 HO OH ocDglucose ocDglucose amylose oc14glycosidic linkages Chapter 2 Cell Chemistry and Biosynthesis Figure 2 41 The functional groups in 13bisphosphoglycerate pyruvate and cysteine Answer 2735 The statement is correct The hydrogen oxygen bond in water molecules is polar thus the oxygen atom carries a partial negative charge and the hydro gen atoms carry partial positive charges The partial negative charges on the en atoms are attracted to the positive charges on the sodium ions but are repelled by the negative charges on the chloride ions Although individually weak several such noncovalent interactions can in aggregate provide sufficient stability to hold a pair of molecules together The situation is analogous to objects held together withVelcroTM a small bit holds them together weakly whereas a large bit holds them together tightly Such fastenings are easy to peel apart because the links can be broken a few at a time rather than all at once The functional groups on the three molecules are indicated and named in Figure 2 41 The drawings are accurate The smaller hydrogen atoms are linked to oxygen atoms and the larger ones are linked to carbon atoms The difference in size reflects the polarity of the respective bonds the H C bond is nonpolar whereas the H 0 bond is polar As a result oxygen draws the shared elec trons away from the hydrogen more strongly resulting in a smaller radius of the electron cloud around the hydrogen atom Both amylose and cellulose are polymers of glucose Amylose is a polymer of xDglucose linked together by 061 a 4 glycosidic bonds Cellulose is a poly mer of BDglucose linked together by 31 a 4 glycosidic bonds It is more dif ficult to discern this for cellulose as every other monomer is ipped through 180 about an axis that passes through carbons 1 and 4 The structures of x Dglucose and BDglucose and the linked dimers in amylose and cellulose respectively are shown in Figure 2 42 The structures of the sedative Rthalidomide and the teratogenic S thalidomide differ at a single chiral center Figure 2 43 After the teratogen had been identified it was assumed that if thalidomide had been synthe sized as the pure correct optical isomer it would have caused no problems Recent experiments however have shown that thalidomide is rapidly racemized converted to a mixture of optical isomers in animals Thus a protocol designed to synthesize the correct isomer would not have made a difference in the end BDglucose BDglucose rotated l l cellulose 344 glycosidic linkages ll carboxylic phosphoric 0 acid anhydride co hydroxvl H07Cl 7 H o l of P 0 phosphoryl l 0 13 bisphosphoglycerate O o carboxylate C C O carbonyl CH3 pyruvate S H sulfhydryl in CHZ l 0 CHic carboxylate 0 amino NHgt cysteine HQ 0 3 HO CH 2 OH OH HO OH HO OH l 4HOCHZ 0 Figure 2 42 The structures of ocDglucose and BDglucose and their linkage into dimers Answer 2737 Arrow point to ring oxygens to show alignment of monomers that are joined into dimersThe curved arrow indicates a 180 rotation of the sugar I THE CHEMICAL COMPONENTS OF A CELL A19 B STHALDOMDE A RTHALDOMDE Figure 2 43 The structures ofthe teratogenic S and sedative R forms of thalidomide Answer 2738 sedative teratogen A molecule is amphiphilic if its hydrophobic and hydrophilic portions are segregated into two distinct regions of the molecule As indicated in Figure 2 44 fatty acids and phospholipids are amphiphilic because each of these molecules has one welldefined portion that is hydrophilic and another that is hydrophobic By contrast triacylglycerols are relatively hydrophobic throughout Because fatty acids and phospholipids are amphiphilic collec tions of these molecules can form distinctive kinds of structures including lipid monolayers at an air interface lipid bilayers the essence of mem brane structure and micelles aggregates with a hydrophilic surface and a hydrophobic center Triacylglycerols which are predominantly hydropho bic separate from a water solution forming a lipid droplet the storage form of fat in adipose cells A FATTY ACID B TRIACYLG LYCEROL o O O ll C Hzc7CkCrcHerHzTCHCHZrCHCHfCHZeCHZ CHCHfCHCHZTCH3 9H2 a CH2 HcioecicrlrcchchHz CHCchHCi1ftHCHI cH fCH3 Cl 3 CH2 HzcioicechchCHCHZiCHfCIlfCl lCHZrCHfCHf CHZichCIg 9H2 EH2 c PHOSPHOLIPID CH2 EH l CTHZ HZCiQ ceCHCHfCHZrCHZ CHfCHfCHfCHfCHfCHZ CHZrCl lfCH3 2 T l 2 HC07CltCH27CHCHCHCHCHZltCHZltCH27CH27CH27CHZechCH3 CH2 0 39 L 39 Figure 2 44 A fatty acid a triacylglycerol EH H d h h I39 39d A 2 39 Th 1 i iOiPiOiCH 7CH 7 H i an ap osp Olpl hswer r e C Hg I 2C i Z Z N 3 hydrophilic and hydrophobic portions of O the molecules are indicated by Iightand hydrophobic hydrophilic darkshading respectively A20 2 40 2 4 1 Chapter 2 Cell Chemistry and Biosynthesis side chain sidechain NH I f l sidechain sldechaln 1 CH1 0 0 CH3 l l er i 1 side chain C Hz C HZ C HZ i W W PM W o Nterminus HiNiCH7C7N7CHiCiNiCHiCiNiCHiCiNiCHiC Cterminus H on i on H on i on G W K E M glycine tryptophan lysine glutamate methionine The components of the polypeptide are shown in Figure 2 45 The synthesis of a macromolecule with a unique structure requires that a single stereoisomer is used in each position Changing one amino acid from its L to its Dform would result in a different protein Thus if a random mix ture of the D and Lforms were used to build a protein its amino acid sequence would not specify a single structure but rather many different structures 2N different structures would be formed where Nis the number of amino acids in the protein Why Lamino acids were selected in evolution as the exclusive building blocks of proteins is a mystery we could easily imagine a cell in which cer tain or even all amino acids were used in the Dforms to build proteins as long as these particular stereoisomers were used exclusively A major advantage of condensation reactions is that they are readily reversible by hydrolysis and water is readily available in the cell This allows cells to break down their macromolecules or macromolecules of other organisms ingested as food and to recover the subunits intact so that they can be recycled to build new macromolecules The components of the oligonucleotides are shown in Figure 2 46 DNA is identifiable by the use of deoxyribose in the sugarphosphate backbone and thymine T as one of the pyrimidine bases By contrast RNA uses ribose in the sugarphosphate backbone and uracil U in place of thymine T CALCUL iTIONS 2 44 A Although we do not know the molecular weight of cellulose because the DU molecules contain variable numbers of C6H1206 subunits we know it is 40 carbon 6 x126 X 12 12 X 1 6 X16Thus 2 g ofcarbon atoms which corresponds to 1023 atoms of carbon are contained in the cellulose that makes up a page Catoms2gx6x xc i 12 d 1023 C atoms The product of the number of carbon atoms in each dimension equals 1023 Xx Ygtlt Z 1023 The number of carbon atoms in each dimension will be in the same ratio as the lengths Thus ifZis the number of carbon atoms in the thickness of the page and Xis the number in the width the ratio ofXZ is 21 x 102 m007 gtlt103 m and X 21 x 10 2007 x 103 Z Similarly the ratio of YZis 275 gtlt102 m007 x103 m and Y 275 x 102007 X 103 Z Substituting for Xand Y 21X10 2Z 275X10 2Z 007gtlt 103 X 007gtlt103 X 2105 Z3 1023 X 007 gtlt10 32 21 X 10 2 X 275 X104 Z3 848 X 1015 Z 204 X 105 Figure 2 45 The components of the polypeptide Answer Z40The Shaded atoms indicate peptide bonds THE CHEMICAL COMPONENTS OF A CELL O U A RNA 3 DNA Figure 2 46 Components of 539 end 539 end oligonucleotides Answer2743A RNA oligonucleotide B DNA oligonucleotide O o Shaded components from top to bottom l l are bases nucleosides and nucleotides of T 0 O o i r 0 O H N Cquot uracil BASES Cquot lt rm guanine CH N CH N A z o z 0 N NHZ ribose W deoxyribose OH 070 NHZ 07T0 NH2 0 N N O N l lt l J adenine l lt l J adenine CHZ N N CH2 0 N N NUCLEOSIDES lt gt i O i o i To NH2 07 rquot 0 O o ch cytosine Cquot l r thymine CH2 0 N 0 CH2 0 N o k NUCLEOTIDES f OH of T 0 of 0 l l 3 end 3 end The suggested shortcut makes the calculation a little more straightforward The volume of the page is 4 X 105 m3 21 X 10 2 m X 275 gtlt10 2 m X 007 X 10 3 mY which equals a cube with a side of 16 gtlt10 2 m The presence of 1023 carbon atoms in this volume corresponds to 46 X 107 carbon atoms per side 1023033 This corresponds to about 200000 carbon atoms to span the thickness of the page 46 gtlt107 atoms16 X 10 2 m X 007 gtlt10 3 m With a diameter of04 nm each it would take 175000 carbons atoms to span the thickness of the page 007 X 106 nm04 nm if they were laid end to end at their van derWaals contact distance At first glance it might seem strange that it takes more carbon atoms in cel lulose where they account for only 40 of the mass than it takes as free pure carbon atoms to span the thickness of the page The key is that the atoms in cellulose are covalently bound to one another and therefore are much closer together than their van derWaals radii The nuclei of covalently linked carbon atoms are separated by 015 nm whereas those in van der Waals contact are separated by 04 nm Glucose C6H1206 has a molecular weight of 180 6 X 12 12 X 1 6 X 16 and therefore a mass of 180 gmoleA concentration of90 mgdL corre sponds to 5 gtlt10 3 M or5 mM 90 mg X 10 IL dL L 5 X103 molesL which is 5 X103 M or 5 mM XgL glucose x 103 mg mole 0 The values of for each of the isotopes are shown in Table 2 7 At Iii2 2303 log 05 art2 A22 Chapter 2 Cell Chemistry and Biosynthesis l D 7 5730 years 1 21 x104year 230 X 10 l0min 12 3 years 563 gtlt 1O Zyear 107 x10 7min 874 days 793 x10 3day 551gtlt 10 min 143 days 485 gtlt 1O Zday 33710 5mln Re arranging 9 2303 log 05 2303 O301 1312 1312 0693 2t W2 For 14C 9 0693 5730 years 2 121 X104year Since there are 526 X 105 minyear 2 230 gtlt10 10min gt 14C atoms correspond to 13 X 10 12 of the total carbon in living organisms A disintegration rate of 153 dpm corresponds to 665 gtlt1010 atoms of 14C dpm KN dpm N 230 x 10 10min 1 665 X 1010 In one gram of carbon there are a total of50 X 1022 atoms 1 ggtlt 6 gtlt1023 dg X C atom12 d Thus 14C atoms are 13 X 10 12 665 X 101050 X1022 of the total carbon in living organisms A 70kg human contains about 9 X 1043 Ci or 009 uCi of14C A 70kg human contains 13000 g of carbon 70 kg x 0185 X 1000 gkg At 153 dpm per gram of carbon this corresponds to 198 X 105 dpm 13000 ggtlt 153 dpmg which is 89 x 108 Ci 198 x 105 dpm gtlt Ci222 gtlt1012 dpm A little over twice this amount of radioactivity 02 uCi is present in humans due to the natural abundance of 40K which has a halflife of 13 X 109 years a decay constant of 101 X 10 15 min l and constitutes 0012 of the potassium on Earth A human is about 035 potassium by weight The sample is 13500 years old Its age can be calculated from the equation for first order decay see Problem 2 46 05 O 23 log M 0 If 23 1 A 39 7 0g N0 Since the dpm per gram ofcarbon is proportional to the numbers of carbon atoms present initially N0 and at the current time N dpm values can be substituted for them 23 30 t 121 x 104year 1 X 10g153 19 X104years X 071 t 135 X 104years or 13500 years 2 48 The specific activity of glycine would be 012 Cimmol if all carbons were 14C The number of radioactive atoms in 1 mmol of glycine is 12 X 1020 6 X 1020 moleculesmmol X 2 14Cmolecule With a decay constant 2 of 230 X 10 10min this number of 14C atoms corresponds to 276 X 1011 dpm 12 X THE CHEMICAL COMPONENTS OF A CELL 0 D O P5 O P5 1020 x 230 x 1040 which is 0124 Ci 276 x 1011 dpm222 x 1012 dpmCi At a specific activity of 200 uCimmol about 1300 molecules of glycine would carry a 14C atom 200 uCimmol is 1600 of the maximum specific activity 200 X 10 6 Cimmol012 Cimmol and since there are 2 car bons per glycine about 1300 molecules will have a 14C atom Ethanol in 5 beer is 086 M Pure ethanolis 172 M 789 gL gtlt mole46 g and thus 5 beer would be 086 M ethanol 172 M X 005 At a legal limit of 80 mg100 mL ethanol will be 174 mM in the blood 80 mg01 L X mmol46 mg At the legal limit 174 mM ethanol in 5 beer 086 M has been diluted 494fold 860 mM174 mM This dilution represents 809 mL in 40 L ofbody water 40 L494 At 355 mL per beer this equals 23 beers 809 mL355 mL It would take nearly4 hours At twice the legal limit the person would contain 64 g of ethanol 016 g01 L X 40 L The person would metabolize 84 ghr 012 ghr kg X 70 kg Thus to metabolize 32 g of ethanol the amount in excess of the legal limit would require 38 hours 32 g X hr84 g Hydronium H30t ions result from water dissociating into protons and hydroxyl ions each proton binding to a water molecule to form a hydronium ion 2 HzO a H20 H OH a HgO OH At neutral pH the concentra tions of H30 ions and OH ions are equalWe know that at neutrality the pH is 70 and therefore the H H30t concentration is 10 7 The molecular weight of water is 18 and thus has a mass of 18 gmole The mass of 1 liter of water is 1000 g Thus pure water is 556 M 1000 gL gtlt mole18 g Because the mass of water is actually 18015 gmole pure water is 555 M The ratio of H30 ions 10 7 M to H20 molecules 555 M is 18 gtlt10 9Thus at neutral pH only about 2 water molecules in a billion are dissociated A solution is said to be neutral when the concentrations of H and OH are exactly equal This occurs when the concentration of each ion is 10 7 M so that their product is 10 14 M2 In a 1 mM solution of NaOH the concentration of OH is 10 3 M Thus the concentration of Ht is 10 11 M which is pH 11 Kw Hi 10 10 14M2 11 1073M 10 M A pH of 50 corresponds to an Ht concentration of 10 5 M Thus the OH concentration is 10 9 M 10 14 M210 5 M It would take 50 mL of KOH to neutralize the protons in the HCl solution and in the solution of acetic acid Both solutions contain 50 mmol of titratable H 05 L X 01 moleL 005 mole or 50 mmol The titratable HJr will be exactly neutralized when 50 mL of1 M KOH 005 L X 1 moleL 005 mole or 50 mmol has been added as indicated in Figure 2 47 At the equivalence point titration of the HCl solution by KOH will give a pH 7 solution of slightly less than 01 M KCl 0091 M because of the 10 increase in volume due to added KOH Figure 2 47A At the equivalence point the solution is the same as would be generated by dissolving the appropriate amount of KCl in water which gives a neutral pH pH 7 as dis cussed in Problem 2 26 At the equivalence point titration of acetic acid by KOH will give a 0091 M solution ofpotassium acetate K r CHgCOO As discussed in Problem 2 26 a solution ofpotassium acetate will be basic The pH changes rapidly in the A TITRATION OF HCI rapidly equivalence point halftitrated A 7 1 pH changes ill 20 l 40 1 M KOH added mL Figure 2 47 Titration curves for 500 mL solutions of 01 M HCI and acetic acid Answer 2752 A24 Chapter 2 Cell Chemistry and Biosynthesis Table 2 8 Dissociation ofa weak acid at pH values above and below the pK Answer 2 53 O P5 104 103 999 102 99 101 91 100 50 104 91 10 10 099 1 1073 0099 01 104 00099 001 region of the equivalence point making it difficult to estimate the pH from the curve in that region but it is somewhere around pH 9 Figure 2 47B Knowing the pKfor acetic acid and the concentration allows an exact calcu lation of the pH at the equivalence point which is 886 As discussed below the titration curve does give a good estimate of the pKfor acetic acid which would allow the calculation 0pr at the equivalence point The pK is equal to the pH when the concentrations of CH3COO and CH3COOH are equal which occurs when half the ionizable Hf has been titrated that is when 25 mL of 1 M KOH has been added From the graph in Figure 2 47B the pH of the solution at that point can be estimated to be between 46 and 49 The actual pKof acetic acid is 476 The values for log A HA A HA and the percentage of the acid that has dissociated are shown in Table 2 8 Included in the table are a set of ruleof thumb values that may be easier to remember and are handy to have mentally available for estimating answers A plot opr versus percentage dissociation of the weak acid HA is shown in Figure 2 48 Allweak acids regardless opr yield titration curves that are identical to this one The curves for different weak acids are shifted along the pH scale depending on their pKvalues 100 percent dissociated allA Figure 2 48 Percentage dissociation ofa weak acid as a function of pH Answer 2753 THE CHEMICAL COMPONENTS OF A CELL P5 This titration curve is fundamentally similar to protein ligand binding curves and to enzyme activity curves As pointed out in Problem 3 103 all three phenomena titration of weak acids protein ligand binding and enzyme activity generate identical curves Most importantly the ruleof thumb values pertain to each allowing rapid estimates in all three situations Estimating the percentages of the four forms of phosphate at pH 7 is straightforward from the ruleofthumb values derived in Problem 2 53 see Table 2 8 Since the cytosol is about 5 pH units above the pKfor dissocia tion of H3PO4 it will be about 99999 ionized thus H3PO4 will account for only about 0001 of the total Since the cytosol is just slightly above the pK for dissociation of H2PO4 H2PO4 would be slightly less than 50 and HPO42 would be slightly greater than 50 of the total Since the cytosol is more than 5 pH units below the pKfor dissociation of HPO42 it will be less than 0001 ionized thus P04 will account for less than 0001 of the total The ratio of HPO42 to H2PO4 A HA in the cytosol at pH 7 can be calculated using the Henderson Hasselbalch equation A l pH pKlogHA Substituting HPO 2 70 69 log jzpo HPO 7 01 logmg HPO 2 1 13926 Because these two forms ofphosphate sum to 100 the other two forms are negligible the ratio can be used to calculate the percentage of each form that is present 44 for HzPO4 and 56 for HPO42 Thus if the cytosolic concentration of phosphate is 1 mM then the concentration of H2PO4 is 044 mM and that of HPO42 is 056 mM You could start with any of the individual solutions ofphosphate and make a 1 mM solution by adding 10 mL to a liter ofwater You would want to start with a little less than a liter ofwater so that you could adjust the pH by addi tion ofHCl or KOH as neededWhen the pH was adjusted to 69 you would then bring the solution to a final volume of 1 liter by addition of water If you started with H3PO4 you would need to add 15 mL of 1 M KOH 15 equivalents of OH to bring the solution to pH 69 If you started with KH2P04 you would need to add 05 mL of KOH 05 equivalents of OH to bring the solution to pH 69 Ifyou started with K2HPO4 you would need to add 05 mL of 1 M HCl 05 equivalents of H to bring the pH to 69 Ifyou started with K3PO4 you would need to add 15 mL of HCl 15 equivalents of H to bring the pH to 69 In all these cases you are moving phosphate along its titration curve to reach 69 the pKfor H2PO4 H HPO42 at which point H2PO4 equals HPO42 You could reach that same point by mixing equal amounts of the species directly Thus you would get a 1 mM solution at pH 69 by mixing 5 mL aliquots of the KH2P04 and K2HPO4 solutions in 1 liter of water In practice you would still measure the pH to make sure it was 69 in case one ofyour original solutions was not exactly as advertised Similarly you could also achieve the same end by mixing 5 mL aliquots of the H3PO4 and K3PO4 solutions The solutions willnot allbe the sameAlthough theywill allbe 1 mM phos phate they can differ in the amount of K and Cl ions Solutions that are brought up to pH 69 by addition of KOH and those that are obtained by direct mixing of different phosphate solutions will be 15 mM K A solution P5 0 Chapter 2 Cell Chemistry and Biosynthesis of K2HP04 that is adjusted downward to pH 69 with HCl will be 2 mM K and 05 mM Cl A solution of K3P04 that is adjusted to pH 69 with HCl will be 3 mM K and 15 mM Cl Depending on the uses for which the buffer is intended these differences could be crucial It is especially important to note that these calculations assume that the pH adjustments with KOH or HCl were precise If one is sloppy and overshoots the pH and then under shoots it several times before hitting it exactly the concentrations of K and Cl can be arbitrarily high even though the phosphate is exactly 1 mM and the pH is exactly 69 This can lead to very puzzling results and has Since the pKvalues of the two buffering systems are not very different the key consideration is overall concentration of the buffers The concentration of globin chains in red blood cells is 67 mM 100 mg mole g 1000 mL lgIObml mL X 15000g X 1000 mg X L 67 gtlt10 3 moleL 67 mM If all ten histidines can interact with the cytosol that is are not tied up in ionic bonds for example then the total concentration of histidines in globinis 67 mM 10 X 67 mM Thus the potential buffering capacity of the histidines in globin 67 mM is much greater than that ofphosphate at 1 mM The ratio of HC03 to C02dis at pH 74 is 20 HCO PH PK gm HCO 74 61 logm3is H HCO HCO 1 logQ COZMM 13 andg COZ dis 20 Since the total carbonate is 25 mM HC03 is 238 mM 2021 gtlt25 mM and C02dis is 12 mM 121 X 25 mM Addition of 5 mM of Ht would drive 5 mM of HC03 to C02dis thereby maintaining the equilibrium for hydration and dissociation of C02 Thus addition of 5 mM H would reduce HC03 to 188 mM and increase C02dis to 62 mM At these concentrations the pH would be 66 HCO pK 1 glco2 disl 188 62 pH 61 048 66 pH 61 log In a closed system bicarbonateC02 would provide a very weak buffering system with a very small buffering capacity In an open system addition of 5 mM H would cause the same changes as above except that the excess C02 would be removed by exhalation main taining its concentration at 12 mM Under these conditions the pH would be 73 HC03 pH pK log Eo disln 188 61 log pH 61 119 73 Thus in an open system the pH decreases by only about 01 pH unit The beauty of this buffering system is that HC03 is constantly being added back to the system through metabolism which generates C02 that is then hydrated to HC03 Moreover the two components of the system are inde pendently regulated C02 exhalation by the lungs can be controlled by the rate of breathing and HC03 can be excreted or retained by the kidneys The concentration ofproteinis about 200 mgmL 018 gtlt11 gmL 198 mg CATALYSIS AND THE USE OF ENERGY BY CELLS mL Note that ifyou are given the density ofthe cell you don t need to know its volume to calculate the concentration ofprotein DATA HANDLING 2 59 The effects on pKvalues are due to electrostatic interactions between the carboxyl and amino groups In alanine a large electrostatic attraction between the NH3 and the C00 is present at pH 7 This favorable inter action makes it more difficult to remove a proton from NH3 raising its pK and more difficult to add a proton to COO lowering its pK The electro static attraction decreases as the amino and carboxyl groups are moved far ther and farther away from one another in the oligomers of alanine virtually disappearing by Ala4 as re ected in the changes in pKvalues Reference Cantor CR amp Schimmel PR 1980 Biophysical Chemistry Part 1 The Conformation of Biological Macromolecules pp 42 46 San Francisco WH Freeman and Company 2 60 Assuming that the change in enzyme activity is due to the change in proto nation state of histidine the enzyme must require histidine in the proto nated charged state The enzyme is active only below the pK of histidine which is typically around 65 to 70 in proteins where the histidine is expected to be protonated CATALYSIS AND THE USE OF ENERGY BY CELLS DE Fl N I TION 5 2 61 Activation energy 2 62 Standard freeenergy change AG 2 63 Oxidation 2 64 Substrate 2 65 Diffusion 2 66 Enzyme 2 67 Coupled reaction 2 68 Equilibrium 2 69 Free energy G TRUEFALSE 2 70 True The difference between plants and animals is in how they obtain their food molecules Plants make their own using the energy of sunlight plus C02 and F120 whereas animals must forage for their food 2 71 True Oxidation reduction reactions refer to those in which electrons are removed from one atom and transferred to another Since the number ofelec trons is conserved no loss or gain in a chemical reaction oxidation removal of 1 tr mu t be r 39 by du ti udditi of electrons 2 72 False The equilibrium constant for the reactionA 1 B remains unchanged it s a constant Linking reactions together can convert an unfavorable reac tion into a favorable one but it does so not by altering the equilibrium constant but rather by changing the concentration ratio of products to reactants A28 Chapter 2 Cell Chemistry and Biosynthesis THOUGHT PROBLEMS 2 73 gt P5 Catabolic pathways break down larger molecules often derived from food into smaller molecules and abstract energy in a useful form in the process Anabolic pathways or biosynthetic pathways construct larger molecules from smaller ones The small molecules generated by catabolic pathways are used as starting points and intermediates in anabolic pathways and the energy from catabolic pathways harnessed in the form of activated carriers is used to drive the energetically unfavorable process of biosynthesis The second law ofthermodynamics applies to closed systems which could be a chamber in a scientist s laboratory for example or the entire universe Closed systems do not exchange matter or energy with their surroundings Living organisms such as cells and human beings are not closed systems they continually exchange matter and energy with their surroundings It is per fectly permissible for a portion ofa closed system a human being in the uni verse to increase its order provided that the rest ofthe system the rest ofthe universe becomes disordered to a greater extent This is what living organ isms do they take in food and use the energy to increase their order But to do so they release waste products that are less complex less ordered than the food they took in and much of the energy in the food is released in its most disordered form as heatWhatever order is createdwithin a cell or an organ ism is more than paid for by the disorder introduced into its environment The overall reaction ofphotosynthesis is unlikely to be carried out by a single enzyme because of the large number ofcovalent bonds that must be made to convert C02 to glucose Typically enzymes alter one or a pair of covalent bonds in a reaction The conversion of C02 to glucose requires a large num ber of proteins to carry out the individual steps in the overall pathway Because sugars are more complicated molecules than C02 and H20 the reaction generates a more ordered state inside the cell As demanded by the second law of thermodynamics heat is generated at many steps along the pathway of these reactions as summarized in the equation for photo synthesis If the reaction is rewritten as its two half reactions it is then clear that Na is oxidized and Cl is reduced 2Naa2Naf2 22 Cl2a2Cl NET 2 Na Cl2 a 2 Na 2 Cl Electrons are removed from sodium therefore it is oxidized Electrons are added to chlorine therefore it is reduced For polymerization to be favored at high temperature and depolymerization to be favored at low temperature AH and AS must both be positive At high temperature where polymerization is favored the TAS term becomes large enough to overcome the positive AH term yielding a negative favorable AG for polymerization At low temperature where depolymerization is favored the TAS becomes small enough that it is outweighed by the positive AH term giving rise to a positive unfavorable AG for polymerization It seems counterintuitive that polymerization of free tubulin subunits into highly ordered microtubules should occur with an overall increase in entropy decrease in order But it is counterintuitive only if one considers the subunits in isolation Remember that thermodynamics refers to the whole system which includes the water molecules The increase in entropy is due largely to the effects of polymerization on water molecules The sur faces of the tubulin subunits that bind together to form microtubules are fairly hydrophobic and constrain order the water molecules in their immediate vicinity Upon polymerization these constrained water CATALYSIS AND THE USE OF ENERGY BY CELLS 2 79 2 82 2 85 P5 molecules are freed up to interact with other water molecules Their new found disorder much exceeds the increased order of the protein subunits and thus the net increase in entropy disorder favors polymerization Reaction rates could be limited by any combination ofthe following factors 1 the frequency of collision with the active site of the enzyme 2 the pro portion of molecules that are energetic enough to undergo reaction or 3 the rate of release of the products from the active site The 107 fold rate enhancement corresponds to the ratio of the areas under the curve to the right of the thresholds in Figure 2 22 The number of molecules with sufficient energy to undergo a catalyzed reaction the area to the right ofthreshold A divided by the number of molecules with sufficient energy to undergo an uncatalyzed reaction the area to the right of threshold B is equal to 107 The statement is correct A reaction with a negative AG for example would not proceed spontaneously under conditions where there is already an excess of products over those that would be present at equilibrium Con versely a reaction with a positive AG would proceed spontaneously under conditions where there is an excess of substrates compared to those present at equilibrium At the same concentrations the AGvalues for the forward and reverse reac tions will be the same magnitude but differ in sign Thus the AGfor C D a A Bis 45 kcalmole Absolutely none These values provide information about how energetically favorable a reaction is under standard conditions AG and under actual conditions AG They provide no information about the rate at which a favorable reactionwill occur Rates depend on other factors in the cell most commonly on the existence of enzymes and their properties The cell links these two reactions by providing an enzyme that catalyzes the net reaction directly Thus in the cell phosphorylation of glucose does not occur as the sum of two reactions but as a single reaction The enzyme hex okinase for example binds glucose and ATP and catalyzes the transfer of a phosphate directly from ATP to glucose A positive AG for D a E and a negative AG for E a F is an unstable situation A positive AGfor D a E means that E will be converted to D This will reduce the concentration of E and increase that of D Meanwhile more D will be added from the upstream reaction and E will be removed by the down stream reaction These effects which increase D and decrease E continue until the concentration ratio E D is sufficient to drive the reaction in the forward D a E direction at which point the AG becomes negative The free energy AG 11 to 13 kcalmole derived from ATP hydrolysis depends on both AG 73 kcalmole and the concentrations of the sub strates and products AG AGO 141 kcalmole logi Given that AG is 73 kcal the ratio of ADPPiATP in cells must range from a little more than 10 3 AG 115 to a little less than 10 4 AG 129 kcalmole as they do under different cellular conditions Enzyme A is beneficial It allows the interconversion of two energy carrier molecules both of which are required in the triphosphate form for many metabolic reactions AnyADP that is formed is quickly converted to ATP by oxidative phosphorylation and thus the cell maintains a high ATP ADP ratio Because of enzyme A called nucleotide phosphokinase some of the ATP is used to keep the GTP GDP ratio similarly high Chapter 2 Cell Chemistry and Biosynthesis Enzyme B would be highly detrimental to the cell Cells use NADf as an electron acceptor in catabolic reactions and must maintain a high NADf NADH ratio to support the breakdown ofglucose and fats to make ATP By contrast NADPH is used as an electron donor in biosynthetic reac tions cells thus maintain a high NADPHNADP ratio to drive the syn thesis ofvarious biomolecules Since enzyme B would bring both ratios to I it would reduce the rates ofboth catabolic and anabolic reactions The two lists match up as follows Awith 1 B with 5 C with 6 D with 2 E with 3 and F with 4 Reactions B D and E all require coupling to other energetically favorable reactions In each case molecules are made that have higherenergy bonds By contrast in reactions A and C simpler molecules A or lowerenergy bonds C are made CALCULATIONS 2 88 0 U The oxidation states for each of the carbons in the various twocarbon molecules are shown in Figure 2 49 The molecules are ordered from most reduced ethane to most oxidized acetate and acetamide in Figure 2 49 In all cases the differences in oxidation state correspond to a pair of electrons Virtually all redox chemistry between organic molecules in cells involves transfers of pairs of electrons usually as a hydride ion a proton with two electrons Transfers of pairs of electrons are favored because molecular orbitals are most stable with even numbers of electrons If a molecule acquires or gives up a single electron it will have an unpaired electron an extremely unstable and reactive state known as a free radical Freeradical chemistryis rarelyused in biological reactions although it is critically impor tant for example in electron transport during respiration see Chapter 14 and in the reduction of ribonucleotides to deoxyribonucleotides A carbon carbon double bond H2CCH2 an alcohol H3C CH20H an amine H3C CH2NH2 a thiol ch CHzSH and a phosphate ester H3C CH2PO42 are all at the same oxidation state Thus any reaction that converts one into the other is not a redox reaction Similarly carboxyls HgC COO and amides HgC CONHz are at the same oxidation state Car bonyls ch CHO are intermediate and carbon carbon single bonds are the most reduced form If you remember the relationships among these molecules you will be able to handle about 95 of biological redox reac tions at a glance For the rest you can apply the rules in this problem The first and third reactions in the series succinate a fumarate and malate a oxaloacetate are redox reactions In the first the two central carbons of succinate are oxidized by the introduction of a double bond they have OXIDATION STATES MOLECULE c c2 c c2 SUM 1 ethane HRCiCH3 r3 3 most reduced 2 ethene HZCCHZ 2 72 41 3 ethanol HRCiCHZOH e3 71 41 4 phosphoethanol chi CHZ Pd2 73 71 41 5 ethylamine HRCiCHZNH 3 4 41 6 thioethane HgCiCHZSH 73 71 41 7 acetaldehyde HRCiCHO 3 1 2 8 acetate H3 CiCOO 73 3 0 9 acetamide HECiCONHZ 73 3 0 most oxidized Figure 2 49 A series of twocarbon molecules arranged in order ofincreasing oxidation state Answer 2788 CATALYSIS AND THE USE OF ENERGY BY CELLS 2 93 P5 gt B each lost one electron their oxidation states have increased from 2 to 1 In the third reaction the carbon attached to the hydroxyl group has been oxidized by conversion to a carbonyl group it has lost a pair of electrons its oxidation state has increased from 0 to 2 The second reaction is not a redox reaction since the oxidation states of the two central carbons have changed in compensating ways the upper one from 1 to 0 and the lower one from 1 to Since the molecules in the pathway have been oxidized the unseen electron carriers must have been reduced that is they must have accepted a pair of electrons from pathway intermediates It is instructive to examine the reac tions a little closer In addition to losing a pair of electrons both succinate and malate also lose a pair ofhydrogens One ofthe hydrogens and both elec trons travel together as a hydride ion H to the electron carrier the other hydrogen is released as a proton The electron carrier in the oxidation of suc cinate is FAD which picks up both the hydride ion and the proton to become FADHZ The electron carrier in the oxidation of malate is NADf which picks up the hydride ion to become NADH but leaves the proton in solution The enzyme catalyzes events at the rate of 1 event32 X 10 5 sec or 1 event32 usec 1014 events100 yr gtlt yr365 days gtlt days24 hr x hr60 min gtlt min60 sec The instantaneous velocities are H20 38 gtlt104 cmsec glucose 12 X 104 cmsec and myoglobin 13 X 103 cmsec The calculation for a water molecule which has a mass of 3 X 10 23 g 18 gmole gtlt mole6 X 1023 molecules is shown below I kTm 2 138 x 10716 g cm2 1 V2 392 Ksec2 gtlt310Kgtlt3gtlt10 23g v 378 X 104 cmsec When these numbers are converted to kmhr the results are fairly astoun ding Water moves at 1360 kmhr glucose at 428 kmhr and myoglobin at 47 kmhr Thus even the largest slowest of these molecules is moving faster than the swiftest human sprinter And water molecules are traveling at Mach 11 Unlike a human sprinter or a jet airplane these molecules make forward progress only slowly because they are constantly colliding with other molecules in solution Reference Berg HC 1993 Random Walks in Biology Expanded Edition pp 5 6 Princeton NJ Princeton University Press It would take glucose an average of 013 second and myoglobin an average of 13 seconds to diffuse 20 pm The calculation for glucose is t x26D 2 cm2 sec 20 um X 104 my X 6 x 5 x10 6cm2 t 013 sec Reference Berg HC 1993 RandomWalks in Biology Expanded Edition pp 5 6 Princeton NJ Princeton University Press At equilibrium AG is zero for any reaction there is no tendency for the reaction to proceed in one direction over the other direction Substituting K for F6PG6P gives 0 AG 23 RTlog K AG 23 RTlog K or 141 kcalmole log K At equilibrium AGis zero and AG is 042 kcalmole F6P G6P AGO 23 RTlog 0 P5 O p Chapter 2 Cell Chemistry and Biosynthesis Substituting 141 kcalmole for 23 RT 141 k l Thea log 05 042 kcalmole Since AG relates to the equilibrium it is unchanged that is AG 042 kcalmole At AG 06 kcalmole the ratio of F6P to G6P is 019 o FGP AG AG 23 RTlog GGP 06 kcal 042 kcal 1411callO F6P mole mole mole gGGP F6P 102 kcalmole 10g G6P 141kcalmole 03972 FGPl GGP 019 The equilibrium constant K 46 X 10 3 M4 AGO 23 RTlog K 33 kcalmole 141 kcalmole log K 10 K 33 kcalmole g 141 kcalmole K 457 X 103 457 gtlt1O 3M 1 234 You will note that this expression gives the numerical value obeecause it enters the equation as a log value where units are meaningless Because of this and because the units derived from the equilibrium expression can vary so widely for example M2 M1 no units M4 M4 etc depending on the reaction some sources treat the equilibrium constant as a unitless number by convention Throughout this book we have added back the appropriate units so that they can be used as a guide in other calculations such as the one below The equilibrium concentration of G6P would be 11 X 10 7 M or 011 uM if GLC and Pi were at 5 X 10 3 M 5 mM Remember that in the expression for equilibrium all concentrations are M G6P K GLC Pi GGP 457 gtlt10 3M 1gtlt 5 x 10 3 M x 5 X1O3M GGP 114 X10 7M 01111M This would be a low concentration for an intermediate in glucose metabolism most of which are in the range 10 to 100 HM or so However given that one intermediate in the glycolytic pathway 13bisphospho glycerate is present at about 1 uM perhaps this concentration for G6P is not out of the question Nevertheless cells have devised a different way to carry out this reaction The value of AG for the net reaction is obtained by adding together the AG values for the individual reactions thus AG for the net reaction is 40 kcalmole 33 kcalmole 73 kcalmole The equilibrium constant K is 69 x 102 AGquot 23 RTlogK 40 kcalmole 141kcalmolelogK 40 kcalmole 10gK 141kcalmole 23984 K 687 X 102 The equilibrium G6P would be 103 M which is an improbably high concentration for a cell It far exceeds the amount of available phosphate and would be more viscous than maple syrup CATALYSIS AND THE USE OF ENERGY BY CELLS 2 95 K G6P ADP GLC ATP 687 x 102 x 5 x 10 3 M x 3 x 10 3 M G6P 103 M G6P 103 M E The cellular G6P of200 HM is about 50000fold lower than the equilibrium concentration 103 M calculated in part D Obviously in cells the phos phorylation of glucose to glucose 6phosphate is not at equilibrium In cells AG for the reaction is 66 kcalmole G6P AD P GLCATP 76 73 40 kcalmole 141 kcalmole logW 40 kcalmole 141kcalmole 188 AG 40 kcalmole 264 kcalmole 664 kcalmole AG AGO 23 RTlog The whole population ofATP molecules in the body would turn over cycle 1800 times per day or a little more than once a minute Conversion of3 moles of glucose to C02 would generate 90 moles of ATP 3 moles glucose X 30 moles ATPmole glucose The whole body contains 5 X 10 2 mole ATP 2 X 10 3 moleL X 25 L Since the concentration ofATP doesn t change each ATP must cycle 1800 times per day 90 moles ATPday5 gtlt10 2 mole ATP DATA IIANDLII IG 2 96 2 97 Addition ofthe final subunit into the ring involves two sets ofbonds one to each of its neighbors Figure 2 50 If the bonds were equally strong then you might expect a squaring of the equilibrium constant a doubling ofAG In reality the situation is somewhat more complex and can give rise to equi librium constants several orders of magnitude higher than the square This rather simple treatment however serves to illustrate the stability that can be gained by closure Icosahedral viruses for example are very stable as a con sequence of closure in three dimensions Reference Howard 2001 Mechanics of Motor Proteins and the Cytoskele ton pp 151 163 Sunderland MA Sinauer Associates Inc The theoretical underpinnings of the assertion that all AG values must be negative are so strong that the error must lie with the experiment One potential source of error is that the concentrations of the intermediates have not been measured precisely enough That is the most likely explanation given the formidable experimental challenges to such precise instanta neous measurements of concentration The other possibility is that the AG values equlibrium values are slightly off relative to their true values under physiological conditions References Berg IM Tymoczko IL amp Stryer L 2002 Biochemistry Fifth Edi tion pp 436 437 New York WH Freeman and Co Minakami S Suzuki C Saito T ampYoshilltawa H 1965 Studies on erythrocyte glycolysis I Determination of the glycolytic intermediates in human erythrocytes J Biochem Tokyo 58 543 550 Figure 2 50 Closure ofthe pentameric ring with two bonds Answer 27 A34 Chapter 2 Cell Chemistry and Biosynthesis HOW CELLS OBTAIN ENERGY FROM FOOD DEFINITIONS 2 98 Citric acid cycle 2 99 Fat 2 100 Glycogen 2 101 Oxidative phosphorylation 2 102 Electrontransport chain 2 103 Glycolysis TRUEFALSE 2 104 False Glycolysis is the only metabolic pathway that can generate ATP in the absence of oxygen There are many circumstances in which cells are temp orarily exposed to anoxic conditions during which time they survive by gly colysis For example in an allout sprint the circulation cannot deliver ade quate oxygen to leg muscles which continue to power muscle contraction by passing large amounts of glucose from glycogen down the glycolytic pathway Similarly there are several human cell types that do not carry out oxidative metabolism for example red blood cells which have no mito chondria make ATP via glycolysis Thus glycolysis is critically important but it s sort of like insurance it s not so important until you need it and then it s hard to do without 2 105 True Oxygen is not a substrate or a product for any reaction in the citric acid cycle Thus the reactions can occur in the absence of oxygen In cells however the reactions cannot proceed for very long in the absence of oxy gen because NADH and FADHZ cannot be converted back to NAD and FAD by oxidative phosphorylation which does depend on oxygen In the absence of NAD and FAD four separate reactions of the cycle see Figure 2 35 will cease to operate THOUGHT PROBLEMS 2 106 The two lists match up as follows Awith 2 and 3 B with 4 and C with 1 2 107 A One way to balance the equation for a pathwayis to write down each reaction and sum them all up as done below for the first two reactions of glycolysis 1 Glucose ATP a G6P ADP Hf 2 G6P a F6P SUM Glucose ATP G6P a G6P F6P ADP Hf Note that in the sum the intermediate G6P appears on both sides and thus cancels out Because the pathway intermediates always drop out of such a balanced equation there is a less tedious way to balance the equation all molecules at the blunt ends of the arrows are reactants and all molecules at the pointed ends are products Ignore the intermediates but pay careful attention to stoichiometry since it is usually just the ow that is indicated Using this method the equation for the first stage of glycolysis is Glucose 2ATP a 2 G3P 2ADP 2 Hf B The equation for the second stage of glycolysis is HOW CELLS OBTAIN ENERGY FROM FOOD 2 108 2 109 2 110 2 111 2 112 G3P Pi NAD 2 ADP a pyruvate NADH 2 ATP H20 The overall equation is just the sum of these two equations after doubling the numbers in the second equation to get the stoichiometry rig t Glucose 2ATP a 2 G3P 2ADP 2 H 2G3P2 Pi2NAD 4ADPA2pyruvate 2NADH 4ATP2HZO Glucose 2ADP 2 Pi 2 NAD a 2 pyruvate 2ATP 2 NADH 2 HzO 2 H The extreme conservation of glycolysis is one form of evidence that all pre sent cells are derived from a common ancestor In this view the elegant reac tions of glycolysis would have evolved only once and then they would have been inherited as cells evolved The later invention of oxidative phosphory lation allowed 15 times more energy to be captured than is possible by gly colysis alone This remarkable efficiency is close to the theoretical limit and hence virtually eliminates the opportunity for further improvements The generation of alternative pathways would result in no obvious growth advantage that could have been selected in evolution Under anaerobic conditions cells are unable to make use of pyruvate the end product ofthe glycolytic pathway and NADH The electrons carried in NADH are normally delivered to the electron transport chain for oxidative phosphorylation but in the absence of oxygen the carried electrons are a waste product just like pyruvate Thus in the absence of oxygen pyruvate and NADH accumulate Fermentation combines these waste products into a single molecule either lactate or ethanol which is shipped out of the cell The ow of material through the glycolytic pathway could not continue in the absence of oxygen in cells that cannot carry out fermentation Because NAD NADH is present in cells in limited quantities anaerobic glycolysis in the absence of fermentationwould quickly convert the pool largely to NADH The change in the ratio NADthADH would stop glycolysis at the step in which glyceraldehyde 3phosphate G3P is converted to 13bisphospho glycerate 13BPG a step with only a small negative AG normally see Table 2 4 The purpose of fermentation is to regenerate NADf by transferring the pair of carried electrons in NADH to pyruvate and excreting the product Thus fermentation allows glycolysis to continue In the absence of oxygen the energy needs of the cell must be met by fer mentation to lactate which requires a high rate of ow through glycolysis to generate sufficient ATP When oxygen is added the cell can generate ATP by oxidative phosphorylation which generates ATP much more efficiently than glycolysis Thus less glucose is needed to supplyATP at the same rate In the presence of arsenate 1arseno3phosphoglycerate is formed instead of 13bisphosphoglycerate Figure 2 51 Because it is sensitive to hydrol ysis in water the arsenate highenergy bond is destroyed before the molecule that contains it can diffuse to the next enzyme The product ofthe hydrolysis 3phosphoglycerate is the same product normally formed but because it is formed nonenzymatically the reaction is not coupled to ATP formation Arsenate wastes metabolic energy by uncoupling many phos photransfer reactions by the same mechanism and that is why it is so poi sonous The reverse of the forward reaction is simply not a possibility under phys iological conditions Recall from Problems 2 83 and 2 97 that ow of material through a pathway requires that the AGvalues for every step must be negative Thus for a flow from liver glycogen to serum glucose the step from glucose 6phosphate to glucose must have a negative AG To simply reverse the forward reaction that is G6P ADP a GLC ATP AG 40 kcalmole would require that the concentration ratio GLCATP G6P ADP be less than 10284 00015 in order to bring the reaction to equilibrium AG 0 1arseno3phosphog lycerate I0 l O OiAsi O I C O HOEH le Cf of f 0 0 H20 spontaneous O O C HOCH C HZ HAsOf H o l 07To O 3phosphoglycerate Figure 2 51 Hydrolysis of 1arseno 3phosphoglycerate Problem 24 I I A36 2 1 13 2 114 2 115 2 116 2 117 Chapter 2 Cell Chemistry and Biosynthesis GLC ATP G6P ADP 0 40 kcalmole 141 kcalmole log GLC ATP G6P ADP GLC ATP 40 kcalmole 2 84 G6PADP 141kcalmole 39 GLC ATP GGP ADP 00015 AG AGO 141 kcalmole log log Inside a functioning cell such as a liver cell exporting glucose the concen tration of ATP always exceeds that of ADP but just for illustrative purposes let s assume the ratio is 1 Under these conditions the ratio GLCGGP must be 000145 that is the concentration of G6P must be nearly 700 times higher than that of GLC Since the circulating concentration of glucose is maintained at between 4 and 5 mM this corresponds to about 3 M G6P an impossible concentration given that the total concentration of cellular phosphate is less than about 25 mM The statement is incorrect The oxygen atoms that are part of C02 do not come from the oxygen atoms that are consumed as part of the oxidation of glucose or of any other food molecule The electrons that are abstracted from glucose at various stages in its oxidation are finally transferred to oxy gen to produce water during oxidative phosphorylation Thus the oxygen used during oxidation of food in animals ends up as oxygen atoms in H20 One can show this directly by incubating living cells in an atmosphere that contains molecular oxygen enriched for the isotope 180 instead of the nat urally occurring isotope 160 In such an experiment one finds that all the C02 released from cells contains only 160 Therefore the oxygen atoms in the released C02 molecules do not come directly from the atmosphere but rather from the organic molecules themselves and from H20 The carbon atoms in a sugar molecule are already partially oxidized in con trast to all but the very first carbon in a fatty acid Thus more electrons and more energy can be abstracted per carbon from a fatty acid than from a sugar One consequence of this can be seen just by looking at what fraction of a fatty acid versus a sugar enters the citric acid cycle as acetyl CoA Two carbon atoms are lost from glucose in its conversion to acetyl CoA thus only twothirds ofits carbons enter the citric acid cycle By contrast all of the carbons of fatty acids enter the citric acid cycle as acetyl CoA You make net amounts ofprotein only after a meal when a mixture of essen tial amino acids is circulating Depending on the composition of the meal amino acids circulate for about 1 to 4 hours after eating during which time they re consumed by protein synthesis oxidation and conversion to fatty acids and glycogen Synthesis of increased amounts of protein is confined to this window because you must have the essential amino acids which come from the diet to carry out synthesis of net amounts ofprotein Once the influx of amino acids from the diet abates you can only recycle the amino acids you already have which means that you cannot increase the amount ofproteins The citric acid cycle continues because intermediates are replenished as necessary by reactions leading to the citric acid cycle instead of away from it One of the most important reactions of this kind is the conversion of pyruvate to oxaloacetate by the enzyme pyruvate carboxylase Figure 2 52 pyruvate C02 ATP H20 a oxaloacetate ADP Pi 2 H Darwin exhaled the carbon atom which therefore must be the carbon atom of a C02 molecule After spending some time in the atmosphere the C02 molecule likely entered a plant cell where it became fixed by photosynthe sis and converted into part ofa sugar moleculeWhile it is certain that these early steps must have happened this way there are many different paths from there to the carbon atom in your hemoglobin HOW CELLS OBTAIN ENERGY FROM FOOD C02 PYR pyruvate carboxylase AcCoA as artate asparagine 0AA CIT fatty acids pyrimidines cholesterol glucose MAL ICIT l glutamate glutamine FUM ocKG gt proline arginine purines heme SUC gt 4 SCCOA chlorophyll The sugar could have been broken down by the plant cell into pyruvate or acetyl CoA for example which then could have entered biosynthetic reac tions to build an amino acid The amino acid might have been incorporated into a plant protein maybe an enzyme or a protein that builds the cell wall You might have eaten the delicious leaves of the plant in your salad and digested the protein in your gut to produce amino acids again After circu lating in your bloodstream the amino acid might have been taken up by a developing red blood cell to make its own protein such as the hemoglobin in question The food chain scenario can be much more complicated of course The plant for example might have been eaten by an animal which in turn was consumed by you on your lunch break Moreover because Darwin died more than 100 years ago the carbon atom could have traveled from animals to plants and back many times In each round it would have started again as fully oxidized C02 gas and entered the living world following its reduction during photosynthesis CALCULATIONS 2 1 18 2 1 19 2 120 At this rate of ATP regeneration the cell will consume oxygen at 67 X 10 15 Lmin 09 x 109 ATPmingtlt1025 ATP x 224 L6 x 1023 02 672 x 1045 Lmin The volume of the cell is 10 12 L 1000 um3 gtlt cm104 iim3 gtlt mLcm3 gtlt L1000 mL Dividing the cell volume by the rate of consump tion of 02 10 12 L67 X 10 15 Lmin indicates that the cell will consume its own volume of oxygen in 149 minutes or about 25 hours Since air con tains only 20 oxygen a cell would consume its own volume of air in about 30 minutes The human body operates at about 70 watts about the same as a light bulb watts 109 ATP 5 X 1013 cells X mole X 12 kcal body 60 sec cell b dy 6 X1023ATP mole 697Isec 7697watts body 7 body X You would need to expend 496 kcal in climbing from Zermatt to the top of the Matterhorn a vertical distance of 2818 m Substituting into the equation for work 98 m sec2 4955 kcal work75 kggtlt x2818mgtlt l X kcal kgm2sec2 418gtlt103 This is equal to about 15 SnickersTM 496 kcal325 kcal so you would be well advised to plan a stop at Hornli Hut to eat another one 418 X103 cal Figure 2 52 An important reaction for replenishing citric acid cyc e intermediates that are removed for biosynthesis Answer 271 16 PYR pyruvate AcCoA acetyl coenzyme A CIT citrate ICIT isocitrate ocKG oc ketoglutarate ScCoA succinyl coenzyme A SUC succinate FUM fumarate MAL malate and OAA oxaloacetate 2 121 gt P5 0 2 122 Chapter 2 Cell Chemistry and Biosynthesis In reality the human body does not convert chemical energy into external work at 100 efficiency as assumed in this answer but rather at an effi ciency of around 25 Moreover you will be walking laterally as well as uphill Thus you would need more than 6 SnickersTM to make it all the way Reference Frayn KN 1996 Metabolic Regulation A Human Perspective p 179 London Portland Press Since there is only one reactant and one product the ratio of equilibrium concentrations would not change 2PG 3PG K3PG 2PG Thus the concentration of 2PG is fixed by the equilibrium constant What ever amount of3PG is added a constant fraction of itis converted to 2PG so that the equilibrium ratio of concentrations would remain the same The value ofKis 020 17 mM83 mM AG for the reaction is 2PG 3PG 141kcalmolelog AGO 141 kcalmole log AGO 097 kcalmole For the reaction PEP a 2PG the equilibrium constant is 2PG K PEP and the standard freeenergy change is AGO 141 kcalmole log AGO 141kcalmolelog AGO 055 kcalmole For the reverse reaction AG has the same magnitude but the opposite sign thus AGquot 055 kcalmole AG for the overall reaction is the sum of the AG values for the linked reac tions however the AG values must refer to the proper direction that is the directions of the linked reactions and the overall reaction must be the same For the conversion of3PG to PEP AGOSPGAPEP AG 3PG gt 2PG AG 2PG 413E 097 kcalmole 055 kcalmole AGOSPGAPEP 042 kcalmole The AG for conversion of3PG to pyruvate PYR and phosphate is the sum of the AG values for the individual steps in the reaction AGOSPGgt PYR AG 3PG gtPEP AG PEP APYR 042 kcalmole 148 kcalmole AGOgPG mm 144 kcalmole HOW CELLS OBTAIN ENERGY FROM FOOD P5 O U 2 123 P5 The AG for conversion of 3PG to pyruvate and phosphate is independent of the pathway for the conversion Thus the AG is 144 kcalmole The AG for conversion of glycerate to pyruvate is obtained by subtracting the AG for 3PG to glycerate GLY from the overallAG 2 AGOGLYgt PYR AG 3PG gt PYR AG 3PG gt GLY 144 kcalmole 33 kcalmole AG GLYEPYR 111kcalmole The analysis above indicates that a very large standard freeenergy change occurs between glycerate and pyruvate Removal of water AG O5 kcal mole does not account for very much of this freeenergy change Thus it appears that the conversion of enolpyruvate to pyruvate is accompanied by a large standard freeenergy change of around 106 kcalmole This reason ing suggests that the majority of the standard freeenergy change associated with the conversion of PEP to pyruvate 106 kcalmole out of 148 kcal mole comes from the conversion of enolpyruvate to pyruvate and not from the hydrolysis ofthe phosphate bond In fact the standard freeenergy change for PEP to pyruvate 148 kcal mole is close to the sum of the enolpyruvate to pyruvate step about 11 kcalmole and the standard freeenergy change for hydrolysis of a simple phosphate ester bond about 30 kcalmole Thus the phosphate bond in PEP is a highenergy bond because its hydrolysis is linked to the very favor able conversion of enolpyruvate to pyruvate The AG for the linked conversion of PEP to pyruvate and of ADP to ATP is 75 kcalmole The AG for the linked reaction can be obtained by adding together the AG values for the individual reactions AG 148 kcalmole AG 73 kcalmole AG 75 kcalmole PEPa PYR Pi ADP P gtATP NET PEP ADP a PYR ATP Reference Lipmann F 1941 Metabolic generation and utilization of phos phate bond energy Adv Enzymol 1 99 162 The AG in resting muscle is 02 kcalmole AG AGO 141 kcalmole log w 33 kcalmole 141 kcalmole log 25 X1073039013 X 1073 33 kcalmole 141 kcalmole log 160 33 kcalmole 31 kcalmole AG O2 kcalmole This very small negative value should not surprise you it says that this energybuffering system is nearly at equilibrium which you should expect in the absence of heavyATP usage If the concentration ofATP decreases to 3 mM and that ofADP increases to 1mM then the AG for the reaction will be 30 kcalmole AG AGO 141 kcalmole log W 33 kcalmole 141 kcalmole log 25 X 1031 X1073 33 kcalmole 141 kcalmole log 156 33 kcalmole 027 kcalmole AG 30 kcalmole 0 2 124 gt 05 Chapter 2 Cell Chemistry and Biosynthesis Thus as soon as exercise begins the reaction will become highly favored and creatine phosphate will drive the conversion ofADP to ATP In reality the enzyme that catalyzes this reaction is efficient enough to keep the reaction nearly at equilibrium AG 0 so that ATP levels remain high as creatine phosphate levels fall fulfilling its role as an energy buffer lfATP 4 mM could sustain a sprint for 1 second then creatine phosphate 25 mM could sustain a sprint for an additional 6 seconds by regenerating an equal amount ofATP This is not long enough to allow a sprinter to finish 200 meters thus there must be another source of energy The additional energy comes from the breakdown of muscle glycogen which is processed through anaerobic glycolysis producing lactate and ATP Typical stores of muscle glycogen are sufficient for about 80 seconds of sprinting The mitochondrion obtains a ow through this reaction by maintaining high concentrations of substrates and low concentrations of products The concentration ratio of products to substrates 0AA NADH MALNAD must be sufficiently small to overcome a positive AG value as calculated in part B The minimum ratio of MAL to 0AA is the ratio that will make AG zero which is 11 X 104 0AA NADH MAL NAD 71 kcalmole 141 kcalmole log 71 kcalmole 141kcalmolelog01141kcalmolelog AG AGO 141 kcalmole log 0AA MAL ifAG 0 1 0AA 71kcalmole 141kcalmolelog01 5 MAL 141 kcalmole 404 10 0AA 569 kcalmole gMAL 39 141kcalmole 92 x 1075 and thus MAL OAA 11 x104 DATA lEANDLING 2 125 Knoop s result was surprising at the time One might have imagined that the obvious way to metabolize fatty acids to C02 would be to remove the car boxylate group from the end as C02 and then oxidize the newly exposed car bon atom until it too could be removed as C02 However removal of single carbon units is not consistent with Knoop s results since it predicts that odd and evennumber fatty acids would generate the same final product Similar inconsistencies crop up with removal of fragments containing more than two carbon atoms For example removal of threecarbon fragments wouldwork for the eightcarbon and sevencarbon fatty acids shown in Fig ure 2 34 but would not work for sixcarbon and fivecarbon fatty acids Removal oftwocarbon fragments from the carboxylic acid end is the only scheme that accounts for the consistent difference in the metabolism of odd and evennumber fatty acids One might ask why the last twocarbon fragment is not removed from phenylacetic acid It turns out that the ben zene ring interferes with the fragmentation process which involves modifi cation ofthe third carbon from the carboxylic acid end Since that carbon is part of the benzene ring in phenylacetic acid it is protected from modifica tion and further metabolism of phenylacetic acid is blocked Knoop s results also specify the direction of degradation If the nonacidic end of the chain were attacked first either the benzene ring would make the fatty acids resistant to metabolism or the same benzene compound would always be excreted independent ofthe length ofthe fatty acid fed to the dogs HOW CELLS OBTAIN ENERGY FROM FOOD 2 126 gt P5 O 0 Reference Knoop F 1905 Der Abbau aromatischer Fettsauren im Tierkor per Beitr Chem Physiol 6 150 162 The complete oxidation of citrate to C02 and H20 occurs according to the balanced chemical reaction C6I I807 45 02 a 6 CO2 4 H2O Thus each molecule of citrate would require 45 molecules of oxygen for its complete oxidation The results in Table 2 5 were surprising to Krebs and others at the time because much more oxygen is consumed 40 mmol than could be accounted for by oxidation of citrate itself Only 135 mmol of oxygen would be required to oxidize 3 mmol of citrate completely 3 mmol X 45 This cal culation shows that citrate is acting catalytically in the oxidation of carbo hydrates which in these experiments were endogenous in the minced pigeon breasts Although others were aware of the catalytic nature of other intermediates Krebs was the first person to complete the circle of chemical reactions that constitute the citric acid cycle Krebs s experimental rationale is clearlylaid out in the paper Since citric acid reacts catalytically in the tissue it is probable that it is removed by a pri mary reaction but regenerated by a subsequent reaction In the balance sheet no citrate disappears and no intermediate products accumulate The first object of the study of intermediates is therefore to find conditions under which citrate disappears in the balance sheet The consumption of oxygen is low in the presence of the metabolic poisons because citrate is prevented from acting catalytically The balanced equa tions for the conversion of citrate to ocketoglutarate and succinate show that the amount of oxygen consumed is approximatelywhat is expected For cit rate conversion to ocketoglutarate half a molecule of oxygen is consumed C6I I807 05 02 a C5H605 CO2 H2O For citrate conversion to succinate one molecule of oxygen is consumed C6I I807 02 a C4H604 2 CO2 H2O Thus the observed stoichiometry of oxygen consumption matches the expectations The absence of oxygen is crucial for demonstrating an accumulation of cit rate from an intermediate in the cycle In the presence of oxygen citrate acts catalytically is consumed and then regenerated so that it does not accu mulate no matter what intermediate is added In the absence of oxygen however the conversion of citrate to ocketoglutarate is blocked since that conversion requires oxygen as described below Under these conditions cit rate will accumulate if an appropriate intermediate is present Of all the intermediates only conversion of oxaloacetate to citrate does not require oxygen The immediate precursor of oxaloacetate is malate Since the con version of malate to citrate requires oxygen all other intermediates also must require oxygen to be converted to citrate The requirement for oxygen is indirect and is mediated through the cofac tors NAD and FAD they accept electrons from the substrates and transfer them to the electrontransport chain and ultimately to oxygen In the absence of oxygen all of the NAD will quickly be converted to NADH and all of the FAD will quickly be converted to FADH2 In the absence of NAD and FAD the reactions of the cycle cannot proceed E coli and yeast do indeed use the citric acid cycle Krebs got this point wrong because he did not realize nor did anyone for a long time that cit rate cannot get into these cells Therefore when he added citrate to intact E 2 127 gt P5 2 128 gt P5 0 Chapter 2 Cell Chemistry and Biosynthesis coli and yeast he found no stimulation of oxygen consumption Passage of citrate across a membrane requires a transport system which is present in mitochondria but is absent from yeast and E coli plasma membranes References Krebs HA amp lohnsonWA 1973 The role of citric acid in inter mediate metabolism in animal tissues Enzymologia 4 148 156 Stare Fl amp Baumann CA 1936 The effect of fumarate on respiration Proc R Soc Lond B121 338 357 SzentGyorgyi AV 1924 Uber den mechanismus de Succin und Para phenylendiaminoxydation Ein Betrag der Zellatmung Biochem Z 150 141 149 See also Albert SzentGyorgyi s Nobel Lecture 1937 at wwwnobelsemedicinelaureates1937szentgyorgyilecturepdf The crossfeeding experiments indicate that the three steps controlled by the products of the TrpE TrpD and TrpE genes are arranged in the order TrpE TrpD TrpB Xe Ya Z gttryptophan where X Y and Z are undefined intermediates in the pathway The ability of the TrpE strain to be crossfed by the other two strains indi cates that the TrpD and TrpE strains accumulate intermediates that are farther along the pathway than the step controlled by the TrpE gene The ability of the TrpD strain to be crossfed by the TrpE strain but not the TrpE strain places it in the middle The inability of the TrpE strain to be crossfed by either of the other strains is consistent with its controlling the step closest to tryptophan The patterns of growth on minimal medium supplemented with known intermediates in the tryptophan biosynthetic pathway are consistent with the order deduced from crossfeeding TrpE TrpD TrpB chorismate A anthrarulate A indole A tryptophan In reality of course the intermediates for the pathway were unknown or not fully known at the time the crossfeeding experiments were done The intermediates were worked out by a combination of educated guesses at the likely intermediates which could then be tested on mutant strains and of analysis of the compounds that accumulated in the mutants Reference Yanofsky C 2001 Advancing our knowledge in biochemistry genetics and microbiology through studies on tryptophan metabolism Annu Rev Biochem 70 1 37 The ProA strain is a deletion it doesn t grow at any temperature The ProE strain is temperaturesensitive it doesn t grow at 42 C but grows at the other temperatures The ProC strain is coldsensitive it doesn t grow at 22 C but grows at the other temperatures At 22 C the ProC strain crossfeeds the ProA strain indicating that the intermediate that accumulates in the ProG strain comes after the block in the ProA strain Similarly at 42 C the ProA strain crossfeeds the ProE strain indicating that the intermediate that accumulates in the ProA strain comes after the block in the ProE strain These two results Cafter A and A after B suggest the order shown below ProB ProA ProC 1ntermed1ate 1 1ntermed1ate 2 1ntermed1ate 3 proline The identification of three genes by the crossfeeding experiments shown here indicates that there are very likely to be at least three steps in the pathway But HOW CELLS OBTAIN ENERGY FROM FOOD if of P70 0 O O O Cl Cl P C HZ ATP ADP C HZ NADPH NADP in in H2N7CH ProB H2N7CH ProA l l C C O O O O glutamate y glutamyl phosphate there could be many more steps than identified genes For example muta tions in the genes controlling some steps in the pathway may not be present Hc CIiZCH H 0 Q2 NADPH NADP CH2 0 Z Z H7C CH HZC CHZ 7 1 HzN 3 H N CH HZNL CH C spontaneous l ProC l C C o o O O O O glutamate y sem ialdehyde ApyrrolineScarboxylic acid proline Figure 2 53 The proline biosynthetic pathway Answer Zrl 28 in your collection of mutant strains Conversely there could be many more genes than steps For example multiple genes could encode different sub units of an enzyme that controls a single step It is important to confirm the pathway by isolation of the enzymes and by detailed biochemical studies Such studies have elucidated the complete pathway for proline biosynthesis Figure 2 53 0 The lack of crossfeeding of the ProA strain by the ProC strain at 30 C or 42 C or between the wildtype bacteria and the mutant strains under any conditions indicates that neither intermediates nor end products accumulate under normal grth conditions The proline that is produced is rapidly used for protein synthesis and is prevented from being synthesized in excess of needs by careful regulation of the pathway in this case through a feed back inhibition of proline acting directly on the ProB enzyme Chapter 4 DNA Chromosomes and Genomes THE STRUCTURE AND FUNCTION OF DNA DEFINITIONS 4 1 Genome 4 2 Double helix 4 3 Gene 4 4 Antiparallel 4 5 Base pair TRUEFALSE 4 6 False The human genome consists only of linear molecules but human cells also contain thousands of mitochondrial DNA molecules which are circular They constitute about 1 of the DNA in a human cell THOUGHT PROBLE IVIS 4 7 The complementary strand for this DNA is 5 GTGCACCAT 3 By conven tion all DNA strands are written 5 to 3 so that the complement of 5 ATG GTGCAC is 5 GTGCACCAT Keeping polarities in mind is a key to following DNA structure DNA replication DNA transcription DNA rep air and recombination virtually all aspects of DNA metabolism 4 8 The phosphate groups in the backbone of DNA are strong acids pK10 and completely ionized giving DNA its customary negative charge These charges are normallyneutralized in cells by positively charged inorganic and organic cations The customs agent probably should be cautious not because ofthe acidic nature of DNA but because of its information content A The positions of the major and minor grooves are shown for all four base pairs in Figure 4 40 If a circle is drawn through the points of attachment to the sugarphosphate backbones the larger arc of the circle corresponds to the major groove and the smaller arc corresponds to the minor groove B The structures of a TA base pair and an AT base pair are shown in Figure 4 4OA and B The two structures are identical except they have been ipped 180 about a north south axis The same relationship holds for the GC and CG base pairs Figure 4 4OC and D C As shown in Figure 4 40 the same edges ofthe bases hence the same chem ical moieties always project into the same groove Thus for example the methyl group of thymine is always found in the major groove as is the amino group of cytosine In This Chapter THE STRUCTURE AND FUNCTION OF DNA CHROMOSOMAL DNA AND ITS PACKAGING IN THE CHROMATIN FIBER THE REGULATION OF CHROMATIN STRUCTURE THE GLOBAL STRUCTURE OF CHROMOSOMES HOW GENOIVIES EVOLVE A72 A80 A84 A87 A70 Chapter 4 DNA Chromosomes and Genomes A r r quot39Tquot a r r quotquotTquot Figure 4 40 All four base pairs with the x majorand minorgrooves indieated rnaror groove rnaror groove Amwe My H H r39 CH3 39V CHz N H O r O H 4 N r r r r l N 1 l N r l lt I N HCN 3 t N N r gt Jr A N g f N N lt K N N N t O r O rnrnor groove rnrnor groove AT base parr TA base par 0 quot7quot g D quot7 rnaror groove rnaror groove 5 o H N X I N H 0 v f N l l N i N l N r H H N s N N N 44 N r N lt N N v H M O O H H rnnor groove 7 rnnor groove 7 r 6C base parr CG base parr D Because the two base pairs in both orientations project a characteristic set ofchemical groups at specific places in the grooves ofDNA it is possible in principle to read the sequence ofDNA with a sufficiently powerful micro scope The helical structure ofDNA however presents what is likely to be an insurmountable practical barrier it would be very difficult to read the s quence where the strands cross 4 10 The base pairs are TA see Figure ALZA and CG see Figure ALZB The coma ponents ofthe base pairs along with their stick representations are shown inFigure 4741f 39 d 39 r 39 quot39 J 39 quot 39 are purines 4 11 In all samples of doublerstranded DNA the mole percents ofA and T are aqua t r r a t t t a as odd in the days before the structure ofDNA was known Now it is clear that while all cellular DNA is double stranded certain viruses contain sins glerstranded DNA The genomic DNA of the M13 virus for example is sin gle stranded In singlerstranded DNA A is not paired with T nor G with C and so the A T and C G rules do not apply A TA BASE PAlR thyrnrne adenrne Y a phosphate B CG BASE PAlR mome guamne Figure 4 41 Spater lling and stiek 39 lt representationsortwo base pairs Answer 39I phosphate gray nitrogen atoms are imerme iate and oxygen atoms are dark gray Vet THE STRUCTURE AND FUNCTION OF DNA The segment of DNA in Figure 4 3 reads from top to bottom 5 ACT 3 The carbons in the ribose sugar are numbered clockwise around the ring start ing with C1 the carbon to which the base is attached and ending with C5 the carbon that lies outside the ribose ring HelixA is right handed HelixC is left handed Helix B has one righthanded strand and one lefthanded strand There are several ways to tell the handedness of a helix For a vertically oriented helix like the ones in Figure 4 4 if the strands in front point up to the right the helix is right handed if they point up to the left the helix is left handed Once you are comfortable identifying the handedness of a helix you will be amused to note that nearly 50 of the DNA helices in advertisements are left handed as are a surpris ingly high number of the ones in books Amazingly a version ofhelix B was used in advertisements for a prominent international conference celebrat ing the 30year anniversary of the discovery of the DNA helix CALCULATIONS 4 14 4 15 4 16 Because C always pairs with G in duplex DNA their mole percents must be equal Thus the mole percent of G like C is 20 The mole percents ofA and T account for the remaining 60 Since A and T always pair their mole per cents are equal to half this value 30 A The DNA in a human cell is about 22 meters in length 034 nm 9 m length 64 X 10 bp gtlt bp x 109 mm length 218 m B The DNA occupies about 9 of the volume ofthe nucleus The volume ofthe nucleus is V 43 x 314 x 3 gtlt103 nm3 V113 x1011nm3 The volume of DNA is V 314 X 12 nm2 22 X 109 nm V 995 gtlt109 nm3 The ratio of DNAvolume to nuclear volume is about 009 995 X 109 nm3 113 X 1011 nm3 thus the DNA occupies about 9 of the nuclear volume The 10 g sample contains about 70 mg ofDNA about 07 9 64x 109 bp 660 d mass ofDNA 10 cells gtlt cell gtlt bp gtlt 6X1023d 70 x 10 3 g or 70 mg Since there are 22 m of DNA in a human cell see Problem 4 15 and 109 cells in the sample there are 22 gtlt109 m 22 X 106 km of DNA in the sam ple enough to reach from the Earth to the Moon more than five times DATA HANDLING 4 17 This experiment implicates DNA as the genetic material Clearly by giving rise to progeny bacteriophage T4 must possess some form of genetic mate rial Since T4 contains only protein and DNA there are only two choices for the genetic material By separating the bacteria from the bacteriophage after infection and showing that the bacteria contained only the 32p label DNA Hershey and Chase were able to demonstrate a clean separation of DNA A72 Chapter 4 DNA Chromosomes and Genomes from protein The ability of these infected cells in the absence of T4 pro teins to generate progeny virus shows that DNA must be the bacterio phage s genetic material There are caveats associated with these experiments as the authors clearly recognized For example an absolute separation of bacteriophage from bacteria could not be accomplished there was 1 358 associated with the infected cells Also not all proteins can be labeled with 35Smethionine some don t have methionine in their amino acid sequence Thus these experiments don t absolutely rule out protein as the genetic material Nev ertheless the weight of the argument falls on the side of DNA These exper iments coupled with other experiments such as the demonstration by Avery MacLeod and McCarty that DNA was the transforming principle of Streptococcus pneumoniae convinced scientists of the role of DNA as the genetic material Reference Hershey AD amp Chase M 1952 Independent functions of viral protein and nucleic acid in growth of bacteriophage J Gen Physiol 36 39 56 CHROMOSOMAL DNA AND ITS PACKAGING IN THE CHROMATIN FIBER DEFINITIONS 4 18 Karyotype 4 19 Centromere 4 20 Histone 4 21 Chromosome 4 22 Cell cycle 4 23 Chromatin 4 24 Homologous chromosome homolog 4 25 Nucleosome TRUEIFALSE 4 26 True The human karyotype comprises 22 autosomes and the two sex chro mosomes X andY Females have 22 autosomes and ton chromosomes for a total of 23 different chromosomes Males also have 22 autosomes but have an X and aY chromosome for a total of 24 different chromosomes 4 27 True Overall only a couple of percent of the human genome is present in mRNA and only about onethird of the genome is transcribed into RNA Even allowing for regulatory regions and other critical sequences it still appears that more than half the genome has no apparent function and may be unimportant junk 4 28 True Humans and mice diverged from a common ancestor long enough ago for roughly two out of three nucleotides to have been changed by random mutation The regions that have been conserved are those with important functions where mutations with deleterious effects were eliminated by nat ural selection Other regions have not been conserved because natural selection cannot operate to eliminate changes in nonfunctional DNA 4 29 False In living cells nucleosomes are packed upon one another to generate regular arrays in which the DNA is more highly condensed usually in the CHROMOSOMAL DNA AND ITS PACKAGING IN THE CHROMATIN FIBER form of a 30nm fiber The beadsonastring form of chromatin is usually observed only after the 30nm fiber has been experimentally treated to unpac it 4 30 True All the core histones are rich in lysine and arginine which have basic positively charged side chains that can neutralize the negatively charged DNA backbone 4 31 False By using the energy of ATP hydrolysis chromatin remodeling com plexes can catalyze the movement of nucleosomes along DNA or even between segments of DNA THOUGHT PROBLEMS 4 32 The DNA molecules in chromosomes are long and exceedingly thin and therefore very fragile The techniques in use in the 1950s which were gentle enough for the isolation of proteins were much too harsh for DNA For example the shearing force exerted by pipetting DNA sucking it through a small aperture was sufficient to break it into the observed small pieces It was a major technical L39 to A that L con tain a single long DNA molecule 4 33 The number of molecules of DNA in a human cell depends on the type of cell and its stage in the cell cycle For the vast majority of somatic cells there are 46 molecules of DNA chromosomes per cell prior to replication After replication but before completion of cell division there are 92 molecules of DNA per cell For a few somatic cells these numbers are very different For example red blood cells have no DNA molecules having lost their nuclei and mitochondria during differentiation from reticulocytes into red cells By contrast skeletal muscle cells which arise by the fusion ofmultiple pre cursor cells have many nuclei and thus a very large number of DNA molecules per cell Sex gametes are also different they are haploid and thus have only 23 molecules of DNA per cell 4 34 By comparing the normal chromosomes 9 and 22 with their abnormal counterparts it would appear that the bottom portion of chromosome 22 has been translocated to the bottom of chromosome 9 The presence oftwo X chromosomes indicates that this patient is female 4 35 Computer algorithms that search for exons are complex affairs as you might imagine They combine statistical information derived from known genes in searching for unidentified genes The list of features includes 1 Exons that encode protein will have an open reading frame and the reading frames in adjacent exons will match up 2 Internal exons excluding the first and the last will have splicing signals at each end most ofthe time 981 these will beAG at the 5 ends ofthe exons and GT at the 3 ends 3 The multiple codons for most individual amino acids are not used with equal frequency This socalled coding bias can be factored in to aid in the recognition of true exons Exons and introns have characteristic length distributions The median length of exons in human genes is about 120 bp Introns tend to be much larger a median length of about 2 kb in genomic regions of 30 40 GC content and a median length of about 500 bp in regions above 50 GC Socalled CpG islands are often located just up stream of the 5 ends of genes The dinucleotide 5 CpG is greatly underrepresented in the human genome occurring at about 20 its expected fre quency Cs in CpG dinu cleotides are a target for methylation Deaminati n of methyl C produces T which accounts for the deficit of CpG In CpG islands however CpG dinucleotides are not methylated and occur at reir expected frequency CpG islands often contain binding sites for gene regulatory proteins The initiation codon for protein synthesis nearly always AUG has a 5 U1 0 gt P5 Chapter 4 DNA Chromosomes and Genomes statistical association with adjacent nucleotides that seem to enhance its recognition by translation factors The terminal exon will have a signal most commonlyAATAAA for cleav age and polyadenylation close to its 3 end The statistical nature of these features coupled with the low frequency of coding information in the genome 2 3 and the frequency of alternative splicing 30 50 of genes makes it especially impressive that current algo rithms can identify about 70 of individual exons and about 20 of com plete genes in the human genome 1 Reference Lander ES et al 2001 Initial sequencing and analysis of the human genome Nature 409 860 921 In a random sequence of DNA each of the 64 different codons will be gen erated with equal frequency Since 3 of the 64 are STOP codons they will be expected to occur every 21 codons 643 213 on average The intermediate chromosome and the sites of the inversions are indicated in Figure 4 42 A gene is any DNA sequence that produces a functional RNA or encodes one or a set of closely related polypeptide chains protein isoforms Replication origins are the specialized sequences that control the beginning of DNA replication the process that allows chromosomes to be duplicated Centromeres are the specialized sequences that permit one copy each of the duplicated chromosomes to be pulled into each daughter cell at cell divi sion Telumeres are the specialized sequences at the ends of chromosomes that allow ends to be efficiently replicated and also prevent chromosome ends from being recognized as breaks in need of repair A 150Mb chromosome would require nearly six days to replicate from a sin gle centrally located origin of replication A pair of replication forks pro ceeding outward from an origin would synthesize 300 nucleotide pairs of DNA per second thus a 150Mb chromosome would take 500000 seconds to replicate 150 X 106 nucleotide pairs300 nucleotide pairssec 500000 sec which equals 58 days Such an extremely slow rate of chromo some duplication would severely limit the rate of cell division A chromosome without a telomere cap would suffer two kinds of problems First it would lose nucleotides at each round of replication and would grad ually shrink Ultimately an essential gene would be lost leading to cell death Second the uncapped end would be perceived by the cell as abroken end and repaired usually byjoining it to another such broken end Replica tion of a chromosome with a broken end would generate two sister chro matids each with such an end that could be joined together creating a fused chromosome with two centromeres Such a dicentric chromosome creates a new problem at the next cell division ifthe centromeres are pulled to different spindle poles the chromosome will break generating new uncapped ends which can repeat the process These socalled fusion bridge breakage cycles are thought to be a major source of the rearrange first second inversion 39 inversion gt gt orangutan intermediate human Figure 4 42 Inversions and intermediate chromosome in the evolution of chromosome 3 in orangutans and humans Ahswer4737 CHROMOSOMAL DNA AND ITS PACKAGING IN THE CHROMATIN FIBER C ments that characterize cancer cells Barbara McClintock first described such cycles in plants in the 1930s see Problem 17 109 A chromosome without a centromere could not attach to the mitotic spin dle thus after replication the two new chromosomes could not be parti tioned accuratelybetween the two daughter cells Therefore many daughter cells would die because they failed to acquire a complete set of chromo somes The essence ofyour proposed cure is that telomerase is essential for cancer cells and that its inhibition would ultimately stop their growth The trou bling aspect of the rival company s observations is that mice lacking telom erase still get cancers indicating that there are nontelomerasedependent routes to tumor formation Thus there is no assurance that if you inhibited the growth of cancer cells with a telomerase inhibitor the cancer cells wouldn t find another telomeraseindependent way around the block Reference Artandi SE Chang S Lee SL Alson S Gottlieb GI Chin L amp DePinho RA 2000 Telomere dysfunction promotes nonreciprocal translo cations and epithelial cancers in mice Nature 406 641 645 In contrast to most proteins which accumulate amino acid changes over evolutionary time the functions of histone proteins must involve nearly all of their amino acids so that a change in any position is deleterious to the cell Histone proteins are exquisitely refined for their function Coiling a DNA duplex in a tight spiral around a histone octamer requires that the duplex be distorted from its preferred gently meandering path in solu tion Thus a fair fraction of the binding energy available in the interaction of a histone octamer with a DNA molecule is used to distort the duplex For a more flexible CTGGAG segment of duplex less of the binding energy is invested in distorting the DNA hence more energy is available for binding the DNA to the histone octamer This energetic consideration may be espe cially important during the formation of a nucleosome in which case an especially exible site may be the preferred site of nucleosome assembly Reference Wang YH amp Griffith I 1995 Expanded CTG triplet repeat blocks from the myotonic dystrophy gene create the strongest known natural nucleosome positioning elements Genomics 25 570 573 CALCULATIONS 4 44 The total length of DNA in chromosome 1 is 95 X 107 nm 28 X 108 bp X 034 nmbp Which is 95 gtlt104 um In mitosis the chromosome measures 10 um Therefore the DNA molecule in chromosome 1 is compacted 9500 fold 95 gtlt104 umIO um at mitosis Extrapolating from the number of genes on chromosome 22 to the whole genome gives an estimate of about 47000 genes 7000015 46667 which is well in excess of the likelyvalue of about 25000 genes The error in the cal culation is the assumption that the density of genes on chromosome 22 is the same as that for the whole genomeThe average genomewide density of genes is about 78 genes per Mb 25000 genes3200 Mb The gene density on chromosome 22 is higher than this average 700 genes48 Mb 146 genes Mb References Lander ES et al 2001 Initial sequencing and analysis of the human genome Nature 409 860 921 Dunham I et al 1999 The DNA sequence of human chromosome 22 Nature 402 489 495 About 76 of each gene is converted to mRNA 54 exonsgene X 266 bpexon19000 bpgene 76 Genes occupy about 28 of chromo some 22 700 genes X 19000 bpgene48 X 106 bp 277 A76 4 47 4 48 4 49 Chapter 4 DNA Chromosomes and Genomes Reference Dunham I et al 1999 The DNA sequence of human chromo some 22 Nuture 402 489 495 The packing ratio within a nucleosome core is 45 146 bp X 034 nmbp11 mm 45 If there are an additional 54 bp of linker DNA then the packing ratio for beadsonastring DNA is 23 200 bp X 034 nmbp11 nm 54 bp X 034 nmbpl 23 This first level of packing represents only 0023 2310000 of the total condensation that occurs at mitosis Histone octamers occupy about 9 ofthe volume of the nucleus The volume of the nucleus is V 43 x 314 x 3 x 103 nm3 V 113 x 1011mm3 The volume of the histone octamers is V 314 x 45 mm2 x 5 mm x 32 x 106 V 102 X 1010 nm3 The ratio of the volume of histone octamers to the nuclear volume is 009 thus histone octamers occupy about 9 of the nuclear volume Since the DNA also occupies about 9 of the nuclear volume see Problem 4 15 together they occupy about 18 of the volume of the nucleus With these assumptions the DNA is compacted 27fold in 30nm fibers rel ative to the extended DNA The total length of duplex DNA in 50 nm of the fiber is 1360 nm 20 nucleosomes X 200 bpnucleosome X 034 nmbp 1360 nm 1360 nm of duplex DNA reduced to 50 nm of chromatin fiber rep resents a 27fold condensation 1360 nm50 nm 272This level ofpack ing represents 027 2710000 of the total condensation that occurs at mitosis still a long way off what is needed DATA lEANDLING 4 50 4 5 1 There are 15 bands on the gel in Figure 4 7 suggesting that S cerevisiae has 15 chromosomes From other studies it is known that two chromosomes of very nearly the same length are present in the third band from the top Thus S cerevisiae has 16 chromosomes A The digestion patterns with the various restriction nucleases indicate that the U2 genes are tandemly repeated in the human genome like the two copies that are present in the bacteriophage lambda clone Restriction nucleases such as HindHI HincII and Kpnl which cut once within the 6kb repeat each cut the cluster into 6kb fragments These enzymes also pro duce a junction fragment at one end of the cluster however it is too faint to show up under these hybridization conditions Restriction nucleases such as EcoRI BglII and Xbal which do not cut within the 6kb repeat leave the entire cluster on one fragment Partial digestion with HindIII creates a lad der ofbands that are multiples of6 kb B Two bands are visible in the bacteriophage lambda digest because KpnI cleaves the cloned DNA twice generating a 6kb fragment containing one U2 gene and a larger fragment from the right side of the clone see Figure 4 8 which also carries one copy of the U2 gene Cleavage of any tandem repeat will generate socalled junction fragments from the ends of the repeated region Only one of the junction fragments shows up in these experiments because the U2 gene is used as a probe The junction fragment is not visible in the genomic digest because it is too faint there is only one junction fragment but there are many internal fragments A longer exposure of the autoradiograph would have allowed the junction fragment to be seen CHROMOSOMAL DNA AND ITS PACKAGING IN THE CHROMATIN FIBER 0 U gt P5 0 Since 2 ng of cloned DNA gives the same intensity 6kb band as 10 ug of genomic DNA a simple proportionality can be set up 2 ng of cloned DNA which contains 1 hybridizing U2 gene per 43 kb is equal to 10 ug of genomic DNA which carries an unknown number n of U2 genes per 32 X 106 kb Although there are two U2 genes on the cloned DNA only one contributes to the hybridization of the 6kb band 11 U2 genes 4 32x106kb X10 Hgquot 1 33 x n 32x106kb X 2ng 43 kb 104 ng n 149 Taking into account the one U2 gene associated with the invisible junction fragment there are about 16 U2 genes in the human genome The basis for the different estimates ofnumbers of U2 genes comes down to the definition of gene For the hybridization studies involving the autora diograph in Figure 4 9 and hence the calculations in part C a U2 gene was defined as a stretch of DNA that would hybridize to a U2 probe The sequence analysis used a more stringent definition a stretch of DNA was identified as a U2 gene only if it encoded U2 snRNA By this criterion there are only three true U2 genes in the haploid human genome Most of the copies identified by hybridization have suffered mutations and no longer serve as a source ofU2 snRNA Such nonfunctional copies are referred to as pseudogenes References van Arsdell SW ampWeiner AM 1984 Human genes for U2 small nuclear RNA are tandemly repeated Mol Cell Biol 4 492 499 Lander ES et al 2001 Lnitial sequencing and analysis of the human genome Nature 409 860 921 The RsaIA gene is missing in males that are green blind and the RsaIB gene is missing in males that are red blind see Figure 4 10 Therefore the R3111 A gene encodes the green visual pigment and the RsaI B gene encodes the red visual pigment The presence of very similar 98 identical genes in close proximity raises the very likely possibility that the variability among males with normal color vision arises by homologous recombination unequal crossingover between the duplicated segments This idea is supported by the uniform increments in sizes of the Notl fragments from individuals with increasing numbers of genes see Figure 4 11 If there is frequent unequal crossing over between these genes as is sug gested by the variability among normal males then colorblind males could arise by the same process Unequal crossingover could eliminate a gene by deletion or alter the function of a pigment by creating a hybrid gene Figure 4 43 A powerful argument in favor of this possibility is that the colorblind males see Figure 4 10 individuals 8 9 and 10 have the same sort ofvaria tion in intensity of their RsaIA fragment as normal males The simplest way to account for this variation is by unequal crossing over within a family of tandemly repeated genes The length of the duplicated segment is equal to the incremental change in NotIfragment length with the addition of each gene Each fragment in Fig ure 4 11 differs from the next higher fragment by 39 kb For example the Notl fragment from a male with two genes 105 kb for individual 1 differs from the Notl fragment from a male with three genes 144 kb for individual 3 by39 kb Thus the duplicated segment at this locus on theX chromosome is 39 kb References Nathans l Thomas D amp Hogness DS 1986 Molecular genetics of human color vision the genes encoding blue green and red pigments Science 232 193 202 A78 P5 Chapter 4 DNA Chromosomes and Genomes A UNEQUAL CROSSING OVER BETWEEN VISUAL PIGMENT GENES B PRODUCTS OF RECOMBINATION AT ARROW 1 WZ gt lgt I MW C PRODUCTS OF RECOMBINATION AT ARROW 2 m W l gt l gt l gt l gt W Nathans I Piantanida TP Eddy RL Shows TB amp Hogness DS 1986 Molecu lar genetics of inherited variation in human color vision Science 232 203 210 Vollrath D Nathans I amp Davis RW 1988 Tandem array of human visual pig ment genes at Xq28 Science 240 1669 1671 Lander ES et al 2001 Initial sequencing and analysis of the human genome Nature 409 860 921 The restriction analysis of the plasmid indicates that it is a linear molecule If the plasmid were a circle digestion with singlecut restriction nucleases would have generated only one band at 12 kb in each case which was not observed On the other hand a linear molecule generates two bands when cut once and they should sum to 12 kb Since two bands that summed to 12 kb were generated with each of the three singlecut restriction nucleases the plasmid must be a linear molecule Digestions with BamHI or BglII yield DNA fragments with identical 5 exten sions 5 GATC Thus a mixture of these fragments canjoin in all possible combinations Neither enzyme however can cut the hybrid sites created by joining a telomere fragment to a plasmid end a BamHI end to a BglII end 5 AGATCC and 5 GGATCT By contrast the appropriate enzyme can cut the sites created byjoining two BamHI ends or two BglII ends Therefore in the presence ofDNA ligase BamHI and BglIIjoints that create a BamHI site or a BglII site are quickly recut whereas the hybrid sites which cannot be cut accumulate This strategy neatly selects for formation of hybridjoints in this case the telomere fragment joined to the plasmid Reference Szostak IW amp Blackburn EH 1982 Cloning yeast telomeres on linear plasmid vectors Cell 19 245 255 Micrococcal nuclease can only cleave DNA that is not bound to nucleo somes Thus its target for cleavage is the linker DNA between nucleosomes In order for micrococcal nuclease to generate a ladder ofbands the nucleo somes must be regularly spaced along the DNA that is the linker DNA between nucleosomes must be a relatively constant length Indeed a ladder of 200 nucleotides indicates that the length of the linker DNA added to the length of the nucleosomal DNA equals 200 nucleotides Figure 4 43 Unequal crossingover between repeated segments Answer 4752 A Two potential sites of homologous crossover Homologous sites are indicated bythe linked arrowheads B Recombination at sites labeled with arrow 7 Both products of recombination include a hybrid gene One product is one repeat shorter and the other is one repeat longerthan the parent molecules C Recombination at sites labeled with arrowZ The products do not include hybrid genes but have an altered number of repeats CHROMOSOMAL DNA AND ITS PACKAGING IN THE CHROMATIN FIBER DU 0 sites of DNase I cutting H H H NNNNX nucleosome DNaseI side View end View If nuclei were digested extensively with micrococcal nuclease instead of briefly essentially every linker would be cut and the linker DNA would be digested to short pieces The remaining DNA that is the DNA that was in the nucleosomes would migrate at the lowest position on the gel In fact when this sort of experiment is done the DNA migrates at about 146 nucleotides which represents the length of DNA that is actually wrapped around the histone core to form a nucleosome This result indicates that linker DNA in rat liver chromatin is about 54 nucleotides long As illustrated in Figure 4 44 if DNA is bound to a surface the most access ible phosphodiester bonds will be those that project farthest from the sur face Depending on the precise geometry of the DNase Ibinding site only the top portion of the helix will be susceptible to cleavage The 10 nucleotide ladder arises because there are approximately 10 nucleotides per turn ofthe helix As a consequence accessible phosphodiester bonds will be spaced at roughly 10nucleotide intervals Reference Prunell A Kornberg RD Lutter L Klug A Levitt M amp Crick FH 1979 Periodicity of deoxyribonuclease I digestion of chromatin Science 204 855 858 If nucleosomes were randomly positioned on the 225bp segment of DNA then the 146bp fragments would be a collection of all possible 146bp seg ments of the original DNA Such a random collection would give a highly diverse set of fragments upon digestion with a restriction nuclease that cuts at a unique location The generation of only two fragments after restriction digestion means that there is a strongly preferred location for a nucleosome on this piece of DNA which gives rise to a unique 146bp fragmentWhen this fragment is cut it gives a 37bp and a 109bp fragment which sum to 146 If the position of the restriction cut were given in the problem you could have deduced where the nucleosome is situated on the 225bp seg ment The presence of a nucleaseresistant fraction in chromatin but not in naked DNA suggests that the Martian DNA is associated with a nucleo somelike structure that protects it from micrococcal nuclease Since exten sive digestion produced a limit product of about 300 nucleotides the nucle osomelike structure must protect about this length of DNA The smear of digestion products indicates that the nucleosomelike structures are not reg ularly spaced along the DNA as Earthly nucleosomes are If they were regu larly spaced they would have given a ladder of bands analogous to those seen in rat liver These results argue strongly that the SWISNF complex slides nucleosomes along the DNA in an ATPdep endent manner Two key observations support this model First incubation with SWlSNF causes the nucleosome to dis appear from the small fragment released by Nhel see Figure 4 15B lane 6 but the nucleosome remains associated with the large fragment released by EcoRI Second cleavage with Nhel before incubation prevents loss of the nucleosome from the small fragment see lane 4 suggesting that the nucle osome can be moved only if there is contiguous DNA If the mechanism of action of the SWISNF complex were to release the nucleosome from the DNA then the nucleosome should have been released Figure 4 44 Schematic representation of DNA bound to the surface of a nucleosome and the associated steric hindrance of DNase Answer 4754 Arrows indicate the phosphodiester bonds that are most susceptible to DNase I cleavage A80 Chapter 4 DNA Chromosomes and Genomes regardless of the restriction enzyme used or the order of incubation Reten tion of the nucleosome on the large EcoRI fragment and on the precleaved Nhel fragment argues against this mechanism If the SWISNF complex transferred the nucleosome from one duplex to another then the nucleosome should have been released from the pre cleaved Nhel fragment and transferred to the other duplex the rest of the DNA in the substrate This mechanism cannot strictly be ruled out by these experiments because of concentration effectsWhen the Nhel frag ment is attached to the rest of the DNA the local concentration of the receptor duplex the rest of the DNA is very high By contrast when it is detached the concentration of receptor duplexis low The authors of this study ruled out this possibility by placing a barrier to sliding but presum ably not to transfer adjacent to the Nhel site Under these conditions the nucleosome remained associated with the Nhel fragment when the sub strate was incubated with the SWISNF complexbefore cleavage with Nhel in contrast to the results in the absence of a barrier as shown in Figure 4 15B lane 6 Reference Whitehouse I Flaus A Cairns BR White MF Workman IL amp OwenHughes T 1999 Nucleosome mobilization catalysed by the yeast SWISNF complex Nature 400 784 787 THE REGULATION OF CHROMATIN STRUCTURE D E F N ITI ON 5 4 58 Histone code 4 59 Euchromatin 4 60 Epigenetic inheritance 4 61 Position effect TRUEFALSE 4 62 True Deacetylation increases the positive charge on the histone tails by unmasking the positive charges on lysines The increased charge tends to stabilize chromatin structure perhaps by allowing the tails to interact more strongly with the DNA 4 63 False A modified lysine can carry one two or three methyl groups or one acetyl group Addition of an acetyl group removes the positive charge on the nitrogen and prevents the addition of methyl groups 4 64 True The variant histones are inserted into nucleosomes via a histone exchange process catalyzed by ATPdependent chromatin remodeling com plexes THOUGHT PROBLEMS 4 65 The structures of serine and lysine residues in histones and their modifica tions are shown in Figure 4 45 Phosphorylation of serine converts an uncharged amino acid to a negatively charged one Methylation of lysine does not alter the charge Acetylation oflysine removes the positive charge leaving the modified lysine neutral The introduction of a negative charge by phosphorylation of serine and removal of a positive charge by acetylation of lysine would both be expected to decrease the interaction ofthe histone tails with DNA which is a negatively charged polymer THE REGULATION OF CHROMATIN STRUCTURE 4 66 4 67 The biological outcome associated with histone methylation depends on the site that is modified Each site of methylation has different surrounding amino acid context which allows the binding of distinct codereader pro teins It is the binding of different downstream effector proteins that gives rise to different biological outcomes A dicentric chromosome is unstable because the two kinetochores have the potential to interfere with one another Normally microtubules from the two poles of the spindle apparatus attach to opposite faces of a single kine tochore in order to separate the individual chromatids at mitosis If a chro mosome contains two centromeres half of the time the microtubules from one of the poles will attach to the two kinetochores associated with one chromatid while the microtubules from the other pole will attach to the two kinetochores associated with the other chromatid Division can then occur satisfactorily The other half of the time the microtubules from each pole will attach to kinetochores that are associated with different chromatids When that happens each chromatid will be pulled to opposite spindle poles with enough force to snap it in two Thus two centromeres are bad for a L causing L b UL ud ing it unstable DATA HANDLING 4 68 A B DNase I preferentially digests active chromatin but micrococcal nuclease shows no such preference Red cells express globin and treatment of red cell nuclei with DNase 1 reduced the ability of the DNA to protect globin cDNA Thus DNase I preferentially degraded the chromatin from which globin RNA was transcribed By contrast fibroblasts do not express globin and treatment of fibroblast nuclei with DNase I did not reduce the ability of the DNA to protect globin cDNA Thus in fibroblasts the globin genes are no more sensitive than the bulk ofthe chromatin Trypsin treatment of nucleosome monomers affects a specific population of monomers namely those that were present in active chromatin This con clusion comes from a comparison of trypsintreated monomers with DNase Itreated redcell DNA Digestion oftrypsintreated nucleosome monomers with micrococcal nuclease yielded DNA that protected globin cDNA to the same extent as DNase Itreated redcell DNA see Table 4 1 If a random population of nucleosomes had been affected all the DNA sequences present in the untreated nucleosomes would still be present in the trypsintreated nucleosomes Since hybridization was carried out in a vast excess of DNA both the untreated and treated monomers would have behaved identically in their capacity to protect globin cDNA and total red cell DNA NH3 NHZ phosphoserine methyl lysine acetyl lysine Figure 4 45 Structures of serine lysine and their modifications in histone tails Answer 4765 Each ofthe amino acid residues is shown as it would exist in the peptide structure ofa histone 0 DU 0 Chapter 4 DNA Chromosomes and Genomes Since the DNA in individual nucleosome monomers showed the same sen sitivity to DNase I as chromatin in nuclei the property of active chromatin that distinguishes it from bulk chromatin must be present in individual nucleosomes This viewpoint is supported further by the observation that trypsin treatment ofnucleosome monomers renders those from active chro matin sensitive to micrococcal nuclease Individual monomers from regions of active chromatin must be physically distinct from other nucleosome monomers Reference Weintraub H amp Groudine M 1976 Chromosomal subunits in active genes have an altered conformation Science 193 848 856 Treatment with sodium butyrate inhibits HDAC activity Figure 4 16C but has no effect on HAT activity Figure 4 1613 Thus acetylated histones accu mulate in butyratetreated Friend cells and other cells as well because HDAC activity is inhibited This result implies that a balance normally exists between HAT activity and HDAC activity When HDAC activity is artificially decreased for example by butyrate treatment HAT activity predominates causing the accumulation of acetylated histones Reference Candido EPM Reeves R amp Davie IR 1978 Sodium butyrate inhibits histone deacetylation in cultured cells Cell 14 105 113 The control proteins either do not bind to any of the histone Nterminal peptides Pax5 or bind to all of them Pc1 and Suv39h1 By contrast the HP1 proteins all bind specifically to the Lys9dimethylated form of the H3 Nterminal peptide The strong association of HP1 proteins with hete rochromatin suggests that the Lys9dimethylated form of H3 will be found in heterochromatin Reference Lachner M O Carroll D Rea S Mechtier K amp IenuweinT 2001 Methylation of H3 lysine 9 creates a binding site for HP1 proteins Nature 410116 120 Colonies are clumps of cells that originate from a single founder cell and grow outward as the cells divide repeatedly In the red colony of Figure 4 18 the Ade2 gene is inactivated by its position next to the telomere At a low fre quency the gene is reactivated giving rise to white cells whose descendents are also white giving rise to the white sectors even though the gene has not moved away from the telomere This result shows both that the inactivation of a telomereproximal gene can be reversed and that this change is passed on to future generations This change in AdeZ expression probably results from a loosening of the local chromatin structure and is inherited stably by an epigenetic mechanism involving members of the Sir gene family It is apparent by visual inspection of the chromosomes in Figure 4 19A that there is a higher density ofgenes with increased expression black bars near telomeres than elsewhere after depletion of histone H4 This confirmed by a more rigorous analysis of the data as shown in Figure 4 46 This analysis indicates that the fraction of telomereproximal genes that have increased expression is more than threefold higher than the genome wide average of 15 It is more difficult to be certain by visual inspection alone that deletion of the Sir3 gene preferentially increases the expression of genes near telomeres Statistical analysis of the data as described above makes it clearer Figure 4 46 The fraction of telomereproximal genes that have increased expres sion is nearly tenfold higher than the genome average of 15 The loss of Sir3 would inactivate the Sir protein complex which would dra matically inhibit deacetylation of histones in the region of the telomere The normally deacetylated histones near telomeres allow the nucleosomes to pack together into tighter arrays which are associated with lower levels of expression In the absence of deacetylation nucleosomes would be c c 9 607 3 a E E g g 40 histone H4 5 1 depletion 5 5 n u v 3 2quot Ir 2 39 deletion n T J39f 10 20 30 40 kb from telomere Figure 4 46 Extent oftelomeric gene activation after depletion ofhistone H4 or deletion of the Sir3 gene Answer 4772 Forthis analysis all the chromosomes were aligned by their telomeres and the results were summed Windows 50genes wide were moved a gene at a time across the chromosomes starting at the telomere At each position the fraction ofgenes in the window with increased expression was determined and plotted as kb from the midpoint of the window to the telomere THE REGULATION OF CHROMATIN STRUCTURE P P5 fragments that do not hybridize to the probe after fragments that do hybridize to the probe after BamHl 190 Bamchleavage T 35 cleavage T 600 T 760 l 920 T 1080 l 12 40 l probe BamHl expected to pack less tightly near telomeres and gene expression should be increased as it is From the analysis shown in Figure 4 46 the effect of Sir3 is most apparentwithin 10 kb of the chromosome ends Depletion of histone H4 is thought to cause a general genomewide reduction in the number of nucleosomes A specific effect on gene expres sion oftelomereproximal genes was not expected The effect extends out to 15 kb or so awayfrom the telomere which is farther than the effect ofloss of Sir3 This result suggests that some genes near telomeres are normally repressed by a nucleosomespecific mechanism that maybe independent of the effects of the Sir protein complex The mechanism of this effect is not yet defined These observations suggest that a special form of chromatin may extend beyond the region to which the Sir protein complex is bound Reference Vllyrick U Holstege FC Jennings EG Causton HC Shore D Grun stein M Lander ES ampYoung RA 1999 Chromosomal landscape of nucleo somedependent gene expression and silencing in yeast Nature 402 418 421 Micrococcal nuclease generates fragments whose lengths vary depending on the spacing of the internucleosomal cleavages that define the ends of the fragments Figure 4 47 Since micrococcal nuclease does not cleave at precise sites within the linker DNA there is some variability in the lengths of fragments produced by cleavage even between the same two pairs of nucleosomes Furthermore similarsized fragments can be produced by cleavage between several different pairs of internucleosomal sites These sorts of variability obscure the finestructure details of the ordering of adja cent nucleosomes Digestion with BamHI sharpens the pattern of bands because it precisely defines one end of each DNA fragment As shown in Figure 4 47 only the fragments to the right of the BamHIcleavage site hybridize to the radioac tive probe The resulting pattern is easy to interpret because the length of each fragment gives the distance from the nucleasecleavage site to the BamHI site directly In the absence of BamHI cleavage the bands are defined by micrococcalnuclease cleavage at both ends Such a pattern does not allow one to deduce the exact sites of nuclease cleavage relative to the probe The method for mapping nucleasecut sites illustrated in this prob lem is called indirect end labelingbecause a defined end the BamHIcleav age site is labeled indirectly through hybridization to a radiolabeled probe The sizes ofthe bands indicate the distances between the nucleasecut sites and the BamHIcleavage site Figure 4 47 Since micrococcal nuclease cleaves between nucleosomes the cut sites define the positions of the nucleosomes relative to the BamHI site Figure 4 48With the exception of the region around the centromere the cut sites are spaced at 160nucleotide CEN3 cutting sites of micrococcal nuclease BamHl Figure 4 47 Diagram relating indirect end labeling to the fragment lengths observed in Figure 4 21 Answer4r73 Vertical lines indicate sites of micrococcal nuclease cleavage The small gap in the micrococcalnuclease fragments shows the position of BamHl cleavage Numbers refer to the lengths ofthe fragments that hybridize to the probe they correspond to the lengths shown in Figure 4 21 Figure 4 48 Positions of micrococcal nuclease cleavage sites and arrangement of nucleosomes around the centromere Answer 4773 Nucleosomes are shown as Shaded circles A84 Chapter 4 DNA Chromosomes and Genomes intervals suggesting that the nucleosomes occupy about 160 nucleotides of DNA The cut sites on either side of the centromere are 250 nucleotides apart suggesting that some special structure covers the centromere It is thought that centromerespecific proteins bind to a centromerespecific nucleosome and protect the centromere from nuclease digestion The cleav age sites on either side of the centromere indicate that there is unprotected DNA between the centromerespecific structure and the adjacent nucleo somes on either side The naked DNA control is important because all DNA sequences are not equally susceptible to micrococcalnuclease cleavage It is essential to know the susceptibility of the specific DNA sequence under investigation Other wise one can be fooled into thinking that a specific band results from the binding of a protein adjacent to the cleavage site when it actually derives from the cleavage specificity ofthe nuclease Indeed the centromere itselfis a preferred site of cleavage although that was left out of the naked DNA digestion shown in Figure 4 21 the absence of cutting at this sensitive site in chromatin is all the more evidence that the centromere is specifically pro tected The results in Figure 4 21 answer this questionvery elegantly The band pat terns from the three plasmids are the key results If the nucleosomes were ordered simply because they were lined up next to the special structure at the centromere then it should not make any difference what DNA sequence was present beyond the centromere Results with plasmids 2 and 3 show clearly that the ordered arrangement disappears at the point where the bac terial sequences plasmid 2 or the noncentromeric yeast sequences plas mid 3 begin This result argues strongly that the regular ordering ofnucleo somes around the centromere is due to some feature ofthe sequence of the neighboring DNA itself 0 U Reference Bloom KS amp Carbon 1982 Yeast centromere DNA is in a unique and highly ordered structure in chromosomes and small circular mini chromosomes Cell 29 305 317 THE GLOBAL STRUCTURE OF CHROMOSOMES DEFiNiTIQNS 4 74 Polytene chromosome 4 75 Lampbrush chromosome 4 76 Mitotic chromosome TRUEFALSE 4 77 False The loops are actively transcribed but contain only a minority of the DNA Most of the DNA in lampbrush chromosomes is highly condensed in the chromomeres and generally transcriptionally inactive 4 78 True If the covering of ribonucleoproteins is first stripped away the looped structure of individual chromatids of a mitotic chromosome can be seen by electron microscopy THOUGHT PROBLEMS 4 79 Ln amphibian oocytes the loops in lampbrush chromosomes are actively tran scribed suggesting that loops may correspond to active genes Based on this analysis of lampbrush chromosomes it is thought that interphase chromo somes in most animals may adopt such a looped structure but the structure THE GLOBAL STRUCTURE OF CHROMOSOMES DMAF 4 82 is too fragile to be observed If a human chromosome were to form a visibly looped lampbrushlike structure in amphibian oocytes it should be possible by hybridization for example to map the sequences that are present in the loops and determine directly whether loops correspond with known tran scription units in the human genome The results of these heterologous injection experiments show very clearly that loop structure is not an intrinsic property of the chromosome the same chromosomes adopt different loop structures depending on the type of oocyte into which they are injected Thus loop structure seems to be deter mined by the proteins that are in the oocyte If this is the case then lamp brush chromosomes formed from mammalian chromosomes mayyield lit tle or no information about the putative natural domains that are thought to exist in mammalian cells Reference Gall IG amp Murphy C 1998 Assembly of lampbrush chromo somes from sperm chromatin Mol Biol Cell9 733 747 Although chromosomes in individual cells occupy small discrete areas the particular location is unlikely to be important Avariety of experiments indi cate that the position of a gene in the interior ofthe nucleus changes when it becomes highly expressed so that it extends out of its chromosome terri tory as if in an extended loop Thus regardless of a chromosome s average position individual segments of it can move to functionally distinct envi ronments in the nucleus IANDHNG The uniform pattern of labeling observed with the smaller chromatin loops is the expected pattern At first it might seem that if transcription proceeds from one end of a loop to the other the labeling pattern should also progress from one end to the other as it does for the giant loop in Fig ure 4 22 However because transcription occurs throughout a chromatin loop by multiple RNA polymerases as illustrated in Figure 4 23 3Huri dine will be incorporated throughout the length of the loop as each poly merase molecule adds a uridine nucleotide to the growing RNA chain Fig ure 4 49 What then is the explanation for the labeling pattern observed in the giant loops The answer is not yet known Because the loops do not label until after a day it is thought that these loops may not be transcribed at all Instead they may be storage sites for RNA that is synthesized elsewhere with an entry site into the loop at one end Reference Callan HG 1963 The nature of lampbrush chromosomes Int Rev Cytol 15 1 34 Throughout the 315kb chromosomal segment that was analyzed the rela tive amounts of DNA in polytene and diploid chromosomes varied by no more than 150 This variation is rather small considering that polytene chromosomes are amplified about 1000fold Furthermore there is no indi cation that the low values are clustered at interbands Thus these results argue strongly that bands are darker than interbands because they are dif ferentially stained rather than differentially replicated The darker staining presumably re ects a more condensed chromatin structure a higher density ofproteins or both The identical restriction pattern in diploid and polytene DNA also support this conclusion in a more subtle way If the differential replication model were correct there would be occasional restriction fragments that spanned a replication fork These branched fragments would migrate very differently from their unbranched counterparts see Problem 5 61 no such anoma lously migrating fragments were observed direction of polymerization labeled RNA segm ent unlabeled RNA segment RNA polymerase X lam pbrush chrom osome axis Figure 4 49 Uniform incorporation of radioactive uridine into a chromatin loop Answer 4782 Newly synthesized RNA is indicated by open rectangles adjacent to RNA polymerase molecules which are represented by lled circles Unlabeled RNA is shown as thin lines P5 0 0 Chapter 4 DNA Chromosomes and Genomes Reference Spierer A amp Spierer P 1984 Similar levels of polyteny in bands and interbands of Drosophilu giant chromosomes Nature 307 176 178 A few of the untreated plasmid molecules are relaxed because they contain one or more singlestrand breaks in the DNA DNA that is carefully handled during the isolation procedure is mostly supercoiled as shown in Figure 4 27 lane 1 Harsh treatment during isolation will yield a DNA preparation that is almost entirely relaxed The bands that run at intermediate positions are a collection of topoisomers that differ only in their linking number that is the number of times one strand is wrapped around the other Adjacent bands on the gel contain topoisomers that differ by a linkng number of one The rate at which DNA molecules migrate through a gel depends on how compact they are with more compact molecules moving faster Relaxed circular molecules are the least compact and therefore move the slowest whereas highly supercoiled molecules are the most compact and run the fastest Treatment with topoi somerase removes supercoils one at a time making the molecule progres sively less compact and slower in migrating The number ofsupercoils in the original plasmid can be estimated by count ing the number of bands between the highly supercoiled and relaxed posi tions of the gel About eight intermediate bands can be counted in the treated samples thus there must be at least nine supercoils in the original plasmid This number is likelyto be an underestimate because ofthe limited resolving power of such gels Once a molecule reaches a certain degree of compactness a gel cannot resolve molecules with further increases in supercoiling Since the supercoils are removed by E coli topoisomerase I the original supercoils must have been negative E coli topoisomerase I will not relax positive supercoils In order for the circular DNA molecules to retain a net supercoiling of zero plectonemic supercoils and solenoidal supercoils must have the opposite sign Thus a negative solenoidal supercoil can be compensated for by a pos itive plectonemic supercoil and vice versa Arrangements B and C are the only structures with zero net supercoiling as shown in Figure 4 50 Arrangements B and C see Figure 4 28 are the two that have comp ensat ing solenoidal and plectonemic supercoils These two alternative arrange ments are nicely distinguished by incubation with E coli and calf thymus topoisomerases as shown in Figure 4 51 For arrangement B incubation 3 JD E coli calf thymus E coli calf thymus topo topo topo topo relaxed negative positive positive B c Q Figure 4 50 The two compensating arrangements of solenoidal and plectonemic supercoils Answer 4785 The signs ofthe solenoidal and plectonemic supercoils are indicated d by an Figure 4 51 Predicted outcomes of incubating alternative arrangements B and C with E coli and calfthymus topoisomerase Answer 4786 The signs ofthe solenoidal and plectonemic supercoils are indicated by and Following the indicated Incubations with the topoisomerases histones were re m oved HOW GENOMES EVOLVE with E coli topoisomerase Iwill not affect the positive plectonemic super coils Thus when the histones are removed the molecules will have the same supercoiling that they started with zero By contrast calf thymus topoisomerase I will remove the positive plectonemic supercoils so that when the histones are removed the molecules will have two negative plec tonemic supercoils derived from the two negative solenoidal supercoils around the nucleosome Arrangement C gives different predictions for both incubations Arrange ment C has negative plectonemic supercoils that can be removed by both the E coli and calf thymus enzymes Thus when the histones are removed the molecules will have two positive plectonemic supercoils derived from the two positive solenoidal sup ercoils around the nucleosome PD Since treatment with both E coli and calf thymus resulted in supercoils in the final molecules see Figure 4 30 lanes 5 and 9 the compensating plec tonemic supercoils in the condensintreated DNA must have been negative which both topoisomerases can relax When those were removed by topoi somerase treatment only the positive supercoils would have remained hence the plectonemic supercoils in Figure 4 30 lanes 5 and 9 are expected to be positive Since the compensating plectonemic supercoils were nega tive the solenoidal supercoils formed by condensin were positive right handed The results in Figure 4 30 are not consistent with condensin s acting as an ATPdriven topoisomerase If condensin introduced a net negative plec tonemic supercoiling those supercoils would have been removed by both topoisomerases to leave relaxed molecules contrary to what was observed see Figure 4 30 lanes 5 and 9 If condensin introduced anet positive plec tonemic supercoiling the supercoils would have been removed by calf thy mus topoisomerase but not by E coli topoisomerase contrary to what was observed see lanes 5 and 9 CU Reference Kimura K amp Hirano T 1997 ATP dependent positive supercoil ingofDNAbylSS 39 A L 39 quot rquot 39 for L condensation Cell90 625 634 HOW GENOMES EVOLVE DEFINITION 5 4 88 Polymorphic 4 89 Ps eudogene 4 90 Purifying selection 4 91 Singlenucleotide polymorphism SNP TRUEFALSE 4 92 True Although this statement is not true for all human genes it is true for many Even more remarkably in some cases the human gene can substitute for the corresponding gene in yeast 4 93 False About 5 of the human genome is subjected to purifying selection but only about 15 encodes proteins This comparison implies that 35 of the human enome more than the amount that encodes proteins has important functions that we do not understand 4 94 True Duplication of chromosomal segments which may include one or more genes allows one of the two genes to diverge over time to acquire different A88 Chapter 4 DNA Chromosomes and Genomes but related functions The process of gene duplication and divergence is thought to have played a major role in the evolution of biological com plexity THOUGHT PROBLEMS 4 95 4 96 The argument is severely awed You cannot transform one species into another simply by introducing 1 random changes into the genome It is exceedingly unlikely that the mutations that would accumulate every day in the absence of DNA repair would be in the very positions where human and chimp DNA sequences differ It is much more likely that at such a high mutation frequency many essential genes would be inactivated leading to cell death Furthermore your body is made up of about 1013 cells For you to turn into a chimpanzee many of these cells not just one would need to be changed And even then many of these changes would have to occur during development to effect changes in your body plan The Hox gene clusters are packed with complex and extensive regulatory sequences that ensure the proper expression of individual Hox genes at the correct time and place during development Insertions of transposable ele ments into the Hex clusters are eliminated by purifying selection presum ably because they disrupt proper regulation of the Hex genes Comparison of the Hex cluster sequences in mouse rat and baboon reveals a high den sity of conserved noncoding segments supporting the idea of a high density of regulatory elements Reference Lander ES et al 2001 Initial sequencing and analysis of the human genome Nature 409 860 921 CALCULATIONS 4 97 A B O U The missing values in Table 4 2 can be obtained by counting the differences between the pairwise combinations of the hemoglobin x chains in Figure 4 32 The differences are 18 for human versus frog 17 for frog versus chicken 12 for chicken versus whale and 17 for whale versus fish The difference matrix shows that human and whale are the two most closely related species The underlying assumption is that the fewer the differences the less the evolutionary distance between the species In the clusteranalysis method the two closest species human and whale are combined and the average differences are determined for the other species Frog Chicken Fish Humanwhale 175 115 17 This analysis shows that the chicken is the next closest species It is then combined with human and whale and used to determine the average differ ences relative to the remaining species Fish 18 Frog Humanwhalechicken 173 This establishes the order of the final two species and gives the overall branching structure shown in Figure 4 52 If you simply used the number of differences relative to human to place the other species on the tree the order of frogs and fish would have been reversed The clusteranalysis method is superior because it makes use of all of the information in the differencematrix table and reduces error due to random variation in the mutational history of individual species Human Whale Chicken Frog Fish Figure 4 52 Branching orderforthe five species based on the differences in the first 30 amino acids of their hemoglobin occhains Ahswer4r97 HOW GENOMES EVOLVE Figure 4 53 Three representations ofthe phylogenetic tree derived from Chi kequot Fr 9 the data in this problem Answer 4798These trees are termed unrooted Human Whale trees because the node where this cluster ofspecies diverged from more primitive species is not shown nor can it be derived from the data in the roblem Fish 4 98 A For five species there are 10 equations which is more than enough to solve for the 7 line segments that make up the tree 1 Humanwhale a b 40 2 Humanchicken a c d 11 11 Human 3 Humanfrog a c ef 18 Whale 4 Humanfish a c e g 17 Chicken 5 Whalechicken b c d 12 Frog 6 Whalefrog bcef 17 7 Whalefish bceg 17 FiSh 8 Chickenfrog d ef 17 9 Chickenfish d e g 20 10 Frogfish f g 20 B Solutions to the two threeatatime equations are shown below In each Ch39Cken case the three equations are summed in such a way as to eliminate all but one variable For humanwhalechicken this involves subtracting equation Whale 5 from the sum of equations 1 and 2 Humanwhalechicken 1 Humanwhale a b 8 2 Humanchicken a c d 11 Human 5 Whalechicken bcd Q 12 5 21 7 1 35 b 45 Humanwhalefrog 1 Humanwhale a b 8 3 Humanfrog a c ef 18 Fish 6Whalefrog bcet 13 6 21 9 1 45 b 35 Note that the values for a and b are not the same Because there are more equations than unknowns multiple values for the unknowns are obtained These are averaged to get the distances represented in the phylogenetic trees Three common representations of the phylogenetic tree obtained from the data in this problem are shown in Figure 4 53 DATA HANDLING 4 99 A The exons in the human Bglobin gene correspond to the positions of homologywith the cDNA which is a direct copy ofthe mRNA and thus con tains no introns The introns correspond to the regions between the exons The positions of the introns and exons in the human Bglobin gene are indi cated in Figure 4 54A From the positions of the exons as defined in Figure 4 54A it is clear that the first two exons of the human Bglobin gene have homologous counter parts in the mouse Bglobin gene Figure 4 54B Only the first half of the third exon ofthe human Bglobin gene is homologous to the mouse Bglobin gene The homologous portion of the third exon contains sequences that encode protein whereas the nonhomologous portion represents the 3 untranslated region of the gene Since this portion of the gene does not encode protein nor does it contain extensive regulatory sequences it evolves at a rate similar to that of introns P5 A90 U 4 100 gt P5 O U Chapter 4 DNA Chromosomes and Genomes A EXONS IN HUMAN BGLOBIN cDNA B HOMOLOGY BETWEEN MOUSE AND HUMAN GENES human Bglobin cDNA 339 3 mouse Bglobin gene H 5 human Bglobin gene 3 The human and mouse Bglobin genes are also homologous at their 5 ends as indicated by the cluster ofpoints along the same diagonal as the first exon Figure 4 5413 These sequences correspond to the regulatory regions in front of the start sites for transcription The regulatory function of this region has limited its evolutionary divergence much as the coding function of exons has limited their divergence Functional sequences which are under selective pressure diverge much more slowly than sequences without function The diagon plot shows that the first intron is nearly the same length in the human and mouse genes but the length of the second intron is noticeably different Figure 4 5413 If the introns were the same length the line seg ments that represent homology would fall on the same diagonal The easiest way to test for the colinearity of the line segments is to tilt the page and sight along the diagonal It is impossible to tell from this comparison if the change in length is due to a shortening of the mouse intron or to a length ening of the human intron or to some combination of those possibilities Reference Konkel DA Maizel IV amp Leder P 1979 The evolution and sequence comparison of two recently diverged mouse chromosomal 3 globin genes Cell 18 865 873 The majority of sequences isolated from the Neanderthal bone differ by seven substitutions and a single base insertion compared with the human reference sequence Although several clones show additional individual dif ferences these differences because they are represented infrequently are most likely to be due to misincorporation of nucleotides during the PCR reaction They are probably caused by damage to the original template and so can be ignored The variability among contemporary human lineages as shown in Figure 4 37 is too little to encompass the sequence obtained from Neanderthal bone Thus it is very likely that you have indeed determined the sequence of Neanderthal DNA Eight ofthe 44 sequences show an identical match or differ only by a single base from the human sequence These sequences almost certainly arose by amplification of contaminating human DNA In the paper on which this problem is based the authors estimated that they were amplifying from a starting population ofabout 50 DNA molecules in 5 HL ofbone extract A sin gle contaminating human cell would have added 50 1000 additional con taminating sequences so their precautions worked very well The main reason for using the mitochondrial DNA for these studies is its abundance compared with nuclear DNA Cells typically contain 500 1000 copies of mitochondrial DNA molecules compared with 2 copies of nuclear DNA molecules With the accumulation of DNA damage over the 30000 years or so since this individual died abundance is critical for success The second reason is that mitochondrial DNA is much more variable than nuclear DNA The ability to detect multiple differences is also critical for showing that this one Neanderthal sample is different from human DNA The most important way to verify these results is to isolate DNA from a second Neanderthal sample Three years after this initial report the same 5 human Bglobin gene 3 Figure 4 54 Interpretation of diagon plots Ahswer4r99 A Positions ofthe exons in the human Bglobin gene B The relationship ofthe homologous mouse sequences to the exons of the human gene Exons are shown as boxes the White areas indicate the 5 and 3 noncoding sequences HOW GENOMES EVOLVE 4 101 gt B 0 segment of mitochondrial DNA from a second sufficiently preserved speci menwas sequenced Although these two individuals differed at several posi tions in the sequence as expected they shared 19 substitutions relative to the reference human sample By sequence analysis these two Neanderthals form a group that is distinct from modern humans It is estimated that the modern human and Neanderthal lineages diverged between 360000 and 850000 years ago References Krings M Stone A Schmitz RVV Krainitzki H Stoneking M amp Paabo S 1997 Neanderthal DNA sequences and the origin of modern humans Cell90 19 30 Ovchinnikov IV Gotherstrom A Romanova GP KharitonovVM Liden K amp GoodwinW 2000 Molecular analysis of Neanderthal DNA from the north ern Caucasus Nature 404 490 493 The left and right boundaries of the inserted Alu sequences and the muta tional changes in the anking chromosomal sequences arbitrarily shown on the 5 side are indicated in Figure 4 55 The boundaries can be located unambiguously using two A y H L By cumparin the sequences vertically one can make a tentative assignment of the boundaries based on the point at which the sequences diverge The second approach makes use of a common feature of the Alu insertion process namely the duplication of chromosomal sequences at the target of insertion A compar ison of the sequences to the left and right of each Alu shows that they are tandemly duplicated one end of each duplication is precisely at the tenta tive boundary assigned on the basis of the vertical comparison The mutations in the anking sequences can be located easily by compar ing the repeated segments that were generated when each Alu inserted into the chromosome 106 ears X 1 site 5 mutations ears after insertion gtlt y 3 X 10 3 mutations 120 sites 14 X 106 years As indicated in Figure 4 55 there are five nucleotide changes in the 120 nucleotides of duplicated DNA that ank the Alu sequences Using the esti mate of 3 X 10 3 substitutions per site per million years the Alu sequences inserted into the human albuminfamily genes about 14 million years ago These particular anking sequences are critical for the calculation because they were generated by Alu insertion into the human albuminfamily genes Thus the sequences that constitute the targetsite duplications mark the time of Alu insertion Additional intron sequences would not help in the calculation because mutations outside the targetsite duplication are unrelated to the time ofAlu insertion The mutations in the Alu sequences themselves are also not use ful for estimating the time of insertion Some of the observed mutations in the Alu sequences very likelywere generated while they sat in the genome at another location prior to the time at which a copy inserted into the human albumingene family 47AM repeat 300 nucleotides 4 TTAAATAl GGCCGGG 7 7 7 7 7 7 7 7 7 7AAAAAAAAAAAAA l TTAAATA TQTGTGGG l GATCAGG 7 7 7 7 7 7 7 7 7 7AAAAAAAAAAAAA l TCTGTGGG TCTTCTTAl GGCTGGG 7 7 7 7 7 7 7 7 7 7GAAAAAAAAAAAA l TCTTCTTA AATAGTATCTGTC l GGCTGGG 7 7 7 7 7 7 7 7 7 7AGAAAAAAAAAAA l TAAATAGTATCTGTC AGAACTAAAAG l GGCTAGG 7 7 7 7 7 7 7 7 7 7AAAAAAGAGAAGA l AGAACCGAAAG Figure 4 55 Boundaries ofAlu inserts and mutational alterations in the flanking targetsite duplications Answer47101 Boundaries are indicated by vertical lines mutational differences are indicated by underlining in one copy ofthe repeats C 4 102 gt P5 Chapter 4 DNA Chromosomes and Genomes The calculation in part B indicates that these Alu sequences invaded the human albumingene family about 14 million years ago that is well after the mammalian radiation and the separation of the lineages leading to rats and humans 85 million years ago Reference Ruffner DE Sprung CN Minghetti PP Gibbs PEM amp DugaiczykA 1987 Invasion of the human albuminocfetoprotein gene family by Alu Kpn and two novel repetitive DNA elements Mol Biol Evol 4 1 9 Infants 2 and 8 have identical patterns and therefore must be brothers Infants 3 and 6 also have identical patterns and must be brothers These two sets of brothers are identical twins The other two sets of twins must be fra ternal twins because no other pairs ofpatterns are identical Fraternal twins like anypair of siblings born to the same parents will have roughly half their genome in common Thus roughly half the simplesequence polymor phisms in fraternal twins will be identical Using this criterion you can iden tify infants 1 and 7 as brothers and infants 4 and 5 as brothers You can match infants to their parents using the same sort of analysis ofsim plesequence polymorphisms Every band present in the analysis of an infant should have a matching band in one or the other of the parents and on average each infant will share half of its simplesequence polymor phisms with each parent Thus the degree ofmatch between each child and each parent will be the same as that between fraternal twins Reference Gelehrter TD amp Collins FS 1990 Principles of Medical Genetics p 80 Baltimore2Williams ampWillltins Chapter 1 Cells and Genomes THE UNIVERSAL FEATURES OF CELLS ON EARTH DEFINITIONS l l Plasma membrane 1 2 Enzyme 1 3 Transcription 1 4 Translation 1 5 Gene 1 6 Messenger RNA mRNA 1 7 Amino acid 1 8 Genome TRUEFALSE 1 9 True Even in eucaryotes where the coding regions of a gene are often inter rupted by noncoding segments the order of codons in the DNA is still the same as the order of amino acids in the protein 1 10 False The nucleotide subunits of RNA and DNA differ in two key ways First the backbone in RNA uses the sugar ribose instead of deoxyribose which is used in DNA Second RNAuses the base uracil in place of the base thymine which is usedin DNA Three ofthe four bases A C and G are the same in RNA and DNA THOUGHT PROBLEMS l ll Trying to define life in terms of properties is an elusive business as sug gested by this scoring exercise Table 1 2 Cars are highly organized objects take energy from the environment and transform gasoline into motion responding to stimuli from the driver as they do so However they cannot reproduce themselves or grow and develop but then neither can old ani mals Cacti are not particularly responsive to stimuli but they display other life attributes It is curious that standard definitions of life usually do not mention that living organisms on Earth are largely made of organic molecules that life is carbon based The first few pages of MBoC emphasize this point and discuss the properties of living cells mainly in terms of their informational macromolecules DNA RNA and protein Reference Pace NR 2001 The universal nature of biochemistry Proc Natl Acad Sci USA 98 805 808 In This Chapter THE UNIVERSAL A1 FEATURES OF CELLS ON EARTH THE DIVERSITY OF A5 GENOMES AND THE TREE OF LIFE GENETIC INFORMATION A9 IN EUCARYOTES A1 A2 Chapter 1Cels and Genomes Table 1 2 Plausible life39scores for car cactus and humans Answer 1 1 1 1 Organization Yes Yes Yes 2 Homeostasis Yes Yes Yes 3 Reproduction No Yes Yes 4 Development No Yes Yes 5 Energy Yes Yes Yes 6 Responsiveness Yes No Yes 7 Adaptation No Yes Yes 1 12 Such modules are generally designed to look for organic molecules charac teristic of life The first Mars probe analyzed soil samples for amino acids none were found 1 13 It is extremely unlikely that you created a new organism in this experiment Far more probably a spore from the air landed in your broth germinated and gave rise to the cells you observed In the middle of the nineteenth cen tury Louis Pasteur invented a clever apparatus to disprove the then widely accepted belief that life could arise spontaneously He showed that sealed asks never grew anything if properly heat sterilized first He overcame the objections of those who pointed out the lack of oxygen or who suggested that his heat sterilization killed the lifegenerating principle by using a spe cial askwith a slender swan s neck which was designed to allow in oxygen but to prevent spores carried in the air from contaminating the culture Fig ure 1 4 The cultures in these asks never showed any signs of life however they were capable of supporting life as could be demonstrated by washing some of the dust from the neck into the culture 1 14 On the surface the extraordinary mutation resistance of the genetic code argues that it was subjected to the forces ofnatural selection An underlying assumption which seems reasonable is that resistance to mutation is a valuable feature of a genetic code one that would allow organisms to main tain sufficient information to specify complex phenotypes This reasoning suggests that it would have been a lucky accident indeed roughly a onein amillion chance to stumble on a code as error proof as our own But all is not so simple If resistance to mutation is an essential feature of any code that can support the complexity of organisms such as humans then the only codes we could observe are ones that are error resistantA less favorable frozen accident giving rise to a more errorprone code might limit the complexity of life to organisms that would never be able to con template their genetic code This is akin to the anthropic principle of cos mology many universes may be possible but few are compatible with life that can ponder the nature of the universe Beyond these considerations there is ample evidence that the code is not static and thus could respond to the forces ofnatural selection Deviant ver sions ofthe standard genetic code have been identified in the mitochondrial and nuclear genomes of several organisms In each case one or a few codons original flask have taken on a new meaning Reference Freeland SI amp Hurst LD 1998 The genetic code is one in a mil lion Mol Evol 47 238 248 1 15 There are several approaches you might tr 1 Analysis of the amino acids in the proteins would indicate whether the set of amino acids used in your organism differs from the set used in Earth organisms But even Earthly organisms contain more amino acids than the standard set of 20 for example hydroxyproline phosphoserine and phosphotyrosine which all result from modifications after a protein has been synthesized Absence of one or more of the common set might Figure 14 masks used in pasteurg tests of be a more significant result spontaneous generation Answer iii 3 swan39s neck flask THE UNIVERSAL FEATURES OF CELLS ON EARTH 1 18 gt mm 2 Sequencing DNA from the Europan organism would allow a direct com parison with the database of sequences that are already known for Earth organisms Matches to the database would argue for contamination Absences of matches would constitute a less strong argument for a novel organism it is a typical observation that about 15 to 20 of the genes identified in complete genome sequences of microorganisms do not appear to be homologous to genes in the database Sufficiently extensive sequence comparison should resolve the issue Another approach mightbe to analyze the organisms genetic codeWe have no reason to expect that a novel organism based on DNA RNA and protein would have a genetic code identical to Earth s universal genetic code as In doublestranded DNA which forms the genomes in all cellular life G pairs with C and A pairs with T It is this requirement for basepairing that necessitates that the number of Gs will equal the number of Cs and that the numbers ofAs and Ts will be the same In bulk samples of DNA this trans lates into equivalent mole percents of G and C and ofA and T The virus X174 does not obey the rules because its genome is single stranded DNA In the absence of a requirement for systematic base pairing there is no constraint on the relative amounts of G and C or ofA and T Schrodinger answered his rhetorical question as follows The obvious inability of presentday physics and chemistry to account for such events is no reason at all for doubting that they can be accounted for by those sci ences It is remarkable howmuch progress has been made since 1944 when the structure of DNA was completely unknown its role was just beginning to come into focus no protein had yet been sequenced and the secret of the catalytic power of enzymes was very mysterious Simple tests had already shown that plants and animals obeyed the laws of thermodynamics neither cells nor organisms can create energy from nothing All organisms require an input of energy from the environment to grow and reproduce even to stay alive Physicists have improved xray crystallography to the point where the structures of large proteins can be determined in weeks and chemists can sequence whole bacterial genomes in a similar time Organic chemists now understand enzymecatalyzed reactions as well as any they study The details of the metabolism of obscure bacteria living at extreme ocean depths on a diet of sulfur and carbon monoxide are well understood The genes that control the intricate body plans of insects are mapped and sequenced Yet many mysteries remain of which one of the deepest is re ected by the enduring truth of Rudolph Virchow s famous 1859 aphorism Omnis cellula e cellulu All cells come from cells One cannot yet mix a defined brew of DNA RNA and proteins together with some lipids and expect to generate a cell from its constituentsWill the next 50 years see this overturned During replication parental DNA serves as a template for synthesis of new DNA During transcription DNA serves as a template for synthesis of RNA During translation RNA mRNA serves as the template for synthesis of pro tein Two other processes D RNA a DNA called reverse transcription and E RNA a RNA called RNA replication occur in the life cycles of RNA viruses such as HIV and poliovirus CALCULATIONS 1 1 9 A The number n of generations of cell divisions required to produce 1013 cells is A3 A4 P5 0 Chapter 1Cels and Genomes 2quot 1013 It is useful to remember that 210 s 103 2 produces the series 2 4 8 16 32 64 128 256512 1024 thus 210 1024 s 103 If 103 cells result from ten gen erations of dividing 1012 cells will result from 4 X 10 40 generations Thus you can estimate quickly that it will take a little over 40 generations to reach 1013 cells You can get a more accurate answer 432 by plugging different values of 11 into your calculator Alternatively you can solve the equation for n which tests your familiarity with logarithms Remember that 2 10102 and 2quot 10quot1 2 Substituting 10nlog2 1013 Taking the log of both sides nlog2 13 n13log2130301 n 432 If cells divided once per day and all cells continued to divide it would take 432 days to generate the number of cells in an adult human Obviously we don t become adults in 43 days The simple answer is that all cells don t continue to divide once per day and some cells are programmed to die As cells differentiate they generally slow their rate of division ulti mately in the adult dividing just often enough to replace cells that are lost or die Of course the real answer is much more complex involving time for cell movements for local 39 e Lquot L A f r matrices to be laid down for cells to differentiate for global patterns to develop and so on For calculations such as these it is useful for purposes of estimation to remember that 45 5103 4 produces the series 4 16 64 256 1024 thus 45 1024 s 103 and that 145 E 1103 Hence 4 different nucleotides can gen erate 1024 different DNA sequences each 5 nucleotides long Similarly an 8 nucleotide DNA sequence can provide enough diversity to tag 25000 genes there being 48 or 65536 possible 8nucleotide sequences However one would expect that most of these sequences would be present more than once in the 32 X 109 nucleotides of the human genome Indeed for a sequence tag to be rare enough to be expected to be present only once it would have to be at least 16 nucleotides long A 16nucleotide sequence would be expected to be present about 07 times in the haploid human genome 1416 X 32 x109 075 A probability calculation should properly be used to assess the likelihood that a tag is sufficiently long to be unique in the genome For a sequence that is present in one gene what is the probability that it is also present elsewhere in the genome The probability of a match PM in any one comparison is the chance of a match at every nucleotide 14 Thus for one comparison PM14 Since the probability of all events is 1 the probability of not matching PN in one comparison is PN1 PM1 14 And the probability of not matching in any number of comparisons c is PN114 THE DIVERSITY OF GENOMES AND THE TREE OF LIFE For a 16nucleotide sequence and 32 X 109 comparisons imagine sliding the 16nucleotide segment one nucleotide at a time along the sequence of the human genome the probability of not matching elsewhere is PN 053 Or since PN PM 1 PM 047 Thus for a 16nucleotide sequence there is about a 1 in 2 chance that it will be present elsewhere in the human genome As you can calculate a 19 nucleotide sequence for example reduces the probability of a match to 1 in 1 21 The surfacetovolume ratio for a sphere is 4TET243TET3 3r thus the ratio is inversely proportional to radius Consequently relative to a human cell a bacterium has 10 times more surface per volume of cytoplasm to allow the passage of nutrients in and waste products out The bacteria however grow 72 times faster than human cells which suggests that something besides the available surface limits the rate of growth THE DIVERSITY OF GENOMES AND THE TREE OF LIFE DE Fl N l TION 5 1 22 Virus 1 23 Model organism 1 24 Archaea 1 25 Hom 010 g 1 26 Eucaryote 1 27 Procaryote TRUEFALSE 1 28 True Phototrophs provide the major pathway by which carbon in C02 is incorporated into the biosphere however it is not the sole mechanism Most lithotrophs can also fix carbon but the amounts are tiny in compari son to the carbon fixed by phototrophs 1 29 False The clusters of human hemoglobin genes arose during evolution by duplication from an ancient ancestral globin gene thus they are examples of paralogous genes The human hemoglobin 05 gene is orthologous to the chimpanzee hemoglobin 05 gene as are the human and chimpanzee hemoglobin Bgenes etc All the globin genes including the more distantly related gene for myoglobin are homologous to one another THOUGHT PROBLEMS 1 30 Whether it s sunlight or inorganic chemicals to feed means to obtain free energy and building materials from In the case ofphotosynthesis photons in sunlight are used to raise electrons of certain molecules to a highenergy unstable stateWhen they return to their normal ground state the released A5 A6 Chapter 1Cels and Genomes energy is captured by mechanisms that use it to drive the synthesis ofATP Similarly lithotrophs at a hydrothermal vent obtain free energy by oxidizing one or more of the reduced components from the vent for example H28 a 8 2 Hi using some common molecule in the environment to accept the electrons for example 2 H 12 02 a H2O Lithotrophs harvest the energy released in such oxidation reduction electrontransfer reactions to drive the synthesis ofATP For both lithotrophs and phototrophs the key to suc cess is the evolution of a molecular mechanism to capture the available energy and couple it to ATP synthesis For all organisms be they phototrophs organotrophs or lithotrophs their ability to obtain the free energy neede to support life depends on the exploitation of some nonequilibrium condition Phototrophs depend on the continual flux of radiation from the sun organotrophs depend on a supply of organic molecules provided ultimately by phototrophs that can be oxi dized for energy and lithotrophs depend on a supply of reduced inorganic molecules provided for example by hydrothermal vents that can be oxi dized to produce free energy The hemoglobin of the giant tube worms binds O2 and H28 and transports them to the symbiotic bacteria which use the H28 as an electron donor and the 02 as an electron acceptor to generate ATP and reducing power to meet their energy needs The resulting growth of the bacteria benefits the worms by providing increased waste products and dead bodies to live on Moreover in the process the toxic H28 is rendered harmless by oxidation to elemental sulfur thereby preventing it from poisoning the worms The balanced equation for oxygenic photosynthesis derived from experi ments using water with isotopically labeled oxygen is 6 C02 12 H20 light Cngzog 6 H20 6 02 In this form ofthe equation it is apparent that the O2 derives from H20 and that all the oxygen in glucose derives from CO2 Four Figure 1 5 All could have split from the common ancestor at the same time Eubacteria archaea could have split from eucaryotes followed by the separation of eubacteria from archaea Eubacteria eucaryotes could have split from archaea followed by the separation of eubacteria from eucaryotes Archaea eucaryotes could have split from eubacteria followed by the separation of archaea from eucaryotes Although horizontal transfers across these divisions make interpretations problematic it is thought that archaea eucaryotes first split from eubacteria and then archaea split from eucaryotes It is unlikely that any gene came into existence perfectly optimized for its function It is thought that highly conserved genes such as ribosomal RNA genes were optimized by more rapid evolutionary change during the evolu tion of the common ancestor to archaea eubacteria and eucaryotes 8ince ribosomal RNAs and the products ofmost highly conserved genes partici pate in fundamental processes that were optimized early there has been no evolutionary pressure and little leeway for change By contrast less con served more rapidly evolving genes have been continually presented with opportunities to fill new functional niches Consider for example the evolution of distinct globin genes that are optimized for oxygen delivery to embryos fetuses and adult tissues in placental mammals A B E A B E A B E A E B Figure 1 5 The four possible relationships forthe evolution ofarchaea A eubacteria B and eucaryotes E Answer 33 THE DIVERSITY OF GENOMES AND THE TREE OF LIFE P5 It would be impossible to identify genes in a vast stretch ofTs As Cs and Gs ifgenes did not have some identifying characteristics In the absence of any knowledge of gene structure in procaryotes you might imagine that the sites where gene transcription begins and ends might be special and thus recog nizable Similarly you might imagine that sequences where protein synthe sis begins and ends might be distinctive and thus recognizable In reality it is the signals for protein synthesis that have proven most valuable for iden tifying procaryotic genes Genes that encode proteins which are the vast majority start with ATG corresponding to the start codon AUG in the mRNA and end with TAA TAG orTGA corresponding to the three stop codons UAA UAG and UGA in mRNA One searches for an ATG and then proceeds three nucleotides at a time codonby codon until a stop codon is reached This procedure defines an open reading frame or ORE Nearly all ORFs greater than 100 codons correspond to genes Some smaller ORFs also encode proteins and are therefore genes however many small ORFs occur by chance and do not correspond to genes In some cases real genes can be identified among the smaller ORFs by virtue of other typical signal sequences that characterize genes in procaryotes Nevertheless in gene counts derived from genomic sequences an arbitrary cutoff is used so that the smallest ORFs are not included in the count Gene identification in eucaryotic genome sequences is much more prob lematical The proteincoding regions of eucaryotic genes are often split into segments that are not finally united until the initial RNA transcript is pro cessed to remove the noncoding RNA Thus the procedure used to count genes in procaryotes is not useful for eucaryotes Computer algorithms to identify eucaryotic genes are still in their infancy and are not yet reliable It is not thought that formation of genes de now from the vast amount of unused noncoding DNA typical of eucaryotic genomes is a significant pro cess in evolution Mutation to generate a coding sequence complete with regulatory elements is too slow a process to account for the observed rates of evolutionary change Since it appears that genes involved in informational processes are less sub ject to horizontal transfer evolutionary trees derived from such genes should provide a more reliable estimate of evolutionary relationships Thus archaea most likely separated from eucaryotes after the archaea eucaryote lineage separated from eubacteria Complexity is a logical explanation for the difference in rates of horizontal gene transfer and it may even be right although there are other possibil ities Successful transfer of an informational gene would require that the new gene product fit into a preexisting functional complex perhaps supplanting the original related protein For a newprotein to fit into a com plex with other proteins it would need to have binding surfaces that would allow it to interact with the right proteins in the appropriate geometry If a new protein had one good binding surface but not others it would most likely disrupt the complex and put the recipient at a selective disadvantage By contrast a gene product that carries out a metabolic reaction on its own would be able to function in any organism So long as the metabolic reaction conferred some advantage on the recipient or at least no disadvantage the gene transfer could be accommodated Reference Iain R Rivera MC amp Lake IA 1999 Horizontal gene transfer among genomes The complexity hypothesis Proc NatlAczzd Sci USA 96 3801 3806 In singlecelled organisms the genome is the germline and any modifica tion is passed on to the next generation By contrast in multicellular organ isms most of the cells are somatic cells and make no contribution to the A7 A8 Chapter 1Cels and Genomes next generation thus modification of those cells by horizontal gene trans fer would have no consequence for the next generation The germline cells are usually sequestered into the interior of multicellular organisms mini mizing their contact with foreign cells viruses and DNA thereby insulating the species from the effects of horizontal gene transfer It is not a simple matter to determine the function of a gene from scratch nor is there a universal recipe for how to do it Nevertheless there are a vari ety of standard questions that help narrow down the possibilities Belowwe list some of these questions In what tissues is the gene expressed If the gene is expressed in all tissues it is likely to have a general function If it is expressed in one or a few tissues its function is likely to be more specialized perhaps related to the special ized functions of the tissues If the gene is expressed in the embryo but not the adult it may function in development In what compartment of the cell is the gene expressed Knowing the sub cellular localization of the protein nucleus plasma membrane mitochon dria etc can also help to suggest categories of potential function For example a protein that is localized to the plasma membrane is likely to be a transporter a receptor or other component of a signaling pathway a cell adhesion molecule etc What are the effects ofmutations in the gene Mutations that eliminate or modify the function of the gene product can also provide clues to function For example if the gene product is critical at a certain time during develop ment the embryo will often die at that stage or develop obvious abnormal ities Unless the abnormality is very specific it is usually difficult to deduce the function or category of function And often the links are very indirect becoming apparent only after the gene s function is known With whatother proteins does the encoded protein interact In carrying out their function proteins often interact with other proteins involved in the same or closely related processes If an interactingprotein can be identified and if its function is already known through previous research or the searching of databases the range of possible functions can be narrowed dramatically Mutations in what other genes can suppress effects of mutation in the unknown gene Looking for suppressor genes can be a very powerful approach to investigating gene function in organisms such as bacteria and yeast which have welldeveloped genetic systems but this approach is not readily applicable to mouse or most higher eucaryotes at present The ratio nale for this approach is analogous to that of looking for interacting pro teins genes that interact genetically are often involved in the same or a closely related process Identification of such an interacting gene and knowledge of its function would provide an important clue to the function of the unknown gene Addressing each of these questions requires specialized experimental expertise and a 39 time 39 from the 39 39 It is no wonder that progress is made so much more rapidly when a clue to a gene s function can be found simply by identifying a similar gene of known func tion in the database CALCULATIONS 1 40 It takes only 20 hours less than a day before the mutant cells become more abundant in the culture From the equation provided in the question the number of the original wildtype bacterial cells at time tminutes after the mutation occurred is 106 X 2 The number of mutant cells at time tis 1 X 2 At the time when the mutant cells overtake the wildtype cells these two numbers are equal 106 X 2t20 2t15 GENETIC INFORMATION IN EUCARYOTES Converting to base 10 see Answer 1 19 106 gtlt10t20110g2 10015 log2 Taking the log of both sides and substituting for log2 0301 6 t200301 t150301 Solving for t 6 0015 0020 0005 6 t 1200 minutes or 20 hours Note that it is also possible to solve this problem quickly using the useful relationship 210 s 103 by realizing that after 1 hour the mutant cells have doubled one more time than the wildtype cells Thus the mutant cells dou ble relative to the wildtype cells once per hour After 10 hours 210 the mutant cells would have gained a factor of a thousand 103 and after 20 hours 220 a factor of a million 106 at which time they would be equal in number to the wildtype cells Incidentally when the two populations of cells are equal the culture con tains 2 x 1024 cells 106 x 260 1 x 280 106 x 1018 1024 2 x 1024 which at 10 12 g per cell would weigh 2 X 1012 g or two million tons This can only have been a thought experiment GENETIC INFORMATION IN EUCARYOTES TRUEFALSE 1 41 1 42 False Plant cells contain both mitochondria and chloroplasts True Bacterial genomes seem to be pared down to the essentials most of the DNA sequences encode proteins a small amount of DNA is devoted to regulating gene expression and there are very few extraneous nonfunc tional sequences By contrast only about 15 of the DNA sequences in the human genome is thought to code for proteins Even allowing for large amounts of regulatory DNA much of the human genome is composed of DNA with no apparent function False In addition to transfers from the mitochondrial genome there are many examples of transfers of viral genomes for example some 1 of the mouse genome arose from copies of a sequence that originated as the genome of the mouse mammary tumor virus What is rare is the transfer of genes from other species THOUGHT PROBLEMS 1 44 Like most questions about evolutionary relationships this one was decided by comparing sequences of genes such as those for ribosomal RNA These comparisons showed that fungi are more similar in gene sequence to ani mals than to plants and probably split from the animal plant lineage after plants separated from animals Thus fungi are thought never to have had chloroplasts and fungi and plants are thought to have invented cell walls independently as is suggested by the use of cellulose in plant cell walls and chitin in fungal cell walls Nucleotide sequence comparisons with other species would allow you to decide whether Giardia represented an ancient lineage or a more recent A9 Chapter 1Cels and Genomes one Such sequence comparisons have been done they show that Giurdia represents an ancient lineage or one that has evolved very rapidly that is almost as closely related to bacteria as it is to other eucaryotes If Giurdia were a strippeddown eucaryote sequence comparisons would have revealed a closer kinship with the eucaryotic species from which it diverged Standard sorts of sequence comparisons ofribosomal RNA genes for exam ple cannot decide the more f A more 39 39 a qu ti of whether the Giurdia lineage traces back to a time before mitochondria and internal membranes became permanent fixtures in eucaryotic cell orga nization Additional sequence comparisons can be used to address this fundamen tal question The hypothesis that Giardia lost its mitochondria as an adapta tion to its current anaerobic lifestyle in the intestinal tract implies that its ancestors once lived in aerobic environments and depended on mitochon dria for energy If that were so then mitochondrial genes might have been transferred to the nuclear genome and the sequence of the Giardia genome might reveal genes that originated from mitochondria Sequencing targeted to genes that are likely mitochondrial markers suggests that Giardia at one time did indeed possess mitochondria or some related endosymbiont Reference Roger A Svard SG Tovar J Clark CG Smith MW Gillin FD amp Sogin ML 1998 A mitochondriallike chaperonin 60 gene in Giardia lum bll u Evidence that diplomonads once harbored an endosymbiont related to the progenitor of mitochondria Proc NatlAczzd Sci USA 95 229 234 and Three general hypotheses have been proposed to account for the differences in rate of evolutionary change in different lineages The individual hypothe ses discussed below are not mutually exclusive and may all contribute to some extent The generationtime hypothesis proposes that rate differences are a conse quence of different generation times Species such as rat with short genera tion times will go through more generations and more rounds of germcell division and hence more rounds of DNA replication This hypothesis assumes that errors during DNA replication are the major source of muta tions Tests of this hypothesis in rat versus human tend to support its validity The metabolicrate hypothesis postulates a higher rate of evolution for species with a higher metabolic rate Species with high metabolic rates use more oxygen hence they generate more oxygen free radicals a major source of damage to DNA This is especially relevant for mitochondrial genomes because mitochondria are the major cellular site for oxygen uti lization and free radical production The e i ciencyofrepairhypothesis proposes that the efficiency of repair of DNA damage differs in different lineages Species with highly efficient repair of DNA damage would reduce the fraction of damage events that lead to mutation There is evidence in cultured human and rat cells that such dif ferences in repair exist in the expected direction but it is unclear whether such differences exist in the germlines of these organisms Reference LiWH 1997 Molecular Evolution pp 228 230 Sinauer Associ ates Inc Sunderland MA DATA HANDLING 1 47 F The simplest hypothesis is that gene transfer occurred at the point indicated in Figure 1 6 Genera in many of the lineages beyond this point have a nuclear C0162 gene whereas lineages that branched off prior to this point do not Five genera Lespedeza Dumasm Pseudeminia Neonotoniu and Amphi curpa apparently have functional copies of both the mitochondrial and the nuclear genes as indicated by shaded boxes in Figure 1 6 GENETIC INFORMATION IN EUCARYOTES GENE RNA mt nuc mt nuc Pisum Clitoria Tephrosia Galactia Canavalia Lespedeza Eriosema Egg Atylosia Erythrina gene 7 Ramirezella transfer and H39H4E Vigna activation Uquot Phaseolus Dumasia 4 Calopogonium Pachyrhizus i Cologania P eraria Pseudeminla Pseudovigna Ortholobium E Psoralea Cullen 7 Glycine Neonotonia Teramnus Amphicarpa C Ten genera Eriosema Atylosia Erythrina Ramirezella Vigml Phaseolus On holobium Psomlea Cullen and Glycine no longer have a functional mitochondrial gene The minimum number of inactivation events that could account for the observed data is four as shown by squares on the tree in Figure 1 6 Six genera no longer have a functional nuclear gene The minimum number of inactivation events that could account for this is five as shown by circles on the tree in Figure 1 6 These data argue strongly that transfer of genes from mitochondria to the nucleus is not a onestep process thatis simultaneous loss ofthe gene from mitochondria and its appearance in the nucleus This is an unlikely scenario a priori since nuclear versions of mitochondrial genes must acquire a spe cial targeting sequence that allows the encoded proteins to be delivered to mitochondria see MBoC Chapter 12 The data in Figure 1 6 argue that the transfer process begins with the appearance ofthe gene in the nucleus pre sumably followed at some point by its activation via acquisition of a target ing sequence This first step is not accompanied byloss of the gene from the mitochondria Once the nuclear gene is activated there appears to be an intermediate stage in which both genes function Subsequently one or the other gene is inactivated lfthe nuclear gene is inactivated the transfer pro cess is effectively aborted lfthe mitochondrial gene is inactivated often ini tially by point mutations then the transfer can proceed The final stage of transfer is deletion of the defective mitochondrial gene a process favored by the economics of genome replication 0 F11 Reference Adams KL Song K Roessler PG Nugent IM Doyle IL Doyle ll amp Palmer ID 1999 Intracellular gene transfer in action Dual transcription and multiple silencings of nuclear and mitochondrial 60x2 genes in legumes Proc NatlAczzd Sci USA 96 13863 13868 If the intermediary in transfer were DNA you would expect that the nuclear copy of the gene would have Cs at the sites of RNA editing If the intermedi ary were RNA you would expect Ts at the sites of RNA editing All Figure 1 6 Summary of Con gene distribution and transcript data in a phylogenetic context showing the most likely point ofgene transfer and the minimal number of points for mitochondrial Squares and nuclear circles gene inactivation Answer 47 Boxes indicate genera with apparently functional copies of botht e mitochondrial and nuclear genes P5 O P5 Chapter 1Cels and Genomes When sequences of nuclear COXZ genes were examined they were found to resemble the edited RNA transcript more closely This observation sug gests that RNA was an intermediary in the transfer process At some point the RNA was presumably copied back into DNA by reverse transcription Whether this is a general feature of transfer is unclear Reference Nugent IM amp Palmer ID 1991 RNAmediated transfer of the gene coxll from the mitochondrion to the nucleus during owering plant evolution Cell 66 473 481 Because synonymous changes do not alter the amino acid sequence of the protein they are not subject to selection pressures which operate at the level of the function of the protein and how it affects the overall fitness of the organism By contrast nonsynonyrnous changes which substitute a new amino acid in place of the original one have the potential to alter the function of the encoded protein and change the fitness of the organism Since most amino acid substitutions are deleterious to the function of the protein they are selected against The histone H3 gene must be so exquisitely tuned to its function that virtu ally all amino acid substitutions are deleterious and therefore are selected against The extreme conservation of histone H3 argues that its function is very tightly constrained probably because of extensive interactions with other proteins and with its unchanging substrate DNA Histone H3 is clearly not in a privileged site in the genome because it undergoes synonymous nucleotide changes at about the same rate as other genes Reference LiWH 1997 Molecular Evolution Sinauer Associates Inc Sun derland MA The data in the phylogenetic tree see Figure 1 3 refute the hypothesis that plant hemoglobin genes arose by horizontal transfer Looking at the more familiar parts of the tree we see that the vertebrates fish to human cluster together as a closely related set of species Moreover the relationships in the unrooted tree shown in Figure 1 3 are compatible with the order ofbranch ing we know from the evolutionary relationships among these species fish split offbefore amphibians reptiles before birds and mammals last of all in a tightly knit group Plants also form a distinct group that displays accepted evolutionary relationships with barley a monocot diverging before bean alfalfa and lotus which are all dicots and legumes The sequences of the plant hemoglobins appear to have diverged long ago in evolution at or before the time that mollusks insects and nematodes arose The relation ships in the tree indicate that the hemoglobin genes arose by descent from some common ancestor Had the plant hemoglobin genes arisen by horizontal transfer from a para sitic nematode then the plant sequences would have clustered with the nematode sequences in the phylogenetic tree in Figure 1 3
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'