New User Special Price Expires in

Let's log you in.

Sign in with Facebook


Don't have a StudySoup account? Create one here!


Create a StudySoup account

Be part of our community, it's free to join!

Sign up with Facebook


Create your account
By creating an account you agree to StudySoup's terms and conditions and privacy policy

Already have a StudySoup account? Login here


by: Daren Beatty Jr.
Daren Beatty Jr.
GPA 3.94


Almost Ready


These notes were just uploaded, and will be ready to view shortly.

Purchase these notes here, or revisit this page.

Either way, we'll remind you when they're ready :)

Preview These Notes for FREE

Get a free preview of these Notes, just enter your email below.

Unlock Preview
Unlock Preview

Preview these materials now for free

Why put in your email? Get access to more of this material and other relevant free materials for your school

View Preview

About this Document

Class Notes
25 ?




Popular in Course

Popular in Mechanical Engineering

This 7 page Class Notes was uploaded by Daren Beatty Jr. on Thursday October 22, 2015. The Class Notes belongs to ME 15 at University of California Santa Barbara taught by Staff in Fall. Since its upload, it has received 18 views. For similar materials see /class/227085/me-15-university-of-california-santa-barbara in Mechanical Engineering at University of California Santa Barbara.

Popular in Mechanical Engineering


Reviews for STRENGTH


Report this Material


What is Karma?


Karma is the currency of StudySoup.

You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 10/22/15
REVIEWS CHEMICAL DATABASE TECHNIQUES IN DRUG DISCOVERY Mitchell A Miller Chemical databases are becoming a powerful tool in drug discovery Database searches based on possible requirements for biological activity can identify compounds that might be suitable for further analysis or indicate novel ways to achieve the desired activity What considerations are involved in the construction and searching of chemical databases Chemical databases have progressed over the past 15 years from being a mere repository ofthe com pounds synthesized within an organization to being a powerful research tool for discovering new lead com pounds By using a query that encapsulates some idea of the requirements for biological activity previously untested compounds that might havet e same type of activity can be identi ed In many cases a compound that is found in a search will not act as the new drug itself 7 compounds in databases tend to be known to the world and therefore unsuitable for patenting i but it could indicate novel ways to provide a desired activity The chemist will then create molecules that are close to the database hit Inthis way the database serves as an idea generator e rowth of chemical databases in drug discovery research has fuelled a growth in commercial software for chemical database management see ON39LlNE TABLE 1 and in publicly and commercially available chemical databases see online links and ON39LINE TABLE 2 This review examines the chemical database as a research tool in the current drug discovery 39 We look at the types of database that are in use how they are generated what data they contain how queries are formulated and run and finally how the results of searches are processed Wpes of chemical database LION biasrz39enre 0mm F 39 Sun Diega Culzfarm39u 92121 USA eemuz l mirthellmiller lianbiasrienre am DOI 101038m d745 A the most commonly used terms are the ones that lead to the most confusion Chemical struce ture is one such pervasive yet vague term To a chemist concerned with synthesizing new compounds chemical structure would usually mean a twoedimensional sketch 1 In many of atoms and bonds that represents the compound produced FIG 12 Implicit in this type of diagram is a set ofxy coordinates for all ofthe atoms in the struce ture allowing them to be seen in a way that allows another chemist to understand the identity of the com pound immediat y ever chemicalestructure diagrams are not amenable to computational operations such as database searching so several types of chemicalestructure repre sentation have been developed by theoretical chemists for use in computer systems The predominant form is the atomibond connection table Formally a connection table records the chemical structure as a graph 7 a set of vertices the atoms linked by edges the bonds 7 which allows mathematical analyses to be applied to classify the structure or calculate its molecular proper ties At a basic level a connection table represents a chemical structure by listing the atoms and bonds that are present in atabular form FIG 1b Line notations for chemical structures FIG 1c such as l I 1125 simplie ed molecular input line system 5 are also commonly H ed SMII ES repre 39 Min the same infore mation as that which might be found in an extended connection table and are compact and easy to use Another chemist looking at the same compound as a potential drug for a given receptor would look at the overall shape and surface characteristics of the three dimensional molecule 7 individual atoms and bonds would be less important Generally at least the atoms are present along with 3D coordinates that provide a full spatial depiction of the molecule such as that shown in FIG 1d More than one 3D structure can be generated for many drugs and DRUGVLIKE molecules as 220 l MARCH 2002 l VOLUME 1 wwwnaturecomreviewsdrugd is 2002 Nature Publishing Group REVIEWS Header Database building The process ofconstructlngpopulatlng chemlcal data bases ls compllcated Each or anlzatlon has lts own rules conyentlons and procedureslhoweyer there are several the wayln whlch the database lsset up One conslderar L L L l L t a The 7 6 0 0 0 0 0 0 0 0 l VZQJOiCountsllne tlrstnumberrepresentsnumber lot ll 00000 N L 00000 c the WW m0 the chemlcalstructurewhlch generallycorresponds to o 0000 c l l l L 00000 C o l t ll o o l t t 00000 C lspayl o owe yeemen we form the compound plus counterlonsssolvent s 000000 and lsotoplcally labelled atomss as ln the curatorgts 000000 VIEW and lot a sample of the compoundswhlch ls 7 Bond block atom l l atom 2 bond order Dellmtler tor end ot Mottlle O r eHN ua 2M0 1 I Y I The numbenng oln two groups of atoms Such bonds can usual y rotate wlth a lowetoemoderate energy barrler that l L molecules 30x1 To the curator ofa corporate compound collectlons often termedlbatchgtv lnsome organlzatlons the form ls suppressed as a dlstlnct entitys but the d attrlbutes of Counterlon and solvent merelyretlects dltrerent ways of organlzlng the same set orlnrormatlon Smutureinput The flrst step ln reglsterlng a new come pound ls generallyto recognlze or assume that a new substance has been obtalned as the result ofan lnehouse synthesls or purchase or detected lnthe sclentlrlc llterr a n most cases the chemlst who perf rmed the l l 1 1 l lth u h thl ldea can be revlsed ln the future as more lnformatlon Tl structure ln a drawlng programsuch as ISISDraw l m luer mg checks for overlapplng atoms multlple fragments mlslabelled groups and so on ls typlcallyperformeds are calculated and added to the dlsplay Most organlzatlons have a pollcy that a parent M l l l l structures are added to the databaseThe de nltlon of unlqueness Varles between organlzatlonsv In some thelstructure rltm l nl rt 1 W quot 39 ltstructuregt L 1 L llbrary programme Other de nltlons orstructuresuch as the vtlw F vmrl ln polymer formulatlon and zeollte chemlstry are beyond the scope ofthls review Thedlfferentdeflnltlonsof structurearelmportant a L L L M l l 1 L L classlfled lnto two categorles those that deflne the rlrl th that store 313 Coordlnates for atomis m addltlon to h nrl 1 almll t cular representatlon make two dlfferent structures for posslbleylolatlons ororganlzatlonalpollcysuch as multlple fragments Whlch can arlse from a drawlng errors or a mlstaken attempt to represent salts Wlth rlrl forbldden It should be noted that a glven chemlcalrstructure dlagram could have manyyalld computatlonal reprer sentatlons for example a dlfferent connectlon table for GABA yeamlnobutyrlc aC1d3FlG lab could be VoLUMEl lMARCH 2002 l 221 2002 Nature Publishing Group REVIEWS Boxl n ml I conformational space of Can pharmacophore is shown below 1 a molecule any moleulle in this database awess a conformation that will bring consttucted by numbetmg the atoms dttretently It is impottant fot compatison putposes that a unique 1 ma Hm y t MR 1 n m it n tesentatton ofthe sttuctute such as a unique SMILE85 ot a 5M name The canonical teptesentatton is then s s t t s a s u p and analysed tot designated teatutes thathll facilitate apid sunsraucmnnseatchtng see below 3D m7rdina timAconnectlontable alongwlth 2D cootdtnates fol displayls genetallysuf clent fot the identi cation ofa substance Howeyet to petfotm any enetgycalculatlom ot to detetmlnewhethet the come this teView thete ate mamMy to explote the sets 0de sttuctutes that C n a O L When building a latge database of 3D sttuctutes one needs m t o sthatatefasta1beitapptommateSomeofthe methods that ate in use toda on a lat e scale ate co RD and Conyettetls Most dings like molecules can adopt seyetal 3D coNFokMATloNs by Otation atound on 0 6 0t motesingles acyclic bonds BOX 1 2 and 3D seatch systems can take this factot into account in C L l X M h the databasesbulldlng ptocess by incotpotating multiple aspartate 138 othersymbols refer to atoms in the standard manner n L L NCI Databaseone can pose a query ofth39ls type overz set onSOJSl compounds with 39 39 39 n h In it fitting candidate molecules om the database onto the quetyx ot both E quot Combinatorial libraries Most of the techniques of C V k 11 l tht39 might possessthe desired activity 222i MARCH 2002 l VOLUME H mm TL chemical database seatchmg that ate discussed in this 1 eVieW applyto databases otdtsctete compounds that quota t mni t m that do poundwhtch is genetallysymtheslzed and isolated as a pute sample Howeyet much chemical Wolk in tecent a 0 yeats has ptoceeded thtou N H gh the cteatlon oflatge num bets of com pounds by automated synthesis Futthetmoteoncethehbtatyhasbeencteatedthe e tem must be able to genetate the individual sttuctutes on emandThls ptocessw 1C is tetmedltenumetatton1 can be pelfotmed tot the entite libtaty ot totsttuctutes that meet usetsspeci ed ctitetiax It must be possible to specttya dlsctete sttuctute and use it to find substtucs tute ot othet matches Within the ltbtatththout pets fotmmg anexpllclt enumetatlonxs Data cartridges Recently the development of chemical data cattndges has enhanced the capabilities ofchemlcal dzmhz quot to cteate custom data types and petfotm seatches on these data types Within telatlonal databases such as 2002 Nature Publishing Group REVIEWS Box 2 l Rapid arescreening L L 39 39 I fmanningall atoms in th qu 1 y 1 those of quot J the database It 1 1 1 1 1 11 1 r 1 1 1 1 1 11 1 l 1 1 1 1 tile llumuel ne approach F thr I 1 of structures to be comparedvary 1 1 1 1 1 L r wuelul au Cub UI I 1 A 1 L anl39m ol 39t Fix membered rings or metal atoms Akey is createdby defining the structural features of to I L interest assigning a bit 1 J atm for L I quot in the database Keys are generally set when I 4 39 4 At earch time uuiy L thathaveallthekeyssetbyuh qu 1 L 39 u l of amoleculeI In r F qh r L I I 1 L thebit A11 1 1 structure is compare representation ofa connection table SU39BSTRUCTU39RE Within the second e d is said to be the superstructure of the rst All struc substructure search scans a database for all substructural es CONFORMATIONAL SPACE e m 1 z e o 3 539 3 molecule can adopt without breakingany bon s MARKUSH STRUCTURE set of chemical structures as a E ts each substitution point They 0 H z 3 E m 1 v2 5 9 compounds that are analysed to using combinatorial techniques ed 3 o a 9 chemical patent or paten atabase 39 are llbierted 1 L1 that match need to be consideredfurther apping Like keys fingerprints are bitmaps that are derived directly from the connection table 1 1 1 1 operations folding that 11 1 1 1Nt 1 1 1 muexfin L encodes all possible ways to traverse the atoms and bonds of the structure A quer st the atommbondpaths of the index Only those structures 5753 Oracle or IBM Informix For chemical database users this means the ability to store a complete chemical data base within a relational database and perform various types of chemical search that are discussed below This opens up a wide range of possibilities when designing chemically A A quot quot I Iquot tions can make use of existing modes of commun39cating with relational databases Furthermore because a large amount of biologicaleassay and property data also reside in the relational database the ensemble 0 chemi cal biological and property data can be queried and browsed in context Methodology of database searching Chemical structures differ greatly from other entities that are commonly stored in databases such as text or numbers and so there are many differences between search methods for chemical databases and those for searches and searches on for example wordsAn exact match search can be thought of as looking up a complete L L is analogous to a wildecarded text search and a similarity search resembles a soundselike search Let s explore these search types in more detail word in a dirtinnqrv A F L searchimv The simplest kind of chemical searching is an exactematch search in which a user looks for a given fully speci ed chemical compound in a database This type of search is generally well defined in the mind ofthe user i does the compound I have drawn exist in this database Exactematch searches are performed to nd out whether a proposed new structure already exists in a database to determine the overlap between two databases by using all of the compounds in one database as queries in the other or to find a reagent in an inshouse inventory system or online An exactematch search might yield no hits even though the compound is present in the database Depending on the exibility of query speci cation and the query engine this can happen if the structure in the database has no marked STEREDCHEMISTRY whereas the 1 mm 1 s L 1 A I form whereas the user has drawn the other the structure in the database is an explicit salt the counterion is drawn as a set of atoms whereas the query structure is just the parent compound or because of other differences in representation It might be possible to overcome the problem in these cases by performing a similarity search see below with a high cutoff Substructureseurching 2Dl One of tile most common structural searches is a substructure match in which a user draws copies or pastes a set of pieces of a chemical structure and requests that the system return a set of compounds that contain the pieces This tyqoe of search is typically well de ned in the mind of the user That is every experienced user has a similar expectation of what hits will result from such a search and can generally tdl quot 39 llil how each ans wel 39 L question There are some issues with respect to stereos c emistry aromaticity and the limitations on atomic representation in a particular software package that could lead to surprises for the user who performs a sub L but these can generally be resolved by closer study of tile software package On the database side the implementation of sub structure searching in arbitrarily large databases in a rea sonable time frame is far from trivial The crucial step in determining whether a molecule in a database matches a query involves examining atoms and bonds in detail and can be time consumingA necessary step is to pre screen the database to eliminate from consideration those structures that cannot possibly match the query Most systems compute keys or lillizvrprillvs that encode features or fragments of chemical structures BOX 2 to allow rapid preescreening Phurmuwphoric searching Pharmacophoric or 3D subs structure searching involves a generally sparse set of atomsbonds groups that is combined with speci c 3D constraints such as distances and angles This process is generally much slower than a 2D substructure search as it requires the examination of xyz coordinates for atoms of the candidate structures to compute 3D constraints Pharmacophoric searching can provide an indication of whether a set of structures can bind to a receptor or enzyme This means that hits might be very valuable in the drug discovery and design process H There are two general routes to generating a phare macophoric search query In the rst case the user has a highequality 3D structure for a receptor or enzyme protein with a known agonist or antagonist small molecule attached Here one makes some assumptions about which groups of atoms on the small molecule are involvedin binding esp l l l q r NATURE REVIEWS l DRUG ll39SE 139VF9V VOLUME 1 l MARCH 2002 l 223 2002 Nature Publishing Group REVIEWS PHARMACOPHORE Hydrogen The ensemble ofsteric and bond electronic features that is OCHa Acid Base necessary to ensure optimal O OH H N H interactions with a speci c N Ko 00 2 biological target structure and to H KO CH3 trigger or to block its biological response CH3 Ar H d h b omatlc ro o lo Hrbond Hrbond mg gryoupp STEREDCHE39MJ S39TRY The spatial arrangements of atoms in molecules and complexes TAUTOMER One oftwo or more structural isomers that exist in equilbrium and are readily converted from one isomeric form to another HYDROGENBOND A weak attraction much weaker L r r r bond but much stronger than oer oxygen nitrogen or tluorine atom in one molecule and a hydrogen atom in a neighbouring molecule Hydrogenrbond donors are groups with electron hungry hydrogen atoms Hydrogenrbond acceptors are atomswith electrons to share BIT STRING A contiguous set ofcharacters that consists entirely of 1s and 0s acceptor donor Figure 2 l P r39 39 39 used in threedimensional queries these groups and encodes this information into a 3D query generally using a graphical tool In the second case no protein structure is available but the scientist has a set of molecules that are known to bind to produce a desired pharmacological effect 3D structures of the I erallv using an automated overlay procedure and areas of commonality thus a 39c a a c i i i r i i 1 3D database queries are often different from 2D queries in one important way the atoms in the query tend to be generalized types rather than speci c chemi cal elements Commonly used types are HYDROGENVBOND acceptors hydrogeneb A d n r arid ha a rings and hydrophobic groupsZZ FIG 2 There might also be extra features in 3D searches such as excluded volumes 3D search systems generally allow the user to input extra parameters to control howtightly the 3D cone rnmqtir A b39t t 39 1 sad t l 5 mg can 6 0 stralnts must match iwhether to consider conformae encode a good deal of I I I I I I infommon in a compactwayI tlo nal ex1blllty 1n the candidate matches whether to and is easily and rapidly check for van der Waals contacts and so on In particular interpreted by computer systems The combination oftwo input bits such that the result is 1 if bothbits are 1 and 0 otherwise Box 8 l Similarity the issue of conformational exibility is important as most drugelike structures have more than one low energy conformation BOX 1 Typically databases include a single 3D structure for each compound In general the more extra parameters that are used the longer the search will take but the more accurate and therefore valuable the hit list will be Features and Once a set in rrc 111 UI q to evaluate the degree of similarity between two molecules By far the most common mathematical formula is the Tanim oto coefficient EQN 1 SAIBcabmc 1 SA B similarity of structures A and B c number offeatures in common between the given propertyin a r 2 this means the number of ON bits when the two bit strings are logicallyANDed a number of features ON in structure A b number of features ON in structure B 39 r 39 t39 39 arenre ent1ornot Fur qmpl Fnrq I L t 39 0 in two molecules A and B Feature number 12 3 4 5 Molecule A 1 10 1 1 Molecule B 0 1 1 10 the similarity assessedby the Tanim oto coef cient is given by EQN 2 SAIB2437204 2 algorithms in 3D structure searching are areas of active research to improve speed and discrimination During the search process the software will generally perform a screening procedure as in 2D substructure searching then proceed to atombond mapping and constraint verification Consideration of conforma tional exibility generally takes place on the y as 3D constraints are mapped onto the candidate molecule e procedure moves on to the next candidate when one quotA 39 i found I 39 I 39H due to any single database entry highly symmetrical structures for example might lead to their inclusion in a hit list if search time limits expire Similarityseurching Whereas user expectations with respect to substructure matching are very well defined 39 matching The user wants compounds that resemble the come pound of interest on the basis of a chemist s intuitive thinking but that do not necessarily re ect an exact or L match The hope is that the biological system will respond to the molecules in a similar way even though they represent different substances Similarity searching is also implemented very differs ently by the various software packages Two factors are notable 39 quot 39 quot 39 t property or set of properties evaluated for each mole cule and second the coef cient which is computed on the basis of this property to quantify how similar two compounds are ZHB lpmlqr Similarity properties Most commercial programs cal culate similarity on the basis of the keys or ngerprints that are used in the rst step of 2D substructure search ing BOX 2 These keys or ngerprints are generally in a format that is relatively easy for a computer to work with i m STRINGS that can be ANDed together easily Similarity searching is therefore a facile operation in most software Compound similarity has also been computed using atom pairsmo sets of four consecutive atoms also known as topological torsions sets of three or four disconnected atomic points and other molece ular properties The basic aim is to find patterns within a molecule that provide a basis for assessing its degree of likeness to another molecule at the same time transcending the obvious substructural features The relative merits of the keys fingerprints atom pairs and so on depend on the purpose at hand and the biases of the user3340 Similarity coef cients Once a property has been selected to describe each molecule some way of turns ing that property into a numerical value that tells us how similar two molecules are must be generated The most common similarity metric is the Tanimoto coef cient which is defined in BOX Other coefficients include the cosine coef cient the Euclidean distance coefficient and the Tversky coef cient user who wishes to perform a similarity search provides a structure generally a complete compoun 224 i MARCH 2002 l VOLUME 1 wwwnaturecomreviewsdrugd is 2002 Nature Publishing Group REVIEWS of interest selects a database and also supplies a cutoff percentage which limits retrieval to only those com pounds that are more than this percentage similar to the input structure At search time the software come pares the query structure with each structure in the Box 4 l Searching a chemical database These instructions a b can be usedto search HQN OH I for analogues of O N OJ baclofen a drug Nae that acts at GABA O yeaminobutyric Cl acid receptors Baclofei39i NSCSi 686 structure shown in oi Similarity 0 485 e 1n the freely accessible National 0 NHQ d Canc r Institute 3 NCI Database 0 moil OH I Retrieve compound HQN HN 0 byname O Gum the web 0 Nsczi i972 NS0295295 address Similarity 0 all Similarity 0 887 httpcactusnci Ci nihgovncidb2 or the erman 39 Whirl Min h 39 39 de ervices ncide make sure 1 allthe text boxes on the right side of the screen are empty Should you have any r 39 39 39 AA Marc Nielzlnn at mn1helixnih gov In the top Query type pulldown menu on the left side of the screen select Name search On the right side type baclofen into the corresponding text box Select Any exact name from th 1 menu that 39 quot then clickon the Start search button In the tabular results page note that baclofen NS C329 137 is the only hit Select it by clicking on one of the 329137 links Fr m L L quot Cy 1m 11 11131 quot 31913 cickonthe Transfer to Java Editor button Note that changes can be made to a query structure using this applet For a rst pass make no changes When ready click on the Transfer to Query Form button beneath L39ue Lexi box and Set up a substructure sear 39 Ifyoufollowerl 39 I quot L Hm quot molecular input line system for baclofen on your query form If you skipped this step you can paste the SMILES NCCCCOOC1CCCCICC1 directlyinto the text box If you d like to practise drawing structures you can use the Editor button to invoke the 39 4 L f 39 innanplan nve l n1 1 1 11 1 Perform the substructure search and viewtheresults Make sure the dropdown menu on the left of the SMILES text box shows Substructure and0r 3D WWW u 1 L th L 11 1 39 Click on a Start Search button Browse the results as desired L ran1d uithth it NSCnumberandsimilarity 1 1 1m N L 1 r 1 1 1 11 Variations Perform a similarity search on baclofen or another one of the molecules retrieved Modifybaclofenin L andperfuuu anuliicl L L Deleting quot 39 L of I the hit list Compare the properties of hits to those of baclofen 39 quot 39 f 39 f baclofen or another of L r in the hit list database using a very fast bitmap comparison come putes the similarity coef cient and compares this coef ficient with the user s cutoff Compounds that have equal or greater similarity are considered as hits and are made available to the user for browsing Molecular docking One form of chemical database search that has gained momentum in recent years is docking or placing a series of candidate molecules from a database into the active site of a protein to evaluate how well the compounds might bind to the receptor or enzyme There are two basic problems to be solved in the docking process how best to t the small molecules ligands into the active site of the protein and how to compare and rank the best poses or ttings of a set of molecules in order to compare them Original y small molecules were docked into active sites using a single rigid conformation H Now faster 1 L 1benel1 1 11 It 1 of conformational exibility The methods of provid ing ligand exibility include most of the accepted ways of evaluating conformations of 3D structures and roviding an overview of these is beyond the scope of this review T 39 are used a study of docking and scoring methods found that differ ent scoring functions work best in different situations and that a combination consensus of scores might be the best way to rank molecules from a docking study The term virtual screening is often applied to compue tational processes that select molecules that are likely to have activity against a biological target of interest 5 Docking is perhaps the computational technique most worthy of this moniker as the molecules that are identi7 ed in a docking scan have been compared most directly with the requirements of the target Postsearch processing of results Realizing the value of a chemical database search begins after the search is complete The user might face a list of compounds that is too large to be examined or tested using available resources Some strategies include filtering 7 essentially imposing secondary search criteria to eliminate compounds clustering 7 taking a representative subset of a larger set and human inspection of the compound structures with or without extra data Filtering The set of compounds can be pruned by eliminating those with properties that are deemed to be undesirable or not druglike One famous set of rules to determine which molecules are most likely to behave o lt500 g molquot log of octanolwater partition coef cient LOG p lt50 no more than five hydrogenebond donors no more than ten hydrogenebond acceptors A chemist may also remove from consideration all compounds without available samples if the com pound is part of an inehouse database or compounds that cost too much ifthe database represents a set of commercial catalogues Molecules with groups that are NATURE REVIEWS l DRUG LI39SV JVFPV VOLUME 1 l MARCH 2002 l 225 2002 Nature Publishing Group REVIEWS Lo G P a rimquot coe icient is the ratio of the Box 5i Database in lead discovery a case history A recent study by Pang and colleagues 7 shows the usefulness of database techniques in the discovery of new lead compounds The biological target in mi iud which might leadto anticancer activity Structures were obtainedfrom Mi illiurllluiiull S gtIclll e Available Chemicals Directory ACD i i i i a a i J 7 Thelogarithmofthispartition dfta aim 1 I h r I I A h h mug of 700 Da coe icientiscalledlogP r thosewIt nut This n r u 1 t 39 39 L 39 pro dJIIJIICEUDOC WthWU stages low resolution 30 Lquot39 rotatinna l39 x quot39 39 iu lnmlinlm l39 1 r thmugh 0e membm e translational increments The dockingyieldedaset 0f I A 39 39 was furth 1 1 L i L i i i 1 i i L Comparison ofthe NCi 0 Chemical structural database 41 7027712 2001 An excellent analysis ofpubliclyand commercially awilable chemical database Trirllasilc N ed ChemcaGiaph Theory ORG Boca Ratm regal N VoigtJ H Bientait Bvvang S ampNiciltiaus M c pen dataoase Witn seven large J Chem W Comput Sci 25 334443 1985 s J Chem W Comput Sci 39 Visual inspection was used to remove compounds with JFLiii i yield27 six of L further 394 Of the finalll compounds 139 o r that were submitted for testing four were found to inhibit the target enzyme in ZSaIOOnuM concentrations As a control 21 other compounds were randomly selected from ACD None of these had activityin the ZSaIOOnuM range Ti 1 h i a r i i 1 39L L A 11 man i i i L that active compounds were identifiedmerely by chance deemed to be overly reactive such as acid chlorides Conclusion which react violently with water are also excluded from further consideration Clustering In many cases even after applying filters there are still too many compounds to test or the coma pounds in the hit list resemble one another to such an extent that there is no point in testing all of them In these cases a chemist can perform a clustering ofthe database to group similar compoundsmwz A represen tative compound from each cluster is sent for biological assay Various computational methods exist for per forming clustering including Wards JarvisaPatrick and u noc 5 etrics that are used as input are often those used for similarity searches see above3555 Clustering can also be applied to understand the under lying set of chemical scaffolds that are represented in a large hit list this can be a useful prelude to analysis of structureaactivity relationships Human inspection Having a person look over the results of a database search is by far the most timeaconsuming process in the consideration of what might be a large set of re quotii in Pertinn might 39 39 matri of structures on a page or a more inadepth analysis of a small number of compounds in the context of further information This further information might include measured or calculated physical properties analytical test results molecular spectra biologicalaassay results and so on The process requires a good deal of effort but it can yield valuable results because of the insights that can be drawn by seeing a set of structures in the wider context of the research process 0 u a As we have seen chemical structures are entities that are substantially different from text numbers and other more common data They have their own representaa chemical database searching a guide to searching the iiliiilllilll39l iINlllllt Jtllriiuwhich illustrates some of the principles discussed above can be found in BOX4A recent case history that highlights the potential for chemical database searches in lead discovery is described in BOX 5 Man vendors now offer basic software for registraa tion searching visualization and analysis of chemical structures see ON39LIN39E TABLE 1 What will differentiate these products The usefulness of chemical database searches for lead discovery often depends on factors beyond the ability to perform the searches themselves First users must have the abilityto perform searches on chemical structures in the context of related data including chemical properties biological activity and protein structural data There must be tools to allowthe researcher to wade through the often large hit sets to select the most promising compounds for further evala uation The user must have the ability to browse the results Within the context of related research informa tion in order to draw the proper inferences from the data The specifics of these requirements vary from organization to organization but adaptability of the software is crucial as is the abilityto interface with other analytical and visualization tools Products that can fula fil these requirements will succeed whereas those that merely have good technology will fail e alby A eta Description otsevera formats used ycomp chemical sir d ucture ile u er programs develope at Molecular Design Limited J Chem MK Comput Sci 32 structures Witnin large databases of organic compounds J Chem W Comput Sci 41 14374445 2001 Weininger D SixiiLESi introducti Chem W Com Sci 23 31 1988 on and encoding rules 2447255 1992 7 Warr w A Combinatorial chemistry and molecular diversity 5 Dry L Latour T Lenerte L Barberis F ampVercautererl D P An overview J Chem W Comput Sci 37 134440 A new grapn descriptor for molecules containing cycles 1997 226 i MARCH 2002 i VOLUME 1 wwwnaturecomreviewsdrugd is 2002 Nature Publishing Group


Buy Material

Are you sure you want to buy this material for

25 Karma

Buy Material

BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.


You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

Why people love StudySoup

Jim McGreen Ohio University

"Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

Amaris Trozzo George Washington University

"I made $350 in just two days after posting my first study guide."

Steve Martinelli UC Los Angeles

"There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

Parker Thompson 500 Startups

"It's a great way for students to improve their educational experience and it seemed like a product that everybody wants, so all the people participating are winning."

Become an Elite Notetaker and start selling your notes online!

Refund Policy


All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email


StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here:

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.