New User Special Price Expires in

Let's log you in.

Sign in with Facebook


Don't have a StudySoup account? Create one here!


Create a StudySoup account

Be part of our community, it's free to join!

Sign up with Facebook


Create your account
By creating an account you agree to StudySoup's terms and conditions and privacy policy

Already have a StudySoup account? Login here


by: Helmer Gutmann


Helmer Gutmann
GPA 3.97

Chin-Fu Chen

Almost Ready


These notes were just uploaded, and will be ready to view shortly.

Purchase these notes here, or revisit this page.

Either way, we'll remind you when they're ready :)

Preview These Notes for FREE

Get a free preview of these Notes, just enter your email below.

Unlock Preview
Unlock Preview

Preview these materials now for free

Why put in your email? Get access to more of this material and other relevant free materials for your school

View Preview

About this Document

Chin-Fu Chen
Class Notes
25 ?




Popular in Course

Popular in Genetics (Graduate Group)

This 8 page Class Notes was uploaded by Helmer Gutmann on Saturday September 26, 2015. The Class Notes belongs to GEN 440 at Clemson University taught by Chin-Fu Chen in Fall. Since its upload, it has received 35 views. For similar materials see /class/214243/gen-440-clemson-university in Genetics (Graduate Group) at Clemson University.

Similar to GEN 440 at Clemson

Popular in Genetics (Graduate Group)




Report this Material


What is Karma?


Karma is the currency of StudySoup.

You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 09/26/15
1192006 123 Long Hall Lecture 3 Mapping Databases ampInf0rmati0n retrieval Please check httppeopleclemsoneduNcchenGEN4407640 1 Mapping Databases From Last lecture 1 Two components of Genomic Mapping a Representational measurement dijfkrent maps b Process of determining where the biological object gene or disease locus lies in the genome9 linking a molecular signature with a biological outcome 2 Major bioinformatics challenge e icient mining and use of genomic data 3 Relationship between mapping and sequence sometimes DNA sequence tracts can be thought of ultra high resolution maps9DNA sequence can be considered as an annotation of the position Today 4 Types of sequence 1 Markerbased tags 2 Genebased tags 3 Single gene sequences 4 Prefinished draft sequences and 5 Completed continuous sequences tracts 5 Genomic Map Elements a DNA markers genomic landmark STS sequence tagged sites b Polymorphic Markers sequence variations restriction based RFLP variable number of tandem repeat units VNTRs PCR based microsatellites short tandem repeats STR SSLP simple sequence length polymorphism SSRs simple sequence repeats PCR based SNPs 7 single sequence polymorphisms occur in each 100 to 300 bases in human provbably l in 1000 base homework c DNA clones BACs PACs and YACs DNA fingerprinting restriction digestion fragment pattern are compared between clones to identify those shared subsets clones whose insert ends have been determined are referred to as sequence7 tagged clones STCs7physical mapping Note PACs are bacteriphage P1 based have a negative selection against non recombinantsamp have an IPTG inducible high copy number origin of replication for large DNA production d Genomic Annotation biological information 6 Complexity and pitfalls how to minimize errors a Using several independent maps b Using maps integrated with all available genomic information e Seetnng expennnentat entaenee 7 Numendaturexssues a Resnlubnn 1nntts fur atffenent maps 9 Types nfMaps a Cytngenette maps Glemsa bands labeld eramuacnvely Dr unneseenuy ase FIS amp ber FistH ar ay CGH enmparattve genumlc es mterph hybryduanun resuurc Resulutinn in Flunrescence in situ Hybrid39 akinquot ea Fla ucleus 2 a 93 st hr 1 H hndlzed 1 g y DNA y DNA k I quot Masses uneeneensen chmmaun enmnann nnee cbmmsnme hybwdtzmtan mgm 1 mm SHProctzdurtz C121 are spread and mvra mzd on a Slide glut m call on a Slide glut an Iymd by damgm TVton 100 In a closed contamth Chmmann btzv armch no the m glm what It I slowly Yamowdfrom the contamth m pwzpamnon I xed with ethanol Htgt moan FISHm a dtmmnz 0157700 blame pair become pass17112 m pvobtz an hybwdztzdjbllowmgrandmdpvonzduwz e bnkage GL maps stamng pntnt fur many diseasergene mapptng S n u b Genet pnneets amp baek bnne fur pny cal mappt g rely un tne nat rally neeumng recumbmatmn amp pulymu l arkers usmg genntypes nubserved tn 7 tnken use tabta al dels fur quan catmn 1na scare e mannnunn hkehhuud atstanee tn GL nnt pmpumun tn phystea1 maps e mysteal Maps 1 STS enntentnnapsze1nne ahgument a radlahun hybnd mapsmntenneatate resuluhun between 61 and pnystea1 maps 4 sequences based maps 1n Gennnne databases 11 Compamuve Maps symeny and predmuon from one memes to II ENTREZ 1 PubMed amp MEDLINE MEDLINE English publications biomed journals PubMed has gt one million more entries and extended general science and chemistry journals nonEnglish 2 The idea of crossreferencing Entree The Life Sciences SearchrErnging 39 Genome 39 SEC 3 39 n 3 PopSel ON Dwain 16 Skuclure A K O O 0 ioou s wo uoo s lUODUU s 1000000 5 luuououo s Figure 1 ENTREZ integrated information retrieval system Each sphere represents one of the elements that can be accessed through Entrez and the lines represents each component databases connects to the others The original version of Entrez had just 3 nodes nucleotides proteins and PubMed abstracts Entrez has now grown to nearly 20 0 es 3 LocusLink is superceded by Entrez Gene Your browser should automatically link to the Entrez Gene home page in five seconds 4 Medical Databases usually nonsequencebased information a Example OMIM Online Mendelian Inheritance in Man b What is the difference between an Mendelian trait or disease vs non Mendelian complex trait Complex diseases are any genetic diseases which do not obey the singlegene dominant or singlegene recessive Mendelian law The term complex traits is also used for phenotypes that may not be considered as diseases d Complex diseases are nonmendelian they show familial aggregation but no clear segregation Segregation is the principal difference between singlegene disorders and complex diseases although the genes of complex diseases segregate their phenotypes do not 0 5 OMIM a Electronic catalog of human genes and genetic disorders b Founded by Victor McKusick housed at NCBI c Concise textual information from the published literature in human genetics diseases mans mine nut Lecture 2 DNA Databases amp Mapping Databases 1 DNA Nucleutide Databases 1 Data ow ofthree maerNA databases sunmlsslnns UDdalns EMBL Figure 1 Data ow for new submission and updates between the three databases 2 Lrnportanee of aeeuraey and ease ofuse for nueleoade sequenee databases a Sequenee comparison more useful to translate DNA into eodrng gt protein at b Avoiding error propagaaon e Faerlrtaang inform atron retrieval Nueleotrde Sequenee atflles a Most comm on formatr ateflle b Sequ nee reeordrepresented as a stnng ofnucleotldes wrtln tags andldentl ers e FATSA forrnat gt denotes tlne beginning of a new seq records definition line defllne39 and an identifier accession ID d Upper or lower ease letters for DNA seq usually so elnaraeter per line Courier fantxsthe best e Similarly a protein seq ean use FATSA forrnat 4 Drsseeaon ofnucleonde seq atflle a Headers database speci c rst itemrDDBJGenBank LOCUS EMBL ID has to be unique wrtlnrn tlne database seconds lengtln of seq thirds molecule b C 3 1 P Q J type biological nature of the molecule fourth division code INV historical datelast date when the record was last made public Organismal division httpwwwncbinlmnihgovHTGStablelhtml BCT bacterial sequences 0 FUN 7 fungal HUM Human INV invertebrate sequences 0 MAM other mammalian sequences 0 ORG Organelle sequences 0 PHG bacteriophage sequences 0 PLN plant fungal and algal sequences 0 PRI primate sequences 0 RNA Structural RNA sequences 0 ROD rodent sequences 0 SYN synthetic sequences 0 UNA unannotated sequences 0 VRL viral sequences 39 VRT other vertebrate sequences Functional division 0 CON 7 Constructed 9or Contigged records of chromosomes genomes and other long DNA sequences 0 EST EST sequences expressed sequence tags 0 GSS GSS sequences genome survey sequences 0 HTC un nished highthroughput cDNA sequencing HTG HTGS sequences high throughput genomic sequences 0 PAT patent sequences 0 STS STS sequences sequence tagged sites 0 WGS 7 Whole Genome Shotgun Sequence EST 7 expressed sequence tag 39 Partial DNA sequence singlepass ofa cDNA clone 39 Largest and fastest growing division of GenBank 39 Derived from some speci c RNA source 39 Source eld can be searched Second part of header de nition lines DE in EMBL summary of biological content Accession number cited in publication two formats 15 and 26 one upper case letter followed by ve digits more than two accession numbers first one is the primary one version U5446919 ACCESSION VERSION accession unchanged but version incremented each and every time the sequence changes Source amp organism OSamp OC in EMBL Feature tables tabled direct representation of biological information feature keys location and additional quali ers source feature is the only feature that must be present in all DDBJEMBLGenBank entries CDS coding sequence instruction on how to join two sequences together or how to make an amino acid sequence from the indicated coordinates and inferred genetic code 5 Third Party Annotation TPA a Primary database entries are owned by the original submitter and the coauthors of the submission publications only owners have the privileges to update the data content b TPA reannotations of existing entries combinations of novel sequence and existing primary entries and annotation of trace archive and whole genome shotgun data 6 RefSeq a Many sequences are represented more than once redundancy b Curated secondary database for genomic DNA transcripts and proteins for O Fquot quot1 g selected organisms reviewed by NCBI staff Provide one and only one reference sequence for each DNA RNA and protein nonredundant RefSeq nomenclature 26 format N experimentally determined NC7 complete genome 39 NG7 incomplete genomic 39 NM7 mRNA 39 NR7 noncoding transcripts 39 NP7 proteins 39 NT7 intermediate genomic contigs X computational prediction model transcripts and proteins generated through genome annotation 39 XM7 Model mRNA 39 XR7 Model RNA 39 XP7 Model protein Assembled Genomic Regions contigs Chromosome records 7 EMBL Genome reviews a b C Curated secondary database that representing complete genome sequence in DDBJEMBLGenBank Standardized annotations Synchronized with UniProt evidence tagged 8 Protein Sequence databases a b C d D Information Space MS technology Assays for proteinprotein interaction Derived form the translation of DNA nucleotide sequence databases Universal vs specialized protein databases computational vs curated enhanced GenPept basic NCBI multiple uncurated records RefSeg curated but limited staff UniProt a combination of SwissProt TrEMBL and PIRPSD 39 UniProt Knowledgebase UniProt central access point for extensive curated protein information including function classification and crossreference 39 UniProt Nonredundant Reference UniRef set of databases that combine closely related sequences into a single record to speed searches 39 UniProt Archive UniParc Comprehensive repository re ecting the history of all protein sequences No annotation used internally g Other protein databases II Mapping Databases 1 Two components of Genomic Mapping a Representational measurement different maps b Process of determining where the biological object gene or disease locus lies in the genome linking a molecular signature with a biological outcome 2 Major bioinformatics challenge efficient mining and use of genomic data 3 Relationship between mapping and sequence sometimes DNA sequence tracts can be thought of ultra highresolution maps DNA sequence can be considered as an annotation of the position


Buy Material

Are you sure you want to buy this material for

25 Karma

Buy Material

BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.


You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

Why people love StudySoup

Bentley McCaw University of Florida

"I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

Allison Fischer University of Alabama

"I signed up to be an Elite Notetaker with 2 of my sorority sisters this semester. We just posted our notes weekly and were each making over $600 per month. I LOVE StudySoup!"

Steve Martinelli UC Los Angeles

"There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."


"Their 'Elite Notetakers' are making over $1,200/month in sales by creating high quality content that helps their classmates in a time of need."

Become an Elite Notetaker and start selling your notes online!

Refund Policy


All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email


StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here:

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.