New User Special Price Expires in

Let's log you in.

Sign in with Facebook


Don't have a StudySoup account? Create one here!


Create a StudySoup account

Be part of our community, it's free to join!

Sign up with Facebook


Create your account
By creating an account you agree to StudySoup's terms and conditions and privacy policy

Already have a StudySoup account? Login here


by: Michael Reilly
Michael Reilly
GPA 3.71

C. Lee

Almost Ready


These notes were just uploaded, and will be ready to view shortly.

Purchase these notes here, or revisit this page.

Either way, we'll remind you when they're ready :)

Preview These Notes for FREE

Get a free preview of these Notes, just enter your email below.

Unlock Preview
Unlock Preview

Preview these materials now for free

Why put in your email? Get access to more of this material and other relevant free materials for your school

View Preview

About this Document

C. Lee
Class Notes
25 ?




Popular in Course

Popular in Chemistry

This 46 page Class Notes was uploaded by Michael Reilly on Friday September 4, 2015. The Class Notes belongs to CHEM 0160A at University of California - Los Angeles taught by C. Lee in Fall. Since its upload, it has received 115 views. For similar materials see /class/177976/chem-0160a-university-of-california-los-angeles in Chemistry at University of California - Los Angeles.


Reviews for INTR


Report this Material


What is Karma?


Karma is the currency of StudySoup.

You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 09/04/15
Introduction to Bioinformatics Chem CM160A 260A CS CM121 221 ChristOpher Lee Dept of Chemistry amp Biochemistry TEL 57374 EMAIL leecchemuclaedu Course Goals n J yn lp quotVquot n7 R 7quot We Um Jm wm uwgaljl wumdafi cogtm W531 Xlnma 1f what bna iim JW V r Aplfg y h l i m bw Kimdig f 5 i Ulmn39dcawrgiaawmd thia f Em amst m mwgf mg gmmm WW g mjwjl ji d mm W Prerequisites Summary U 392 E l 2TH ux f a Kl1v 7b V 39Tx 5 Q 39 VHLUQWLHC Gutumee tr WAW Iquot U Efb fhn V HU5HwWHWrQO NW RHWax U H C 011 m l tgx ngtQm mme Meme em LUVQUU J591 0 We WM lj lisz m imiw ig L39 x 7 qu k a mm 4k viC ih QQJIL Kwww A J UQM ewmeme 739 J Un 3 2 m a 339quot 339quot V m page 172 Pew egurilm meets MWF MZ 1 pm Botamy 3253 39rwv a m GQJUJHKSE UQJCejJQJWQLEJ J Wnum 7 v 39L J xch 395 ue an i i quot Q KQKLWQKU Le 95 033mi 3 3553 0 9 9 e M a New 7 D We J D 9 SW Q E 059 a 39 mm wig no OJ Ual 1amp6 Q 1 O igmw m 25 xx 1 k 1 I E C 1 331 w AV Dbjg m Di 7 DJ 9wa O 4I 3 B 522 E 438 2 E2 a H me w gmww gm m j xl 334g W x 4 DFUWQ 4 BF k W j LmeDJjjx D O 6amp6er 1 49 Qggww Nv u 2 66 232muE kliq 1364 3 a E km E 433 gag gm gag amW 2 J 233 Course Website n quot mg 0 Ln WWgmr 395 FarNV 9 39 a mng lag 0 Heswum m mm u Kg mgg MU r m 514 p OREJMQQ x A A 1 7 V EXILER quot quot f gum 5 39 T 7 2 397 I r r quot L nanR O YQM am Q rsgtulrwn Magylgw m we W fwgk ka w J is mm M r ltn E ML QM mm 5 MA Why Bioinformatics The way to answer biological questions in the genomics era Examp e What are the Origins of Genome Complexity V D 53 K39 f a 9 V 9 H119 7hr 78 7M quot3 V x TIL w 3 JL 7 r T R VESSth We J xzm gtgtU UU 6 1625 ii quot l EU 23IHUMUUJM73 v 39 Ly n 39 w H tyx g 0 Dmsophiia 3 H W 7 F 39 M 7 V r x 1 H Jr gt Q VVquotr L K 39 v A A q H rquot quot quotTquot I v aT Q 7 H w W m w w I r y w iv 3 7 LAquot x faR39mnmHamw magi 3 kirk quot llquot quot 195 39CAI W ixx ESLg WNW s dwc i 1 Wm H wem gem we SW kd 1t my Mm U a exons I m m w Wa v WI Ix m xx xxw xxvm E H HH HlichFF w 6 6 q fillLU WNW 115 m but anmy mm is actuality used 28 exams gaggggg E wgg o 969 g mmgg Egg 9 0 g quw g x 03 E ggg gt6 E aw g 83 wvgwg a 8 g mfg ng5 E Eggw IIL E ENNE The First Sign of Complexity letter i e 2000 Naiure America Inc hintgeneticsnaiurecom Analysis of expressed sequence tags indicates 35000 human genes Bram Ewng 8 Phil Green NA 11 The number of proteincoding genes in an organism provides a from 168 EU 39bi ai ies generated at me A39Asilington Univen me nem39 Ce it r7 T 39 s dn not name 115 the lt6 as 4300 s f derived visian has 5000 Evolution of muiiiceiiuiariiy appears to have mm zne pi39ubdbi What i pariituiir gene is represented liniw invertebrates Caennrhabditis elegans3 and Drosnpliila To eiimimie me arterumni and mnmminam sequences in melanagasler having 19000 and 13600 genes respeniveiy Here the ESTs refs 75 we determined me hi it unlit v mm of sub memo NaulreAmema menu neuencsmammcom letter Gene Index analysis of the human genome estimates approximately 120000 genes FuniiinivuiiHiil mu 39 rrni nn Hm i m ii i39i i inninnuwnmmi Hm nnmn vef1numan geneshuuecemanalysesMmeavellehledala Um iiisl mimile n Umnumbci m inumn gum is mm on ref 2 or as many as mono rev 3 am 9 nm rewmcmcd m we si A w mane xv nm N moxome 22 Sequencing onsnm n esumaled a mi nn mqmme um nmnxn m in rucmmnmm ms wn 05001 genes hayell on men annmnnon of me ample mer m wen nus mnin nmg mm Ni aniilihcs1ini w mommenimongnmennimsnngesxsmemmay beauamunai mmm Js wiginhms sm 1 genex me neariy 100000 human ms in mm provide an mm mm mm This snggrsis mm m gum muicmminin new i 39 s ui nimn an Inn mese smgiepess xequenms man he aveniliy mm mm NLMniimxbmnmwimbvisi wmuumingpiw nnaiysenmnamoveamammannnsennenees Imlluling uuoxe mm H mm mm imms mm iui Hm h39mJimim m m mm n 1mm yenomkDNAxpmlouslransmp lonamlveuoramlnec inwim mu in lawsuits Hui mmnn L r inymem mm x iueIesWelavplevsiape alIIqulyremIEdandn on 8 m in mm number m gmmnggesimg m mmquot iimiinl 39 sen mmnmnesnmmnnseer snnsssemmmq nwxnnnmnLinnnmnmgm number m Lnims iiin milucnm UH esnmnie in minnny m min muiiipic 39HiCgt 0m pummiai neinmeiv in qluiii m iiw rsi Mu mnmsmmmnim in mminmsnn m imm ismmn HW mm mm esrn e u ne Est seqnemeslm prmhu y a tonne repvesenlell genes at 951 mammpxsubmmemanu used Ilus m rests me nan Gene lIidItex Aalebases o expyesxell genes 0 human meme m and oKlIer speoes innpzmwwnmnorgmbnmmmn Usmg highly veflnen and nca inc nunrgenemnamreeom penilequot esumems inman me nan nenome onmns minCumnnmnwm nm Mm Himxmnhi m hm mnn epnmxnmem2nnuon es n i in em ouimu mm n imnnnmewimnei Lwchn ne mum Puzzle 2 Given the same experimental data ESTs and the same access to the latest computers and analysis methods why is it so hard to count the human genes articles Initial sequencing and analysis of the human genome lnlemaiinnal lluman Rename Sequencing Bunserlium 39 trove m 39 n v inlrmv quot39 39 and evolution 39 available a e uma 39 39 39 39 39 uh rlala 39 39 uh illaignl uum uh sequence genome r 0 There appear to be about 30000740900 protein coding genes in the human genomeionly about twice as many as in worm or y However the genes are more complex with more alternative splicing generating a larger number of protein products PUZZle 3 J Hm bw d 1 WWW ml md W tlm W gfg g ham g wgm Wigs mm mm mm L1 E D 1 k 3 7 quotI Jquot C Th I gt Science by Computer 13er is 7 17quot IL L I r F a f FCVWw lt17 AILN a IL L VI my K x Hememe VieHm me We when M 39L IL b e L W quot r 1r quot PV 4quot L V I e 70 m E g 4c QJUQQ JUUU w j QIKQMHJEQJ WM Qrf tpl39lf Q r 52W riz H VJ pf lm P Gene anR a 39fLerz quotf r We wa scale IL A 1 77 1 1L HEICEE U L IL 395 h f wa LJil a lm r I 1 El j 9 TN FV W FxrA Tr 39 39 9 v pp 1 b V 7 1 Ojemee MW meuuumewg r quotx 3 ix J ft Wm IL H km N AURJUJ 1 5 16 39739 fL QCW HM w H m 11 r n 311 J 1 321 1 it 7 j 1li 1 complexi Introduction Genomics 39 What is genomics What is bioinformatics How does research in this new field differ from previous research eg molecular biology Definitions x WJJ F m 7 5 a 13917 r r aquot Wquot39 a 59 QNVVW J39L V r 5 fr 7 39quot VJ mm JL g4mmmw QM mgma WQ MS w Z Mr agmmfa edb ihl glhimmgmw mf mi im m ande mim39itIfjgtVt m my Qx1m gtmi imfmmau m mm m mm mm M g mamps midi pmb alb fgmg miwgm W g2 23 W mmzoa nmw QMAEEQ m j l s a a b m inJ D V fWUw I 1 1 D EN U 331JJ D 3 i Eg g i Egg g gag JT JltA U 1L 4m s 43 J 1 932 f 3 E gwj gw 3th 5 a Eh 9 j i 3 93a gw wg g A 4 Aw 0 94 g D JP 0 3 D r n 6 5 when a 431 m lt 21 J x 4 x 2 w my D m a W 4 EE 9 K 9 3 5 9 mgwaggw h69lt g From Nothing to Everything Molecular Biology quotone gene one proteinquot one gene one la one gene one PhD Genomics one genome one PhD E g Sore Fitzgibbon P aerophilum 2000 From N0 thimg to Everythmg Wham you do when a the g mes are i Mama sequencgd and si ing in Genbank BRIDW J R JE 12 TV P J 192a quotf E quotix 7quot t 399 ZMU U M NH M HEW U H 7r Hg 17 t win7339 IL W Uk kx 7r EngEggZ guwggg 6 E H qu 7 o x Ilkkk J kt 721 A igkxik mk j l l k l 9 ES LEE E n my AN 9 E gSLHVLHQM3HE 58 4W n gt J a l k m2 4 Jawk Wu 3 559SQ Egg VJ J 9 EggEgg Xxx 1k Ez J E x gag 1 NC 4 xx N1 J Auxlmx E 592 1 t K J J i m Em g s w 3 123 x 5 ng g C H E J K r m 9 0 new J 5 L a V ix E 1m CV AWE K 15 283 B mama Ea Ewg mEUEmaomm 26 C Q Genomics Foundation High Throughput Technology quot Jx BmJ r A kt Q er N u A v hgazluuw F n A E E ULJ 13 J J39 A a V fang 932 f1 W quotquot 7 39J 39 u u r r M n u mm 7 Lager Dye Baged Sequem dmg Four i it 3 Color S a b n a u D u 0 Automated Lane Tracking Automated Trace Analysis 5mm Automated Base Calling ome Seqwences 400 Complete Gem O In Genomics every question is really an information problem 35 39 quot n p P 7 e F 3g 7 a 7r 7 9 J39 7 4m hKv W k 4 Wm m 330 1 J r p U ULQJU o EM J c215 gtS g M H QM x 39 L 39 3 quotf Kquot 4 quot A w x 7W T xaszs g M Eu 24 quot r I J MienJo E fg 3991 39739 11 wkng w 10 Q 7 r n V V39r r m ILIR J R v oom goomoo no of Lam yr r A w 54 U J L JH QW 1o mm we y 9 n Ia w a H QEQQ gag E x g 39 9 Q Hwa wbwg599 4 ga ng me r E g HHmmggag ggmggw ggmgg u m 1 1 Ajax 4E ax Di rr Q g a g m g EEEEEE L I VJlt 1 57Mme c R v QEHEgt g wp E wggggmg E ggw la 59 9a rm a 4 wEogmu 2653 mg mEUEmjgm Hlnman Genome Sequencing Cl 39 m 397 1 Se V T quot rw rcvy 3 v IL I a L L1 1 lt1 1414 fax A 5 F V f We QenMne nueH M 1 one enn ng j d 7 Lw W n I en Hi ne mfonna on problem it I a L W1 m WC J n r men n 39 39 1 39Q i quot fquot 39Qr rx39 new Hey we 4 39 quotf Assembly 39 ne n n13 K quot E3 UH we L v Ur V W 3 1 p 1quot 153w W axs m7 39 quotHMJ39L g C 39 39339 n Alla3 nnnnnj u en nmne u we 5 LUJ We jjxlelf lJ g wine Hwevarcmca shotgun sequencmg A a r 41quot 7 quotVAMI39 I Genome DNA EAC ubrary Organized mapped Iarge cmne cormgs BACIO be sequenced Shmgun quotn 3 F F r dunes 1 L 1 39 Shmgun 1T1 sequence 39rrmucccre Assamny 5L Purely quuemea m mmb y a b c m Ya a g d A2 p4 q a am Horen a4 4 y a d 51 m augnmem OK if as 52 mm 235 ahgnmentm midme omy nut OK L d A A1 A2 B1 A3 BS AA EGAS y JL WW 1m quot I w m r r A 4 w w I Chromosome Map location Gmehov q v mu m p x Mammevd mm x 50 1 O 150 200 2 U Chromosome 2 1 x 5 u 3 2 u 3 gf 39 O z N V c 9 a 0 9 x E x x X Ma39shhe dmap x 4 u I X 50 200 2 0 100 15 hmmo ome 2 O 0395 gjgmgo E3 32 HEW agggg 3 9m gwgmgmmmg 5 Z EQWQQQ g a mwggw a 9 Egg E muggm a 33 gma 2 E3 828 5 mmwamw 435 hggww g g m Egg a gag Y I 12 a 1L 2 x m 4 l 1 4 ANN M 333 wagEa g 3 mg Egmm gi m x m 33ng g Q Antincl max How to tum Data into Discoveries GOO Wh n you get massive data what do you do with if Whaf does it mean YS 6 Mm I w I 5 war a f m H L w w W x 7r qr Kw x IL QUK M aH m V CS E9 ILL 7quot I39r fL fL WWquot 39n Vib Q quotEx 49 1 lt Q V Vf w j IN ME QQIUEM W13 m HUJ avlL gtMJ gmf W 7m quot 3 quotHymnrag F n zquot u gw camm M WCS fo r mat39 n i n of mm t i 1 i Def Lqu U 50 f a g hi v mew m M IL tquot 1 7 I m m r3x 92 m Computational Challenges V flme V R V l I f In 7 a Tfl n r A lglllsilgl lay pzlllgm Elll llng lllg ZEN 7 w JCT lgililllll loll nf39 7 V pry WNW IE 1 a T1 E f ll lllgllllly QUUUQQ l ll Tim 7 P nb A llul iiilllQlQ l wn ran an 4 x 5quotx m l i Ill mmth amp 3 lm W I n n Q In dmll y ll W39 A 339 0 A 3 A w v z IL TV 53 7 95 Wquot L plumn lglars 1er WM l JEEl l fr39 0 2 139 quotI r n L ell l la W Sollving the Hmformatiom Problem 0 fva 5 11 A 13er xx gtquotD LWfVPm REWW i 391quot quot n ILr x39 T aquot 14 MUQJKQHHWQJ m pm b szmm Cm JQUEQHU Wham w m HM ngag 1 W 7 1 an 7 T N V a RN31 41 1 R V WUM W Mg MU LL J W H rx 7 r LB T V 11 El VH 7 39V39N F xj M yfr q v39wf A cvquot g3 wg W War AL 3 77w W Rng gemm ul ms w kmgpww HMMUMLHM pmxgb Hhma L x i J x Wt 9ErLfLpn Jim ILE39N AQQM mm mm m w LUMLHQW IL 3M quot0 a H y lt r 39F L W WY agagwmm M Ea x 7 VJM m Lg r9 I F quot fL r4 IL E VBNE y 10 Q Lw A HJWKQUUWL QM H kd U JCU Riv L H 39 m 3 M J n L a H Completeness Changes Everything 3 U He u 913 F F v 7 Fa Q figquot 19 H J J N 39 3917 r7 I f I Z 3 13 Q e m HQWQWW meet me them KW739s 1 n HOGquotV W wQz W YEAU 3Q 17 LY wax 1 s 1 quotmm 39gt 4 n Rx q A gt quot M A Le MW u weg Q Q 9 f U Mmem F a Q L W7 uy V i V V f 9 1 rd J L r LL I F a U K L m J m L V s u A k 6 Q 396 1ng a FM j 6 W m QM FA 3 wgyj Lg 5 V ch 7 Q Q H t q CD 5quot x 5 f H 1 G r E r quotM 9 Example Otrtholog Predic 0 Ion Gene Evolution through Speciation vs Duplication speciation gene duplication rH A In this diagram all ancestral gene genes drawn with the same symbol are orthologous genes with different symbols are paralogous Genes with the same color are in the same species orthologs paralogs Using Multiple Genomes for OrthoiLog CrossWalLidation 4 g ri f SW Z 6 Wm KP quot quot3 J S lt3 3 v j v quot I 5 r 739Q 39r jquotquotquotw5 Waxy quot 1 ugxgmam ymfk U kg 9quot M quoti L g gt x aYEUQJKQWKQKQQ LIDULHL Lv YL EUQJWQ HHL Wu mmm mm 4 m K Q A ll raw Hf fL ngawa a A JLquotL K39 n 3 am x b m L k JQ an H memgjggJ IQM H mm Hui iv a V A wrxw ygqgj E f Bidirectional Homology Search of Genomes A e B iidhioiogsgs eihioiiidi ail beet hitng beaceoiee paireiog dopiiw iom 165 to evoiwtiomaiy laiiiuieoig L oomiim wipingan 633i 19977 Clusters 0f Ortho ogous Genes Ama ysis F j ima semem Q d xcas sg iiiiwsmphw quot w w J 1 y 39 5 C7 r f v lt7 I v 5 x 39gt 4 L eg 1 K w x if 1 I Slightly Mare mmplex Example


Buy Material

Are you sure you want to buy this material for

25 Karma

Buy Material

BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.


You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

Why people love StudySoup

Jim McGreen Ohio University

"Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

Kyle Maynard Purdue

"When you're taking detailed notes and trying to help everyone else out in the class, it really helps you learn and understand the I made $280 on my first study guide!"

Steve Martinelli UC Los Angeles

"There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."


"Their 'Elite Notetakers' are making over $1,200/month in sales by creating high quality content that helps their classmates in a time of need."

Become an Elite Notetaker and start selling your notes online!

Refund Policy


All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email


StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here:

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.