Statistical Bioinformatics STAT 5570
Utah State University
Popular in Course
Popular in Statistics
This 19 page Class Notes was uploaded by Geovanny Lakin on Wednesday October 28, 2015. The Class Notes belongs to STAT 5570 at Utah State University taught by John Stevens in Fall. Since its upload, it has received 14 views. For similar materials see /class/230499/stat-5570-utah-state-university in Statistics at Utah State University.
Reviews for Statistical Bioinformatics
Report this Material
What is Karma?
Karma is the currency of StudySoup.
Date Created: 10/28/15
Gene expression analysis of time course experiments Reed Gann Statistical bioinformatics 6570 Spring 09 References Conesa M NuedaA Ferrer and M Tal39on masigpro a method to identify significant differential expression profiles in time course microarray experiments 2005 Tusher V Tibshirani R and Chu G 2001 Significance analysis of microarrays applied to the ionizing radiation responsequot PNAS 2001 98 51165121 Why a time course Life is dynamic Scale important in time and gene dosage Often a systems response to stimulus is poorly de ned Can be seen as a safe experimental design Classic Time course Exp Want to measure response to some stimulus Treatment vs control Also interested in effect of time select various time points to sample from and construct experiment accordingly 6 12 24 hours Replicates Two required three if possible Replicates Not discussed previously Allow us to get a better estimate of variance Define Biological vs technical replicates Rep plot additional quality check required Rep Plot Libr aax m qamw ReadAffxggigiggkgggg quotDocumentsidisk localSpring 3909statistical BioinformaticsTIMECOURSE Data lt gmgkdata exprsData lt exgrsData ggex9rsData Q 9gexgrsData1exErsData2 Rch16 cex8 xlab39Trt0tp30Rep139 xlab39Trt0tp30Rep239 abline0l Rep Plot RG13 TRT4TP120REP2CEL 2 39 6 5 0 12 14 RGS4 41201aCEL Tn msaRep Data Preprocessing Preprocessing steps Image summarization Data normalization Quality checks All similar to what we ve done all semester long to prepare the data for the two methods we ll discuss today maSigPro Treatment vs control Human Caco2 cells and B infant395 ATCC 15697 treated Human Caco2 cells Three time points 30 60 120 min Two biological replicates at each time point and each treatment class Total 12 arrays HG U13320 chip 54675 probesets each array maSigPro Ingmar data lt ReadAff cel lenathquotDocumentsidisk localSpring 09Statistical BioinformaticsTIHECOURSB prgpqge data Data lt gatdata exgrs ata lt ixg ata Tcell lt qrep06repl Dcontrol 1treament E lt rownamestg r5Data maSigProdata lt listxegsData yTcell geneidgg whimng logged2TRUE colname5exnrsData lt pastequotAzrayquot c112 sep quotquot czea experimental desi 1 Time lt repcrrepctl3 each 2 Z Replicates lt grep1c16 each 2 Control lt Crepu 6 repw 6 Treatl lt Crepw 6 xepll 6 edesigy lt g hue Replicates Control Treatl rownamesea esiun lt pastequotArrayquot cl12 sep quot W m lilyaryma5iuro tctest lt maSigPro e rsmata edesig degree I dig quotgroupsquot main quotTestquot tcte5t5w shows significant genes by experimental groups maSigPro l 39 tt g gene 52800 out of 54675quot 54675quot 54675quot 54675quot 54675quot 54675quot 54675quot 54675quot 39 54675quot NO Significant 54675quot 54675quot genes 54675quot 54675quot 54675quot 54675quot ThlS IS a result of 54675quot 54675quot the model 54675quot of 54675quot OOOOOOOOOOOOOOOOO mmmmmmmmmmmmmmmmm ctorquot gt tctestsummary shows significant genes by experimental grou s l quotg9significant genesquot maSigPro model maSigPro follows a two step regression strategy to find genes with significant temporal expression changes and significant differences between experimental groupsquot Does not make any mention of repeated measures or 1 for w t t 1 t t a Major issuemust characterize highest order interactions if significance is found doing a post hoc analysis Familiar Alternative We can also use SAM to analyze time course data Time course analysis feature also available in the Excel plugin Data organization slightly different than what we ve done previously Key features of the model account for repeated measures SAM libraryimpute library sam Call W timecourse data germegts of y are of the form kTimet where k is the class label and t is the time in addition the suffixes Start or End indicate the first ang last observation in a given time course the class label can be that for a two class unpaired one tag class paired problem all lt rownamesexBrsData glass lt pa t crep1r5rrep216quotWC3O6O904sep quot atagmtc14710 in star M t glassipasteclassijquotStartquotseg forg39Z in start2 sipasteclassijquotEndquotsepquotquot datatimeSAMlistxexgrsDatayclass logged2TRUE samrobj lt WdatatimeSAM resgtXEequotTwo class unpaired tim testStatist cquotstandardquot W wty2ec quotsigned areaquot random seed1234 deltatable lt samrcomgutedeltatablesamrobj rggngdeltatable cl45 3 SAM gt 88W83988818588 We get 8 06 0108 96 0086 Significant 86 0078 3 883 genes at 0an 3 83333 FDR of 5 o 12 0033 9 0044 2 882 At delta 168 2 8888 get about 30 2 81888 significant 3 32333 genes 4 0000 4 0000 3 32333 Model is more 3 0000 3 0000 appropriate 0 NaN SAM Recall from our previous visit to SAM 31in ii iii 51 so relative difference function sz39 is genespecific scatterquot and is defined as the standard deviation of repeated expression measurements i sin Va Exiii 7mm En ruiii 7mm SAM vs maSigPro SAM model accounts for repeated measurements that are a result of a time course experimental designmaSigPro is simply a two step regression method applied to each of the experimental factors Costly errorSignificant genes vs no significant genes Posthoc analysis proves that the timetreatment interaction is significant not shown Review Time course introduces new aspects of the analysis that must be considered Replicates Additional quality check required Interactions between time and treatment Iftimetreatment interaction is significant it must be characterized Interactions must be accounted for in the model applied for analysis Recall maSigPro0 SAM30 Unaddressed Issues Post hoc analysis Steps to address high variance in biological reps Visualization of these significant genes General issues with the code