Special Topics EECS 690
Popular in Course
Popular in Elect Engr & Computer Science
verified elite notetaker
This 4 page Class Notes was uploaded by Melissa Metz on Monday September 7, 2015. The Class Notes belongs to EECS 690 at Kansas taught by Staff in Fall. Since its upload, it has received 33 views. For similar materials see /class/186801/eecs-690-kansas in Elect Engr & Computer Science at Kansas.
Reviews for Special Topics
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 09/07/15
EEC EECS 690 Introduction to Bioinformatics Microarray Data Analysis lll Anne Y Zhang Electrical Engineering and Computer Science httppeopleeecskueduyazhang 0 mos 41 A v Zing unluxvavtalxax i Outline Classification 0 K nearest neighbor classi er o Decision tree learning 0 Naive Bayes classi er nos 41 A v Ziag nowwmm 2 Review SelfOrganizing Map SOM o The SOM works both as a projection Visualization method and a clustering method Maps highdimensional input data onto a low dimensional usually twodimensional output space while preserving the topological relationships between the input data 0 Constraint version of K means clustering v Prototypes lie in a one or twodimensional manifold K prototypes Rectangulargnd hexagonal grid muse 41 A v Zing unluxvavtalxax 1 Review Principal components analysis PCA PCA Project the highrdlmenslonal ootnts m a low otmenstonal ce While preserle the essence of the data 0 A better coordinate system for he data 0 PCA objective To reduce dim ensionalit To reduce redundancies in he data To choose he most useful variables features To visualize multidimensional dat To identify groups of objects eg genessam ples To identify outliers c acaco nos 41 A v Ziag uluwwmtalxax Page 211 Microarray Data Analysis ervised Learning Unsupervised Learning up Classi cation Clustering Class Prediction Class Discovery Finding previously Finding distinctions umecogoized structure betwem goups Hierarchical clustering KrNearesL Neigibor D El Kansans clustering free learning GMM 1 can g Naive Bayes classi er 50M uppul t Vectorllaclnn FCA muse 41 A v Zing unluxvavtalxax s Example Tumor Classification Current methods for classifying human tumor mainly rely on a variety of morphological clinical and molecular variables Microarray allows the characterization of molecular variations among tumors by monitoring gene expression Hope microan39ay will lead to more reliable tumor classi cation and therefore more appropriate treatments and better outcomes nos 41 A v Ziag uluwwmtalxax s Example Tumor Classification 0 Data gene expression ofthe tumor tissue from cancer patients and gene expression ofthe normal tissue from normal persons 0 Task to learn a model from the above data that can be used to predict the status of a person based on his expression profile 0 Solution ms 13 A v 2mg Urwels39vnlkanss 7 Classification 0 Data A set of data records also called examples instances or cases described by kattributes A A2 AK a class Each example is labelled with a prede ned class 0 Goal To learn a classification model from the data that can be used to predict the classes of new future or test casesinstances o Supervised learning classes are predefined use a training or learning set of labeled objects to form a classifier for classification of future observations 2mm 13 A v 2mg Unwswnlmsas a General model of classi cation 0 Given input X output Y y fX 5 135 05 is independent on x unknown 0 Want to estimate the functionf based on known data set training data ms 13 A v 2mg Urwels39vnlkanss 9 Classi cation A TwoStep Process Model construction describing a set of classes 0 Each sample is assumed to belong to a predefined class as determined by the class labe The set of samples used for model construction training set a The model is represented as classification rules decision trees or mathematical formulae Model usage for classifying future or unknown objects 0 Estimate accuracy of the model 0 The known label of test sample is compared with the classified result from the model ccuracy rate is the percentage of test set samples that are correctly classi ed by the model a Test set is independent of training set was 13 A v 2mg Unwswnlmsas m Approaches o K Nearest Neighbor 0 Decision tree learning 0 Naive Bayes ms 13 A v 2mg Urwels39vnlkanss M K Nearest Neighbor 0 Majority vote within the k nearest neighbors 1 k z y 1611 le o kNN requires a parameter k and a 9 distance metric F training error is zero but a a test error could be arge 0 o 0 o I 00 0 As k l training error tendsto 9 increase but test error tends to e e 0 o g 0 0 ecrease rst and then tends to increase 8 O O For a reasonable k both the training and test errors could be smaller than the Linear decision boundary K 1 brom 2mm 13 A v 2mg Unwswnlmsas K 3 green 12 33 K Nearest Neighbor Example 3 K vs Misclassification Error IHW rn variance gt Werfittin Large k rnore stabie estimates but not flexible Decision Tree Learning Decision tree learning is one ortne rnost Widely used tecnnioues ror ciassirication Tne ciassirication rnodei is atree called decision tree A riowecnarteiike tree structure internal node denotes a teSt on an attribute Brancn represents an outcorne ofthetest Leaf nodes represent ciass iabeis or ciass dlStH ution c4 5 by Ross Quinlan is bernabs tne best known System it can be downloaded rrorn tne Web A decision tree for buyscomputerquot Decision nodes and leaf nodes classes excelllent fan From a decision tree to a set of rules A decision tree can be converted toa set or rules Eacn batn trorn MM root to a ear is a ruie rim U39 AND meant wasquot i l wquot m yesquot wquot momentum beeeiisxtzi mmbnysbambub wiiuunbxeattgmg W39 T mb yscorrwllu mquot this Decision tree genration 0 Decision tree generation consists oftwo bnases A Tree Eunstructlun Al sum aiitnetranine examples are at tne mul anitien examples iecuisivelv based on seieetee attributes Tree pruning identity and remove branenestnat reiieet neise er eutiiers 0 Use ordecision tree Classiiying an unkrio sarnbie Testtne attribute values brtne sarnbie againsttne dEElSan tree
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'