### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# 628 Class Note for CHEM C1260 at Purdue

### View Full Document

## 22

## 0

## Popular in Course

## Popular in Department

This 15 page Class Notes was uploaded by an elite notetaker on Friday February 6, 2015. The Class Notes belongs to a course at Purdue University taught by a professor in Fall. Since its upload, it has received 22 views.

## Similar to Course at Purdue

## Reviews for 628 Class Note for CHEM C1260 at Purdue

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 02/06/15

Statistical Decisions Populations Features and Classes Randy Julian Lily Research Laboratories Review of probability in analytical chemistry Simple mixture of two compounds L01 L02 n1 number of molecules of compound 1 n2 number of molecules of compound 2 Q Prior probabilities Pw1 n1n1n2 Pw2 n2n1n2 Blind decision rule decide wl if Pw1 gt Pw2 otherwise decide wz This is not bad if Pw1 is very small 1 Example Drugs from Active n compounds Two types of compounds classes wl not a drug wz is a drug Pw1 gtgt P002 Easy Guess Not a Drug and move on lt9 Error rate is low but discovery rate is 0 n Need some non zero chance of success Get more information to make the decision To use additional information requires the new information to be statistically dependent on the state of nature We need some more theory to use this new info 1 Probability and Distributions Pa the probability mass function probability pa the probability density function Pa b the joint probability likelihood of haVing both a and b pab the joint probability density Pa l b the conditional probability likelihood of getting a given b pa l b the conditional probability density use lower case for distribution functions upper for mass functions 1 n Continuous variables When a random variable xcan take values on a continuum The probability of Xhaving a particular value is almost always 0 Instead the probability that x falls within an interval ab is used The probability density function is used Prx ab jpxdx px 2 0 and pxdxl 1 Better decisions with more info Use a measurement X to improve the decision rule Assumption is that drugs have a different value for xthan nondrugs The measurement is assumed to be statistically dependent on the state a they have a joint pdf pxtd x then follows a class conditional probability density function the likelihood of getting x from something in state aquot they have a conditional pdf px ltd 4 What we what is the likelihood of a given that we measured x the posterior probability 1 From posterior probabilities to n decisions Bayes rule likelihood X prior posterior evidence Q or 130 x Pxl0P0 px PUG Pxl a1Pa1 Pxl 0 2Pw2 px is just a scaling factor to make sure the posterior probabilities sum to 1 1 Bayes Decision rule Q Choose most likely class given the features measured and the likelihood of the class If Pa1lx gtPa2lx choose 01 Otherwise choose 02 The scaling factor px can be removed If pxl w1Pa1 gt pxl wz P002 choose all Otherwise choose 02 1 n Cost of error What if there is a cost of making a mistake between choices Matx i 60 the cost of taking action 139 when the state of nature was actually j symmetrical or zero one loss 27 cu 1 2 1 0 2 1 0 R04 ix 2W iwjPwj ix BaJyeS Risk 1 Cost of misclassificationasymmetric Assume low probability of compound being a drug Assume there is are different costs for being wrong 1 2 Acted like it 1 0 0002 b was a drug 2 099s 0 I but it wasn t Acted like it i wasn t a drug but it was 10 Molecular weight as a feature madeup data with a little help from 2 pxl a Possible drug distribution 1 by molecular weight 4a a an 2 0 1 u 393 2n o a E 8 Se in El El mu ADD BBB BBB mun mu mol wl 11 2 Posterior Probabilities PM l x i2 in DE DE pmilx n4 M an Posterior Probailities P5039 P2001 a1 H H O H Q O Q 2mm ADD sun EDD mun iznn mass x 12 n Risk Function ROiilx 0024 Risk Functions 0022 0020 0018 0016 0014 0012 0010 0008 0006 0004 0 002 0 200 400 600 800 1000 1200 mass x 13 What if there were more overlap px wt Alternative drug distribution AU in 2D compounds H mm ADD sun Bun mun iznn x moi Wt 14 2 n Posterior amp Risk Pusleviuv Pmbaililies Pa 99 PZZU til now um Pm ix 1 2 Risk Functions 1 uma um umz umu n In m am am mass x mm D 2m 40 em mss x mu um 2m 15 Minimum Error Decision Rule Choose action which minimizes Bayes Risk Q A machine which chooses the minimum risk is called a classifier discriminant functions are assigned to each class gX i 12C Q Class or Category is assigned based on the largest discriminant lt9 Bayes classifier gfx R04 x 15 1 17 General Classifier uczimi eg clasxi calimt discriminth ymien39mir iltptll FIGURE 15 The functional structure of a general statistical pattern classifier which includes 1 inputs and c discriminant functions gltxl A subsequent step determines which of the discriminant values is the maximum and categorizes the input pattern accordingly The arrows show the direction of the flow 0139 interriiairon ihuuglr frequently the arrows are omitted when the direction ol39 flow is selfrevidellt From Richard O Duda Peter E Hart and David C Stork Pattern Classi cation Copyright 20m by luhn Wiley amp Sons Inc 1 1E Approaches to classification Recall 2a x PxlwiP601 px Density estimation Estimate density 13nxl 0 Compute 131w l x Build Plugin classifier 07105 arg max I3nwi ix ISKSC 6 Regresston Directly estimate 1310Alx Q Discriminant analysis A 9 Construct discriminant functions A gquotx 2 aux arg max gw x 9 ISKSC Learn decision functions directly DH 2 0x Finding the discriminant function directly Q One dimensional example Pw5 Pw25 Figure 210 From Richard O Duda Peter E Hart and David G Stork Pattern Classi cation Copyright c 2001 by John Wiley amp Sons Inc 1 Minimizing the expected error lt1 Objective is to minimize the loss function Loss functions constructed as Rax or LaX E Lax Seek a function fX for predicting a given the input Xquot Familiar loss function squared error loss Lwfx 60 f 02 Looks more familiar if a is treated like a variable LYfXYfX2 20 3 10 n Regression as discrimination It can be shown fx argminEY XQY cZ lX x fxEYlXx 35gt This is the conditional expectation also known as the regression function The best prediction of Yat any point XXis the conditional mean when best is measured by average square error 21 3 Finding an fx Linear Regression What if one assumes fx is approximately linear in its arguments EXY EOCXTH fxsxr SO Not conditioned on X but used a functional relationship to pool over values of X Assumes fx is well approximated by a globally linear function 22 3 11 Linear Regression Example Linear Regression of 01 Response 23 3 Finding an fx Nearest Neighbor lt4 Nearest Neighbor tries to find fx directly from the training data fx Avelyi x e Nklx Q Where Nkx is the neighborhood containing the k points in T closest to x Approximations expectation is approximated by averaging over sample data conditioning at a point is relaxed to conditioning on some region close to the target point Assumes fx is approximated by a locally constant function 24 3 12 Nearest Neiqhbor Example 1 15Nearest Neighbor Classi er 25 3 Nearest Neighbor Example 2 1Nearest Neighbor Classifier 26 3 13 Classification Error k Number oi Nearesl Neighbors 151 83 45 25 15 9 5 3 1 i i i i i i r i i i i ii i I i i i i 39 I m Linear N r 0 o s n I I l r i u c pg 7 if c g E o Train Test I Bayes E d r i i i i i i i i 2 a 5 s 2 1a 29 57 zoo Degrees cl Freedom Wk 27 3 Bayes Optimal Bayes Optimal Classi er 3 14 n References 1 2 3 Pattern Classi cation Duda Hart Stork John Wiley Sons 2001 Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings Lipinski C A Lombardo F Dominy B W amp Feeney P J Advanced Drug Delivery Reviews 23 325 1997 quotThe Elements of Statistical Learning Data Mining Inference and Predictionquot Hastie Tibshirani and Friedman SpringerVerlag 2001 15

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

#### "When you're taking detailed notes and trying to help everyone else out in the class, it really helps you learn and understand the material...plus I made $280 on my first study guide!"

#### "I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

#### "It's a great way for students to improve their educational experience and it seemed like a product that everybody wants, so all the people participating are winning."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.