### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# 666 Class Note for CHEM C1260 at Purdue

### View Full Document

## 26

## 0

## Popular in Course

## Popular in Department

This 18 page Class Notes was uploaded by an elite notetaker on Friday February 6, 2015. The Class Notes belongs to a course at Purdue University taught by a professor in Fall. Since its upload, it has received 26 views.

## Similar to Course at Purdue

## Reviews for 666 Class Note for CHEM C1260 at Purdue

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 02/06/15

MultiIayer Neural Networks Randy Julan LIVy Research Laboratories NonLinear Systems 4 1 multiple layers three layer Schematic for K classes Output Yi Derived Features Zi Hidden Layer Hidden Unitsquot What is going on Model gkYk Yk Regression gkYk Y Softmax functionquot Classification 6 l Zm 0a0m xix m1M Yk z Ok fz k1K k213 3K3 Activation Functionquot 0v 1e Parameters Weights Zm 0a0m a2X m 1M Yk 0k kTZ k 1K wowaw m1MMp1 weights 0k k k1K KM1 weights N K Squared Errorquot or R9 y 10gfx k k Crossentropy deviance Gx arg kmaxmxi Classifier Comments about the model e a With softmax activation and the crossentropy error function Exactly a linear logistic regression model in the hidden units All parameters estimated by maximum likelihood lt Large number of parameters Problem is probably over parameterized Don t want a global minimum for R6 it is probably an over fit solution Q Some regularization is needed to prevent over fitting Directly Penalty term Indirectly Early Stopping Gradient Descent Multiple error surfaces e Training Back Propagation Delta Rule Two pass method Forward pass 1 weights fixed Yk computed Backward pass 1 error in output layer computed 6ik 2 error in middle layer computed smi Smi 0arixi km ki Compute the gradlelnts and update N 2242 9quot i1 3315 N a m 0 R7 aml am yrZ i1 aafn r training iteration epoch y learning rate Early Stopping learning stopped initial weights n Problems with Early Stopping The issue of when to stop is important we know of no satisfactory rule for this algorithm Folklore suggests that disasters have been saved because convergence is so notoriously slow that users cannot afford the computer time to over fit the training data BD Ripley Pattern Recognition and Neural Networks p 155 Penalty Terms R09 116 Q Weight Decay M 2m Zara Ian ml Weight Elimination 2 air 2 2 km1 km m11aml shrinks smaller weights towards zero cross validation used to nd A Effects of weight decay Neural Network 10 Units No Weight Decay Neural Network 10 Units Weight Decay002 Training Error 0100 gt o 0 Training Error 0160 0 Test Error 0259 I Test Error 0223 Bayes Error 0210 I I 0 Bayes Error 0210 i I V 0 Hidden Units Number of hidden units governs complexity of decision boundary degrees of freedom Jn 040 A 035 test 030 025 0 20 train 015 9 13 17 21 25 29 33 37 total number I I I I I I I L of weights 23456789 n Seecting Learning Rates 5 J J J J n lt no quotI nu nap lt n lt 2 I 1 l 77 gt 2 I w 39 w 39 W I w w wquot w w lt 5 Comparing ExamC NN5 Knn11 Krm11ofTrainC NN TrainC 5555555555 5 D 5 mm o a o 0 gquot uP 3quot w m 5 m u N 0 a o N 3 9 a X a a one n X quot can o 5 55 w Re ne a en mi m n o agape a m a a Wu 5 o a 2 E 5 m 5 n 5 m 55 m 5 n 5 m 55 m m 51015 x2 4n 55 n x2 x2 NN TrainC I a Data EIEIEH In I AU NN ExamC I a Data EIEIEH 15 I In I x1 NN TrainC w Wm H V NN TrainC Wm H mm W Nx E 15 1D 71D x1 NN TrainC Wm H m p Nx mi 15 1D 71D x1 10 HH f E m a W mi m39 a 1X EC EE ER on zz f E m a W mi DL39 zgtlt E En EE a 055C 22 NNTrzinC n nm quotN TrainC n M x2 NN TrainC Wm 7H m o a d o Equot e 4n 75 u m 15 x1 12 Logistic Regression Size0 NN TrainC Si 9 0 Decav 0 01 two layer g 0 0 0 8 a e o o O o o c o o o x 62 u o T I I 10 5 o 5 10 15 x1 libraryMASS lt need this too librarynnet nnet x y weights size Wts mask linout FALSE entropy FALSE softmax FALSE censored FALSE skip FALSE rang 07 decay 0 maxit 100 Hess FALSE trace TRUE MaxNWts 1000 abstol l0e 4 reltol l0e 8 size number of units in the hidden layer Can be zero if there are skip layer units linout switch for linear output units Default logistic output units entropy switch for entropy maximum conditional likelihood fitting Default by least squares softmax switch for softmax log linear model and maximum conditional likelihood fitting linout entropy softmax and censored are mutually exclusive skip switch to add skip layer connections from input to output NNets in R ex51R TrainCltereadtable quottrainCdat names TrainC ltec quotX1 X2 y p lt7 asmatriXTrainC 73 tplteTrainCy tpi C1assindtp Xp lt7 seqltminltTrainCXD maxltTrainCXl length 50 rlp lte 1engthltxpgt yp lt7 seqltminltTrainCX2 maxltTrainCX2 length 50 pt lt7 expandgridx1 Xp X2 yp setseed1 Znn lt nnetp tpi skipzT softmaxT sizes decayzDecay maxit1000 Znnt lt predictZnn pt zpnn lt7 Zr1r1t1 7Znr1t2 Neural Nets in the News 450 J Chem IIg Cam1m Sci 998 33 450456 Aqueous Solubility Prediction of Drugs Based on Molecular Topology and Neural Network Modeling Jarmo Huuskoncrl Marja Sale and Jyrkj Taskjnenquot Division of Pharmaceutical Chemistry Deparlment of thlrmacy PCB 56 FBI00014 University of Helsinki Finland Received November 2 097 A inerhod for predicling the aqueous solublllly of dnlg conrpounds was developed based on topological indices and arti cial neural network ANN modeling The aqueous solubility values for 211 hugs and related compounds representing acidic neutral and basic drugs of diflerent structural classes were collected from the literature The data set s divided into a training 11 160 and a randomly chosen test set 1 5 Structural parameters used as inputs in a 23 3 arti cial neural network included 14 amm type electrotopological indices and nine other topological indices For the test set a predicrive 4 086 and r 053 log units were achieved 14 J Chem Inf Comput Sci 2000 4 9477955 Electrotopological State Indices Jauno J Huuskonenquot David J Livingstonel and Igor V TetkoM Division of Pharmaceutical Chemistry Department of Pharmacy FOB 56 University of Helsinki Helsinki F1N700014 Finland CheniQuesl Delamerc House 1 Royal Crescent Sandown isle of Wight PO36 8L2 UKr Centre for Molecular Design University of Portsmouth Portsmouth l39lants POI 2E0 UK Laboratoirc dc Neurorlrlcuristiquc lnstitut dc Physiologic Universit de Lausanne Rue du Bugnon 7 Lausannc ClHOOS Switzerland and Biomedieal Department Institute of Bioorganic amp Petroleum Chemistry Mtu39manskaya l Kicv7660 253660 Ukraine Received May 14 1999 A method for predicting log P values for a diverse set of 1870 organic molecules has been developed based on atom type electrotopological state Estate indices and neural network modeling An extended set of Estate indices which included speci c indices with a more detailed description of amino carbonyl and hydroxy groups was used in the current study For the training set of 1754 molecules the squared correlation coef cient and rootmeansquared error were r1 090 and RMSLOO 046 respectively Structural parameters which included molecular weight and 38 atomtype Estate indices were used as the inputs in 3951 arti cial neural networks The results from multilinear regression analysis were 11 087 and RMSLOU 055 respectively For a test set of 35 nucleosides 12 nucleoside bases 19 drug compounds and 50 general organic compounds I 116 not included in the training set a predictive r2 094 and RMS 2 041 were calculated by arti cial neural networks The results for the same set by multilinear regression were 1 2 086 and RMS 072 The improved prediction ability of arti cial neural networks can be attributed to the nonlinear properties of this method that allowed the detection of highorder relationships between Estate indices and the noctanolwater partition coef cient The present approach was found to be an accurate and fast method that can be used for the reliable estimation of log P values for even the most complex structures 947 Neural Network Modeling for Estimation of Partition Coefficient Based on AtomType J Chem Inf Compul Sci 1995 35 826833 Neural Network Studies 1 Comparison of Over tting and Overtraining Igor V Tetko David J Livingstones and Alexander I Luik39b Biomedical Department Institute of Bioorganic and Petroleum Chemistry Murmanskaya l Kiev660 253660 Ukraine ChemQum Cheyney House 1921 Cheyney Street Steeple Morden Hens SGS OLP UK and Centre for Molecular Design University of Portsmouth Portsmouth Hams P01 256 UK Received January 27 19959 The application of feed forward back propagation arti cial neural networks with one hidden layer ANN to perform the equivalent of multiple linear regression MLR has been examined using arti cial structured data sets and real literature data The predictive ability of the networks has been estimated using a training test set protocol The results have shown advantages of ANN over MLR analysis The ANNs do not require high order terms or indicator variables to establish complex structure activity relationships Over tting does not have any in uence on network prediction ability when overtraining is avoided by crossvalidation Application of ANN ensembles has allowed the avoidance of chance correlations and satisfactory predictions of new data have been obtained for a wide range of numbers of neurons in the hidden layer 15 n Validation Ideally we would set aside some of the data and use it as a validation set There is usually not enough data to do this Finesse the problem with K foldquot validation Can use Bootstrap Estimation KFold Validation Divide data into Kparts K510 etc 1 3 4 5 CrossValidation estimate of prediction error CV ZLyn fAiK gtxi Ki iK N for Leave One Out LOO 16 Prediction error estimated bv CV d 10fold CV estimate I E x I g Q g actual 1 0 1 5 2 0 SubseiSizep This is a Neural Network wg0 xcj2 I I x I 1 I I 39 I 39 I II 39I 39I x I I I I 39I II I I II x j 17 and this is a Support Vector Machine L 18

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

#### "I made $350 in just two days after posting my first study guide."

#### "Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

#### "It's a great way for students to improve their educational experience and it seemed like a product that everybody wants, so all the people participating are winning."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.