×

### Let's log you in.

or

Don't have a StudySoup account? Create one here!

×

or

21

0

8

# 663 Class Note for STAT 59800 with Professor Neville at Purdue

Marketplace > Purdue University > 663 Class Note for STAT 59800 with Professor Neville at Purdue

No professor available

These notes were just uploaded, and will be ready to view shortly.

Either way, we'll remind you when they're ready :)

Get a free preview of these Notes, just enter your email below.

×
Unlock Preview

COURSE
PROF.
No professor available
TYPE
Class Notes
PAGES
8
WORDS
KARMA
25 ?

## Popular in Department

This 8 page Class Notes was uploaded by an elite notetaker on Friday February 6, 2015. The Class Notes belongs to a course at Purdue University taught by a professor in Fall. Since its upload, it has received 21 views.

×

## Reviews for 663 Class Note for STAT 59800 with Professor Neville at Purdue

×

×

### What is Karma?

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 02/06/15
Data Mining 3857300 STAT 59800 024 Purdue University January 22 2009 Measurement and Data Measurement Real world Data Relationship in real world in data Goal map domain entities to symbolic representations Hierarchy of measurements f DRatio absolute zero 39 I Naming conventions AgrrIggg Ag 9 K Feature Vaues 24 28 32 N Y Y Vaues Distance measures Many data mining techniques utilize similaritydissimilarity measures to characterize relationships between instances 0 Nearest neighbor classification 0 Cluster analysis 0 Proximity general term to indicate similarity and dissimilarity 0 Distance dissimilarity only Metric properties 39 A metric will is a dissimilarity measure that satisfies the following properties 0 rl 3 U for all i39 and dM0 lfl l drip UM for all 2 d E d kdkj for all ijk Distance metrics 0 Manhattan distance L1 dM3 g1 Euclidean distance L2 dam 214m 202 6 Most common metric 0 Assumes variables are commensurate Standardization 0 Normalization 0 Removes effect of scale 39 Divide each variable by its standard deviation 39 Makes all variables equally weighted 0 Weighted Euclidean distance 0 Can weight variables by relative importance 6k 196ki56k2 a W ampl k 1 11c fig 19 Oak dwahmy 251 wzlxz39 392 Correlation among variables 0 Variables contribute independently to additive measure of distance May not be appropriate if variables are highly correlated 0 Can standardize variables in a way that accounts for covariance Diameter Heighti Heightg Hagmoo 10 Covariance Measures how variables X and Y vary together Positive if large values of X are associated with large values of Y Negative if large values of X are associated with small values of X Covariance matrix Symmetric matrix of covariances for p variables Measures linear relationship Correlation coefficient Covariance depends on ranges of X and Y i 7 z i 77 i J 33721 n z1 7290 y Standardize by dividing through standard deviations Can also form pxp correlation matrix Mahalanobis distance dMHOJ W 7 rjTE 1ri 7 WM Automatically accounts for scaling Corrects for correlation between attributes Tradeoff Covariance matrix can be hard to estimate accurately O Memory and time complexity is quadratic rather than linear 13 Distance measures for binary data Matching coefficient 391 0 O Hamming distance normalized by number of bits 1 H77 H77 Jaccard coefficient 0 mo n00 If we don t care about matches on zeros MC n11noo 73 W11W10W01W00 Jam W11W10W01W00 14 Introduction to R 15 Next Class Reading Chapter 3 PDM O Topic 0 Visualization and exploration of data 0 Homework 0 HW1 assigned available on class website 0 Due Jan 29 start of Class 16

×

×

### BOOM! Enjoy Your Free Notes!

×

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

Bentley McCaw University of Florida

#### "I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

Amaris Trozzo George Washington University

#### "I made \$350 in just two days after posting my first study guide."

Bentley McCaw University of Florida

Forbes

#### "Their 'Elite Notetakers' are making over \$1,200/month in sales by creating high quality content that helps their classmates in a time of need."

Become an Elite Notetaker and start selling your notes online!
×

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com