### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# 573 Class Note for STAT 59800 with Professor Neville at Purdue

### View Full Document

## 22

## 0

## Popular in Course

## Popular in Department

This 19 page Class Notes was uploaded by an elite notetaker on Friday February 6, 2015. The Class Notes belongs to a course at Purdue University taught by a professor in Fall. Since its upload, it has received 22 views.

## Similar to Course at Purdue

## Popular in Subject

## Reviews for 573 Class Note for STAT 59800 with Professor Neville at Purdue

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 02/06/15

Data Mining 3857300 STAT 59800 024 Purdue University March 31 a 2009 Descriptive modeling Bayes nets many slides borrowed from Lise Getoor Density estimation The joint distribution can be used to answer any question about the relationships among subsets of the variables 0 Goal determine a compact representation ofthe full joint distribution PXPX1X2Xp 0 Approach model joint probability in terms of local conditional probability distributions CPDs C If dependencies have a sparse factorization then CPDs can be specified compactly and full joint representation will be linear Op Bayesian networks O A Bayesian network specifies a joint probability distribution via two components Directed acyclic graph G 0 Collection of conditional probability distributions P Xil Par 0 The joint distribution P is defined by the factorization PltX1 Xmgt HPltXiiPaigt i1 0 Additional requirement G is a minimal l map of P Factorization example PCA R E B PBPENPR ENPA E BNPCXXXA vs PCA R E B PBPEPR EPA E BPCA Conditional independence o Each rvX is independeni of iis nonrdescendanls given iis pai39enis PaX Conditional independence Each rvX is independeni of ille I39esl of he nelwol39k given quot3 Markov blankei Markov assumptions o A direcled acyclic graph DAG implies a sel of Markov assumpions wilh ils graphical slluclllre ADAG G is an rMap ofa disiribuiion P if all le Markov mumplions implied by G are salislied by P In this example IEB IBER IRABCE IARBE ICBERA Factorization Theorem Thm if G is an IMap of P then PX1lquot39IXp HPXi IPaXi Proof By chain rue PX1Iquot39IXp HPXi I Xllquot39IXi71 wlog X1Xp is an ordering consistent with G From assumption PaX g XL XH XL gtlt1 PaXi g N0nDescXi Since G is an IMap I Xi NonDescXi PaXi Hence 1Xi XL XH PaXi PaXi We conclude PXi X1Xi1 Pi PaXi Consequences 0 We can write P in terms of local conditional probabilities o If G is sparse then the representation of P is compact 0 But I Maps do not imply a sparse graph structure 0 eg a complete graph represents any distribution P 0 Thus want a minimal Iivlap such that removing any edge for G introduces a conditional dependency that cioes not hold in P 10 Minimal lMaps If is a minimal IMap Then these are not IMaps 11 Conditional probability distributions c The other components of a Bayesian network is the specification of the conditional probability distributions CPDs C We will focus on the simplest representation for CPDs tabular CPDs 0 Other representations decision trees noisy or gates context specific independence 12 Tabular CPDs When ihevariables of interest are all discrete the most common representation is a table Strengths Can be easily stored and manipulated a Very flexible Weaknessa A B PC 0AB PC1Alg Size grows exponentially 7 17 1725 7 75 with the number of parents 7 1 1750 750 I 7 712 a 55 I I 733 167 13 Example River Pollution Diagnosis 5m HipWsac5tu suukrlslmchgraup5cl52prallcthtm 14 Example Estimating auto insurance risk SocioEcon 15 Example Car diagnosis Initial evidence Testable variables gtl k is39 16 Known structure Given a network structure G and a choice of parametric family for PXiPai learn the parameters of the model 0 Goal Construct a network that is closest to the probability that generated the data 0 Use maximum likelihood or MAP estimation 0 The likelihood decomposes according to the structure of the network 19 Likelihood decomposition n L0 D HP1immil0 e H H Pm 2 lPaj 00 H H Pm ltigt We 00 f1 L lt0j D J Decomposition implies independent estimation problems 20 Unknown structure E B A ltYNNgt ltYYYgt ltNNYgt ltNYYgt C Set of nodes variables is specified O Inducer needs to estimate edges and parameters Approaches to learning structure Constraint based 0 Performs tests of conditional independence 0 Searches for a network that is consistent with the observed dependencies and independencies 0 Strengths O Intuitive separates structure learning from the form of the independence tests O Weaknesses 0 Sensitive to errors in individual tests 23 Approaches to learning structure O Score based 0 Define a score that evaluates how well the independencies of a given structure match the observations 0 Search for a structure that maximizes the score 0 Strengths Statistically motivated can make compromises between fitting the data and complexity Takes structure of conditional probabilities into account 0 Weaknesses 0 Computationally difficult 24 Score function Using Bayes rule P0 5W5 P5ID PD PD is the same for all structures G Can be ignored when comparing structures 25 Optimization problem 0 Input G Training data 0 Scoring function including priors if needed 39 Set of possible structures including prior knowledge about structures I Output 0 A network that maximizes 39the score 39 Key property G The score of a network is a sum of terms due to decomposable nature of model 26 Learning structure 0 Theorem Finding a maximal scoring network structure with at most k parents for each variable is NP hard for b1 0 This problem is addressed by using heuristic search 0 Model space all nodes set of possible edges 0 Search operators add edge delete edge reverse edge 0 Evaluation penalized likelihood Search techniques greedy hill climbing best first search simulated annealing 27 Search operations 0 Typical operations 28 Exploiting decomposability o Caching To update the score of after a local change we only need to rescore the families that were changed in the last move 29 Greedy hillclimbing 0 Start with a given network 0 Empty network or Best tree or Random network At each iteration 0 Evaluate all possible changes 0 Apply change that leads to best improvement in score Iterate stop when no modification improves score 0 Each step requires evaluating 0m9 new changes 30 Possible pitfalls Greedy hill climbing can get stuck in Local maxima All one edge changes reduce the score O Plateaus Some one edge changes leave the score unchanged Equivalent networks are neighbors in the search space and receive the same score O Standard heuristics can escape both Random restarts TABU search 31 Known structure incomplete data Use EM to infer missing values and estimate parameters simultaneously Inference 0 Bayesian networks and Markov networks are compact representations of probability distributions 0 Each network describes a unique probability distribution P 0 How do we answer queries about F 0 We use inference to refer to the process of computing answers to such queries 33 Queries 0 There are many types of queries we might ask 0 Most of these involve evidence 0 An evidence e is an assignment of values to a set of E variables in the domain 0 Simplest query compute the probability of observing the evidence Pe ZZP177 76 34 Queries 0 We also may be interested in the conditional probability of a variable given the evidence PXe PXe W It is used for 0 Prediction what is the probability of an outcome given the starting condition target is descendant 0 Diagnosis what is the probability of disease given the symptoms target is ancestor 0 Direction between the variables does not restrict the directions of the queries 35 General inference 0 Theorem Computing PXX in a Bayesian network is NP hard Implies that we can t find a general inference procedure that works for all networks efficiently For particular families of networks we have provany efficient inference procedures eg polytrees Exact inference methods 0 Variable elimination clique tree propagation 0 Approximate inference methods Markov Chain Monte Carlo variational methods loopy belief propagation 36 Summary Graphical models are used for both description and prediction c Set of conditional independencies describes the data C For use as descriptive model we need to learn the model structure 0 Optimal structure learning is NP hard but we can use heuristic search O How to evaluate descriptive models 37 Next Class c Topics DM evaluation 0 Due 0 Homework 4 38

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

#### "Selling my MCAT study guides and notes has been a great source of side revenue while I'm in school. Some months I'm making over $500! Plus, it makes me happy knowing that I'm helping future med students with their MCAT."

#### "I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

#### "Their 'Elite Notetakers' are making over $1,200/month in sales by creating high quality content that helps their classmates in a time of need."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.