Popular in Course
verified elite notetaker
Popular in Statistics
This 14 page Class Notes was uploaded by Miss Sabina Grimes on Monday October 19, 2015. The Class Notes belongs to STAT 631 at Rice University taught by Volkan Cevher in Fall. Since its upload, it has received 18 views. For similar materials see /class/225036/stat-631-rice-university in Statistics at Rice University.
Reviews for GRAPHICAL MODELS
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 10/19/15
JUNCTION TREE ALGORITHM Tuesday September 9 2008 Rice University STAT 631 ELEC 639 Graphical Models Scribe David KAHLE Instructor Terrance SAVITSKY Dri Volkan CEVHER Stephen SCHNELLE 1 INTRODUCTION In the previous scribes we introduced general objects used in graphical models as well message passing schemes such as the sumproduct algorithm and the maxsum algorithm for efficient computation of local marginals over subsets of the graph as well as the most probable state of the graph For graphs which are trees we noted that the algorithms are exact In this article we discuss a more exible tool known as the junction tree algorithm which is used to compute local marginals of subsets of arbitrary graphs and yet still retains the property of being exacti The article draws largely from the exposition contained in Bishop2 Lauritzen4 Barberl and Wainwright and Jordan5I 2 THE JUNCTION TREE ALGORITHM The junction tree algorithm comprises 7 steps listed below which are expounded in the 7 subsections of this section Moralize directed graphs only Triangulate Form the junction treei Assign the potentials to the junction tree cliques and initialize the separator potentials to unity 5 Select an arbitrary root clique 6 Carry out message passing with absorption to and from the root clique until updates passed along both directions of every link on the junction tree 7 Read off the clique marginal potentials from the junction tree AAAA hm co to H VVVV 21 Moralizing the graph Directed graphs only For uniform applicability directed graphs require the additional step called moralization in order to be converted into an undirected graphi Thus to perform the junction tree algorithm on an already undirected graph one proceeds directly to step The procedure described in this section is only necessary if we begin with a directed graph Moralization the transformation of the directed graph C7 N5 55 to an undi rected graph Q N959 plays an important role in the junction tree algorithmi lnformally moralizing entails adding edges between parents of nodes and dropping the directions More formally the transformation from C7 to Q requires the addition to 597 of two sets 5949 and 55H The former joins the parents and the latter drops the directions by adding edges in the reverse direction They are properly de ned 59449 I XiXj E N2 HXk E N 3 XiXk E 55 and XjXk E 55 8 5ng XjXk E N2 XjXk 85 but XkXj E 894 I 2 Thus the moralization step is simply 55 U 5549 U 55H and the moralized graph is 1 Q1 Ng g U g Hg Example 1 Suppose we are presented with the directed graph Q Ng g W ere Na ABCDEFG 5 I A707A7D737D707FDyF7DyG7EyG LQL A pictorial representation of Q is contained in Figure 1 To moralize the graph we join the parents and remove all directions Here A and B are both parents of D so the edges AB and BA must be added to 55 in the moralization stepi Similarly C D D C D E E D all need to be added All of these are part of the joining step that is in 5549 Thus Eg g A B B A C D D C D E E D All that is left to do is remove the directions This is done by adding all the re verses of the edges which aren7t in the graph For example A C E 55 but C A 597 so C A needs to be added to 55 Similarly DA DB F C FD G D and G E should be added The entire set is therefore simply the mirror image of 5 a 85H C A D A DB F C F D GD GEi The undirected graph resulting from the moralization we simply call Q1 2 Q1 Ng gU gg g U where Ng I ABCDEFG 3 59 1 1470 AD 19le CF DvFl D G E G 143 EAL 071 D7 0 DyEL EyDL 071407 DyAL DBL F7 0 FyDL 071 97113 The undirected graph Q made from the moralization of Q is displayed in Figure 2 with the edges added by moralization in dashed blue 22 Triangulating the graph The second step rst for already undirected graphs is triangulationi To begin we need to have a notion of a triangulated graphi An undirected graph Q is said to be thangulated if and only if for every cycle of length n 2 4 possesses a chord Thus in triangulating the undirected graph Q we are adding edges so that the new graph QA is triangulatedi3 The 1This is instead of introducing new notation which makes explicit that it came from moralizing Q Wherever the distinction is ambiguous we will use the notation Qm 2Recall that a wcycle is a path with the same beginning and end a chord is an edge joining a pair of nonconsecutive Vertices in the cycle Also recall that by convention the counting of cycle length begins with 0 as opposed to 1 3In eneral we will reserve the superscript A on an undirected graph Q to emphasize that Q is triangulated FIGURE 2 Undirected graph Q in Example 1 resulting from the moralization of Q with moralization edges in dashed blue process is not unique as is demonstrated in Example 2 but the general idea is to continue the triangulation procedure adding edges until we obtain a triangulated graph and stop at that step Example 2 Consider the undirected graph Q N9 59 de ned by N9 7 AB 0DEgt 59 AyB7Byc7CyD7DyE7E7A 37A0737D707E7D714719 pictured in Figure 3 Any cycle for example A 7 B 7 C 7 D 7 E 7 A is of length 5 and possesses no chordsi Thus to begin we will add the undirected edge edges to speak precisely A C and C A to fgi The result is the graph Q0 Ngfg1 where 5glt1gt 4be370707D7D7E7E7A7A70 3714b073M177C7E7D7A7E707A The superscript 1 is used to denote that this graph is the result of the addition of one undirected edge the point is that we no longer have Q but a manipulated version of it For our discussion the number itself is not of any interest Notice that only the edge set 5g changes in each step of the triangulation process The picture on1 is contained in Figure 4 with the triangulation undirected edge in dashed redi Upon inspection the new graph 91 is still not a triangulated graph since it exhibits for example the 4cycle A 7 C 7 D 7 E 7 so we must continue adding edgesi Thus we will add the undirected edge A D and DAi As before the result is 92 Ng 592 where 592 7 Ath 37 0 07D RE E A A7 0 AD BACBDCEDAECADAi However unlike its predecessors 92 is triangulatedi Thus we set GA 92 completing the triangulation processi Note that the triangulation procedure is not unique The selection of which edges to add in Example 2 was completely arbitrary we could have selected many different edges to add to triangulate Q Which edges are of greatest interest to add is a question which will not be examined in this article FIGURE 3 The untriangulated graph Q in Example 2 FIGURE 4 The graph 91 from Example 2 with the triangulation undirected edge in dashed red 23 Forming the junction tree As in step 2 we begin with a few de nitions A hypergmph H is a collection of nonempty subsets of a nite set H known as the base set The elements of H subsets of H are referred to as FIGURE 5 The graph 92 GA from Example 2 with the trian gulation undirected edges in dashed red hyperedgesi4 For example if Q is a nite undirected graph the set of cliques of Q denoted CQ forms a hypergraph known as the clique hypergraph Recall that a tree T N357 is any connected undirected graph without cycles and that the key property of such graphs is uniqueness of path between any two vertices A junction tree is not surprisingly a particular kind of tree Speci cally a junction tree T3 H73573 is a tree whose nodes H73 are a hypergraph with respect to some base set with the additional property that the intersection U N V of any two nodes U V 6 HT is contained in every node W in the unique path joining U and Vi The property is referred to as the running intersection property or the junction property e nition of a junction tree is daunting at rst because of the tiers of new de nitions Fortunately a familiar example can help make the de nition more tangible It will also indicate why the formation of a junction tree is the third step in the algorithmi Example 1 Continued Recall our graph achieved via moralization in Example 1 de ned in 2 and 3 and pictured in Figure 2 Consider the clique hypergraph CQl To be precise to describe it we need to determine H The base set is of course Ng and from the edge set 5g we determine that 4 79 ACDFCDDABGDE each set being a maximal clique of Q Now the de nition of the junction tree speci es that such a hypergraph is thought of as the nodes of a tree with a special propertyi So if we set HT CQ the only thing that stands in the way of us and having a junction tree is the edge set of the tree 57 which can be anything as long as the graph which it generates is undirected connected contains no cycles and satis es the running intersection property meaning that it has to be both a tree and satisfy the running intersection property 4The de nition of hypergraph is somewhat unfortunate lntuitively we would like a hypergraph to be a generalization of an undirected graph however precisely speaking this is in fact not the case The de nition provided is closer to a generalization of what we have been referring to as the edge set of an undirected graph 5g A more appropriate de nition would be an ordered air H NH H where 5H is a set of subsets no longer restricted to pairs of NH In this de nition a hypergraph H would be akin to the familiar notions of topological space and measurable space the difference being that the set Q1 is not de ned through a set of axioms For the purposes of this article however the de nition will be the one provided in the main body For ease of notation let s write the maximal cliques as strings of letters instead of sets this will be particularly helpful later For example we will call ACD I A CD and similarly with the other cliquesi An edge of a graph is simply an ordered pair so an example of an element of 573 would be ACD GDE Again since H7 is xed we need only consider different possibilities for 573 To ensure that 57 de nes an undirected graph we just require that for each ordered pair in 573 the ordered pair with the reverse ordering is also in 573 so that takes care of one of the conditions and reduces the number of different possibilitiesi From here we simply posit a few different possibilities 51 7 FCDDABDABACDDABGCD TS T DABFCDACDDABGCDDAB 52 7 FCDDABDABACDACDGCD TS T DABFCDACDDABGCDACD 53 7 GODDABDABACDACDFCD TS T DABGCDACDDABFCDACD The pictorial representations of 5713 55723 and 55733 are contained in Figures 6 7 and 8 We now consider each individuallyi Regarding 5713 we note that it is clearly fully connected with no cycles and undirected so all that needs to be checked is whether or not it satis es the running intersection propertyl But FCD GOD CD g DAB since C is not in the clique DABi Thus 503 does not create a junction tree The second edge set 5723 also creates a fully connected graph with no cycles so we check the running intersection propertyi FCD ACD C D g DAB so it is also not a junction treel Again for 573 all we need to check is the junction propertyi GDE ACD D Q DAB so no problem there GDE FCD D which is contained in both DAB and ACD so again we are safer All that remains is to check DAB and FCD so DAB FCD D Q ACDi Thus the edge set in fact generates a junction tree which we will call T3 C Q It turns out that to every triangulated graph corresponds at least one junction treel Although we didn t have the ability to realize it before the moralized graph Q in Example 1 is in fact a triangulated graph so step 2 required no addition of edges The subsequent junction tree T is therefore just one example demonstrating a more general relationship between triangulated graphs and junction trees The general strategy is to begin with the hypergraph CQA de ned by the maximal cliques of the moralized and triangulated graph GA and then search for an appropriate edge set which satis es the properties of a junction treel While the ef cient determination of a such an edge set for an arbitrary triangulated graph is an important topic in its own right for now we will be satis ed with the fact that we have found a junction tree and proceed from there There is one additional step Which is taken When displaying the junction tree Which makes explicit the running intersection property Which Will be so important in enforcing certain consistency properties later on in our discussion that of introducing separator nodesi Displayed in the middle of each edge is placed a separator node Which is conventionally boxed instead of circled7 it contains the variables Which are common to both of the nodes the intersection Thus the junction tree we found in Example 2 is more commonly displayed as Figure 9 we Will use this representation for the duration of the article FIGURE 6 Graph on CQ With edge set 503 v J FIGURE 7 Graph on CQ With edge set 55723 FIGURE 8 Graph on CQ With edge set 533 FIGURE 9 Junction tree T3 With separator nodes L I L 24 quot 39 39 r 39 and 39 39 From the HammersleyClifford theorem we know that the jo39nt density of all the variables in the original graph factors into an appropriately scaled product of potential functions over the maximal cliquesi For the junction tree algorithm the clique potentials are set to the original potentials over the undirected graph or even more explicitly if they are known from a known directed graph structure the conditional distributions themselves just as they would be in the sumproduct algorithmi The potentials for the separator nodes are set to unity 25 Selecting an arbitrary root node The previous steps of the algorithm have been used to set up the nodescliques in a way that is suitable to apply a message passing algorithmi Now we must select a root node to begin Each link between nodes and separators will be used twice during message passing once in each direction This is done by propagating messages up from each leaf to a root and then in reverse from the leaves to the root Although trees are usually represented with a vertical orientation we will represent them sideways as in 9 to preserve spacer Example 1 Continued In Figure 10 we select node GDE as root Root node FIGURE 10 Junction tree T3 with node GDE emphasized as root 26 Carrying out message passing Potentials were assigned in step Because we have created a junction tree there will be at least one node of the junction tree that is connected to only one neighbor and is not our arbitrarily chosen root of the tree Do not choose the root as the idea is for our first pass through the tree we pass towards the root Pass the messages using the standard message passing algorithms for graphical modesi Select other nodes that are connected to only one neighbor if present and pass towards the root Once an internal node of the tree more than one neighbor has received messages from all those nodes which it separates from the root pass its updated message along towards the root Finally reverse this process We can think of this in the sense of a traditional tree with a root node at the top messages are passed up the tree at each stage updating nodes along the path to the root with information from all of its children before moving up to the next level Then with the updated root we pass revised messages down the chain Fortunately messages must be passed along the links in each direction only once Forming the junction tree ensures that this algorithm will convergei Example 1 Continued In Figure 11 passing messages are quite straightforward up the tree and then back down First Pass Up Second Pass Down Root node FIGURE 11 Message passing on the junction tree T3 with root GDE 27 Evaluating desired marginal potentials After completing the bidirectional message passing scheme in step 6 the messages at each node correspond to the potential functions of each clique which correspond to the joint probability distribution of the nodes in the clique Because the nodes form a clique and thus are all connected to nd the probablity densities of a subset of these nodes those the clique we need only to marginalize the resulting potential function over the remaining variables Ideally we would like to nd a triangulated graph with minimal maximal clique size to make the computations of marginals inside each node nearly trivial This provides us with at least some guidelines by which we can discriminate against different triangulations Example 1 Continued Suppose in our example our variable of interest is in fact just A Then from Figure 12 we can marginalize over either nodes DAB or ACD to nd the density of A A consistency property exhibited by junction trees to be discussed below ensures that these two methods of obtaining the marginal of A are the same Root node Nodes which contain A FIGURE 12 Junction tree T3 and nodes via which the marginal distribution of A can be acquired 3 CONSISTENCY The concept of consistency came into the discussion at the end of the junction tree algorithm This concept is explored in more detain in this section in three parts the idea the importance and the algorithm 31 Idea Given two adjacent nodes of the junction tree V and W we wish the potentials resulting from marginalizing over V or W to their separator S to give the same potential for S EM 1088 2111 1055 s ws v5 We call this condition of equality consistency Global consistency would imply that for any two nodes V and W with intersection I we have 2 W Z W wi vi Hence the joint probability distribution of the intersection of any two nodes is the same whether marginalizing within either node even if the two nodes are not neighbors 32 Importance Ensuring consistency is important The idea of the junction tree algorithm and many other message passing schemes is that marginalization to nd variables in a cluster need only be done over that cluster Essentially each cluster is localized and effects of other clusters are factored in during the message passing algorithm but need not be considered later However localization should require that if we look at a variable in the intersection of two or more cliques we get the same results for its density whether we marginalize over one clique or another Fortunately consistency holds for a junction tree under a variety of circumstances including as data is observed and used to obtain better estimates of other parameter This fact is outline in further detail below 33 Algorithm Suppose one or more variables in V 39U is observed and clamped to a particular state 11 represents the updated potential function due to the observed data Our task is to modify w and 118 to satisfy 2 mm ms Emu w5 v5 Absorption replaces 11 s and w with 95 Z 110 v5 L1W8 98 I w w Then 1 s 1 s 2 w wz w 1 8 i 7 i S s 7 1 s 7 2 11 1 and consistency is reestablished We say we have absorbed 390 into W through Si v5 4 EXAMPLE 2 A nice example is carried throughout 3 attributed to Lauritzen and Spiegelhalteri The following problem is set up Shortnessof breath Dyspnoea my be due to Tuberculosis Lung cancer or Bronchitis or none of them or more than one of themi A recent Visit to Asia increases the chances of Tuberculosis while Smoking is known to be a risk factor for both Lung Cancer and Bronchitis The results of a single Xray do not discriminate between Lung Cancer and Tuberculosis as neither does the presence or absence of Dyspnoeai77 They set up the graphical model as shown in Figure 13 FIGURE 13 Original graphical model in Example 2 For the rst step in the algorithm we moralize the graph as shown in Figure 14 Next we triangulate the graph as demonstrated in Figure 15 Then we form a corresponding junction tree as in Figure 16 Separator nodes are added in Figure 17 ow we need to select a root node so we select SBL and start passing messages up to and down from SBL as in Figure 18 and Figure 19 From here we could obtain any marginal we desire FIGURE 15 Triangulated in red moralized graph in Example 2 REFERENCES 1 D Barber Piobabilistic modelling and Teasoning The junction tTee algoi ithm Working Paper 2004 C M Bishop PatteTn Tecognition and machine leui ning Springer Oxford 2007 Robert Cowell IntTodnction in infeTence in bayesian netwoiks 19 9 7 6 SL Lauritzen Giuphical Models 1996 New York Oxford University Press MJ Wainwright and M ordan Giuphical models monential families and variational infei ence UC Berkeley Dept of Statistics Technical Report 649 2003 Q PP N RICE UNIVERSITY DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING 6100 MAIN ST HOUSTON TEXAS 77005 Eimail a T888 dkahleQriceedu TerranceDSavitskericeedu srsZQriceedu VolkanQrice edu FIGURE 16 Junction tree of graph in Example 2 FIGURE 17 Junction tree of graph in Example 2 With separators initialized to l Root node First Pass Up FIGURE 18 Passing messages up the junction tree Root node Second Pass Down FIGURE 19 Passing messages down the junction tree
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'