Artificial Intelligence CMPSCI 683
Popular in Course
Popular in ComputerScienence
This 204 page Class Notes was uploaded by Roman McCullough on Friday October 30, 2015. The Class Notes belongs to CMPSCI 683 at University of Massachusetts taught by Staff in Fall. Since its upload, it has received 7 views. For similar materials see /class/232283/cmpsci-683-university-of-massachusetts in ComputerScienence at University of Massachusetts.
Reviews for Artificial Intelligence
Report this Material
What is Karma?
Karma is the currency of StudySoup.
Date Created: 10/30/15
Victor Lesser CMPSCI 683 Fall 2004 Colliiw 39 The structure of a learning agent 39 Basic problems bias Ockham s razor expressiveness 39 Decisiontree algorithms Vlmsta lm Learning is change within a system that improves its performance This admits a lot of different behaviors but identifies the basic preconditions of learning Leaming systems must be capable of change Learning systems must do something dillerenllyas a result of the change Mummy 39 A viable alternative to problem solving 39 Learning can simplify the complexity of problem solving Replace procedural knowledge inferencing search with learned functions and policies 39 Learning increases ef ciency robustness survivability and autonomy of system Key to operating in open environments 39 A learning program can become better than its teacher Vlmsta lm What changes as a result of learning How does the system nd out change is needed How does the system localize the problem to nd out what changes are necessary What isthe mechanism of change element enemtor Supervised learning 7 istpiit by a teacner wnai actrpn is best in a specrnc situation Reinforcement Learning 7 Geisieeitpacii appui ine consequences ola specrnc sequence piacirpns in a certain situation 7 Can aso De lhoughloi asupervtseo learning vnlh aless iniorm at ye ieeopadi sgna Unsupervised Learning 7 No leedhackahoul acirpns 7 Learns to prertrciiuiure precepts given its previous precepts 7 Can t iearn wnallo on unless italready nas a utririyiuncimn Learning element modi es performance element in response to feedbac Critic tells learning element how well agent is doing Fixed standard of performance Problem generator suggests actions that will lead to new and informative experiences Related to decision to acquire information G oal s Learn better actions Speed up performance element Which components of the performance element are to be improved What representation is used for those components What feedback is available What prior information is available v mama nmt a Information about the results of possible actions the agent can take Utility information indicating the desirability of world states Actionvalue information indicating the desirability of particular actions in particular states Goals that describe classes of states whose achievement maximizes the agent s utility v mama nmt M o A direct mapping from conditions on the current state to actions Weighting of parameters of multi attribute decision process A means to infer relevant properties of the world from the percept sequence Information about the way the world evolves Allow prediction of future events u uxxi m nmt 1D The type of training instances a tne beginning data fortne learning task The language used to represent knowledge Speclflctl all lll lg instances rnust be translated into tnis representation language ln sorne programs tne training instances are in tne sarne language as tne internal knowledge base and tnis step is unnecessary A set of operations on representations 7 Typlcal operations generalize orspecialize ewsting knowledge combine units or knowledge crotnerwise modlfythe program s eXlSIl g knowledge or tne representauon ortne training instances u uxxi m nmt 2 The concept space 7 The operations that define a space ofpossible knowledge structures that is searched to ind the appropriate characterization ofthe training instances and similar problems The learning algorithms and heuristics employed to search the concept space 7 The orderoithe search arid the use of heuristics to glJide the search v New nmt 13 All learning can be seen as learning the representation of a function Choice of representation ofa function e Traderoff between expressiveness and efficiency is Whatyou Want representa le is Whatyou Want learriable of examples cost of search Choice of training data 7 Correctly reflects past experiences 7 Correctly predicts future experiences How to judge the goodness of the learned function v New nmt is numerical parameters decision trees formal grammars production rules logical theories graphs and networks frames and schemas computer programs procedural encoding t mm ml 14 0 Importance of Prior Knowledge Prior knowledge can significantly speed up learning process EBL explanationbased learning 0 Learning as a search process Finding the best function 0 Incremental Process online vs offline t mm ml 16 Let an example be x fx Give a collection of examples off return a function h that approximatesf This function h is called a hypothesis Feedback is relation between x and hx x x could only be approximately correct Noise missing components v New nmt i7 0 Simple hypotheses that are consistent with data are preferred 0 We want to maximize some metric of consistency and simplicity in the choice of the most appropriate function v New nmt ie Many hypotheses h s are approximately consistent with the training set Curvefitting linLvmutlQixiiniiii iUn quotMun a int A preference for one hypothesis over another beyond consistency is called Bias t mm m is Restricted Learn based on conditions ofthe situation representation of gica Whetherto Wait at a restaurant for a table sentences In 5 390 5 1025 o Eu 2L 3 Tree of property value tests 7 Terminals are dEEiSiDnS t in r m mi 2n A classi cation decision tree takes as input a situation described by a set of attributes and returns a decision Can express any boolean function ofthe input attributes How to choose between equally consistent trees vummm 1i E mpk mmx Gull u m m m in mm lit w m mum Xi h M Na in Saint 335 M In quh 771139 Y X h M Na 139 mil 3 i Nu Mm ill 017 ND x v m w in mm M w W nin m n m m i Mil w 1an mm m x m A M An hill to M m 1mm I vquot x v i n v i nmr E m in nin m x v m i Mr W i m w W m vquot x v A n v i nmr it n m 1an nin m x v m u m i K W n x m m m m in rt v m Imimn mm vquot XJ iv A K K i Mi 1M1 rli iVn it in m m in Mill i M w W m m vumcsiwzm 2 Alternate Price Bar Raining FriSat Reservation Hungry Type French Italian Patrons NoneSome ThaiBurger Full WaitEstimate 010 1030 3060 gt60 uunmsixmim 12 Construct a root node tnat includes all tne examples tnen for eacn node i if tnere are both positive and negative examples 0 oose tne best attribute to split tnem 2 if all tne examples are pos neg ansneryes no 3 if tnere are no exam les for a case no observed examples tnen cnoose a default based on tne majority classification at e parent 2 or raining under hungry yesaitemate yes 4 if tnere are no attributes left but We have botn pos and neg examples tnis means tnattne selected features are not sufficientfor classification or tnat tnere is error in tire examples can use mayority vote o 3 uunmsixmim n 3 r ii A prfect attribute divides the xarnples intn sets that are all positive and negative X1 X3 X4 X6 X8 X12 X2 X5 X7 X9 X10 X11 6 X8 X12 9 X10 X11 Basic idea is to build the tree greedily Decisions once made are not revised llo searc Choose most signi cant attributequot to be the root Then split the dataset in two halves and recurse De ne signi cance using information theory based on information gain or entropy Finding the smallest decision tree is an intractable problem Cannot use decision tree to represent tests that refer to two or more different objects 3r2 Nearbyr2r A Hicerp A Hicer2p A Cheapertpnp New Boolean attribute CheaperRestaurantNearby but intractable to add all such attributes Sometruth tables cannot be compactly represented in ree decisiont ieoiinst it aio onlyitan eiien nonoei otinnots aiet exoonenoaiiylaioe decision oee will he ne ed 7 Maioiitynin ion whiciiieoiinst itnioiettiai iiattotitsmnots aiet vumcsinmn 3t Any Boolean function can bewritten as a decision tree VI PatronsrFul0 A WaitEsfimafer1030 A HungryrN WilliVaitr Row of truth table path in decision tree 2quot rows given it literals 27quot functions ointment so Expected amount of information provided by an a ri u e 7 Siiniiaito tne concept olvaltte of perfect information Amount of information content in a set of examples V is the possible answers p positivei n negative mo Ptngtgt DtoogiPlto Example 12 cases dpos Eneg information ibil P P P P P P I if to iilo i Wm W mm m saw v number ofallnbules ofAllnbule A r2matnd2rt fmi ommm 1 PM Pt v n GatrA 117quot rmamAzrA pn m Gmrthns 1 Z 4 6 24 7101 l D itquot 0541 12 1th ilz 616 t Z 11 Z 22 4 22 Gum ypz1EE ZHUbtts nu nun mm min run as m an In mm m nun HE Partition on m gives least Impurity QB KM EVER GAS KIM EVES QM KIM EVES sIrr m minim m III u 2 2 2 2 7 11 rm T41402 rziao 17mm tut EYES ARE IETI39ER ATTRIBUTE How do we measure how close our hypothesis is to r Try ho on a test set Learning curve Measure quot0 correct predictions on the test set as a function ofthe size ofthe training set 9 unmet on last sel 39n 2n 40 ll an inn Training my size Almning MW at the decisinn tree aignrithm an inn randmnly gmu39zta l Examples in the mm dnmain The graph summarilzs 2 trials v Mammal How correct is this 7 Can we even iudge this idea 7 Notall attributes used on correctness 37 v a mm mm 38 Finding meaningless regularities in the data With enough attributes you re likely to find one which captures some ofthe noise in your data One solution is to prune the tree Collapse subtrees which provide only minor improvements Using information gain as a criteria v minimum Handling examples with missing data Add new attribute value unknown lnstantiated example with all possible values of missing attribute but assign weights to each instance based on likelihood of missing value bein a particular value given the distribution of examples in the parent node Modify decision tree algorithm to take into account weighting 39 v a mm mm on Handling multivalued large attributes and classification Need another measure of information gain Information gain measure gives inappropriate indication of attributed usefulness because of likelihood of singleton values Gain ratio Gain over intrinsic information content The version space algorithm Neural Networks Continuousvalued attributes Discretize Example Preprocess to find out which ranges give the most useful information for classification purposes 0 Incremental construction H mm m Why is search the key problemsolving technique in Al Formulating and solving search problems Understanding and comparing several blind search techniques Victor R Lesser Overview of Material You Should Knowll CMPSCI 683 Fall 2004 You should have read material In Chapters 13 vmxcemrm Z The state space model provides a formal definition ofa problem and what constitutes a solutionto a problem Complete and Partial Solutions A solution is found by searching through the state space until a lgoall state with specific properties is found Local and Global Properties A solution is asequehce oloperatorsthat Will change an initial state to a goal state the attributes ota state that has certain properties move movie v Search involvesexplaring lexplicitly generatingl parts ofthe 39 Cryptarithmetic PTObIGMSI statespace until a solution is found lorthe entire space is exploredl SEND MORE The point or search isto find a solution Without exploring the ehtire Mm state space Without generatingthe complete state space Vincemim z vimsmrm l A popular benchmark for Al search Solved for n up to 500000 Signal interpretation eg speech understanding Theorem proving eg resolutiontechniques Combinatorial optimization eg VLSI layout Robot navigation eg path planning Factory scheduling eg flexible manufacturing Symbolic computation eg symbolic integmtion Can we find closed form solutions to these problems Hoffman Loessi and Moore 1969 If n is even but not ofthe form 6k2 For39 12 nl2 place queens on elements L 2Di flMill If n is even but not ofthe form 6k Forj 12 nl2 place queens on elements 0 1ZH n2e 1 mod n MM M204 t me 1 mod n lfn is odd Use case A or B on n1and extend with a queen at nn Is this a good benchmark problem for testing search techniques t uxxlcssmrmm States and state spaces Operators representing possible actions Successor function Initial state and goal test Path cost function Examples 8puzzle 8queen path planning map coloring What are the corresponding states operators initial state goal test and path cost t uxxlcssmrmm Star State Goal Slate States integer locations of tiles Actions move blank left right up down Goal test goal state given Path cost 1 per move Note Solving npuzzle problems optimally is NPhard v LesserCSBECl qum Emmi States realvalued coordinates of robot joint angles parts of the object to be assembled Goal test complete assembly Path cost time to execute E v Lesser c5553 Fz39um Actions continuous motion of robot joints State space representation There is a state corresponding to each city Initial state is the start city state Goal state is the destination city state Operators correspond to roads there is an operator citya gtcityb ff there is a road from citya to cityb v Lesser c 5583 and Initial state is Arad Goal state is Bucharest Partial Search tree a The initial state b Alter expanding Arad Sibiu c Alter expanding Sibiu Sibiu Arad Fagaras Oradea Arad Arad Timisoara Zerind Arad Timisoara Zerind Rimnicu lcea The nal search tree shows six partial solutions open search nodes 11 v Lesser csaaa qum A solution to a search problem is a sequence of operators that generate a path from the initial state to a goal state An optimal solution is a minimal cost solution Solution cost versus search cost Search tree tree or graph of explored by the search proces a search tree orsearch graph is a subgraph orthe state space Search involves maintaining and incrementally ext I ions states really nodes s ending a set of partial so u We refer to these partial solutions as search nodes nodes in the search tree The process of extending a partial solution is called expanding a node 7 Basically expanding a node involves USll lg all7 operators ll able to the latest state ofthe node to ldentify reachable states and so generate new partial solutions nodes 7 lt is common to referto nodes by theirlatest states but a node really represents a partial solution operatorsequence State space a graph ofthe set of states reachable from the initial states via some operator sequence The state space is sometimes also called the problem space or the search space Path a sequence of operators from one state to some other state Solution a path from an initial state to a goal state Partial solution a path from an initial state to a nondeadend intermedia e state 7 Encompasses family orpossibie solutions coal test predicate that tests if a state is a goal state goal states may be explicitly listed orspeciried by a property Path cost function that assigns a cost to a path often denoted g It is the Sum of costs of the operatorsactions of the path u taxxicssmrmm Key issues in de ning states a which obiectsreiations to represent a which configurations need to be mapped into separate states Key issues in de ning operators 7 may have to make expllclt unstated assumptions in the problem descriptlolli a how staterspeci cgeneral should operators be a how much domall lrspeclflc knowledge should be compiled into the operators Developing an effective state space representation of a problem is choosing an appropriate abstraction r Wlthout abstraction agel its Would be swamped by the detalls of the realrworld u taxxicssmrmm i Thara ara two main asnaats olabstraction a iamming unneLessan detail liom tne stale aasriiptinns and so tne l7 eiatois a iamming legal opeiatms tnat aia useless Di inainriani ini arhiating oals Agooil abstraction a iemmesas innh detail aspossibleto mite ll eas enougntolnd a solution a maintainsthavaiiiihy oltne solutions loitne LonLeptual goals An abstraclsolution raprasants a large ninniiar ol iiataiiaii paths onan thara is a trariaoii beMeen siniiriiahy anii generality itha repmentation iraaornas sospecilicto tha givenproblem that it cannot be iisaii tor avan vary similar problems Straightfoniirard representation of states boatlocm1ocm2locc3loc loc E side1side2riverlboat Results in 37 2181 states Can you simplify by abstraction wo stan concept ol Missionaries and Cannibals Thraa missionaries and thra 39 athat ca r iitha rii Alsearch problems can he iisaiito explore tha ila abstraction ibals ara on onesiile ola n hoiii upto two people and avar outnumber meLThele is a boat availabl 39 Cannibals missionali and Cannibals across the 39 Tripmule Planning DeterminehowtogetlrornonelocationtoanotherAssunietliat y iiirnownnratahyyoiiarain and hair ania iiaaar Abstraction Simpli cation articular missionaries and cannibals on eacn side do not matter on num ers do ot nay to nave explicit states Witn people in nce in boat Will only Want to cross to tne boat 0 otner side now once know number of a know number on tne otner si Abstract states boatside1m39sside1c39sside1 tes Results in 2 x4x4 32 sta type on one side de State abstraction also usually reduces the number of o erators move 1 m and 1 c from side1 to side2quot vs the previous move m1 and c1 from side1 to side2quot move m1 and c2 from side1 to side2quot etc Useless operators can also be removed 1mC 2im 16 single missionary goes to goal side in boat The abstract solution using move number of people operators is still a valid solution to the conceptual goal simply have to randomly select particular people when executing t uxxlcssmrmnt in its full generality states for this problem would be very t complex since they would describe con gurations of the world comple e at latitude and longitude xy time is t radio on raining car 1 meters ahead etc To simplify we focus on the problem of nding a sequence g al of city to city traversals that accomplish the in this case our abstract states simply become in city x Wecan further simplify by identifying important cities ie quot 39 tionsandidentirying major cities and cltles wlth road LII39IC the subset of relevant cities we don39t need to include Amherst in the state space irwe are trying to get to Boston from Worcester Likewise in its full generality there would be a very large number of operators to e considered and it would take a very large number of operators to achieve a solution 7 e 9 go heading h at speed 5 turn radio on etc Mth the abstract states operators are of the form go from city a to city bquot where there is a road from city a to city b A solution to the abstract problem solves the basic goal but does not give us the detail required for say a robot vehicle to actually navigate the trip till the abstract problem solution allows us to see if a solution is even possible and to judge its approximate cos Search control strategies Order in which the search space is explored The extent to which partial solutions are kept and used to guide the search process The degree to which the search process is guided by domain knowledge The degree to which control decisions are made dynamically at runtime v uxxvcssmrmm 25 A key issue in search is limiting the portion of the state space that must be explored to find a solution The portion of the search space explored can be affected by the order in which states and thus solutions are examined The search strategy determines this order by determining which node partial solution will be expanded next v uxxvcssmrmm 27 Choose states to expand next Choose operators to expand the states Execute the set of state operator pairs Update the search graph with some of the new states created Decide if search should be terminated Finding an acceptable solution There are no solutions to the problem Wtth resource constraints t minimum 23 The extent to which partial solutions are kept and used to guide the search process 0 The degree to which the search process is guided by domain knowledge 0 The degree to which control decisions are made dynamically at runtime t minimum 23 Completeness does it guarantee to nd a solution when there Is one Time complexity how long does it take to nd a solution Space complexity how much memory does it require Optimality does it return the best solution when there are many A search tree is a graph representing the search process Nodes are the data structures from which the search tree is constructed Implicit graphs and explicit graphs Branching factor and solution depth Generating and expanding states Open and closed lists of nodes mnmsumm an mnmsumm 2 Basic idea o line simulate exploration of state s ce by generating successors of already explored states mm D IREDSEARL II problem swing returns a solution or raiiiiie initialize tie seamh lien llsmg die initial state of pmblnm loop do it39there am no candidates rei expansion then return raiinie choose d lea n or Expansion emitting to iiiimny irine node ontzins a go State t i i um the miiesponding solution else expand the node and add the ranking noda m the starch tree end Blind or uninformed strategies use only the information available in the problem definition Breadthfirst search open list is FIFO queue Uniformcost search shallowest node first Depthfirst search open list is a LIFO queue Depthlimited search DFS with cutoff Iterativedeepening search incrementing cutoff Bidirectional search forward and backward Search strategies can be classified in the Multilevelm ultidimensionalmultidirection Garne Search deals Witn the presence of an opponent tnat takes actions that diminisn an agent s performance see AlMA nodes goalp successors Time and space complexity defun bfs nodes goalp successors cond null nodes nil funcall goalp first nodes first nodes t bfs append rest nodes funcall successors first nodes goalp successors l Time and space complexity Failure to detect repeated states can turn a linear problem into an exponential one Do not regenerate the state you just came from Do not create paths with cycles Do not generate any state that was generated before using a hash table to store all generated nodes There are four phases to problem solving 1 Goal formulation 7 based on current World state deterrnlne an appropnate goal 7 oescnoes deslraole states ortne World 7 goal formulatlorl may ll39lvolve general goals or speclrlc goals 2 Problem formulation 7 rorrnallze tne problem ln terrns or states and actlons 7 state space representation 7 nnd 5equence5 or actlons tnatlead to goal states e posslbly select oest ortne sequences 4 Executi n hase 7 ca ry OUt aCth S ln selected sequence v uxxlcssmrmm A completely autonomous agent would have to carry out all four p ases Often goal and problem formulation are carried out prior to agent design and t e agent is given speci c goal instances agents perform only search and general goal formulation problem formulation speci c goal formulation etc For nonagentquot problem solving a solution may be simply a speci c goal that is achievable reachable there may be no execution phase The execution phase for a realworld agent can be complex since the agent must deal with uncertainty and rrors Adopting goals and using them to direct problem solving can simplify agent design lntelligentJrational agent means selecting best actions relative to a PM but PMs may be complex multiple attributes with tradeoffs Goals simplify reasoning by li ng agent objectives but still organize Irect behavior Optimal vs satis cing behavior best performance vs goal achieved May use both goals to identify acceptable states plus PM to differentiate among goals and their possible solutions v uxxlcssmrmm Pred l Complete searchbased problem solving involves both the search process and the execution of the selected action sequence Problems can vary i n a number of ways that can affect the details of how problemsolving search agents are bul t One categor39 ation is presented in AIMA related 7 Total cost or searcn7based problem solving is tne sum of to accessi it eterminism t e searcn costs and tne patn costs operatorsequence cost Dealing with total cost may require 7 singlestate problems Agent knows initial state and exact errect or eacn acllo Searcn oxersingle states 7 Combining apples and oranges e g travel miles and CPU time Multiplestate problems 9 0 make a traderoff betweel l Search tlme and solution cost optimality resource alloc tiori Agent c n ot know its exact initial state andortne exact errector its actions 7 Tnese lSSUeS must be nandled in tne perrormance measure MUSt Search overstate Setsi May or may not be able to fll ld a guaranteed Solutlol39l t taxxicssmrmm Contingency problems Exact prediction is impossible but stat determi Continuation of Simple Searc 7 How t es may be Search ned during execution via sensing Must calculate tree of actions for every h o use neunstics domain knowledge in order to accelerate contingency 7 Readll lg Sections 4 l4 2 Characteristics of More Complex Search lnterleaving search and execution may be better SUbPI39Dblem interaCtiDn More compl x View of operatorcontrol costs Exploratlon problems Uncertainty in sear Agent ay have no information about the effects of its actions and must experiment and learn Search in real world vs model Nonmonotonic domains Search redundancy t taxxicssmrmm Victor Lesser CMPSCI 683 Fall 2004 Operate using a single current state Paths lollovved bysearch are not retained Contrastvvrth open and closed node llSlS searchtree Little memoryrequired usuallya constant Sometimes the path to the goal is irrelevant Squeens problemjobshop scheduling circui design computer con guration automatic programming automatic graph drawing Optimization problems mayhave no obvious goal test or path cost Continuous lunclions Vl l mm K Local Search HillClimbinglterative Improvement Simulated Annealing Stochastic Hill Climbing Beam Search Genetic Algorithm Mammy Very simple to implement Very little memory is needed Can often find reasonable solutions in very large continuous state spaces for which systematic algorithms are not suitable Mammy Unsolvability Is there a solution 7 Systematic can require exhaustive examination ofexponential Search space 7 Stochastic cannot determine unsolvability CompletenessOptimality 7 Systematic complete 7 Stochastic incomplete Speed 7 Neither l5 uniformly superior each does better fordifterentsorts of oblems Local Search is an example of Stochastic Search 5 3 Start Search with complete but nonoptimal so u ion Modify incorrectlnonoptimal solution to move it closer to correctoptimal solution Me mamJam What is the search space Search space of complete solutions vs partial 39 s solution When useful reasonable complete solution can be generated Operator to modify complete solution Some notion ofprogress Measure of complete Solution m terms of constraint violations Oran evaluate lu ctlo The main iterative improvement algorithm is hillclimbing Continually move in the direction of increasing value of a successor states until a maximum is reached Tradeoff between time to select a move more and time to reach goal state less This is sometimes called steepest ascent HC and is called gradient descent search if the evaluation function is based on cost rather than quality t uxxvcssmrmm A simple form of local search Continually move in the direction of increasing value abjecm funczmn glohal mnwnum Greedy Local search grabs good neighbor kmvlrm mmn without thinking ahead gr 1an mnmhum mm qwrr uncnl n v swam mm function HHLCUMBLNGl39pIUbIem returns a state that is a local maximum inputs probEm a pro locnlvariables current a node neighbor a node L39lll l eni 39 MAKENODEHNITIAL STATE prublemD loop do Looks at all neighbar a inghestV39alned successor of current successo s if VALIEneighbur lt VALUE email then return STATE current neighth current v laseresnxa mm L6 G 21 inilial sale L4 Gaza LaGza blmkthaus mung undue mug thing 39 GomL Fur each blueklnalhaslhecunecl51mmquot slvucluve a line cummele smmure undemealh n is wow as l shuum be and 1 pmntluv every black m We suppun 5mm Fm each blurklnal has an moaned suppunmnmg mm une Pumlluv my mumquot my mama suppu slmclure v swam mm L4 G 45 v layman mm 7 v y 1 Ti iiiiiiia siiiim imfg Pr Can get stuck at a local maximum Ways to overcome the weaknesses solution Stochastic hillclimbing Simulated Annealing Local maXimum choose at random an uphill move among the successors Unable to find its way off a plateau olu on Sometimes take downhill steps Plateau You may have to get worse to get better Cannot climb along a narrow ridge when almost all steps go down continuous space Firstchoice hill climbing generate successors if randomly until finding an uphill move Rd K f d Randomrestart hill climbing restart search from 39 ge m ee ges 39 i i randomly generated initial states V Lesser C8683 F2004 13 V Lesser C8683 F2004 Sim Li a s Simulated annealing is a variation of hill climbing in 1 Evaluate the initial State If it is also a which at the beginning of the process some downhill goal State then return it and CW moves may be made The idea is to do enough OtherWises continue With the initial exploration of the whole space early on so that the State as the current State final solution is relatively insensitive to the starting state This should lower the chances of getting caught 2nitialize BEST30FAR as the current at a local maximum a plateau or a ridge state Randomlyjump to alternative nonlocally optimal moves based on p e39AET more time goes less willing to explore nonoptimal path 3lnitialize Taccording to the annealing schedule V Lesser C8683 F2004 15 V Lesser C8683 F2004 Loop until a solution is fan left to be applied in the curre a Select an operator randomly that has not yet been applied to the current state and apply it b Evaluate the new state Compute AE value of current value of new state 7 if the new state is a goal state return it and quit a if it is not a goal state but is better than the current state then make it the current state u s BESTSOFARIn Ihls new sate iihetterthan current BESTSOFAR m n iii tn mm n a nuanotn Miami c Revise Taccording to the annealing schedule oi Return BESTSOFAR as the answer Keep track of K states rather thanjust one a Modified breadtnafirst contour created dynamically Start with K randomly generated states Stop if any goal state Multiple Successors generated for each of the k states Choose top K successor states in next cycle Contrast with Random Restart 7 Positive quotSharing information across differentseai ches 7 Negative 7 May eliminate diversity coming from random starting function Sin ULAI bD ANl aAnmutpitiiiiuii suwduiy rrtunls a solution state iupllls pwbsm aproblcm rt lit dule a mapping from lime in temperature local variables titl i L ni u node rim 3 node 1 ti lenipeiaiuiequot controlling the piuoabiluy of duvinwc39ll d steps WVI m sMAKENODEHMTIALSTATE metileniii or re do n rennetaim if T7 lien return Furcm liexi a ti randomly selected succeseol of ailmil A VAI iillldl7VAi i tl l11 lil it A gt 0 ii A mil Pile else L ili mi i7 icitiiiily with pinbaliiliiyrA M u uxxlcssmrmm Beam Search patterned after biological evolution Learning as Search Metaphor of Natural Selection Offspring are similar to their parents The more t are more likely to have children Occasionally random mutations occur it uxxlcssmrmm Localized Beam Search Speciaiized approacn for generating successors and for seiecting next states An indiViduai solution is represented by a sequence of genes Tne seiection strategy is randomized Witn probability of seiection proportionai to fitness indiViduais selected for reproduction are randomly paired and nme ar mutated Selection More fit members are iikeiy to be in next generation Mutation Random aitering of characteristics Crossover popuiation Crossoverisorooaoiylne keyidea v Exploll relative independence of certain subpmblem solutions imbedded in diveieni members What is the fitness function How is an individual represented How are individuals selected How do individuals reproduce Represent Outlook Ovemmf v Ram A Wmda i mng by Outlook Mm 011 10 LgtSunnyno LgtWeakno Represent IF WmFSLmng THEN PlayTenmrjie by Ou ook Wind PlayTenm 1 1 1 10 1mm Shrugs ammmh D sumg spammm mmumum wmmmm gt WHUWUUU lt mummmm WWW mu Twpa39ntcmmn HMUW HUMWWU gt UWHWUUU lt mnmmm ummunmm llnilumcvu U Emmi W001 mm W gt WWDWW lt OMEN WMWM Hmmm mmumm gtwmmmn 75 Whrle maxh mm lt mathmum 1 Select Pmbabrhsucally select 1 0p membmmua qu lmmag 2Plnmmhj z L muuver mbabmsmany salt251 pm ufhypntheses frva Fur each Pm pairh1h2 pruduce twu uffspnng by applyinng Crussuver 9mm Add a1 uffspnng msx 3 Mulate Invert a randomly selected bit in m p random members of P5 4 Update P P 5 Evaluate for each Irin P compute Fitnessh Return the hypolhesis from P that has the highest lness 39 m r em mWeMWWWWW mm quotmm Mmmm er cream mm mm WWWWWmmmmmmmmmmm Fitness proportionate selection Fitnexxthl PW 39Efnmttgt can lead to crowding lack oi diversity Tournament Selection Ptck in h at random thn vnttorm propabtltty thn probabllttyp selecttne more ttt Rank Selection Sort all nypotneses by tttness Probablltty of selectton 1s proporttonal to rank Lean msuncttve set at propostttonat rules oompettttve vvtn CA5 Fltness Fitnexxhcwn39ecth neptesentatlon IF aTn a2F THEN T IF a2T THEN cF tepesented by a1 a 5 a1 a 5 10 I1 1 11 1D I Geneva opaatnts Wmtva1aplelengtntule sets a solutons Wmtonly Welttonned utstnng nypotneses a1 a 393 a1 a 393 n10 01 1 11110 0 n01i10 10 010 1 choose tossovet polnts tot hegattetblts1v 2 Nowtesttlct polnts In n to those that ptoauce bltsttlngs vvttn wellaetlnea semantlcseg 13 186v Itvvecnoose 1 it tesultls a1 a 393 a1 a 5 I13 11 10 0 11 10 0 a1 a 393 a1 a 393 4 00 01 1 10 01 0 h Add new genetic operators also applied probabilistically 1 Add Alternative generalize constraint on a by changing a Dto 1 2 Drop Condition generalize constraint on a by changing every 0 And add newlield to bitstring to daermine whether to allowthese aagc aagc AADC 01 11 0 10 01 0 1 0 So nowthe learning strategy evolves Performance of GABIL comparable to P I f f t d b symbolic ruldtree learning methods C45 quotPquot a quot quot ngmms Fraser 9 y ID5RAQ14 trees slnlx iWy Average performance on a set of12 synthetic problems GABILMAA and DC operators 921 accuracy GABIL AA and DC operators 952 accuracy Symbolic learning methods ranged from 912 to 966 vulmsixmlm 1 mnmsumm at Determining the parameters Determining the method for designating a result and the criterion for terminating a run Determining the set of terminals Determining the set of functions Determining the tness measure vulmsixmlm 3 mnmsumm an 1 TERMINAL SETT A CONTROL PARAMETERS TOIIOI Oz Dk Population Sin M 4000 1 FUNCTION SETF GENETa1ioNs c 51 F AND OR NANO NOR a FITNESS MEASURE 5 TERMINATION CRITERIA ANO RESULT DESIGNATION Thefnnass cam amhe Z39comhinations mm klum nds iN T 39TenlIiNate when tness equals II 239 mm The dandamixm fnnass of a Smmssion istha sum ovum z39rmuss M Mar 6 51 generations mus mm mm manning NiaamI hmwem the Boolean que um hytha Sexpress39on aNIHhE coma valua oIIIIeIargaIunaion Boolean mnApaNyIunmonI 39 D ignat h lGoTal individual as the solution vulmsixmlm I mmnsumm x OR AND INOTOIII quot01 OR NOT D1 IANO no In IANO no In IOR IOR D1 INOT no IOR IOR n1 INOT no INOTOIII IANO INOT OIII INOT OIII vulmsixmlm 3 mmnsumm w More interesting example design electronic filter circuits Have been applied to a wide range of problems Individuals are programs that transform beginning circuit to final circuit by o Results are somet39mes very good and addingsubtracting components and Somet39mes Very Poor39 connections The technique is relatively easy to Use population of 640000 run on 64node apply and in many cases it is beneficial parallel processor to see if it works before thinking about Discovers circuits competitive with best another approach human desIgns v uxxmssmrmm M u uxxvcssmrmnt RepairDebugging GSAT Interacting Subproblems More on CSP problems texture measures Multilevel Search blackboard Continuation of Reinforcement learning Qleaming Instance Based Learning Victor Lesser Case based learning CMPSCI 683 Fall 2004 vmxua lm Z In FlL the environment is usually modeled as an MDP defined by Flndapollcyll seS gt aeAls could be stochastic that maximizes the valuuutllltyieimected lulure reward 01 each s 5 set of states of the environment V 5EllY39lt2 7 l39qu A5 set of actions possible in state S 65 PlSS39 probability of transition from S to 839 given a and each nu pair Mlquot 5 l l Rssa expected reward on transition S to S given a dwibm Wm Y 2W 1515 H a y discount rate for delayed reward discretetirneil0 12 o Hl Hz n3 r M W v mxcmlm K v mxumim 4 These are called value lunelions cl evaluation lunetions in Al Experience Build Predictions Value Function Vara Q Q Continual online Simultaneous acting and learning A Markov model ofthe environment Essie pmhahimyoi traudtionirom s to 539 given a Rss39a 4mm rewml ontrmition s to 539 given a States with probabilistic actions Terminal states have rewardsutilities Problem Learn expected utility of each state 1 1 V b x a m s Percepts tell you State Reward Terminal vJexxeicssmnnm 9 Developed in the late 195039s in the area of adaptive control theory Just keep a running average of rewards for each te For each training sequence compute the reward to go for each state in the sequence and update the utilities Accumulate reward as you go back Generates utility estimates that minimize the mean square error LMSupdate v mmcssm mm M A training sequence is an instance of world transitions from an initial state to a terminal state The additive utility assumption utility of a sequence is the sum of the rewards over the states of the sequence Underthis assumption the utility of a state is the expected rewardtogo of that state v uxxucssxwmm m l fie lb Fill 0 01 02 VH3 Hi 11 12 13 isampjsM 39 Uini1Ri U0quot ni U0n n1 ni ni1 I t v mxucssxlrmm 12 Cnnverges very slowly because it ignores the relationship between neighboring states Utilities ofneighboring states are mutually nstrained Ui Ri 2 PJ U0 Can apply dynamic programming to solve the system of equations one eq per state Can use value iteration initialize utilities based on the rewards and update allvalues based on the above equation mnmsinmm u Approximate the constraint equations without solving them for all states Modify Ui wheneverwe see a transition from ito j using the following rule 7 Ui Ui a mi Um Ui v2 mmquot The modi cation moves Ui closerto satisfying the original equation Flewritethis V5 ltV5 Dt7 y VS39 V5 TD error to get V5 e1 aV5 otyyV539 mnmsinmm m VSlt1 aVS ar yVS39 Sutton 1988 vse1 aVS aREWARDpmh A er each 5 antmn update the State s I x s muva n39s muva x smuve u smuve Aisumeammp mnppnnem r m hemmlmes malls Immkgs x smnve vumcsiwzm w mnmsixmlm 2n 1 Make a table With one eme per mm Take mmquot 0 vumcsiwnm S 16M 7 summed mha g nfwmmn 4 s 7 if 5 7 2 Now play lots of games 3 i W To ple our moves look ahead one step 3 n I E yanmzspmm 3 n draw nmslais d mammmmmm ma pmh nfwmmng 7 melatgstm syn mm mm nmenmepmk ammeatmdnm muplamlmjnww gumuuemhackugs hclemen ly mmpms a mum a mxnuenlmese hackupsssmesae usmu NH 0 a Is a 7 mp TD 11 SlmpleMnnhe Cal LinM I u 3 Emma me backups 34E Nnmetevmwmslm nutmmebase an mm He nnpnnem 3 mm a lt7 mphmmry39 mm s 7 h m hefme uurgzedy mnve s 7 mnve We lummemealzh m mm m 7 abachuu m 7m mm 7 m axmallpmmvefranmn eg a 71 the my w plumb Emwsu v Search Dynsmu pmgrammmg Backup m 6 AR 1mm an naclupi H termma smes We 7 mm nacxups Ema mm snaucw I as 3 mm mm 1 Choose best action from any state s using learned Vquot nsargamax rv a vV5v a A problem 0 This works well if agent knows 5 SanS and r Sx Asia 0 But when it doesn t it can t choose actions this way How Much To do we Need to Know To Learn v mew m 25 0quot Optimal policy Define new function very similar to Vquot Q Warm VV6ampa lf agent learns Q it can choose optimal action even without knowing r or 6 71s argamaxrsa yV6sa 71s argamch s a Q is the evaluation function the agent will Note Q and V closely related VYJ max Q6419 11 Which allows us to write Q recursively as Qapafr 6902 ill60302 7 Jya 7mg Q Sena Let Q denote learner s current approximation to Q Consider training rule Qsa r r maxQltsza39 a Where v is the state resulting from applying action a in state anda is the set of actions from v v Wis mi 22 v 392 Learning d Foreach 512 initialize table entry Qae 0 Observe current state 5 Do forever Select an action a and execute it Receive immediate reward r Observe the new state Updatetnetable entry for Qa as follows Qa rymaxQ39a39 a 759539 vumcsiwnm mum v mm c Mar tl tylu 4 l l men nHmmxln Lilllt fill mum dtcwnrtl limlr it39vmllt39 llirn ivquot t in or wiglu1i m gt hum v lt 10K 21310 mam What if reward and next state are non deterministic We redefine VQ by taking expected values V SEEnmmvmz J m QWEEM a WW 50 vumcsiwnm Q learning generalizes to nondeterministic worlds Alter training rule to NW lammn anlHuaagt mi39 1 1 vuztm 511 Where on Can still prove convergence of to Q Watkins and Dayan 1992 Q learning reduce discrepancy between successive Q esLinmes QA St at 1 AQle at it QZxt at 1293 s at One step u39me difference E 31 9137 at H y gagomy a qmv ent expressmn Tvrm step Lime difference A QA51391yl AmaXQstatAQA511al1 Q2szaz a y rt1 VzmaaXQsz2L1 N step time difference 1 A TD A algorithm uses above training rule QKGBM 39 H 7 39t1 701 gtVtquot1 Vnm XQStngt 0 Sometimes converges faster than Q leaIning converges for learning Vquot for any 0 S A 1 Dayan Blend 211 nthese 1992 QA st at 1 AQlst gt at A 92 st W 12 93 5 gt at Tesauro s TDGammon uses this algorithm e closer lzmda is m the tune more impunan later differences as i we m 14 Problem choosing actions with the highest Is it betterto learn a model and a utility function orto learn an actionvalue function with no expeged Ut39l39ty 39gnores the Contr39bunon to model learning I Tradeo 39befween immediate good and long This is a fundamental question in Al where much term good exploration VS exploitation of the research is based on a knowledgebased approaCh39 A randomwalk agent learns faster but never Some researchers claim that the availability of uses that knowledge model free methods such as Qlearning means that the KB approach is unnecessary or too A greedy agent learns Very SIDWIY and acts complex based on current inaccurate knowledge v uxxi m m 35 r mus amt Give some weight to actions that were not tried very often in a given state but counter that by knowledge that utility may be low Key idea is that in early stages of learning estimations can be unrealistic low Similar to simulated annealing in that in the early phase of search more willing to explore Too many states Can define Q as a weighted sum of state features or a neural net Adjust the previous equations to update weights rather than updating Q Can have different neural networks for each action This approach used very successfully in TD Gammon neural network Continuous statespace Can discretize it Polebalancing example 1968 Time steps need not refer to fixed intervals of real time Actions can be low level eg voltages to motors or high level eg accept a job offer mental eg shift in focus of attention etc States can be lowlevel sensations or they can be abstract symbolic based on memory or subjective eg the state of being surprised or lost Encode specific experiences in memory rather than abstractions Carry out generalizations at the retrieval time rather than the storage time lazy learning In their most general form Based on partial match on a similarity metric retrieve a set of casesinstances most relevant to the present Adapt the retrieved experiences to new situations This could be based on algorithms ranging from a simple nearest neighbor classi cation to chains of reasoning Key idea juststore all training examplntx jtx Neamt neighbor 7 Gwen ouery indanLexz liist loLateneaiesttiaining exempiexm then estimate f WVf n Kiieamnreigirhor ac n takeyoteamongitskneaiestneighboismoisuete valueotaigetlunrtion 7 Take mean or neiues olkneaiest neighbors riealryalueo www On lell Positive and Negative Training Examples 39Nearest neighboryes for x 57k ctassilies it as nega On right is decision surlace or nearest neighbor query in region will have same value amuse u Miglt wantweight nearer neighbors more mm wwmm Where tit rim And dxq x is distance between yr 1 Note now it makes sense to use all training examples instead ofjust k 7 Classification much slower Imagine instances described by 20 attributes but only 2 are relevant to target function Curse o dimensionality nearest neighbor is easily mislead when highdimensional X Similarto cyerfitting One approach Stretch yth axis by Weight 21 Where 21 H 2 chosen to minimize prediction error 39 Length the axesthat correspond to the more relevant attributes Use crossvalidation to automatically choose Weights in rzn 39 Minimize errorin classi cation 39 Setting 4 to Zero eliminatesthis dimension all together amuse u V 7 I 7 L is no Instances map to points in W Continuous real values Less than 20 attributes per instance Lots of training data Advantages Training is very fast Robust to noise training data Learn complex target functions Don t lose information Siowa ueiy time Easiiv fooled by irrelevant attributes Can apply instancebased learning even whenX in Need different distance metr39c CaseB ed Rea applied to instances with symbolic logic t39ons soning is instancebased learning descrip I usercomplalnt errorsaonshutdown PowerPC operatingsystem windows networkconnectmn Pom memory iameg Installedapplicatmns Excel Netscape Vlrusscan dlsk lglg likelycause 7 a pf Jr i i Note mmormsloml wpmximaionto nor eadi guerypoint xq not him an explic approximation fx ior region sunoundingxq 7 Fri iirimi MYCHDMD mares neignbas mewn39wnan39 wnzn 7 Fri uadraiic e Pmdums pimmseappmumaim ini Sweial choices onnono minimixe r Smaim eiminvei knmieatn neighbors 1 E n i 2 J he omnan x ffxii e Disamewagned squared em nveiaii neigntmis 1 1 E In ELfIfz mo 1 Key elements of problem solving CBR are Cases represented as solved problems index cases under goals satisfied and planning problems avo ded Retrieve prior case sharing the most goals amp aoiding the most prob ems Adapt solution of prior case to solve a new case Mav require resoiving problems andor repairing solutions index new case and solution under goals satisfied and planning problems avoided CADET 75 stored examples of mechanical devices Each training example qualitative function mechanical structure New query desired function Target value mechanical structure for this function Distance metric match qualitative function descriptions Size of largest subgraph between two function graphs CHEF consists or six processes 7 Problem anticipation the planner anticipates planning problems by noticing features in the current input that haye preyiously participated in past planning pro ems 7 Plan retrieval The planner searches for a plan that satisfies as many of its current goals as possible While ayoiding the problems that it has predicted 7 Plan modification The planner alerts the plans it has found to sausfy any goals from the input that are not already achieved 7 Plan re air When a plan failstne planner fixes thefaculty plan by building up a casual explanation ofwny tne failure has occurred and using it to find the different strategies for repairing it 7 Credit assignment Along Wlth repairing a failed plan the planner Wants to repalrthe Chafacteflzatloh of the World that allowed lt to Create the failed plan in thefirst place ltdoes this by using the casual explanation ofwny the failure occurred to identify the features in tne lhputthat led to problem a d then arkthe as predictive of lt 7 Plan storage The planner places successful plans in memory indexed by the goals that they satisfy and the problems that they avold t in new nmt chop i warm 9 Umitiit i s 9 a u i quotT l i I up i Awnmwaumi mm iaxai rim t a n Mm A nar pound of beef Twotablespoons of soy sauoe e ne teaspoon sugar A half pound of green bean One teaspoon d salt the garlic into pieces the size of matchheads red the beef sh Mann rf surfry the spices rice wme green bea ate the beef in the garlic sugar corn starch ncewine and soy sauce ry the spices rice Wine and beeffor one minute he green bean to the spices rice Wine and beef n and beef forthree minutes beef Addthe salttothespices ricewtne green bean and t in new nmt Rectpetot BEEFVANDVBROCOU Found heatest tectpe ts BEEFVWTTHVGREENVBEANS Modtthhg tectpe BEEFVWtTt tGREENrBEANS To sattsfy thctude thCCOtt th the dtsh Ptacthg some thOCOtt th tectpe BEEFrWtTHrGREENrBEANS rCmStdettttg thgtec ertrcttttc Before dothg step SttthytherVattabte do Chop the broccott thto pteces the stze of chunks rtttgtedtettt CttttC apptted Chet alters otd planstosatisty new goals using a set at semen modi cation rules and a sa 1 nah obi hhattpntthn ntheet Ohe easpnnh mugs Twn antespnnhs ntsnyattae A hath pm at tmcmt Ohe ta pmh ntttce he one Eath ntsatt Ahatta espnnhntmth tatch one want ntgalttt chopthe game m pacesthe sue mutants ream be megamt sugar mm stack nae wnean 50 sun was A mesntnthe SIDES nee WM WEEDquot ll he The dtsh hEIW tastes satty The dtsh hEIW tastes sweet e dtsh hEIW tastes tttegama The tteettsheh tehde Thedtsh hEIW tastes satEly metmeottlinntcnq Th 39 ALTERVPLAN SDEEFFECT Reptaee the step that Causesthe vtdtatthg Ddhdttmh thh dhe that ddes hdt have the s the etTeDt but aDhtevesthe same gdat 39 SLPtTrANDrREFORht Sptttthe step thtd th separate steps W and tuh them thdepehdeht 39 ADJUNTVPLAN REMOVE Add a new step tn be run atth thh that Causes a stderetteUt that temdvesthe stderetTeDt as tt ts Created tttdetttttg BEEFVANDVBROCCOU under goats and ptobteths tt thts ptah ts suocessfut thetdtowthg shoutd bettue The beet ts how tender The thOCOtt ts how cttsp thctude beet th the dtsh thctude thCCOtt th the dtsh M atlte a Stttrtty dtsh The ptah avotds tattute exemptttted bythe state The thOCOtt ts how soggy th tectpe BEEFVANDVBROCCOU Searching lor plan that satisties input goals lndud d1idlten in the dish lndude snow pea in the dish Make a stirtry dish Collecting and activating tests IS the dish STYLESTIRERY Is the item a MEAT IS the item a VEGETABLE IS the TEXTURE 0T item CRISP ChidltenSnow peaStir Frying Failure Meat sweats when it is stirtriedquot Stirlrying in too mud1 li uid makes aisp vegetables soggyquot Reminded ol alailure in the BEEFANDBROCCOLI plan Failure The vegetable is now soggyquot v gxxr m m 57 Learnin First work done at Umass on learning rules of Baseball Analytical Learning ExplanationBased g o A Quick Overview of Planning v gxxr m m 59 Driving down on Make a stirin dish Suoceeded Driving down on Avoid lailure exemplilied bythe state The brocooli is now soggyquot in redpe BEEFANDBROCCOLI Suoceeded Driving down on lndude d1icken in the dish Failed Trying more general go Driving down on nclude meat in the dish Sucoeeded Driving down on lndude snow pea in the dish FailedTwing more general goal Driving down on Include vegetable in the dish Found redpegt RECQ BEEFANDBROCCOLI u in mm m 59 39 Example 1 You consider buying a program to 39 manage your nances that costs 100 There is a prior probability of 07 that the program is suitable in which case it will have a positive effect on your workworth 500 There is a probability of 03 that the program is not suitable in which case it will have no effect Victor Lesser 39 What is the value of knowing whether the CMPSCI 683 program is suitable before buying it Fall 2004 vtmwamrm Z Expected utility given information The general case We assume that 07500100o30 exact evidence can be obtained about the value of some random variable El Expected utility not given information The agent39s current knowledge is E 07l500100lw3lo100l The value of the current best action 1 is de ned by 39 Value Of information Euer maxA g PResultyADoAE 07500100030 07500100030 UResult A 100 28025030 v mxcmlm 2 v mxcmrm l Vl th the information the value of the new best action will be EUuEEE maxA zPResuItA DoAEE UResultA But E is a random variable whose value is currently unknown so we must average over all possible values e1k using our current belief VPIE E Xy yLElEUwepl E E elk EUu E vummm 5 A decision tree is an explicit representation of all the possible scenarios from a given state Each path corresponds to decisions made by the agent actions taken possible observations state changes and a nal outcome node Similar to a game played against nature vummm 7 Decision Trees Decision Networks Markov Decision Processes MDPs D Decision node or Chance nude 380K n1 50K splay 275K e 7 310K 490K minurmangesPi 210K mayurdiangesPilS 400K Emma atm ko hui ww ih mm ruse I6 75KlU6i 2iml hlEIE MBElKF lLM W1 7 321UKlU3 36mK131E1 There are two candidate cars C1 and Cl each can be of good quality or bad quality 7 There are two possibietests T1 on C1 on C1 costs 20 C1 costs 1500 500 below market value but it it is of bad quality repair cost is 7 son gain oer ll39l lost CZ costs 1150 250 below market value but it it is of bad quality repair cost is 1 7 250 gain or 100 gain Buyer must buy one of the cars and can perform at most ohetest What other information costs 50 and T2 The chances that the cars are of good quality are 070 for C1 and 080 for CZ Test T1 will con rm good quality with probability 080 and will con rm bad quality with probability 065 Test T2 will con rm good quality with probability 075 and will con rm bad quality with probability 070 mnmsixmlm W Demsinh DnTest n nu lailstzny c2 eisemiyc T2 man u i Traverse the tree in a depthfirst mahher Buyer knmws car 01 i5 9quot quality iai Assign avalueto each lealnode based on the 7M P01900d 7 ihi Calculate the avera e utility at each chahce Buyer knows car CI is 900d quality a 13 Pczgood 1 Calculatethe maximumutility at each decision nodewhile markingthe maximum hrahch Test 1 check qual39ty ofcar 0 Pt1passlc1good 8 Pt1passlc1bad 35 2 Trace bachthe marked brahches from the roct hode down to fihd the desired optimai cohditiohai piahi Test oftz check quality of car 01 Ptzpasslczgood 75 Ptpasslcbad 3 Finding the vaiue of perfect or imperfect information in a decision tree vummm u mm Case1 e Pcignmm2raiip 010000 7 e Utiiity 2000450020 400 Case2 e Human04am pcihaa e Pt2daii02gnnd P020000Pt2iaii e 25x B2Pt2iaii e nrmaiize 234 i034 warm 7 Utiiity 14004 150720 230 ase e PE2badi04aii e madamawn P02badPi2iaii a 7x2 i4P0daii e i e Utiiity14007i150720715080 Eggected Utility 0f Chance Node of 170cignnd 3 e Utiiity 2000i 5007700720 7220 Expected Utility of Chance Node of 1amp2 r 7 X480 3X7220 270 an run 7 59 gt030 41 X80 iBB 5 vummm is mm What is the decision if Decide to do test 12 It comes out false Do you buy c1 or c2 Ecltest t2iail Expected Utility of Chance Node of 1amp2 270 Ec2test t2lail Expected Utility of Chance Node of 3amp4 l68 5 vumcsiwzm l 5 nite set of domain states A nite set of actions P la state transition function ra reward function can get reward at any point 50 initial state The Markov assumption Hill x17s72saPxl 5411 vumcsiwzm W A model of sequential decisionmaking developed in operations research in the 1950 s Allows reasoning about actions with uncertain outcomes MDPs have been adopted ythe Al community as a framewor for Decisiontheoretic planning engeanet 19951 Reinforcement learning a a Bano et ai loam b k uumizsixmim ix A palsy is a slalsa aiwlai aslal to slaasa at aasl slate Al Optimal Palsy is a palsy where yaa are always Dhnnsin ila 2 lilaimalmlasilalaiam r ily o asalalisia 111 f 611388 Actions succeed With probabilityO 8 and move at right angles With probabilityO l remain in the same postion when there is a wall Actions incur a small cost 0 04 uumizsixmim aa dummy 2 Decision networks or in uence diagrams are an extension of belief networks that allow for reasoning about actions and utility The network represents information about the Solutioiiisa agent s current state its possible actions the simple palll solution is m solution is a ossible outcome ofthose actions and their detamIHISIIE acyclic la n cycIIc glap1 pt 1 Nondelamlnisllc Allows lol lnllnlte quotquotY ce 0 outcomes EEIIOH dumesm a Decision trees are not convenient for representing domain knowledge Requires tremendous amount of storage 39Multiple dedisidns nodesquot expands tree Rain v oddiidalidn di knowledge aidnd ditlelent pains e Julnl Probability Dislnddlidn vs Bayes Net WeatherRep ort O Generate decision tree on the y from more b l Utlhty economical forms of knowled e Um re a Depthfirst expansion oitree for computing optimal deemquot Parameters PRain PWeatnerReporthaini PWeatnerReportleain UtilityRainiUmbrella Chance nodes ovals have CPTs conditional probability tables that depend on the states of the parent nodes chance or decision Decision nodes squares represent options available to the decision maker Utility nodes Diamonds orvalue nodes represent the overall utility based on the states of the parent nodes v mm m 25 1 The directed graph has no cycles 2 The utility nodes have no children 3 There is a directed path that contains all of the decision nodes 4 A CPT is attached to each chance node specifying PAparentsA 5 A real valued function over parentsU is attached to each utility node v mm m 27 Causal knowledge about how events influence each other in the domain Knowledge about what action sequences are feasible in any given set of circumstances Lays out possible temporal ordering of decisions Normative knowledge about how desirable the consequences are t mm m za Links into d on nodes are called information linksquot and they Indicate that the state ofthe parent is known prior to the decision The directed path that goes through all the decision nodes de nes a temporal sequence of decisions It also partitions the chance var39ables into sets I0 is the vars observed before any decision is made I1 is the vars o served after the rst and beforet e second decision etc Iquot is the set of unobserved vars The noforgettingquot assumption is that the decision mbers pas observations and decisions Nun 39 t in mm m 22 Ulilily dealhsnniserosl PcnthighairportsitsDrieimlrlra lFlqulllgatlthiuh constructionniglil Two months before the harvest of a vvheat eld the farmer observes the state 0 of the crop and he observes whether it has been attacked by mildew M If there is an attack he will decide on a treatment with fungicides There are five variables 0 fair f not too bad n average a good g M no no little I moderate m severe s H state on plus M rotten rbad b poor p 00 observation of Q imperfect information on Q 0M observation of M imperfect information on M i Set the eVidence variables forthe current state 2 For each possible value of the decision node a Set the decision node to that value b Calculate the posterior probabilities for the parent nodes ofthe vtilt n de c Calculate the expected utility forthe action 3 Return the action With the highest utility Similar to Cutset Conditioning of a Multiply Connected Belief Network 0 or d 39 Asingle decision nodeD may have links to some chance nod Aset oi utility quotnations U U over domains X X Goal nd the decision d that maximizes EUDI1 v Howtosowesuch piohlems using astandaid aayesian netwoik package Bowie UXPXiaa 2UAXAPXA ID E Need a more complex evaluation technique since generating a policy mmizsixmim at fTnotest a Buyiiauyzi fTdotestt1 a iiiipass Buyi eise ii ii daii Buyz a iiiipass BuyZ eise ii iidaii Buyi a Buyi 7 Eu 2 fTdotestt2 7 Same as above A POLICY IS A SEQUENTIAL SET OF DECISIONS EACH POTENTIALLY BASED ON THE OUTCOME OF PREVIOUS DECISIONS Basic idea Ross Snacntei Peiioiin a sequence uiiransiurrnaiiuns iu ine diagram inai pieseive ine nptimai pniicy and itsvaiue uniii oniy t e UTILI nu e remains a Siniiarto ideas Mtransiorrnation into polytiee Four basic vaiueutiiityEreserving reductions Barren node removai Cnance node removai marginaiization Decision node removai maximization Arc reversai Bayes ruie mmizsixmim ad LetX represent a subset of nodes of interest in an influence diagram Let Xk represent a subset of evidence nodes We are interested in mpg 9 A node is barren if it has no successors and it is not a member anl or k The elimination of barren nodes does not affect the value of Fwy 9 mnmsinmm ax Nude i diremiy imked in mm nude v nodes connectedto tnottoi and nodes connected to i butnottov Assume null o 10 now mnmsinmm m r mn l liaan Nude Reillm ll waft EDNABIIJJ FiAlPlB lNPlCl BAIguPlD l C Rmnuval into Value Nude hyExpectntion x l E mw VlDlZleXD P00 Ln vi 39 n vummm u Given an in uence diagram containing an arc from i to j but no other directed path from i to j it is possible to to transform the diagram to one with an arc from to i lfj is deterministic probabilistic ifquot Q 4 XI Arc Reversal f then it becomes Y gt X PaAPa V V PtlAlvl39albl X Pa EMMA Q 1 HA l BX 1Zl Pill lAYZil tl l XYVPiB lX ZJ L arents PaAPaB parents of Awho are notparents ofB w r r u Decision Example 1 Rcwl39sc XY AI 1 Remuval ofX by Expeclaliun in v 1 Removal oID by maximiznliun illm V J g1 w you LIJe uplimal pulley furD given y Victor R Lesser CMPSCI 683 Fall 2004 Search and Agents Material at the end of last lecture Continuation of Simple Search The use of background knowledge to accelerate search Understand how to devise heuristics Understand the A and IDA algorihms Reading Sections 4142 CharacteristicsofMoreComplexSearch Subproblem interaction More complex view of operatorcontrol costs Uncertainlyin search Nonmonotonic domains Search redundancy Vtmltai im There are four phases to problem solving 1 Goal formulation based on currentvvorld state determine an appropriate goal descnbes desirable states of the world goal formulation may involve general goals or specific goals 2 Problem formulation formalize the problem in terms of states and actions state space representation 3 Problem solution via search find sequence s of actions that lead to goal statefs possibly select best of the sequences 4 Execution phase carry out actions in selected sequence Mummy A completely autonomous agent would have to carry out all four phases 0ften goal and problem formulation are carried out priorto agent design and the agent is given speci c goal instances agents perform only search and execution genera goal formulation problem formulation speciic goal formulation etc For nonagent problem solving a solution maybe simplya speci c goalthal is achievable reachable there maybe no execulion phase The execution phase fora realworld agent can be complex sincethe agent must deal with uncertainty and errors Vtmltai im solving can slmplify agent design Adopting goals and using them to direct problem lntelligentJrational agent means selecting best actions relative to a PM but PMs may be complex multiple attributes with tradeoffs Goals simplify reasoning by limlting agent objectives but still organizedirect behavior Optimal vs satis cing behavior best performance vs goal achleved May use both goals to identify acceptable states plus PM to e 39 among goals and their possible solutlons Complete searchbased problem solving involves both the search process and the execution of the selected action sequence 7 Total cost or search7based problem solylng ls the sum or the search costs and the path costs operatorsequence cost Dealing with total cost may require 7 Comblmng apples and oranges e g trayel mlles and CPU tlme 7 Haylng to make a tradeorr between search tlme and solutlon cost optlmallty resource allocatlon 7 These lSSueS must be handled ln the performance measure y omega nmt 5 u Wrasse nmt Problems can vary in a number ofways that can affect the details of how problemsolving search agents are ui t Contingency problems One catego 39zation is presented in AIMA related dumg exec mo Ma 5 to accessi ty and determinism 7 slnglestate problems 7 Must calculate tree or actlons ror eyery contll lgency 7 lnterleaving search and executlon may be better Agent knows lnltlal state and exact errect oreach acllol l Respond to state or world after executlon or actlon wlth Search oyerslngle slates mew OU Come WW 7 Multlple7state problems Exploration problems Agent cannot know lts exact lrlltlal state andorthe exact A gen may haye no lnrormatlon about the errects or lts effect of lts 60mm actlons and must expenment and learn MU Seam Over State 5 7 Search H l real world ys model May or may notbe able to nnd a guaranteed Solutlon y mm m 7 t mm m r Exact predlctlol l l5 lmposslble but States may be determll led ensll lg Uninformed strategies do not use any information about how close a node might be to a goal additional cost to reach goal They differ in the order that nodes are expanded and operator cost assumptions v New m a While uninformed search methods can in principle find solutions to any state space problem they are typically too inefficient to do so in practice Informed search methods use problem speci c knowledge to improve average search performance I mm amt m Heuristic problemspeci c knowledge that reduces expected search effort Informed search uses a heuristic evaluation function that denotes the relative desirability of expanding a nodestate often include some estimate ofthe cost to reach e nearest goal state from the current state In blind search techniques such knowledge can be n oded only via state space and operator representation v New m M Travel planning Euclidean distance 8puzzle Manhattan distance Number of misplaced tiles Traveling salesman problem Minimum spanning tree 39 Where do heuristics come from t mm amt 2 Heuristics can be generated via simplified models of the problem Simplification can be modeled as deleting constraints on operators Key property Heuristic can be calculated efficiently v New m 13 Start with OPEN containing just the initial state there are no nodes left on OPEN do N a C a a a o L39 G a z o 2 c For each successor do i If it has not been generated before evaluate it add it to OPEN and record its parent ii If it has been generated before change the parent if this new path is better than the previous one In that case update the cost of getting to this node and to any successors that this node may already have v New m 15 Idea use an evaluation function for each node which estimates its desirability Expand most desirable unexpanded node Implementation open list is sorted in decreasing order of desirability t mesa amt 14 Do not regenerate the state you just came from Do not create paths with cycles Do not generate any state that was generated before using a hash table to store all generated nodes i mesa amt a Simple form of best rst search Heuristic evaluation function hn estimates the cost from n to the closest goal Example straightline distance from n to Bucharest Greedy search expands the node that appears to be closets to the goal Properties of greedy search Uses a lot of space The resulting algorithm is complete in finite trees but not optimal v New nmt n t mm m is There are a variety of factors that can affect the choice Search strate ies f h t at d d t f h rBestrfirStsearcMaka ordered searcn seam S r egyaquot quotsewnquot seaquot 39 reedy a k a bestefirst a Branching factor Ar 7 Expected depthlength ofsolutiori ordered deptnrirsuaka hiltrclimbmg 7 Time v5 Space iirnitations nMemonrbounded search e Mditipie goai states aridor initiai states iterative deepenin A iDA t a e mpllcltOrEXplicitGoalState specirication Timjm g gzzwmmed A W 7 Uniform v5 rioriruriiforrn operatorcostsi Anytime Ar 7 Anysolution v5 optirnai Solution eamhmg and MW 7 Soiution patn v5 statei eiterative imprdement algorithms generatearidrtest approaches Number of acceptable Solution stateai steepest ascentniiiciirnoing e Dirrerent rorward v5 backward prancning ractors Randomerestart hilleclimbmg r can the same state be reached with different operator sequences Simulated anneaiing 7MultirLevelMultirDirnenSionat Search v New nmt ig t mm m 2o Example tracing Aquot with twa different heuristics an an M E Wmmm mum 39 Similar to best rst search except that the evaluation is based on total path solution cost n gn Mn where gn cost of path from the initial state to n Iln estimate of the remaining distance ml unmltl llen1uumlylrvvn at W H Um Wilmal 5mm 75125 tn m me ln Yum mmmu MadPrint Lyrrllwwn Maw usmm ltnu u m Admissible heuristic never overestimates the actual cost to reach a goal Monotone heuristic thefvalue never decreases along any path When h is admissible monotonicity can be maintained when combined with pathmax n39 mumquot gn hn39 Does monotonicity infimply admissibility 3 Intuitive explanation for monotone h lfh is a lowerbound thenfis a lowerbound on shortestpath through that node Therefore f never decreases It is obvious that the rst39solution found is optimal as long as a solution is accepted when snlutinn s nmie for every other nude t in mm m Let 0 be an optimal solution with path cost 2 Let SO be a suboptimal goal state that is 350 Suppose that A terminates the search with SO Let n be a leaf node on the optimal path to 0 Z n admissibility of n z SU nwas not chosen for expansion z n z sa SUgS0 S0l5agoalhSO0 1quot 3850 contradictionl A is complete unless there are infinitely many nodes withfn ltfk A is complete when 1 there is a positive lower bound on the cost of operators 2 the branching factor is finite t in mm m locally admissible ilhlllil Milli g costilli Nil 3 nigoalill For a given heurlstlc function no optlmal Each state reached has the minimal gull algorithm is guaranteed to do less work in terms of nodes expanded s mange methty b2 Aside from ties inf A expands every node K Mg necessaryfor the proof that we ve found the quot1 b quot i shortest path and no other nodes N1 quot quotM lt Mills rill1 via NJ slim via Ni ii always xp d dhelorell Not necessary to expand lipN1 quotexpanded llp ll1 also llll l MM 2 uunmsixmlm What is the implications of local monotonicity Amount of storage What happens if h1lth2lth for all states n2 dominates ni What are the implications of overestimating h Suppose you can bound overestimation uumcsixlmb 3i uunmsixmlm Rising While informed search can produce dramatic real iaveragecasei ini rovernents in comoiexity n typically does not eliminatethe potential for exponential iwoistcasei performance metrics e Averageliunbeiafnodes expandedW e PeneliamePdN 7 Effective branching factor If llsolulion depth is o then of is the bianLliing laLtoitnat a union seam tree would nave to nave to geneiate Nnodes Niitritzi D ju EBFtenusto be ielatnel independent oftne solution oeom Note that these de nitions completer ignore the Cost orapptyino the heuristic function uumcsiwnm 11 sew L ml Ellwtiie aiithuiii ruiqi 1 mm new IDS i inn iw I u 39v ti l39 l l n It Ht iii a a m a l u i it E c 9 15 2V l3 i x s m in mum m H t WM l 223 H1 1 e m e t l E it u e n e if 1 LI l2 n M i if 14 1M Ht ll a Search cost involves both the cost to xpand nodes and the cost to apply heuristic function Typically there is a tradeoff between the cost and performance of a heuristic function E g We can always get a perfect heuristic function by naVing tne function do a searcn to find the solution andtnen use that solution to compute timode uumcsiwnm I This tradeoff is o en referred to as the meta level vs baselevel tradeoff Baselevel refers to the operator level at which the problem will actually be solved Metalevel refers to the control level at which we decide how to solve the problem We must evaluate the cost to execute the neun39stic function relative to the cost of expanding nodes and the reduction in nodes expanded mmiosixmmt so A requires open amp close list for remembering nodes Can lead to very large storage requirements Exploit the idea that i a h s i actual cost create incremental subspaces that can be searched depthfirst much less storage Key issue is how much extra computation How bad an underestimate if how many steps does it take to get l Worse case N computation for A versus Nlior lDA Beginning with an fbound equal to the fvalue of the initial state perform a depth rst search bounded bythefbound instead ofa depth bound Unless the goal is found increasethe fbound to the lowest fvalue found in the previous search that exceeds the previous fbound and restart the depth rst search Use depth rst search with fcost limit instead of depth limit IDA is complete and optimal but it uses less memory 0bf lc and more time than A I n I D I n l n o lmmtl n l mmmnl 1mm mount Algorithm lterativeDeepeningA contours 1 Set THRESHOLD the heuristic evaluation ofthe start state 2 Conduct a depth rst search based on minimal cost from current node pruning any branch when its total cost function g h39 exceeds THRESHOLD If a solution path is found during the search return it 3 Otherwise increment THR 5 OLD by the minimum amount it was exceeded during the previous step and then go to Step startstate aiwavs on path so initiai estimate is aiwavs overestimate and neveroeereasing v in new nml lDA is asvmptotieaiiv same time as A but oniv 0a in space 7 versus rorm 7mm mm Ana 0od e Avoids overheadofsorted queueofnodes Sw mtsumxa wlulmd sign mm Nana nsg We Rimmuhg lDA is simpier to impiement e no ciosed iists iimited open iist was finr gas ow M News xgmim ii w ifquot in Korf s isepuzzie experiments lDA soived aii problems ran faster We 5 arm pimpsk 33m to even though itgenerated more node tn unavng is seim iei Via i 3 1 Nodes are lab 7 A solved no problems due to insm cient space ran slower than IDA sled wm r g M The n values are the stralthlne distances to Eucharest What is the next conmum v in new nml Continuation of Discussion of DA Other Time and Space Variations of A v igxxy m m 45 Victor Lesser CMPSCI 683 Fall 2004 Local Search Heuristic Repair CSP and 3SAl Solving CSPs using Systematic Search The relationship between problem structure and complexity vimwemim Z Aset of variablesX1X and a set of constraints C1Cm Each variable has a domain D of possible values A solution to a CSP a complete assignment to all variables that satisfies all the constraints Representation of constraints as predicates Visualizing a CSP as a constraint graph vmmrm Constraint graph nodes are variable arcs Show constraints ti o Maquot inMiiniu Variables WA NT Q NEW V Sl T v Domains D redpreenblue Constraints adjacent regions must have dillerent colors on WA NT ll the language allorrs llils or lll39A NT 6 rerl preenl lredlluel green mull green blue Wm sumquot are assignments satisfying all mmquot c g ll39AredNTgreenQred1 7l l reenl red4blueTgIeen TWO FOUR oo n1ox1 x1ww u1ox2 X139l3939l39 o1ox3 x3 r 39 alldimri39l39iurwrkro Snrmnrevariables Between09FTUWR0 Constraints BetweenOl x1x2x3 Finite domains squeens matching cryptarithmetic jnb assignment Finitedomain C Boolean C 3SAT NPcomplete In nite domainsjnb scheduling Cannot enumerate all possibilities 39Overlne range olinlegers Need a constraint language 39Sla nprSlar n 3 Bound range vummm 7 Representing preferences versus absolute constraints Weighted by constraints violatedsatis ed Constraint optimization is generally more complicated Can be solved using local search techniques Hard to find optimal solutions Create a complete but inconsistent assignment Successor state change value of one variable Use heuristic repair methods to reduce the number of conflicts iterative improvement The min con icts heuristic choose a value for a variable that minimizes the number of remaining con icts Hill climbing on the number of violated constraints Repair constraint violations until a consistent assignment is achieved of 50 ste Can solve the millionqueens problem in an average ps function MIN 39ONFLICTsmp mainstay returns a solution or failure inputs Sp 394 CUnSll llllll satisfaction pwb maxsteps the number of steps allowed before giving up local variables urr39erir a complete assignment a Varin lL value a Value for a Variable a E t39ziriwi lt an inilial complete assignineul for my for 1quot l to minty slit a Mr 4 a randomly chosen con ictedvnrinhle from VARTAHT 511 mine i ll value or var that minimizes C iJNlitiL39i sitor v current mp sort mmliie in Current it39ciiwnr 395 a solution for Cap then return current end l lul l l lll f t t tense eat in Preprocessing phase to generate initiai assignment 7 Greedy aigontnin tnat iterates tniougn rOWS placing each queen on tne column wneie it conni ts Witn tne fewest previously placed que c ens Repair phase seiest randomly a queen in a specin rim tnatis in con ictandrnoves ittutne uiumn Within the I same ruw wnErE it cunnicts witn tne renest other ens Atwoastep Solution oran 8aqueen5 problem The numberorremaining conflicts foreacn new position of the Selected queen l5 Shown Algorithm rnoves the queen to the rninaconriict square breaking ties randomly t Marissa amt i2 Nonsystematicsearch hypothesis 7 Deptnnia seam oaoiu organized W R9 alphas 7 Pameunmtesaiempinieunisiaieatnmamnpain rum whim jfn 53212 7 MmemimmnswitnnrsiqueenpiaLe inLenteinMiaiW n1 5 57 J Ta i i tni misc eimnoauueusnnnaueeanyi a n 1n Am mini in new 56 M m m mm mm mm H m 7 BaLk iaLking pingiamtnatian nmv mumsimwam Loiurrnswiinin mm t 5 m mWS iipe oirmpooiiu W 5 7 Distribution oisolutions n1 3 2H 7 Depth rst lines not perinirnweii Where sniutinns Liusteie intiee memedconpnatimrmmes 7 Random miniatan LasVegas aigoinnrn does better but siiii piobiern vunmsiwnm i3 utmmsixmim Informedness hypothesis Heuristic repair is better because it has more Given a propositional sentence determine if it is satis able and if it is show which information that is not avaiiabie to a constructive b cktracking more encompassing View of search Prnpnsmnns have to be true to make the space sentence true SSAT is the problem of nding a satisfying truth assignment for a Minicon ict heuristic seiect a variabi e that is in sentence in a special format con ict and assign it a vaiue that minimizes the number of con icts number or otner variabies tnat Wiii need to be repaired vunmsiwnm is utmmsixmim A hierai is a proposition symbol or n negation mm Mquot n 39 A Diause is a disjnlmion oli elals iiiclause is a disjunction oi 39 Asemence ill CNF omnmunciwe nnrmaiinrm is a conjnlmion 0 is a ZVCNF sememewun inuv Liauses am we pmpnsmnn smbnis vacm Av vmwvcw AT AF AF EF ET 31 CF CF CT LL DF DF DF m 11 m 11 m LSAT pulymmlal hmehutnan t may all pmhlm mm 27 SAT momwowan A Ava A cumsows 03va M M F T B X Pmblem aiuenalonnulamlie pnmsmoml calculus llm aninlerpelanm one uamallesunuerwnen lielonnula curries mllnle oneponlmlnme exisls Dmce urs GSAT nnul aselolelausesu nlx FUPS andMAXTRlES output a sallslwlglmh asslgmens m unwound neon 7 lo MAX TRlES T rammlygemraleulmnassignmen MAX FUPS ion l39TsallSllEs unen Mum T l a mammal varla e suenlml a emngeinusumnassimmemuueslne largesl inueaseinlolal nmbermclauses a more salislieuny T mlnlneuun assignmen mp mversed enu luv eno luv Mum m sallslymgasslmmemlmm eno vumcsixlmll 2i Bl39asedRaldum Walk Willi prohahimy plollowllie slamlanl GSAT scheme 7 le makelhebeslpnaslble lp Willi prohahimy 1 p pink avaiahle oculllillg in some unsaisfleli clause amlllip ilslnnli assignment lllole a possible upnill mm m nunan am we nolex nkglu on me mum scur eon un nine n mm M sci leeneel vumcsixlmll 2e GSAT versls Dausmmm a hadwadwg ayle agmlmn Duuuein mint mmnm 1cm Domain want immulmll mimole 39 in me a Easy Sast 39lahla problems where many solutions llanl Saamalileprolilems unierelewsolmions Easyrewsausrialileprolilans no n my l Assumes onlmlrlalt search in quotI satis dila spaee aml quotI non satis alile space i nepaion o1 proposnionl mnmsixmlm 1b Na39I39ve application of search to CSPs Branching factor is n d at the top level then n1d and so on for 11 levels The tree has n dn leaves even though there are only dn possible complete assignments Na39I39ve form ulation ignores commutativity of all CSPs Solution consider a single variable at each depth of the tree Initial state the empty assignment Successor function a value can be assigned to any variable as long as no constraint is violated Goal test the current assignment is complete Path cost a constant cost for every step n in mm m function UM KTRM KliSHRJHFSpJ returns a nlution orfailure relurn RECL39RstVEBACKTRACKIMNup function 397 U quot T m39 failure if asuigmnonl is complete then return mignmrm vnrlt Srirrr bnwsmn nVmum EthRHEl minimaurigrmzentcnil for each value in ORDERVDOWAIVVALUES l l th JlSSigtlll llf cup do mull i RECLlRSthBACK tnnmn ul hm VNLZEHSSigmllPllIL mp if remr 1211111 then return result end rein rnjallm v We m n in mm m CSP search complexity may be affected by The order in which variables are assigned values The domain values chosen for assignment Variableordering heuristics reduce the bushiness of the search tree by moving failures to upper levels Valueordering heuristics move solutions to the left ofthe search tree so they are found more quickly by backtracking search Good heuristics can reduce search complexity by nearly an order of magnitude vummm z Relate decisions about search control to characteristics of the problem space Characterize the problem topology by a set of texture measures Static and Dynamic MetaLevel Information vummm 3 Key questions 1 Which variable should be assigned next and in what order should the values be tried 2 What are the implications ofthe current variable assignments for the other unassigned variables 3 When a path fails can the search avoid repeating this failure in subsequent paths mmusixmim an Variable ordering The mostconstrainedvariable heuri 7 has lhelewesl legal values 7 reduce branchihglamor 39 The mostconstrainingvariable heuristic 7 involved in largest number olcohstra his 7 likely reduce future branching lactors Value ordering The leastconstrainingvalue heuristic 7 rulesoulthelewestchoiceslorheighborihgvars 7 reduce likelihood nl backtracking De ne The probabilitythat the assignment of a particular value to a particular variable leads to an overall solution to the problem Compute The ratio of complete assignments that are solutions to the problem and have that value for the variable over the total number of possible assignments Heuristic The number of constraints on the variable involving that value ALL BM n Bur dbn 1 Domain Lal3 to we rt Sglmtn more ruv t in 91quot a n E u stquot 5 mariner is 39 istentwith allthe variable do not ckquot c innute The ratio oltlre nurnber olsolutions to tire problem witn constraints on trieuariable in question rernoueiltlrat coulll not be solutions to tire lullyconstrainell problem to tire total nurnber ol solutions to tire problem with constraints on tire variable rernoueil e Let o lne set ol Donstlalnls involving v rLe atne problemwnhoul o in A solullonslo a not solutionsto A solutions a Hioti meansvarlable snoulo be bound early 39 Heuristic The number 0 constraint on the variable asking t euariable in question is tire last one instantiateil ExaetVanable Tighmess Texmres Measures Deemsualred Soluqu NunSuluLluns Hemsbe Variable Tightness Texture Mmme El 48 El 12 I I J Number ctCuustralnto E 12 3 Diana Variable Tightress El El El 57 El El El 33 El El El El ml etl Millie tile ln39l bell Hell in n n n or u isiinitiinrnaerlii Mostconstraining variable Reduce the branching factor by deleting values that are Select for assignment the variable that is involved in quotOtc quot5i5t9nt With the values the Sig the largest number of constraints on unassigne variables v ria Fonrvard checking a simple kind of propagation lesi Also called the searchrearrangement method WA VT 0 MSW V S T lrilialdomains Ai ierV lAiej Leastconstraining value WWW Select a value for the variable that eliminates the WM smallest number values or variables connected With the variable by constraints i e i maximize the number of assignment options still open vumcsiwom mm An arc frotho Yin the constraint ra h is a consistent if for every value of x thgrepis some value of Ythat is consistent with X 6 w Can detect more inconsistencies than fonivard o checking m m a w v at v Can be applied as a preprocessing step before IJIZIDEEIIlZIIIEI search or as a propagation step after each 9 assignment during search m m n m V at Process must be applied repeatedly until no lZIIlEIEIIIIIEI more inconsistencies remain Whi W vumcsiwmi 3 town to emerging liaisonadhemxunriMem my Wain mini Palm is M dxmh APquot y mi m A binary CSP has ai mnsl oinzi arts Eaen are ix n ean nnly ire insened an die agenda d urnes heeause ai mnsl dvaiues niyean be deleted Checking cansislency nian are can he dnne in em e Wnrsl easeiirne enrnniexiiy is oinZdzi Dues nnl reveal every nnssrhie inenns39sieney rum 1mm MAI MIMIk1 IS e n A r spars m WWW w man wan qiis u m a x m in ii 3 Win AU Lb nsnw we ixquot x m nan r rm 2 A thequot H mm A4 ndr y n warm r mm i r M W ir Jrnwnr immunemire m wrung may nan Mimime rnvviizwiirymniiiu m Wmng in UmA m mnrwn le l e um he mm m seine Illlle y immune and ddeie xnnm nomumi mamare A graph Is krcnnsislen ii inr any set nikvanahies inere is arways a enns39sienivaiueinrine klhv 39 hie given any ennsisieni naniai assignrnenunrine diner M v ides r Agvaph is smneiy Krmnssem mi is mansistem in ir Knumhev m nadesinan nd backtmckmg eney niiersirnngerinrrns nl H nerinrrns die cnnslrzinlprnngz n e ddee amaum m haddvawng rmiems aaianee ninnw rnuen nrenrneessing In gen grann In ire k enns39sieni versus rnnre searen Chronological backtracking always backtrack to most recent assignment Not ef cient Con ict set A set of variables that caused the failure Backjumping backtrack to the most recent variable assignment in the con ict set Simple modi cation of BACKTRACKINGSEARCH Every branch pruned by backjumping is also pruned by forward checking Con ictdirected backjumping better de nition of con ict sets leads to better performance v Wis m 45 The complexity of solving a CSP is strongly related to the structure of its constraint graph I 39 39 into 39 yields substantial savings 0dquot u 0dCnc Tree structured problems can be solved in linear time 0nd1 cutset conditioning can reduce a general CSP to a treestructured ne and is very efficient if a small cutset can be found v Wis m 47 Procedure lNFORMEDrEACKTRACK VARSVLEFT VARSVDONE lr all variables are onslsterimhen sblutibn found STOP Let VAR a variable in VARSVLEFTthat is in bnrliet SVDONE Let VALUES list at bussible valuesforVAR urgereg in ascending brger aeebrging tb number at ebnrliets Wim variables in VARSVLEFT For eaen VALUE in VALUES urtil solutlorlfourld lr VALUE gbes net ebrrlietWitn anvvanable tnat is anARSrDONE tnen Assign VALUE tb VAR call lNFORMEDrEACKTRACKOARSVLEFlVARSVDONE engir engrur eng brbeegure Eegln brbgrarn LethRstEFr list at all variables eaen assigned an initial state Let VARSVDONE nil call lNFORMEDVEACKTRACKWARSVLEFT VARSVDONE End brbgrarii 1 Choose a variable as root order variables From root to leaves such that every node 5 parent precedes it in the ordering f 3 E 8 LP 2 For from n down to 2 apply TiElleV EINCONSISTENTPul crltinl NJ 3 Forj from 1 to n assign X consistently with Pur crltX7 t in mm m as 39 JJ CSP an a speniai kind of pmHem smes de ned b va ues of xed sek of variaHas goal tes de ned by mnstlaims on variable va ues Backtracking deplhr m scald with one mime assigned per nod Vamme mdmng and um se ectinn hemistis help signifm y Forwavd checkmg preveMs assignments that guarantee My aim ConsJain proyagatiun mgq are cansxslenq docs addiLRVnd walk to constrain vahxes and detect inconsistencies The CSP ye regardth anws ana sis cf mblem structure Cutset condltwonmg mslantlate In all ways a set of vanaka p y P suc t at the remammg constramt graph is a tree Tue mama csm can be so ved 1 linear a 3mm gze 5 Mme 01f H ch12 my fan for Emu c W Imam mincon ius is usquy effective in practice W Interacting Subprnblems Multilevel Search b ackboard vumcsiwnm 5 Victor Lesser CMPSCI 683 Fall 2004 connections per neuron 39 Neuron tires when its inputs exceed a threshold 39 Inputs are weighted and can have excitory or inhibitory effect 39 individual ring is slow 001 second but bandwidth is very high roll bitssec 39 The brain performs many tasks much faster than a computer Scene recognition time 1 second 39 Learning and graceful degradation vtuuemrm K 39 Neural Networks Representing functions using networks of simple arithmetic computing elements Learning such representations from examples Waterman Computational architectures and cognitive models that are neurallyinspired Faithful to coarse neural constraints not neural models Large numbers of simple neuronlike processing units interconnected through weighted links They do not compute by transmitting symbolically coded messages inhibitory and excrtory signals program raids in the structure oftlre interconnection massive parallelism and no centralized control Waterman g Ability to bring large numbers of interacting Automobile automatic guidance systems constraints to bear on problem solvrng so Credquot application evaluation mongage wastrel screening real estate appraisal Norse resistance error tolerance graceful Ob t t f h t de Jec recognIIonacesc arac ers Ability to do complex multilayer recognition 39 SpeeCh recognition and V0 synthESis with a large number of inputsoutputs quickly Market forecasting automatic bond trading Learning with generalization Robot control process control Biological plausibility Breast cancer cell analysis Potential for speed of processing through ne on and gas exploration gramed Paralle39lsm Image and data compression vunmsixmlm 5 mnmsumm a ciated with n so Dnlpnl 39Pmcessing nuns compute weighted sum oltheir inputs and than u 7 Linear tundlnn mmhines mpus 51m W al w ray Nnnllmartundlmtmnslnrms gmmhined inputs amth value mm mm mm m vunmsixmlm 7 mnmsumm x ulstepfunctjon bSignfunction cSigmoidfunction Can make each nlthese lunmmns athreslmld such that it nutputs al Robust approach to approximating realvalued discretevalue and vectorvalued target functions Learning the Weights and Connectivity wN 0 implies no connectivity among nodes a and a a x Figure 196 um with slap fummntbxthz amvmm lmncanrtas lay gm gym when the input lS greaterthan threslmld can alsn thmugh dummy link v gt XOR mm multilayer nemrk mnmsllmm ml in FeedForward Networks unidirectional links No cycles DAG No internal state other than weights Layered feedforward Each unit is linked only to units in the next layer Synchronized movement of information from layer to layer Relatively understood Wu A rv mini mamp man nutputmde vulmsixmlm u Have bidirectional connections Wltn symmetric Welgnts All units are botn input and output units Activation function is tne sign function can only be i or 7 1 Functions as an associative memory a new example Will cause the net to settle into a training pattern that most closely resembles tne new example Training set cl pnctograpns 7 Each weight is a partial encoding olall photographs 7 Mewstimulus small piece olone olthetlained photographs Aaivation levels olneural units will Mled coma photoglap vulmsixmlm i5 Recurrent Network arbitrary links Activation is fed back to units that caused it Internal state stored in activation levels Can be unstable oscillate etc Can represent more complex functions onionsml H Singlelayered feedforward networks studied in the late 1950 s x1 wl quotll1 1 n lifwnwxL wnxngt0 1 otherwise onionsml in Represents some useful functions linearly separable But some functions not representable vulmsixmim n Perceptron learning rule wlewll era ope reduce difference between obxerved and predicted Value in mall increment where r is the target value oftraining example o is the perceptron output or is a small constant eg 1 called the learning rate r1 is either 1 or1 Stan out with randomly assigned weights between 5 05 vulmsixmim w Local encoding Each attributed single input value Pick appropriate number of distinct values to correspond to distinct smbolic attributed value Distributed encoding One input value for each value of the attribute Value is one or zero Whethervalue has that attribute XbetWeenUan lMistinttinpmsvl vlvlv i X4 vluiv2uiv3uvll tomsmi ix The perceptron learning rule will converge to a set of weights that correctly represents the examples as long as the examples represent a linearly separablequot function II and a is suf mently sma Why does it work Perceptron is doing gradient descent in weight space that has no local minima 7 Optimization in the weight space based on sum of squared errors tomsmi 1o Learning the Will Win nredimte 1 1 Leaming the ma39ority function of 11 inputs 5 r rufwj wLVN ttwkwmw t 09 8 a 72 E a 18 a I elceptmn 1 if Decihinnlree 3 0 3 39 8 t 3 5 06 Perceptmn r 3 01 f Dectsinn tree 59 05 05 o 04 0 4rrrrrrrrr ll Ill 21 30 40 ilt 6t 70 Xlt 9 III II III 20 31 40 ill rt 70 81 9 llltl Training set size Training set size Delta Rule wle wl 012504 04de Drs set of entrre trarnrng examptes tf trarnrng examptes are not hnearty separabte Detta rute converges towards best t approxrrnatron to target concept Use gradrent descent search to searcn nypotnesrs space of possrbte Wergntvectors to nd the Wergnts tnat bestft tne trarnrng exampte Arbrtraryrnrtratwergntvectnr t eacn step Wergnt vector rs attered tn the drrectron tnat produces tne steep escent atong error surface N r r Fm ahmzrumththtwn weights the typnthstsspane Hs m w m mm The mum 1x15 W Q Oba mm mum error 5 reached intimatestheemuni ewmgm i gwe vshxhypndmk mm m a xedxeinf mug mmplex The mw snaws the negated gmdmnt at an particular pant mm the duectmmntle wquot m dang gno ncn39g sleepst descent 31mg the emrsnrfzae vunmsixmlm 2 tauntsmu 2t least mean square ermrnver alt training examples Convergence guaranteed for perceptron since error surface contains only a single global minimum and learning rate sufficiently small large number of iterations Larger learning rate Possibly overshoot minimum in the error surface Can use larger learning rate ifgradually reduce value of learning rate over time Similarlo simulated annealing v uxxicssmrmm 25 Problem with Perceptrons is coverage many functions cannot be represented as a network sum threshold function But with one hidden layer and the sigmoid threshold function can represent any continuous function Choosing the right number of hidden units is still not well understood With two hidden layers can represent any discontinuous function v uxxicssmrmm 27 Incremental gradient descent by updating weights per example 7 wie w at 03x Looks similar to perceptron rule 7 0 not thresholded perceptron no 9 output rather lhresholded linear combinations of inputs wx Reduces cost of each update cycle Needs smaller learning rate More update cycles than gradient descent u uxxicssmrmm BackPropagation Learning How to assess the blame for an error and divide it among the contributing weights at the same and different layers Gradient descent over network weight vector u uxxicssmrmm 012 Wjlaj First level of Back propagation to hidden layer WJtew a39aJ39m AtTt ot39g 2WJt39aJ tJ a onaenmremmogwtmmspmnw econd level of Back propagation to input layer m W W W AJg 2wlgIt2W m 0t J t t Gmdlenl Allen at WW 05 l l Summing the errorterms for each output umt influence by wK thru all weighting each by the WW the degree to which hidden unit is responsible for error in output 3 Provides a way of dividing the calculation of gradient among the units so that change in each weight can be calculated bythe unitto which the weight is attached using only local information Based on minimizing MW 2T raj tmulhple output units t Compute the delta values for the output units using the observed error Starting with output layer repeat the following for each layer in the network Propagate delta values back to previous layer Update the weights between the two layers Typically use sigmoid function g 1 1 equot d Nlceproprty goin gon Gradient of error E with respect to weight w 95 200 odiodil 04 correct on test set 2 z 1 06 all ivIultiluyernelwork h Decision tree 05 1 M i i l l i l l i i I 10 20 3t 50 on 70 80 90 ltlll 10 Training set size Initialize all weights to small random numbers Repeat until satisfied For each training example 1 Compute the network outputs 2 For each output unitk 6 e r 0A 0A1 0A 3 For each hidden unith 6h 0h1 0h ZWM A Aammi 4 Update each weight w 6w Awu where Aw 1611 mnmslmlm at Ndimensional Euclidean Space of network weights Continuous Contrast with discrete space of decision tree Error Measure is differentiable with respect to continous parameters Results in wellde ned error gradient that provides a useful structure for organizing the search for the best hypothesis mnmslmlm 1 Smooth Interpolation between data points Smootnlv varving decision regions Tend to label points in between positive examples as positive examples if no negative examples hm um N ng mam lrciinpk ii K overlilling nml minimum Backprop is susceptible to over tting After initial learning Weights are being tuned to fit idiosyncrasies of training examples and noise Overly complex decision surfaces constructed Weight Decay decrease weight by some small factor during each iteration thru data Keep Weight values small to bias learning against complex decision surfaces Exploit Validation Set Keep track of error in validation set during searcli Use Weight setting that minimizes error Use as stopping crlterla Error Surface can have multiple local minimum Guaranteed to converge only to local minimum Momentum model Weight update partiallv dependent on the nl iteration r Aw ni7 131 aAwJ rt1 r Helps not to get stuckin local minimum 7 Graduallyincreasing the step size of the search in regions where the gradient is unchanging Gradient descent over network weight vector Easily generalizes to any directed graph Will find a local not necessarily global error minim um Minimizes error overtraining examples will it generalize well to subsequent examples I Training is slow can take thousands of iterations lliililoii thiri ii if ii z lliJJDLtil 7 v fi IJiii i ii mmammmmmmm nmzxmmmmmmim Using network alter training is very fast idem mirghugl39 tmmrgmmplsilwm Amsooomrgepcis it lmlwhlc me heimdutnctmtsmrgthzmmdmzimmmh my lime dmmmthz mm the hidden m in mm 1mm 3 If inmtlisthestar ai hmaxymmdirgfnmg hrL lWhlc i plumsmi u Instances are re resentedb man attribute valuepairs p y y Reinforcement Learning The target function output may be discrete valued realvalued or a vector of several real or discretevalued attributes The training examples may contain errors Long training times are acceptable Fast evaluation ofthe learned target function may be required The ability of humans to understand the learned target function is not important vulmsixmim u plumsmi u Victor Lesser CMPSCI 683 Fall 2004 Review of Neural Networks MarkovDecision Processes Reinforcement learning vlmuamim 39 Gradient descent over network weight vector 39 Easily generalizes to any directed graph 39 Will nd a local not necessarily global error minimum 39 Minimizes error over training examples will it generalize well to subsequent examples 39 Training is slow can take thousands of iterations 39 Using network allerlraining is very fast y unimlm K 39 instances are represented by many attribute value pairs 39 The target function output may be discrete valued realvalued or a vector of several real or discretevalued attributes 39 The training examples may contain errors 39 Long training times are acceptable 39 Fast evaluation of the learned target function may be required 39 The ability of humans to understand the learned targetfunction is not important vlmuamim Supervised learning is sometimes unrealistic 39 Using feedbackrewards to learn a where will correct answers come from successful agent function Rewards may be provided following single reward after a long sequence of eaCh aetioni or only When the agent actions reaches a terminal state In many cases the agent will only receive a Rewards can be components of the EnVIronments change and so the agent must actual utility function or they can be ad39ust its action choices H n H H Joweyssue hints nice move bad dog etc vuimmm 5 vunmmrmi a Tmmnglnfu desired target mich Tmmnglnfu evalmtmrs lemdsquotl pem1hesquot Inputs Input Output sumo Objective Minimize Ermr auger output actual output Objective get as much reward as possible vuimmm i vunmmrmi x i Environment Agmt and mvlmnmmt mtEraztatdumEtE iimexiepx r i L 2 K Utilityreward depends on a sequence of deciSions WE M ME WEE Sm Howtolearnbestactionmagtltimizeexpected o m M m a 4a 42 42 reward totake at each state of Agent vumum v vmmmm m s finite set of domain states Augmenting the completely observable MDP with the following elements 0 A finite set of actions 0 Ps lsa state transition function 39 039 set OfObseNatlons I Pol a observation function ma 39 reward fundwn Discrete probability distribution over SO initia state starting states The Markov assumption Can be mapped into MDP Psrl sway sa PSrl 3711 Explades state space vumum n vmmmm u Specify how to combine rewards over multiple time steps 0 Finitehorizon and infinitehorizon problems 0 Additive utility sum of rewards 0 Using a discount factor 0 Utility as averagereward per time step v luxutsm mi ll is a scalar reward signal an adequate notion of a goal maybe not but it is surprisingly flexible A goal should specify what we want to achievo not how we want to achieve it A goal must be outside the agent s direct control thus outside the agent The agent must be able to measure success explicitly frequently during its lifespan v misses em 14 Suppose the sequence of rewards after step tis K What do we wantto maximize Wm M v 5m to general we wamm maximize the expected return ER for each siep I Episodic tasks intemctiou breaks uatumlly imo episodes e g p1ays ofa game trips through a maze R lrllrlZL rTv where Tis a final time step at which a terminal state is reached ending an episode v luxutsm mi W5 Continuing tasks interaction does not have natural episodes Expected Return becomes infinite Discounted return R iii WV We L 2mm i where y 0 s y s 1 is the discount rate shortsighted 0 lty gt1 farsighted v misses em 15 Avoid failure the pole falling beyond a critical angle orthe cart hiding end of i lJi tmck As an episodic task where episode ends upon failure 1 1 for each step before failure gt return number of steps before failure As a continuing task with discounted return re e1 upon failure 0 otherwise return is related to 7 7 2 fork steps before failure In either case return is maximized by avoiding llure for as long as possible v urrrisssurmu A policy is a choice ofwhatactlon to choose at each state A Optimal Policy is a policy Where you are always choosihg the actioh that maxlleeS the return utility otthe current state t tE a tions succeed with probability 08 and move a right angles with probabi tya 1 remain in the same position when there is a wall Actions incur a small cost 004 What happens when cost increases Why move to 655 instead of 611 w v urrrisssurmu Getto the top ofthe bill as quickly as possible reward e 71 for each step where notat top ofhill gt return e enumberorsteps before reaching top ofhill Return is maximized by minimizing number ofsteps reach the top orthe hill v lute ism le Optimal policy defined by policyquot i argmaaxg P isaUj W R0 mng PJ is aUj Can be solved using dynamic programming 19 7 Bellman 5 How to compute Ujwhen it s def recursive v lute ism le repeat U 9 U39 for each state i do U39i e Ri maxim ix aUj end until Cloernoughw U39 Final Versmn We car think at Vi as maximizing ouruliih over some tired horizon in We will calculate i iyi the maximum value attainable in in steps starring in stae r llor all slates s We amused baa warm i anii 0 nu ulillryior taking nostepslarilnalvali v m max Irlrinirmln unlywe step simply mooseme anion vim me man Vii ilrl maxirimmin I me m win the tignesr ut ity ircluuirg its viii rv oi result ig 31m li rim late u39malnr Mm W vunmmmn Slow to converge Convergence occurs out from goal Information about shortcuts propagates out from goa Greedy policy is optimal before values completely settle Optimal value function is a xed point of VI vunmmmn Statevalue updats can ha parlonued in any orriar in vain neration This sugg is tryingto decidew atstates to update to maximin convergence speed Prioritindsweeping is avariation olvalue naration more computationally ef cient llocusedi Put all mm in a priority quail in will or now rnncn w thinklhairvalttu might chm given quotfor oivaina 39 aration v Very ef cient in practice Moore a Atkeson1993i nleV39 enzeVlzeanveeer improvement POW Greedlication 5 WOHOtmic Evaiuation step step Generalized Policy Iteration Intermixt etwo steps at a liner scale state by state action by auion etc Solve in nitehorizon discounted MDPs in nite ti Start With vaiue function Vu Letm be greedy poiicylor VD Evaiuate 771 and let V be the resulting vaiue function Let 77 be greedy poiicy for V Let Vm eva ue cln Each policy is an improvement until optimal policy is reached another xed point Since nite set of policies convergence in nite time vunmmmn vunmmmn repeat H H39 U eValueDelermiruzliorKH lor each statei do H39i arngPcns onj a I end until HH39 Can be implemented using Value Iteration Um Ri 2 P03 iXiHiU39 or I I Fewer iterations than Vi b 39 Dynamlc Programmmg xpenswa ui each iteration more Ui Ri 2 Pj i 3LHiUj Source of disagreement among practioners Pi vs Vl 29 v mm nmb cm Learning Model of Markov Decision Learner is not told which actions to take Process TrialandError search Possibility of delayed reward sacrifice shortterm gains for greater longterm Learning Optimal Policy gains The need to explore and exploit Considers the whole problem of a goal directed agent interacting with an uncertain environment 34 v mm nmb 32 Learning from interaction Learning about from and while interacting with an external environment Learning what to do how to map situations to actions so as to maximize a numerical reward signal A collection of methods for approximaing the solutions of stochastic optimal control problems Policy what to do Reward what is good Value what is good because it predicts reward Model what follows what mum The agent may learn Utility function on states or histories which can be used in order to select actions Requires a model of the environment Actionvalue function for each state also called Qlearning Does not require a model ofthe enVIronmen A passive learner simply watches the world going by and tries to learn the utility of being in various states An active learner must also act using the learned information and can use its problem generator to suggest explorations of unknown portions of the environment mum Given A Markov model of the environment States with probabilistic actions Terminal states have rewardsutilities Problem Learn expected utility of each state In RL the environment is usually modeled as an MDP delined by S 7 set nlstates nlthe envimnment A Pss39ae pinbamlity nitransitmn lmm s m 539 given a a 7 set eieelmns pessmie in state 5 ES Rss ae expected reward en transitinn s m 539 given a y r dlSDDlHlt rate ldr delayed reward disuretetime l0 1 2 7m M Q m Q ll M1 4 mmmm Flnda polleyn sES9 11643 could he stochastic lhatmaxlmlzes the mummy expected lumretewatd oleach S Wi mthw harm andeadl 541 pal Pwmds ea Er w m y Zrug J Y my Jr These are called valueiundlons d evaualon iundlons In AI There exist optimal valuelundions V39J m xV QTM m XQ39M And corresponding optimal polldes It d argrrler U a n is the greedy policy with respect to Q mmmm Build 00 Experience Predidions V Q Value mullilllglrmr Funaion yaw ureadmmnon QQ Select Actions Continual oniine policy Simultaneous acting and learning nw 3 V Q mmm i WWW 1 Given A Markov model ofthe environment States with probabilistic actions l Terminal states have rewardsutilities l S Problem Learn expected utility of each state Percepts tell you State Reward Termina vumum u mmmm u A training sequence is an instance of Developed in the late 195039s in the area of o adaptive control the ry Just keep a running average of rewards for each world transitions from an initial state to a terminal state state 39 The additive Utility assumption utility 0f For each training sequence compute the reward a sequence is the sum of the rewards over togo tor each state in the sequence and update the utilities the states of the sequence Accumulate reward as you go back 39 under thls assumptloni the my Of a State Generates utility estimates that minimize the is the expected rewardtogo of that state mean square error LMSupdale vumxcmrmr is vmmcmfml o Utilities of neighboring states are mutually constrained Converges very slowly because It Ignores the relationship between neighboring states U0 R0 2r M U0 Can apply dynamic programming to solve the system of equations one eq per state Can use value iteration initialize utilities based on the rewards and update altvalues based on the above equation vumxcmrmr n vmmcmfml 8 Approximate the constraint equations without Rewrite this solving them for all states VS e VS a Wm mg Modify Ui whenever we see a transition from i toj using the following rule TD error Vi Vi on Ri V0 Vi to get The modi cation moves Vi closer to satisfying the WI 61 MW GD VVS39 original equation mummy w vunmmf m 51 Vse1 aVS otv yVS39 Sutton 1988 Vse1 aVS aREWAkptpam Alter each i aotm updat the State s mummy 5i vunmmf m a 0 based Reinfnrecement Learning Additional Topics in Machine Learning vumum m vunmmfnm 55 Victor Lesser CMPSCI 683 Fall 2004 MultiLevel Search Integrated Search across multiple problem representations Interdependence of Search Paths Information can be mined from the results oiothersearch pat s Leads to problem solving associated vv39th control NonMonotonic Domain Can t useA type of heuristic to guarantee completeness Cost of Control ExpensiveNonuniform cost of operatorapplication Node evaluation cost is dynamic and expensive Ratings need to he reevaiuated when nevinodes are created More complex choice process or next node to expand v iaxe into aacount ctost oi operator application WliiEli canvarv depending ra or on node an ope mainline Example of More Complex Search Paradigm Erman LD HayesRothF Lesser VR and Reddy DR 1930 The HEARSAYII Speech Understanding System Integrating Knowledge to Resolve Uncertainty quot Computing Svrveyr 12 2 Carver N am Lesser V 1992 The Evolution of Blackbeard ControlArcliitecmres Compuher Science Technical Report 92771 University of Mmchmetts Amherst i39his is a revised and expanded version of paper with sane title in Expert Syrian Willi Applicaiialir Special me on he Backboard Paradigm and Ir PM 39 an ion ola KS 7 Knnw when n has snmemmg usemm nnmbme datamvemed solmmn nnmng IS Solvmg quotmmmmm 7 some Woes sum wars 7 Eva uatehypmheses J lgsaw 7on dmevem eve s m ahsmdmn 7oquot dmevenl SmmumolaKS independent andsepalahle Puzzles Pewemves Laygeeyamc mpom 0quot We WW 7 omev K55 ave nm dependent on emaence mmnev K55 sn mmns Attention locus ing at KSs 7 Human Enmemm a K35 exermmn INumannmmnmeWWW Mum 7 Cnmmmemumedimmda aenwmnment Esme mamnppnnuneuccnmnmn mmmeanm SWSWE 7 5 quot9 9 Cunpemwe mow mam momma 7M4 M 539 WW m m s mgg m MHWHWEWW A sysiem is composed ofmany diverseknowiedge sources Kssp WWW 5 a Carmeva Leve s nf Representatmn fur Speec Understanmng Systems The quotmm mm vmmzsi llm 7 yunmsixmlm x mum 7 r r 7 r r mu mmm39 eg in mmm e my rm mm vumcsiwzm v mnmsinmm w App g 39 Problem With Very large 3quot complex Problem which requires large amounts of search spaces knowledge 39 ComputationaHy Wradab e 0 99Wate ems Apphcab e know edge covers Wrde dwerse set of search space areas 39GeneraHy mpnssrb e m guarantee npumamy nr smurmn Wedge may be pammned m terms of Epsom ncrementa generatwon of partwa sohmons areas Aggregat oquot of Comramts Know edge and input data maybe errorfu r incomp ete Search space maybe viewed in terms of dwfferent and approx mate perspectwes and Me s cf abstrac won Constramt optwmwzatwon vumcsiwzm u mnmsinmm u Opportunistic Processing 7 Problem sotvrng should be onven by current state or problem Solvtng and available knowledge apptrcaote to tan state 7 Cooperatron among orrrerent sources ofknowledge wnrcn perrmt resolution orarnorguous sttuatron and correctron or rncorrect decrsrons Knowledge Acquired Along Different search Path can be exploited In making control decIsIons Adapting Control Strategies based on state of search v mm m 13 Sophisticated Problem Solving Search multilevel incremental 1 Plumb A r opportunistic 5 PM nonmonotonic expensive and non uniform operator costs sophisticated control v mm m 15 Planning and scheduling Intelligent Material Handling Factory Scheduling Data InterpretationSituation Assessment Speech and Vision Understanding Multisensor Fusion Layout and Arrangement Protein Molecular Layout Building Design t mm m 14 tmn s e quotJ t mm m a Application of knowledge is triggered by current state of blackboard data directed Based on blackboard events39 m e m Trigger evaluation of preconditions ofrelevant KS se preconditions are satis ed is instantiated with appropriate context and placed on scheduling queue agenda Focus of attention mechanism evaluates agenda and chooses for execution KSs that are most promising for urther system progress KSs are executed and alter state of blackboard trigger new blackboard events 7 Change in the blackbna dilimi dElEllDfl modi cation 7 NonrnDDuiienDe Dian ex Chan e ad ed x n Nodes partial solutions exist at particular level and associated with a primitive element 7 Each level nas associated win it a vocabviaiylnal aaiinasina range nlpiimilive elements 7 Each node has a set nialliibulesllial can be ieveirdependenl Nodes can be related to other nodes atthe same or different levels 7 Expiimiiyinmugn links and implicitly based on node aiinnuias Nodes may represent alternative competing partial solutions 7 Permits direct comparison nlailemalive saamn pains 7 integrated representation nfallemalive saamn pains Partitioned into distinct information levels 7 Each level nnias a miiarani representation niina pmniam space niin ilS own pfimilive elements KS decomposition relates naturallyto one or a few information levels 7 Localization niks activity Levels form a loose hierarchical structure 7 Abstraction olelemenls niina next lowerievei 7 An apiioiiliamewoik nla plan or pmniam sunny 7 Analysissynlliesis action between levels mnmsixmim ix 39 LEXICAL l sndlill eil Trigger speci es a set of event predicates that need to be true or KS to be considered for execution Precondition speci es a set of state predicates that need to be true for KS to execute Context speci es where KS will be applied KSAR Obviation condition speci es a set of state based predicates that if all true indicate KSContext is to be removed from agenda KS action arbitrarily complex program Declarative Information used for scheduling u E a tl 1v mam m m m nowm m m EVENHEVEHS minim 5min txzvmwm mm 16mm tamis APPiiEDmNsrmiNrSi mmmoammommmommmoamanoEmBAEUlli mm txsmmm swim upsmmmmt mm mm msmml imam tumult rvawnotrvukmnmammmhamln WilliE who Mam Muhamt wwwmmm livmuEAmnamI v lrdhmhimll mm comm Nil gt5 mm Newimubeirwtmmmt mmmomnwmt pmmmmna pom twmmm t irml mmwun wwwmmmmz GeneaeJomblellbel pmmm llEleNMUEAmmruZ iemlaunmwllll Durwardthunmnrkrrvane Psw mv nnhomt mhomlll D unvc lmheruumpcrrhwohe Psanmamnmamr man m t Examsthmmmm my mnch mmmr tamttxsmmmm mummy moan mamwzxmmz Wimmwm maimsmmmt uman maimsmmmz mth 196mVabtnwiulvanmwuvmammimewllll l1 in 1 h MJHEJQFT I TRlGGERrEVENT VANCHORHELlendmirg atmmlesanELle WMquot f S nmextvarsr PSrAnmm Herrxt Arr reel em M D O Anmnreez Helm 59 M SrVke mares D I H Emndvarsr NeancLaheanrAnmnreel Hertrmeras NevtndahanrAndmran Hat rnHelH EmmiquotMquot mmm VmeHdr rdeelr ramundHelm D E I J I earlrplt theHdr ramrHemammdrHelm mm J Exemtaaimycre ta sm mreacyu NlL 39 D Exemteacyuem WW D P statussExEcurAELE recmdrirm 3 DP 1 A mummme msmrerenem mggrmihy mummam quottram a aararerrreee aypneemmrs TlnsKSAquusmtstitehladthnm crmtmmwlnultlmusl an new mad kalemmmtsmmrmeanmltnandmmltdm Smuehotitltannnsltmeplmnusly V h Wm mmn unncatmns titeKSARis xemtahl W O vumcsiwm z aamsm in Howto decide which of many potential KS instantiations are the most preferred 7 Howto Compare apples and oranges Information Retrieval Based on Interpreting Connected Speech 7 Dilierenl levels and parts nl seaerr space Sample sentences Howto control the potential for combinatorial explosion of hypotheses on the blackboard Which abstracts referto theory of computationquot 7 Overhead srgrrrrrearrtiy rrrereases as large number nl partral lulmns are placed on BB Limhose articlesquot Howto decide when the system has an acceptable What has Mccanhy written 5iquot091974quot solution Nonrmnnnlnmc Character Diseamh vumrsnm z amuse 1x Large search space 7 Zlonlegal sentences Uncertainty and Approximate Knowledge 7 Sensors 7 Acoustic ononenc knowledge Knowledge costly to apply Dif cult to subdivide problem39solving Interacting constralnts Coeamculauon onenomenon Wide variety of knowledge needs to be applied BB levels Knowledge sources Control Strategy Trace Combinatorial number of possible interpretations Datarelated uncertainty 7 Noisy uncertam andormlssmg data 7 Masking onenomena e incomplete domam model Correlation ambiguity 7 Multiple m elel mll lale number of instances ofeacn interpretation and da a ype e datarassoclatlon problem Volume of data too large to be completely processed Multisensor fusion u m r m m Mama u m r m m Signaiaegu39 ionparaneieiexiraeiionsegnneniaiionand e 7 SEC digitizesthe signal neasmes parameters and pm uLas a labeledseginenlalmn Woidspnmng 7 POM ueaiessgiiameniassmpmnesesimmsegmeils 7 MOW Lieatesvmmirvpmhesesl able Llasses 7 women milieislnendnnei DlWDl inpmneseslnalidow names a a i Phraseisland generation 7 WORDS o Liealeswmdseuuememonthesesthaliepiesenl potential pmaseslmmwmd montheses and Weak grammatiLal quotwile ya 7 WORDSEQCTL anmlslhe number dimpeinesesinaiwom SEO Lieales 7 PARSE attemplsto parse a mid Sequeme and d sneessinl Lieales a pniase hypotheslslmm d Kg WORD7SFQ 5 mam AK Pnrase extending 7 mom maids all inssible Wzmsma migm smamcalvwecme nilnllim ivenpmase 7 VERlFV rats me mnssemvamensagmemnvmnesesanda mmguaus waidpiimsepai 7 CONCAT aaaiesapmsenvpdmessmm avermm mmgunusw dWVEE mu Rating halting andinierpreiaiion 7 RPOL iaeslne aemmmvmeam newmnmdned mmnessusmg l innialianpiacad nntnenvpmnessbvmhemss 7 STOP deudeslu halt panesang deledsa mmplele senenae win a smuenvvmgi lavng ainmesme sisem has mused its available names and seiedsme baa pmsehvpmnessm set Mmmplemenavv miase wmneses asme mlpul 7 SEMANT generatesanunambigumsl eipmlalmnlmmelnlnimallnmelnevd sSeni Wllm me user nas ddened mmusixmlm 3b Scheduler invokes highestrated KS with specific context 7 Check deidie running wnelnei precondition Slillvalld KS modifies blackboard 7 Fdeusdimnlmi database is updated 7 Relevant preedndmdn procedures are ndlined Relevant precondition procedures are evaluated 7 New KS instances are posted on suheduleiwilii Context Priority of new KS instances are calculated and those old ones are affected by change in control database mmusixmlm an MEET r l Bottomup processing to word level 7 Surhcrehtrehabrhty toropportuhrstrc processrhg 39 If SearCh n t progress39ng39 remgger Kss for more hypotheses KS as generator functions 7 L m ted generamn Ofa temawes Implement with control KSs stimulated by agenda 7 Retnggered to generate addrtronal hypotheses as search staghates Search termination Select sequence of word hypotheses as candidates Special mode when a spanning hypothesis is for phrase hypotheses constructed of suf cient credibility 39 opponuniStiC seaI39Ch at Phrase Leve39 Use hypotheses to constrain further search e islandsfofrreltabtltty e integrate partial phrases comrhg rrorh drrrereht drrectrons 7 FM outwards hot bottomrhypotheslzed v New nmt 37 t mm m 1 22 mxxy m nmt a alarm amt 1 KS 5 a stimulus creation onAPDASH parameters torthe utterance Action create segment hypotheses 2 KS WORD cTL stimulus start of processing Action create goal hypotheses at the worq level Thesewlll control the amount of hypothesization that Mow WIII qo Action create goal hypotheses at the worqsequence level Thesewlll control the amount of hypothesization thatwoRDsEa WIII qo KSFOM stimulus New segment hypothese Action create syllableclass hypotheses 5 KS MOW stimulus Mewworo createq bottomup Action create 4woro sequence hypotheses ANDVFELDMANVI tau 145 225 limp law 23 still AiinMARiimvsvz 157 EIG quotnasal 57 7 KSPARSE stimulus ARE Mord sequence Action create phrase ARE 57022 t in new m E KSPARSE stimulus AMDFELDMAMr word sequence Action Create phrase AND FELDMAN r 50145225 9 K ARSE stimulus EIGHT word sequence Action Create phrase EIGHT 354557 10KSPARSE imu u SHAWANDMARVIN word sequence Action Create phrase SHAWANDMARVIN 1757215 11 KSPREDCT 8t VERIFY stimulus ARE 1phrase Action Predict from the grammar 252 worqs followmg w uldsale REDDY1552 2 ANY is 2 ll HUGH 1555035 anq VOU155392 39 12 KSCDNCAT stimulus ARE 1phrase REDDY Moro Action create phrase ARE REDDY51052 13 s DNCAT stimulus ARE phrase Aw word Action create phrase ARE Aw items t in new m 14 KSPREDCT amp VERIFY stimulus AND FELDMAN r phrase 39 Predict 100 words preceding Reject 76 orthern The best of the verified 24 in descending ratin order are FEIGENBAUM 3072250 WEIZENBAUM 7072250 ULLMAN 70116150 NORMAN 0 103150 and NEWBDRN To 103150 15 KSPREDCT amp VERIFY stimulus EIGHT phrase Action Predict the word NINE following and verify it 305232 Predict SEVEN preceding but reject this because orrnisrnatch with the acoustic segrnents ii M 13 KSCDNCAT stimulus BY word FEIGENBAUM AND FELDMAN r phrase Action Create phrase BY FEIGENBAUM AND FELDMAN 1 3452225 19 KSCDNCAT stimulus ABOUT word FEIGENBAUM AND FELDMAN 1 Action Create phrase ABDUTFEIGENBAUMANDFELDMAN 3343225 20 KSPREDICT at VERIFY stimulus A30UTPElCEN3AUMANDPELDMAN 1 phrase Action Predict one preceding word WHAT Verify it 102049 phrase 16KSCDNCA39I stimulus FEIGENEAUNF wordAND FELDMAN r phrase Action create phrase FEIGENEAUM AND FELDMAN r 3572225 17 KS PREDICT ampVERIFY stimulus FEIGENEAUM AND FELDMAN r phrase Action Predict aght preceding words Reject one Discuss Find two already on the blackboard EV 305222 and AEOUT1754572 Verify ve others NoT75 4532 ED75s 72 CITE 01522 auoTETo4 clTEs65 52 32 t in new nail 21 KSCDNCAT stimulus ClTE word FEIGENBAUM AND FELDMAN 1 phrase Action Create phrase CITE FEIGENBAUM AND FELDMAN 1 3349225 22 KSPREDCT amp VERIFY stimulus CITE FEIGENBAU AND FELDMAN 1 phrase Action Predict four preceding words Reject two orthern Verify 2523219 YEAR2o3o49 47 t in new nail 23 KS PREDICT ampVERFY stimulus lav FEIGENEAUM AND FELDMAN r lphrasel Action Predictlo preceding words Reiecttive AESTRACTS ARE EOOKS PAPERs REFEREMcED Findtwo already on the blackboard AN 24 KS CONCAT stimulus NOT Mordl FEIGENEAUM AND FELDMAN 1 Action create phrase NOT FEIGENEAUM AND FELDMAN mes15mm v mere m 49 stimulus A RE REDDY Action Predict tiireetollowmg wordsVerity CITED lso 52asl oRlao52s7l AND255252 28 KSCDNCAT stimulus ARE 1phrasej HUGH lwordl Action create phrase HUGH you ids assay 29 KSCDNCAT stimulus ARE iphrasei vou word Action create phrase ARE you 133055 30 KSCDNCAT stimulus ARE REnnv iphrasei CITED Mord Action create phrase 1 ARE REnnv CITED llamas v mere m 51 25 KS CONCA39I stiniulus Amr lwordl lav FEIGENEAUM AND FELDMAN r lphrasel Action create phrase AM lav FEIGENEAUM AND FELDMAN r 1522425 ARE AM lav FEIGENEAUM AND FELDMAN 1 33592251 is also created from ARE AMv and an FEIGENEAUMAND FELDMAN 1 26 KSSTDP stimulus ARE AM lav FEIGENEAUM AND FELDMAN r iconiplete sentence Action Deactivation of several dozen competing hypotheses t in new mi 5 KSPREDICT ampVERIFY stimulus ARE REDDv anD lphrasel Action Predicttwo followlng words Verify 1755335 in zoasn Ml 32 KSCDNCAT stimulus ARE REDDv anD phrasel av lwordl Action create phrase ARE REDDv CITED av 150039 33 KSPREDICT ampVERIFY Action Predict one followlng word Verify ANY130105126 34 KSPREDICT ampVERIFY stimulus ARE HUGH hrasel Action Predict one following word Verify MAcELlao 6 t in new mail 52 35 KSPREDICT amp VERIFY sumuius ARE you ipiirasei Action Predict three following words Reject USUALLY Veriw REGULARLY1253511S ALWAYS153572 36 KSCDNCAT sumuius ARE REDDV ipiirasei oRiwordi Action create phrase ARE REDDY oR l750s7l 37 KSCDNCAT sumuius ARE REDDV ipiirasei AND Mord Action Create phrase ARE REDDY AND 73032 v new m 53 Many and diverse sources of knowledge can participate in forming and modifying the emerging solution Linking partial solutions at the same level of abstraction and those at different levels Each knowledge source can be implemented using The most appropriate representation of it knowledge The most ef cient inference engine for its reasoning v new m 55 as KSSTDP Stimulus Stagnation Action Stop search and accept ARE ANY BY FEIGENBAUM AND FELDMAN 39 KSSEMANTquot Stimulus Recognized L terance ARE ANY BY F BAUM AND FELDMAN Action Generate an interpretation for the database retrieval system u in new m 54 No a priori commitment to tne order oflrlferel39lcll39lg steps Bottomeup orTopdown e Dataedlrected or ModelGoal directed Eacn knowledge source can contribute opporiul39llstlcallv since eacn nas continual accesst tne current state ofthe searcn e The rlght knowledge can be applied at the rignt time 7 Permits leferentlal diagnosis Control knowledge can explolt a global view of tne emerging set of potential solutions and tneir relationsnips u in new m as New Section on Reason About Uncertainty on October 10 v Wye m 57 The sources of uncertainty in intelligent systems Representing and reasoning with uncertain information Bayesian reasoning Chapter13l 136 In complex environment agent almost never have access to the whole truth about their CMPSCI 933 environment RussellampNorvig Fall 2004 Victor Lesser Wamth Z lmprecise model of the environment Noisy sensory data medical diagnosisvveather forecast object identi cation and tracking Stochastic environment lmprecise model of the system random processes moving obstacles Medical science Theoretical Ignorance Limited computational resources Exceptions to our knowledge can chess planning vvih partial information Practical never be fully enumerated item I I All birds ly Laziness Limited communication resources distributed systems MAS vvithout global view a way of Primal mm summarizing this uncertainty v invemim i v mxvmim l Making decisions without knowing everything relevant but using the es of What We do know Exploiting background and commonsense knowledge which is knowledge about what is generally true Dmicult to easily represent in classical logic 7 nitoduce lequllemenis ror vagueness marialily incomplete an contlsdlctory information Very different approaches based on type of reasoning required and assumptions about independence ofevidence Crucial to the architecture of an agent that is interacting with the real world 7 The challenge is hoW to acquire the necessary qualitative and quantitative relationships and devising erricientrnethods for computing useful answers from uncertain knowledge v Wrasse nmt 5 FirstOrder Loglc F0 i commitment that facts are either true false o unknown eContrast With Probabilithheorv Degree or Beiiei in Proposition same epistemological commitment as eContract With Fuzzy Logic Degree orTruth in Proposition Deductive inference can be done only with categorical facts de nitely true statements Tnus FOL and logical agents cannot deal With uncertainty This is a rnaior limitation since virtually all realrvvortd domains involve uncertainty Eliminating uncertainty would require that ethe Worid be accessible Static and deterministic ethe agent has complete and correct knowledge eit is practical to do complete sound inrerence v Wrasse nmt 7 Because uncertainty a fact of life in most domains agents must be able to act in spite of uncertainty How should agents behave What is the right thing to do axlmlze their the information they hav Decision Theory The rational agent model agents should do what is expected to m 39 39 erformance measure given Thus a rational decision involves knowing e The relative likelihood or achieving dirrerent statesgoals W Probability Theory 7 The relative importance payroffjforvarlous statesgoals quotUtility Theory Probability is about the agent s belief not directly about the world Analogous to saying whether a given logical statement is entailed by the knowledge base Beliefs depend on the percepts that the agent has received to date Percepts constitute the evidence on which probability assertions are based Probabili ies can change when more evidence is acquired Most real domains are inaccessible dynamic and non deterministic at least from the agent s perspective in these domains it is impossible for an agent to know the exact state of its environment 7 Also agents can rarely be assumed to have complete correct knowle go of a domain The qualitication problem many rules about a domain will be incompleteincorrect because there are too many conditions to explicitly enumerate them a a E g birds fly unless they are dead Honrllylrlg types have broken a wing are caged etc Finally even where exact reasoning may be possible it will typically be impractical computationally y uxxl m nmt a More realistic approach to many applications but reasoning under uncertainty is more dif cult Nonmonotonic need to examine previously made conclusions based on new or modi ed evidence Nonmodular all the available evidence must be considered Uncertainty measures oharaderizo invisible fads how do the exceptions to A r B interact with the exceptions 0 yield the exceptions to A a C7 7 Probability as a way of summarizing the uncertainty that mes from our laziness and ignorance y uxxl m nmt M Leads to the property of monotonicity once a fact is truebelieved it must remain so 7 Thus adding new knowledge always increases the size ofthe knowledge base Nonmonotonicity the addition of new knowledge may require the retractionremoval of previously derived conclusions dlluIUl nonmonotonicity 7 WM require that assumptions be rn de a lnterences may not be deducllvey viau Probability exhibits nonmonotonicity KANE E2 not determined by RAlEQ u mesa nmt u Symbolic approaches represent the different possibilities without measuring their likelihood Can potentially combine There are several different numeric approaches probabilities certainty factors belief functions Dempster Shafer fuzzy logic We will focus on probabilistic reasoning faced lots of early objections u mesa nmt 2 McCarthy and Hayes claimed that probabilities are epistemologically inadequate leading AI researchers to stay away from it Some philosophical problems from the standpoint of arti cial intelligencequot Machine Intelligence 4463502 1969 Arguments against a probabilistic approach Use of probability requires a massive amount ofdata Use of probability requires the enumeration of all possibilities Hides details of character of uncertainty People are bad probability estimators We do not have those numbers We nd their use inconvenient y mama mm a When I began writing Probabilistic Reasoning in Intelligent Systems 1988 l was working within the empiricist tradition In this tradition probabilistic relationships 39 the founda ions 0 uman nowle ge whereas causalit simply provides useful ways of abbreviating and organizing intricate patterns ofprobabilistic relationships Today my view is quite different I now take causal relationships to be the fundamental building blocks both of physical reality and of human understanding of that reality an regard probabilistic relationships as but the s ce phenomena of the causal machinery that underlies and propels our understanding of the worldquot Judea Pearl CAUSALlTY Models Reasoning and inference Cambridge University Press January 2000 y mama mm 15 The only satisfactory description of uncertainty is probability By this it is meant t at e be combined using the rules of probability and that the calculation ofprobabilities is adequate to handle al situations involving unce ain y n particular alternative descriptions of uncertainty are unnecessaryquot DV Lindey Statistical Science 2 1 724 1987 Probability theory is really about the structure of reasoningquot Glen Shafer y Wrasse mm 4 A rulebased expert system developed In the mid 1970 s for automated diagnosis of infectious diseases Used certainty factors to represent likelihood Example if the stain of the organism is grampositive and the morphology ofthe organism is coccus and the growth conformation ofthe organism is clumps then 7 the identity orthe organism is staphyloccus Early and Simplified Approach to Dealing with and Evidence data y Wrasse mm a Certainty factors are real numbers between 1 and 1 attached to facts and rules Positive and negative values indicate increase and decrease in the degree of belief Certainty factors are relative measures do not translate to absolute level of belief v mm m 7 The user provides uncertain observations with certainty factors attached to them Ex 09 organismis grampositive 04 morphology of the organism is coccus 07 the organism grows in clumps Belief in a conjunction of premises is calculated by max0min090407 04 Belief in conclusion CF x belief in premises 07 x 04 028 H mm m a B12 B1 Bz 1 B1 when both positive or negative B12 B1 Bzl1 minB1B2 with opposite signs Ex Combining 04 with 06 gives 04 06 1 04 076 More positive evidence will always increase the certainty factor Evidence combination rule is commutative and associative hence order is unimportant v mm m g Evaluations of MYCIN show that it is as good or better than most human experts But certainty factors have no operational definition Hard to use in decision making Surprisingly good with appropriate knowledge engineering and limited forms of deduction H mm m 2 Connecting information derived from different paths Bidirectional inferences explaining away Correlated nonindependent sources of evidence Retracting conclusions monotonicity v mews m 21 Let At leave for airport t minutes before the ight VWI At get me there on time Problems inaccessible world road state noisy sensors traf c reports uncertain actions blowout Suppose PA25U4 PAeo6i PAJZO9i PAlAAO399995 Which action to choose Depends on preferences utilities Decision theory probability utility v mews m 23 Scenario c Events 5 summer was an in miim w mass is we R ii minee last niiihi MchMswlerules 1mm mun 2 was in use Inth men nine 2 uqeemve meme ml um me ems ml he wet ms mmne mm em 2 wet in mum um mm 2 sueeestm meme mu um n mnee me man Combining rules weget WSan snunkler suiiiies wei IVBIRM F n3 n3 n72 wet suiiiiess mlquot so sprinkler made us believe rain i in new amt 22 Assign a numerical degree of belief to propositions and ground sentences 0 s PA s 1 PTrue 1 PFalse D PA V B PA PB PA A B Other properties can be derive cl 1 PTrue PA v A PA P A PA A A PA P A So P A 1 PA i in new amt 24 Random experiments and uncertain outcomes Event Set refer to possible outcomes of a random experiment Elementary events the most detailed events of interest The number of distinct events and their definitions are totally subjective and depend on the decisionmaker v New m 25 Represent the result of random experiments Notation x y 2 represent particular values of the variables X Y Z Sample space the domain of a random variable set of all elementary events Ex Sample space graduating students Elementary events John Mary Event set Females graduating in civil engineering H mm m za For random variables we are interested in equalities like PX x1 07 E PX xi 1 since the values are exhaustive and mutually exclusive Can refer to the probabilities of all values at once as a vector PX 070102 Eg for Weather mnny cloudy rainy snowy can have PWeather 0702008002 Propositions are Boolean random variables v New m 27 An assignment of probability to each event in the sample space Discrete vs continuous distributions Ex PWeather 07 02 008 002 sunnyraincloudysnow Q What are those numbers Where do they come from H mm m 22 Probabilities are precise properties ofthe 39 rse Value can be obtained by reasoning for example if a coin is perfect use symmetry When probability of elementary events are equally likely Prevent size of event set size of sample space Exist only in artificial domains Require high degree of symmetry v New m 29 Represent degrees of belief More realistic approach to representing expert opinion Examples The likelihood ofa patient recovering 39om a heart attack The quality of life in a certain city t mus nmi Probability as frequency of occurrence Prevent number of time event occurs I number of repeated random experiments Problem Need to gather infinite amount of data and assume that the probability does not change over time Some experiments cannot be repeated 0 Success ofoil drilling at a particular location 0 Success ofmarketing a new PC operating system a Success ofthe UMass basketball team in 2005 v New m 1 The appropriate probability to associate with a proposition depends on the knowledge Information that Is available PA denotes the prior probability prior The probability that A is true in the absence ofany speci c knowledge Once an agent has some knowledge evidence the prior is no longer applicable t mus nmi PA E denotes the conditional probability posterior probability the probability that A is true given that e Probabilistic reasoning is inherently rlorlrmorlotorllc because c nstraints on how conditional probabilities can yary can naye PA lEylbutPAlE1 52o 7 Contrast this With FOL in which it K51 l otnen K51 E l o Ex PCavityToothacheD8 Similarly can use PABC etc o The notation PXY refers to the two dimensional table PXxiYyi Conditional probability can be defined in terms of unconditional probabilities PAIB PABIPB when PB gt 0 or PAB PAlB PB the product rule A probabilistic model consists of a set of random variables that can take on part39 39 probabilities Icular combinations ofvalues with certain An atomic event is an assignment of values to all the variabies eg X1x Wquot Atomic events are mutually exc exhaustive nzxmn lusive and collectively Theioint probability distribution loint PXX assigns pro abilities to all possible atomic events Thus it completely specifies the probability assignments for all propositions in the domain PA A B P AB PA i B PA PB PAB PA Elmwt margiriaiizatiori or summing out PA EMA Bi PBi conditioning Given X1 PX1 Xquot assigns probab es to possible values of the variables Example Xquot thejoint probability distribution each set of Toothache Toothache Cavity 004 quotCavity 001 089 From the joint distribution we can compute the probability of any complex proposition such as PCavity v Toothache or PCavity Toothache Why not use thejoint probability distribution From the product rule PlAB PAB ME PBA PlA Hence PBA PAB PlBIPA Don t really need PlA Iormalizatlon PlBIA a PlAIB PB Pl BIA 1 PW 39 Bl Pl B PU l x PM We can condition on background knowledge PlBIAiEl PlAIBiEl PlBIEll PlAIEl Abduction if As can cause Bs and know ofa B then hypothesize A as an explanation for the B Abductive inferences are uncertainlplausible inferences as opposed to deductivellogical inferences The existence of B provides evidence for A ie a reason to believe A Evidence from abductive inference is uncertain because there may be some other causelexplanation for B Abduction is the basis for medical diagnosis f disease D can cause symptom Sthen ifa atient has symptom S hypothesize that she suffers from disease D Plobject image proportional to Plimage object Plobject Plsentence audio proportional to Plaudio sentence Plsentence Plfault symptoms Plsymptoms fault Plfault Abductive Inferencell Buml si u may nm we explamm CONCLUSION unlmnwnvs negative EXPLANATION Maybe me alnlflniemute is I valid due tn luterlaln annhues cuntlusmu may um have in pieiiine and mailman enmpieie Suppumugevldeu e negative piemine may ha we alemanye explanamn ennsmenni pnsnue aeaiienwe immune E E ea piemne maybe ineenaV due tn Mammy in Suppumugevldeute HypotheSis Ebased on the ewdencslAi where the comple evidence ix Aland A ClAI Potenual xourcex of uncemintyin hypothexix rPanial evidenceric A A1 rUncerlain evidence satis es the inference axinmi Bi uncenainsnmel E A ll E Al rUncerlain pmmiSErlE somel E AIMS uncertain anssihle altamalive interpretations fEIl39EVldBnEBrl Bi faminel E Allhecnnect1nferenleisAquot a C anssihle allmmalive evidence furlhe hypothesis 7 1 e i for same Aquot 6 Al the annealevidence 13 actually AI 3 pennies are placed in a box 2headed 2 tailed fair A coin is selected at random and tossed What is the probability that the 2H win was selected given that the outcome is H P2HH PH2H P2 PH2HP2H PH2TP2T PHFPF 11311301I31I21l3 2l3 PxlyPy P i A 0 X Enwnm mexplauati u quotlquotka Punian xlylxe zyz vmymzi pamatmusiseuay l VDLVIDZiisVlZl pamatsuppunm E mismgsnppunmm pusnhlealtrexplauatmrehyp g te mayhepanufan g aiemaiwemk E menmntymsnppu mgevideulze mm Emmi Vehicle pusnhlealtrexplauatmmypes P n MW anuistwghcst amuslIHimse W amth Eumrrmal mlmn vmgtwmwmm Consider a diagnosis problem Witli multiple symptoms Pdlsrsl PdPsrslldlPsrsl For eacli pair of symptoms We need to know Psrslld and Pssl Large amount of data is needed Need to make independence assumptions Pslsl 135 Or conditional independence assumptions Ps lsyd Psld Psrslld Hsld Psld 1mphcilly d causes 3 and s Witli conditional independence Bayes rule becomes 17 MY oiPZ PXlZ PYlZ Given PCavityToothache 08 PCavityCatch 095 Compute PCavityToothacheCatch PToothacheCatchCavity PCavity I P Need to know PToothacheCatchCavity Assuming conditional independence PXYZ PXZ and Bayes39 rule becomes PZXY u PZ PXZ PYZ PCatchCavity PToothacheCavity v mm m 45 Three prisoners A B and C have been tried for murder and their verdicts will be read and their sentences executed tomorrow morning They knowthat only one of them will be declared guilty and will be hanged while the other two will be set free the identity of the condemned prisoner is revealed to a very reliable prison guard but not to the prisoners themselves v mm m 46 In the middle of the night Prisoner A calls the guard and makes me 39 F39 539 quot39 39 to one of my friends to one who is to be released You and I know that at least one of them will be freedquot The guard takes the letter and promises to do as told An hour later prisoner A calls the guard again and asks Can you tell me which of my friends you gave the letter to It should give me no clue regarding my own status because each of my friends has an equal chance of receiving my letterquot v mm m 47 The guard answers I gave the letter to prisoner B he will be released tomorrowquot Prisoner A returns to his bed and thinks Before ltalked to the guard my chances of being executed were one in three Nowthat lwas told that B will be released only C and I remain and my chances of dying have gone from 333 to 50 What did I do wrong I made certain not to ask any information relevant to my own fate Problem Did the guard reveal any information to prisoner A regarding his fate v mm m as Let X stand or prisoner X will he declared innocentquot Let GX stand or prisoner X will he declared guiltyquot Then RuaiGArRlGAl PlGAl m 1 GAME PllBl PllBl m 2 WWW o But IE B will he declared innocentquot was inlened mm a more dim observation I39B Guam said that B received the letterquot ilwe compute PlGAlI39Bl we get the Conan answer Ru39aiGArRlGA 1mm 1 PlGAll39BF Pll39Bl m 3 5p Consider the following results ofa test oftwo types of drugs on a group of people Survival or men DRUG A052 DRUG a Survival lorwomen DRUGAMSDRUG 42 Forthetotal population DRUGA m DRUG a 01 Is this a paradox vumcsiwzm 5i Probabilistic reasoning with belief networks mnmsixmlm 52 Extra Slides Victor Lesser CMPSCI 683 Fall 2004 Homework due Tuesday at 5 on November 29 Vlmltf lm 39 Continuation of Decisiontree Algorithms 39 The Version Space Algorithm 39 Neural Networks Viaysuamm u Complete space of finite discretevalued functions relative to available attributes Maintains only a single current hypothesis decision tree Performs no backtracking in its search Uses all training examples at each step in the search to make statisticallybased decisions regarding how to refine current hypothesis Vlmltf lm Selects in favor of shorter trees over longer ones Selects trees that place the attributes with highest information gain closest to the root Stop growing the tree earlier before it reaches the point where it perfectly classifies the training data Postprune the tree Use nontraining instances to evaluate based on a statistical test to estimate whether pruning a particular node is likely to produce an improvement beyond the training set A hypothesis over ts the training examples if there is some other hypothesis that fits the training examples less well yet actually performs better over the entire distribution of instances Causes Noisy Data construct tree to explain noisy data Lack of Examples small number of examples associated with leaf coincidental irregularities cause tne construction or more detail tree tnan warranted t uxxicssmrmm a Ne Add new attribute value unknown Estimate missing value based on other examples for which this attribute has a known value Assign value that is most common among training examp as e Instantiated example with all possible values of missing attribute but assign weights to each instance based on likelihood of missing value being a particular value given the distribution of examples in the parent node Modify decision tree algorithm to take into account weighting t uxxrcssmrmnt a 0 Handling multivalued large attributes and classification Need another measure of information gain lnformation gain measure ives inappropriate indication of attributed usefulness because of likelihood of singleton values Gain ratio Gain over intrinsic information content v uxxvcssmrmm g Sort instances based on value of an attribute eg temperature Identify adjacent examples that differ in their target classification Generate a set of candidate thresholds midway between corresponding examples Use information gain to decide appropriate threshold v uxxvcssmrmm M Continuousvalued attributes Discretize Example Preprocess to find out which ranges give the most useful information or classification purposes v uxxvcssmrmnl u MllWaitr ltgt Patronr Somev PatronrFull eHungryU A Typeg ench v Patrormr Fuii A eHungryU A TypeU Thai A FriSatr v patronmmim eHungryU A TyperBurger ach example is a logical sentence Altr A aBarr A Frir A V llWaitr A decision tree is consistent with the data iff the corresponding KB is consistent v uxxvcssmrmnl 2 Can use Simpler Approach than Decision Tree Algorithm ssttming Complete Consistency lncrementally present examples lncrementally re ne hypothesis Add Unknan Example pustive s m negative MnnntnniL yen niEvnivnnh nlCttnent Best Hypothesis heyei mn ily to eliminate any example and adittst Cuiientvanlnesis Maintain single hypothesis Adjust to new example in nrdertn maintain consiste cy 7 An example can be consistent with the current 39 all he39 hypothesis or t c lalse negative iithe hypothesis says his negative but in tact h is positive or lalse pnsitye iithe hypothesis says his positive but in this negative Generalizationspecializatinn Dropping conditions Adding conditions function llKRiN erllt lrllARNlNrgtxv1mpltr rethrni a hypntliesis H Hasty hypothesis consistent with the rst example in ennmyjes tin etch iciimilimg cxuiitplc in examples do if e ii t aisepnshhe for n then H 9 than so a specialization of H consistent with mmpien else if e is Mine negative tor 1 then a inlizuLiuuuchunaieLeltlwilhexmple if no comment specializationi gcncmlizanon can be found then fall and return H Xi is posrtiye AlternateXlistrue 7 H1lnlll I o thesisvxvmuwmtxnbnanatetx AW Gull Emile A a I w 1 mp R I r WW X2isriegatiye lalsepostiye quot quotquotquot quot 39 7mWIIIWaItxAIternatetxianopahonsccsome r Iltu An in Xbm as n m mm on m x m n An m m s n rm on my 39 X3l5P 5ltherldSe egatWe X my in An n n my on m 7mwm llxPalionsxsome 14 U 39vn lax in Full 39w No hm 7 In x m n m M7 in as 39w m mm gto rw 39 M spos twerfdsenegatwe a m m o m Xbm m m run 0710 m 7H4iniIWaittxPatronstxsomewPatronstmiinno x m m o n nor 5 m to raw 0710 m nusmtx X t n m Emu SS in In hm 710 1393 x w m m n m s m to raw gtou m 39 OthelHypdheses xr m m m m m as m ran 11710 r w 7mwmwmtxPahonsmsonrevtpatronsccnrihano n my h n not 5 w n on 0710 m7 Wallislmalex1lllll rt 1 m m m m s n n rm 1177M m 7WWWmnmwmisummmm n a A least commitment approach keep all the hypotheses that are consistent With all the examples so far Very large search space 7 Mo backtracking N0 900d hegrlstlcs Problem howto represent the current set of remaining Nondetermlnlsnc searCh hypotheses the version space efficiently 7 May need to backtrack Using boundary sets Updatingchecking hypothesis is Sset most specific consistent hypotheses expensive in terms of number of 7 every member MS is consistent with all observations solar and there are no consistent hypotheses that are more speci c examples Gsetmostgeheral cohsisteht hypotheses Need to reevaluate every modi ed e my man on a consistent on n cannabis so at hypothesis on aquot exampes presented andthere are no consistent hypotlmes that are more general mama to mnmsilmm 2r Mom ppmin More speci c Initialize th specific e sets 8 and G to the sets of maximally and maximally general hypothesis that are consistent with the first observed posmve training instance G set of hypotheses that represent disjunction of each single attributevalue pair S the hypothesis which is the conjunction of the attributevalue pairs in the training instance For each subsequent instance i do v uxxlcssmrmm If i is negative Remove from S the hypotheses which match i 7 False positive for si too general Make hypotheses in G that match imore speci c only to the extent required so that they no longer ma ch i 7 False positive foer too general Remove from G any element that is no longer more general than some member 0 Remove from G any element that is more speci c than some other member in G If i is positive Remove from G the hypotheses which do not match i 7 False negative for Gk too speci ic ake hypotheses in S that do not match imore M general only to the extent required so that they match i 7 False negative for S too specific replace by immediate generaliza Ion Remove from S any element that is no longer more speci c than some member of G Remove from S any element that is more general than some other member in v uxxlcssmrmm S Termination One hypothesis is le in the version space indicating that it is the desired concept de nition The version space collapses with either S or G becoming empty indicating that there are no consistent hypothesis for the given training set The algorithm runs out of examples with more than one hypothesis left can use the result for classi cation if all agree ne otherwise can use majority vote origin Japan origin Japan origin Japan rnfr Honda rnfr Toyota color blue color green c ilecaile 1970 ilecaile 1990 type Econorny type Sportx type Econorny t 6 t s w origin USA orign Japan rnfr Chryxler rnfr Honda color blue color white decade 1980 ilecaile 1980 type Econorny type Econorny t 13 Jopon Honda one 1980 5mm G 1 x x 1 15 S Jopori Honda one 1980 5mm E7 3 Joplxri Honda one 1980 Economy 13 Japon mm one 1990 Emmy G 1 12 one 14 15 1 x x 1 Emmy 3 Jopari x one 10 Economy W X2 5M X X5 x1 x2 xamp xy Economy Japan x2 Ellue gtltw Economy E USJ Chrysler Ellue 1980 Economy G Japan x2 Ellue X x5 Japan x2 x3 gtltw Economy SJapan x2 Ellue gtltw Economy E Japan Honda White 1980 Economy G Japan x2 xg gtltw Economy S Japan x2 x x Economy Elegant Algorithm Limited Applicability There are no errors in the training examples Will remoye correct hypotheslslrom set as soon as encounters raise negative hypothesis Does not handle unlimited dlsyunctlons in hypothesis space Extensionsallowllmlledlormsoldlsiuncllon Generalization Hleararchy or more general predicates that represent disiuncllon How do we knowthe hypothesis h is close7to the is target functionfif we on t knowwhatf Sample Complexity Can We decide how many examples We need to rain on Underlying principle An hthat is seriousl Wron will almost certainly be found out Wlth high proba ility a er a small number of examples An h that is consistent with a large set of training examples is unlikely to be seriousy W ong Probably Approximately Correct PAC Learning Stationary assumption training and test data drawn randomly from same population of examples using same distribution Compare HypotheSls f To correct concept F Vie Probability of mlsclassliylrig an instance Probability of instance being in he F Fx not equal Fx HypotheSls is good to extort it classrles instances correctly Hypothesis approximatdy correct lF ne eEPtnlse Valiant Accuracy param eter Hypothesis Probably Approximately Correct AC Pu E QEHubEJd Con dence parameter Hypothesis space showing the Eballquot around the true function f Assume worst case All herr hbadH Have Error gt Spite ofpossible hypotheses Upperboundis Phn agrees With M examples sUS M For lHl hypotheses probability of some it E Hbeing consistent With all M examples Palm contains a consistent M hypothesis with MexamplesiHi 1e To guarantee that is PAC lilo E s 6 Because E rl l areknown can solve forM Blumer et a1 l l M2ELN3LNlHD Given 666 Any hEHconsistentwrthMexamples M 2 1sPACll Bylooking at H forvanousrepresentauons can determine corresponding Ml givingbound on sample complexity for PAC learning Space of H is 215W n attributes Sample complexity of space grows as 2quot Number of examples is at most 2quot Learning algorithm will no better than a lookup table in terms of PAC Problems occurs because ofworstcase complexity analysis and size of 7 Do nel necessarily relleciiire averagercase sample complexity Can we reduce the size of H and still learn reasonable Boolean functions Series of Tests each with conjunction of literals PatronsxSome gt yes Patronsxfull and FriSatx gt yes Ngt kDL restrict size of test to k literals More expressive power than depth k decision tree PAClearn in a reasonable number of examples for small k v uxxicssmrmm 7 i UL J Computational architectures and cognitive models that re neurallyinspired Faithful to coarse neural constraints not neural models Large numbers of simple neuronlike processing units interconnected through weighted links They do not compute by transmitting symbolically coded messages quotprogramquot resides in the structure of the interconnec ions massive parallelismquot and no centralized control v uxxicssmrmm 9 7 Aw 7 7 Approxrmately 10M neurons 104 synapses connections per neuron Neuron fires when its inputs exceed a threshold Inputs are weighted and can have excitory or inhibitory effect Individual firing is slow 001 second but bandwidth is very high 10M bitssec The brain performs many tasks much faster than a computer Scene recognition time 1 second Learning and graceful degradation t uxxrcssmfmnt 1 Ability to bring large numbers of interacting constraints to bear on problem solving soft constraints Noise resistance error tolerance graceful degradation Ability to do complex multilayer recognition with a large number of inputsoutputs quickly Learning with generalization Biological plausibility Potential for speed of processing through fine grained parallelism t uxxrcssmfmnt m Automobile automatic guidance systems Credit application evaluation mortgage screening real estate appraisal Object recognition faces characters Speech recognition and voice synthesis Market forecasting automatic bond trading Robot control process control Breast cancer cell analysis Oil and gas exploration Image and data compression anoulpm Unm mm imiinv Inpul m vulmsixmim n unionsuni u Processing units computeweighted suni oftheir inputs and than apply a threshold function i Wui 39 input output 39 links inks in Input Activation Output runnuun fumth a Step function h Sign function c sigmoid function vulmsixmim n unionsuni u uH w W Continuation of Neural Networks a 1 lm 196 mummy mummme Ihwhmmnmmcanm as by gm ngn gt XOR requires mullihyerneiwurk vulmsixmlm Wu g mnmsumm w Victor Lesser CMPSCI 683 Fall 2004 Continuation of Inference in Belief Networks Automated Belief propagation in PolyTrees V Leswr DEM HUN 2 Network construction Conditional independence of a node and its predecessors given its parents The absence of a link between two variables does not guarantee their independence Effective inference needs to exploit all available conditional independences Which set of nodes X are conditionally independent of another set Y given a set of evidence nodes E PXYE PXE PYE Limits propagation of information Comes directly from structure of network v Lessei CSEEJ mm Definition If X Y and E are three disjoint subsets of nodes in a DAG then E is said to d separate X from Y if every undirected path from X to Y is blocked by E A path is blocked if it contains a node Z such that 1 Z has one incoming and one outgoing arrow or 2 Z has two outgoing arrows or 3 Z has two incoming arrows and neither Z nor any of its descendants is in E v Lessei saaa mm v umvcsmmm Property of belief networks ifX and Y are d separated by E then X and Y are conditionally independent given E An ifandonlyif relationship between the graph and the probabilistic model cannot always be achieved v msucsmmm Whether there is Gas in the car and whetherthe car Radio plays are independent given evidence about whether the SparPlugs fire ignition case 1 man POM PGI PGIR PGI v maussmmm Gas and Radio are conditionalIyindependent if it is known if the battery works c5392 POVBG POUR PGBR PGB v nssuc b E STARTSKM OVES lNZ EUTALSO W E FOR CASE 3 Gasand Rado are independent given no evidence at all But theyare dependent given evidence about whetherthe car Stals For example it the cardoes not start then the radio playing is increased evidence that we are out olgas Gas and Radio are also dependent given evidence about whether the car Moves because that is enabled bythe car starting PGaisadioPGas PRadiolGasPRadio PGasl RadioStart not PGaslStart v umvcsmmm o BNs are fairly expressive and easily engineered representation for knowledge in probabilistic domains 0 They facilitate the development of inference algorithms 0 They are particularly suited for parallelization Current inference algorithms are efficient and can solve large realworld problems v umvcsmmm itemon F Topology trees sinIyconnected sparsely oonnected DAGs Size number of nodes Type of variables discrete cont functional noisylogical mixed Network dynamics static dynamic v msucsmmm m Polytree belief network where nodes are singly connected Ex1ct inference Lineu in size of network Mnltieonneeted belief network This is aDAG but not a polytree Exnct inference Worst use NP 1 v umvcsmmm 12 simpie Bxamplas HI 4 pattEms uf rFa nm n ihniran he handledhy heliefn twnrks E mpmsems an mama Wriahlai Q is a tury mama whining AW Inmmuml Diagnostic Causll Mixed PQE v maussmmm 13 What is pY5Y1Y4 r DefinemmstPTspY5Y4Y3Y2Y1 T yz 7 pY5Y3Y4pY4pY3Y1Y2 pY2 pY1 7 pY5Y1Y4 pY5Y1Y4pY1Y4 r seqnosum owrmi ngvan39abls y 7 pY5Y1Y4 SumY2Y3 pY5Y4Y3Y2Y1 r aSuming van39abls mke on only truth or falsity pY5Y1Y4 pY5Y3Y1 Y4 PY5 not Y3Y1Y4 y5 7 Connect to mmms of Y5 not aheady part of expmon by marginalization sums pY5Y3iY1Y4 v nmvcsmmm 14 SUMY3p 5m Yli4 pY31 Y1 Y4 7 31449615111 Plaid Y1 y SUMY3 gYSlYS Y4 p YSl Y1 Y4 7 Y1 conditionally independent of Y5 given Y3 7 Y3 repments all the contributions d Y1 0Y5 3 7 Ca 1 anodeixmmimmlly imipememoimn dmdanlsgiveuimm ems SUMY3pYSlY3 Y4 pY31Y1 YS 7 Y4 conditionally independent of Y3 given Y1 7 Cw3 Y3 not admmhnt of Y5 which d separats Y1 and Y4 v 1m mam mm SUMY3 pYSlY3 Y4 Sum Y2pY3 Y2 lYl y y 7 Connect men39s on3 mtaireadypanof 1 Z expemion SUMY3 pY51Y3 Y4 Sum Y2 y3 y4 prSlYLY2EY21Y1 7 psisjld psil Syd psjd product rule L SUMY3 pY51Y3 Y4 SUMY2 y pY31Y1Y2M 7 Y2 iniependeut of Y1ypY2Y1pY2 7 De nilion of Baysean network v 11mm mm What is pY1Y5 yl yz 7 pY1Y5pYlY5pY5 7 pYlY2Y3Y4Y5 in terms nfcpt 7 pY5Y3Y4pY3Y1Y2pYlpY2pY4 y3 4 pY1Y5 pY5Y1pY1pY5 7 Bayes Rule 7K pY5Y1pY1 is mam mm 17 K My upw 1 KquotSUMMN5 Y3DY31Y1gtp 1 Y1 Y2 7 Connectw Y3 entonSnotahmdypartofexpreshn 7 NW SUMdPg d mg 7 Y1 oondiu39omlly independent of Y5 givenw 7 pY5 Y3YlpY5Y3 y3 YA Kquot SUMY3 SUMY4JPV5 Y3 Y4EY41Y3p 31 1 L W1 7 Connecth W parent of Y5 not already part of expmhn 7 P615 SUMdPg 5 d Pd 5 YS K quot SUMY3SUMY4pY5Y3 Y4pY4pY31Y1 p 1 7 Y4 independent om pv41Y3pm USES mm m K SUMY3 SUMY4pY5Y3Y4pY4pY3iY1pY1 yl yz 39 K SUMY3 SUMY4p Y5iY3 Y4p Y4SUMY2p Y 3 Y1Y2JY21Y1 pY1 y3 YA r Connecttn Y2 parent on3 not almadypartof expession 7 Psilsj SUMdPsi sj d Pd Isj i I K SUMY3 SUMY4pY5Y3Y4pY4SUlY2pY3i YLYDM pm 7 Y2 independent of Y1 7 Expression that an be calculated fromcpt v msucsmram Can remove a lot of recalculationmultiplications in expression K SUMY3 SUMY4pY5tY3 Y4pY4SUMY2pY31Y1Y2pY2 pY1 Sumrmtions over each Variable are done only for those portions of the expression that depend on wria le Save results of inner summing to avoid repeated calculation 7 Create Innerrnediane Emotions 7 F Y2Y 3Y1 SUMY 2pY 31Y1Y2pY 2 v msucsmmm Ifthere is evidence both above and below PY3Y5Y2 we ssoaaums mums into above 5 am below 539 portions and use aversion o1 aayss mmo mm i 5 is AWN QVWEQ ps is wetreat 1 1k asanormalizingiaaoranmns yew 2 PlQiS tP 4 iQFPQi q isepaiaes 31rqu f so Plain 4999 WW 9 We cdmlatethe am prohahimy in this wound as part olthetopdown pmuduieloi mlmlating m2 is The ssoomi prohahimy is mlmlatai diiectw hythe bottomup pioceduu v msucsmram 21 Most probable explanation MPE OI most likely hypothesis The imtantiation of all the Iennjning vaIiablu U with the highm Folnh39lity given the evidence MPEU l e a39gImXu Pue Maximum a postmod MAP The imtantiation of mine variables V with the highest pmlnbility given the evidence MAPV l e arguing Pve None that the axsignment to Ain MAPAle night he mnpletely different fromthe assignment to Ain MAPABl e sum aver Wines nl39 E vs individml Wines nf E Other queriu probability of an arbitrarylogical expmsion over query varialiesdecision policiu infnmntion value mitng evidence infomntion gathering planning etc v nssuc m 22 Notation M vix Conditional probability Inatix e The evidence Be1x Px i e Posten39or distn39bution of x x Mvix E fxMvix v maussmmm 23 do non e ee39 6 Represents the causal evidence e39 Represents the evidential evidence Need to compute Begtlt v mieic b 24 Belx Px l e Pe39 lxePx l e 7 Bayes rule Pe l e ocPe lx ePx l e Normalization ocPe l xPxl f x desep ee39 a Mx 7600 is mam mm 25 Mx represents the degree to which x might explain the evidential support Pe39IX rrX represents the direct causal support for x PXle Both Mx and not can be calculated in terms of the A and n values ofthe neighbors of x mm mm 25 Mx Pe x 2PM xyPyx Y 2Pe yPyx ydisepm39 y 2 mm x Y 3PltxmePme u EPOC uPu e u disepmJr u 2Pxu7ru u u39Mx M v ussucsmmm Each node must combine the impact of messagesfrom several children Each node must distribute a separate 1 message to each child v ussucsmmm 72745 Props v Mummy 3931 The evidence E can be decomposed into two subsets o E the subset of E that can be accessed from A through its parents o E the subset of E that can be accessed from X 1 through its children M nssucsmmm 32 The current strength of the causal support 7r contributed by each incoming link UI gt x MU PltU l elk The current strength of the diagnostic support A contributed by each outgoing link x gt Y I AyJOc 136 l x The fixed conditional probability matrix Px l u1K un v Lessev CSEEJ mm 33 PM Step 1 Belief updating Inspect msgs from parents amp children and compute Belc aAcJrc where 111ij1 Jrc E Pclu1KunHerul mm i Step 2 Bottomup propagation Compute A msgs to send up A xU is the msg X sends to parent U 1x l5 2 100 2 PXlU1Un H x Uk x UEK 1 w Bis an arbitrary constant factor out contributions to belx from Ui v Lessev csasa mm 54 Step 3 TopDown Propagation Computer msgs to send down quotWm is sent from x to child Y In Xi um MM 2 PXiU1U n MU mu u i n i a Bel x factor out contributions to belx from YJ 1 X Boundary Conditions 1 Rootaxixj istheprior prob dist 2 childless node11x i1 1i 3 Evidence node MX 0 pi v msucsmmm Pulxltu 0mm 1mm mum Eqimimniur cm Equalicut39oi my Eqimunnfoz V 3 Pr Emmi rm 75m ii HY Fquamm fnr awry Px y xi mm M39xtxi Mng Approximate inference techniques Alternative approaches to uncertain reasoning v Lesser CSEEJ mm 37 Victor Lesser CMPSCI 683 Fall 2004 Designed to Deal with the distinction between uncertainty and Ignorance Ratherthan computing the probabiliyol a proposition it computes the probabilitylhal evidence supports the proposition Applicability of 08 assume lack of suf cient data to accurately estimate the prior and conditional probabilities to use Bayes rule incomplete model gt rather than estimating probabilities it uses belief intervals to estimate how close the evidence is to determining the truth ofa hypothesis vrncaamm K Alternative Models of Dealing with Uncertainty informationEvidence DempsterShafier Theory of Evidence Fuzzy logic Logical ways of dealing with uncertainty viucaamm 1 Allows representation of ignorance about support provided by evidence allows reasoning system to be skeptical For example suppose we are informed that one of three terrorist groups A B or C has planted a bomb in a building We may have some evidence the group C is gurlty P C UE We would notwant to say the probability of the othertwo groups being guilty is it in traditional theory forcedto regard belief and disbelief as functional opposites pa t pnct alb l and to distribute an equal amoun of the remaining pro ability to each group DS allows you to leave relative beliefs unspecified viucaamm A 4 ReilGreenBlue 06RedGreeu RedBlue GraemBlue Red Green Blue Snwosethat memento snponsredgremm manage a in mm mmquot wll heasyied m mammal mile naysian mm assumesom the reuniting support IS HSQM DME negation of he mm or its comment line Given a population F bluered green of mutually exclusive elements exactly one ofvvhich f is true a basic probability assignment m assigns a number in 01 to every subset of F such that the sum of the rs is 7 Mass as a represenlalion ol evidence supper true value offis In su s bluereogreenblue red bivegreenredgreeriredbiuegreenemoly sel There are 2l i propositions corresponding to the et A A belief in a subset entails belief39in subsets containingtha subset ireo entails Beiiel in redgreenredbiueredbiuegreen l A E mmicsumm Random switch model Think ofthe eidence as a sWitch that oscillates randomly between two or more positions Theiunction mrepresentstheiraction of the time spent by the sWitch in each position Voting model m represents the proportion of vctes castior each cl several results oithe eidence possibly including that the evidence is inconclusive Envelope model Think cf the evidence as a sealed envelope and m as a probability distribution on What the contents may be Belief or possibility is the probabilitythat B is provable supported bythe evidence Be A 2EMmB SupporlcommilledloA r it mmwiimmv Plausibilityis the probabilitythat B is compatible with the available evidence cannot be disproved Upper beiiei limit on the proposition A PllA 2 a M 4 mlB Supportthat can move intoA r immmw mumWaiwmwwmmm minim s new ii M PllA1BelA mmicsumm Belieflnterval BeAPIA confidence in A Interval width is good aid in deciding when you need more ev39 ence 01 no belief in support of proposition otalignorance 00 beliefihe proposition is false 1 1 beliefihe proposition is true 31 partial belief in the proposition is true 08 partial disbelief in the proposition is true 27 belieffrom evidence both for and against propostion Suppose that mD8 and m2D9 DpD oz DnD2 D 18 wow m2D 72180898 m2D0 m12 FD02 Using intervals 81 and 91 981 Given two basic probability assignment functions m1 and mZ howto combinet e 7 we diverantsoums of evidence 7 FwdPodPqidmnmnndi nnalindzpendmu Bel C Sum m1A m1Bl where A intersects C Normalized by amount of nonzero mass left after intersection Sum m1A m1Bl where A intersect Bl not empty mnmsllmlb ED 18 pro oz Need to normalize 180802 by 72 m2D29 m2aD64 m2DD07 Using intervals 81 and 01 2936 mnmsllmlb Cold cold Pneu pneumonia When we begin with no illlormation m is supposem conupohiisio our heiier after observing lever m Cold n1ull 9 um Suppose m1 corresponds to our heiier altar observing a nllllly hose Mi vi 07111 ml 9 all pummel u Normalizing hi gt til at the heiiernr ism annc39nterl with l gives m5 would lo1nrlmmii Aimyfmpoid 100696 namii FmCoIdPneu loom nizmii Allergy n7iizilziiiimziiii 0017l00846 What is the hem possibility olAllergy pummel l5 Thehwe can enmhhie whim w lt08 02gt ECP06 5C 048 REP 012 6 04 A50 032 a 008 nwepeueeznewenmmem impala lull Arjuna Ml Flu nl nzu MI 8 MR Suppose m enneipnmis in our belief that the problem goes my on trips and thusixzmizlu l with an allergy W l 5 M mnmsilmm Addresses questions about necessity and possibility that Bayesian approach ot answer Prior probabilities not required but uniform distribution cannot be used when priors are unknown Useful for reasoning in rulebased systems mnmsilmm M u NJ lib Source of eVidence notindependenti can lead to misleading and counterintuitive results The normalization in Dempster s Rule loses some meta information and this treatment of conflicting eVidence is controversial r Dillicult to develop theory or utility since Eel is not defined precisehr with respect to ilecision making Bayesian approach can also do something akin to confidence interval by examining how much one s belief would change it more eVidence acquired e hnplich uncertainty associated with various possible changes ounimnni n Methoiltor reasoning with logical expressions oescrilring tuzzy set membership 7 Knowledge representation hasep on ileoreasorrnein ratherthan a Crisp rnernpership orpinary logic natheithan storhastir pioresses 7 Degree ntiillh in proposrion iarhai than negree or mini Logic otgroaunlpropartics as well as calculus lor incomplete intorniation e precision in reasoning is costly and shouip not be pursued more than necessary ApplicationFuzzy Controlrhuorn e gecoiloitarurzyrnlrs m sspecdtesicwthonttahrannpioreoislighr 7 computer uontrclonwlilerongs olmicae 39vmhillg 39 wit sumac ounimnni io Normalization process can produce strange conclusions when con icting evidence is involved 9 ABC quot 1 A 099li 5 001 quot 1 Cl 099li 5 001 39quot1 mzBl 0 Certain of B even though neither piece of evidence supported it well No representation of inconsistency and resulting uncertaintyl ignorance unionsmin ix Fuzziness is a way nfde ning concepts or categories that admit vagueness and degree nothing to do with degree of belief in mething and need not be related to probabilities We believe 5 it will rain today example of Was it a rainy dayquot fuzziness misty all day long but neer breaks into a shower rain for a few minutes and then sunny heavy showers aii daylong unionsmin 2n Giventwo inputs 5 di erence between current temperature and target temperature normalixed bythetargettemperature as thelime derivativeolE ldEdll Compute one output w the changein heat lorcoolingl source Fully variahln NB neg higl NS20PS PB Example rule HE is 20 and IE is NS than Wis PS 39 20 and IE isPBthanis NB Sign icant Reduction in number olnlles needed and handl noisy sensors warm 1r Fuuy variahielakes on aluuyset as avalue 7 Aluzzy set Class Arn x rs enaraelerrzea by a memaersmp lurmlmn va lnal assigns eaen pman rn x a real number between on Height Examplequot in thetall class 7 va x 1 lm any persen wer leetlall 7 We x 0 lm any persen underS leetlall 7 We x rn between lnr nereni gt5 and lt a Piecsw39n LlnoarFunclion rorvarun In Delween 7 vemnrlall ll406 15 17 Hedge 7 systematic mearnealen m a Bharacterislmlunmmn m represent a linguistic speeralrzalmn 7 very lauwnerylali xvlall x2 warm 1 A different type of approximation related to vagueness rather than uncertainty Measures the degree of membership in certain sets or categories such as age height red several did many Example Several 23 35 411 91 Representation A ulalu u E U mmusumm 12 sf Geneullytrue 4 Mlyhetrue 4i Geneullyblxe e a a 7 Heighth Snort Marin m quot s s Height rn kl Express pretmceson possiblevdues of amiable Mme em van is not known 7 mnmmmmemw 7 Usmimmsmmmmzzymiesels Cwsismncyuiaamwilhmpedtoamnoept tallpeoplequot 7 mailman mamas ERay56 7 PnlnyMaaslrel39Kt MrrmxmvIallwenl lnensueulmemyeelu wanmmlamn Hummuselm rmrlnhesvl lall maple l mssimiyinl a NlAll lcnmnememmM immemttudeg enu mmmm islnawwisnezessmyml nmpizmmkisa swaps 11 swaysuw mama mm 55 L a c39 n n39 befuzzy 53 um mumnm mm wra 39 gas x is c n x Is I mm 1 E n y is d Visibility lsmlnw ll visibility is innthen cmdmm is may Cnmmm is H mm HIVcit leiedby e39l nnawdirmzdby d39 lhefuzzy um IS based on Mn nmepts i luzzylmpllmlmn 2 me mmpusnnnai rule mimevmce Fuzzyirmllntim wind 5 a 7 h a and n ae ruzzy sets Represmtation A ualul i u E u sa waders ova sanewrialzle U E umaxau hul E U A H E lumWagr m0 i ll EU a A lul 7aulueui For exam Young am Ridi v n Rl mites39ar product ovutwo variants AXE uMnl39nlqul my in EUVEV other set operators ae mmaimsuguor exarrple A U B iilam i hu 7am hul u an i39 A n B May W l u E U BIC lunmsumm A fuzzy relation is represented by a matrix A fuzzy relation 3 associated with the implication a a l is a fuzzy set of the carteslan product U x V R u vImuv uE UY v E V One ofthe most commonly used fuzzy implications is based on the min op lew mm au bv lunmsumm n Ris arelaionnom A m a 39 dation mm 1 c Thecomvos ion 01R and s is arelaiannom A to c denoted R s F quot3 Tattmll 39mmdahh quot39slhChD I1 Simila y c111 heused as mle mmrme when R 15 lzzy rel Inquot to v x 15 my ans u u v 15 my ans u v v lsinmad by 1 an R 1yY x R quotInduspollen v vhm mintmltut 111711134quot T 11 EU no oethat whenR aah 1 Hspeed 15han men brakmg me 1s medmm speed NnrmaTOD 100 SNOW60 TEO OTOO hrakmgmme Medmm0 ST VZ M 24 05 When uharamanshu when are mam155 hueaw nhe waym carry mA maxrmm hTerehDe sm de ne a mamxin he Dnmpnsdmn The mamx M W DDNDDSTUDM Thierehee Ts dehhed 817 had ma mInaX b The mamxTnHms examp e TS asTanWs mnmsllmm Map 10 The malllx TDHMS examp e Ts asTanWS M 13 n 1 z 3 e s mum mmm sy mm mm mm mm 111 11 11111111 mm mnle mm mm mm mm mm mum 11111151 mm mm 11111121 mum 111111111 mm 1111111 11111111 mm mm 1m mmmm mmm sy mun mm mm mm Slmphfymgterms we have me fu nwmg 11 n n 1 1 1 11 s e 11 s 1 n 11 11 n n n w 5qu 11e mm mewmm mmmmees 11e fuzzy Wmm nmmnmplymg 1e mmXMtymemymsp 2 111111 111 mmng The gmeva 1am 1sg1ven w v 1 gm 151 Thus Vthe gven weed vednv me had m mnmmmn m mun mun mewn mn vlmnx 5 mm 5 mm mnmnmw urmn vl mn Zmn 1 maxlrmnmmwmm 1 11111111111101 mew H115 11 8 2w Equw emw 1n the vmmnmannnwe mvethe vaunwngmzzwepvssnannn mymemamng mu m be apphed e mm in T m 313 yen5 Induced fuzzyset mnmsllmm Nnrvrmi speed Q n 0 w 4 60801011 Give speed mm v mqu mm Given twa inputs g 1quot L 1 temperature narmalized by the target temperature dE the time derivative af E dEdt Campute one autput W the change in heat ar cauling source Fuzzy variables NB neg big quot520 PS PB Example rule if E is 20 and dEis NS thenWis PS if E IS 20 and dE is PB thenWis NB v mqu mm Fuzzy canrmd Fuzzy mmm How Q decide What 0110 W What breaking force to appiy vipnucmrmk Tame Penmsenrmg AH e Ri es Defiiimrm w zariabies iSSCiCiatEd wrm Mes vipnucmrmk NE NS 20 P5 pa NE pas NE w 20 pa P51 207 NS Nas P5 NS Pa N85 NS 20 PS PB a1 0 1 Suppose that E75 and dE0 Hence E is PB with degree 5 E is P5 with degree 5 E is 20 with degree 1 Two rules are applicable a mm A mme 5 A 1 5 g mple A mme 5 A 1 5 39 quot xlwl I It mitst mslwl rIs A quotan vunrmnu 71w 0 1 r neruzzincation trans iating a ruuy category to a precise output neruzzincation using tire center at grav y wnyisiuuyiogicsosuccessrui tumitsumm ax Easily and succinctly represent expert system rules that involve continuous variables Model environment variables in terms of piece wise linear characteristic functions vunrmnu More on Logical Reasoning about Uncertainty Learning tumitsumm tn Victor Lesser CMPSCI 683 Fall 2004 InstanceBased Learning Analytical Learning ExplanationBased Learning First work done at Umass on learning rules of Baseball Maleriallake from Mitchell s Machine Leaming Begin Resource Bounded Reasoning Vlmyta lm Encode specific experiences in memory rather than abstractions Carry out generalizations at the retrieval time rather than the storage time lazy learning In their most general form Based on partial match on a similaritymetric retrieve a set of caseshnstances most relevant to the present context Adapt the retrieved experiences to new situalionsThis could be based on algorithms ranging from a simple k nearest neighborclassi cation lo chains of reasoning vmmrm Key idea juststore all training examples x x l Nearestneighbor Given query instancexq hrsl locale nearest training example xm it fh KNearestneighbor Given xq laxevole among its hnearest neighborsildiscrele valuedlargellunction lake mean ollvalues olhnearest neighbors l realvalued ft v Vlmyta lm i Might want weight nearer neighbors more heavil A i i V fri 2iwfn r iquot Where 1 251 t x J I i J u i 39 And 109 1 is distance between x x Oii lell Pnsiiiie and Negatiieiiaining Examples Note now it makes sense to use all training 39Neaiest neignbniyeslui xq 5i Blassilies i as negative examples 39quotStead OfJust k 7 Classi cation much slower On HEM is decision surface lui nearest naigmi quaii in iegiun Will We same value uumcsiwmi 5 Imagine instances described by 20 attributes but only 2 are relevant to target function Curse o dimensionality nearest neighbor is easily mislead when highdimensional X Similarto overiitting One approach Instances map to points in SRquot Continuous real values Less than 20 attributes per instance Lots of training data Advanta es Stretch i quot axis by Weignt zli Wnere 21 i 2 chosen to gagnmtgt s very fist d t minimize prediction error 39 5 quot0 59 WW 3 a Lengttittie axesinaimiiespnnainine more relevant Learquot COWPlEXtaFQEtiUHCtiOHS was Don t lose information Use crossvalidation to automatically cnoose Weignts Disadvantages ii i n Minimize aim in uiassiiiuaiimi 39 SW at Wequot me Setting z tnzem eliminates inis dimensiun all together EaSllWOOlEd by liremama ributes i l J 4 Note Ann lorms loml approximation to nor eadi query point x Why not form an miicn approximation f z lorregion smoumlin x 7 Fl llmalmndlmln kneaea nagmnls Romanian via 7 anuaiianc 7 Picnics pieuewse appmxinialinn tn i Swud choices of mono nl39nimixe explo graling lils m 7 Squared enninvei kneaiesttn neignms we in 2 WWW fxfx 7 Diaanmveighted squaedeimi we all neigmmis 51W 100fI KdwI Key elements of problem solving CBR are Cases represented as solved problems lndex cases under goals satisfied and planning problems avoided Retrieve prior case sharing the most goals amp avoiding the most problems Adapt solution of prior case to solve a new cas i May require resolVing problems andor repairing solutions lndex new case and solution under goals satisfied and planning problems aoided Can apply instancebased learning even whenX a M Need different distance metric CaseBased Reasoning is instanc applied to instances with symbolic lo s ebased learning gic description Cann nnn usercomplamt errorsaonshutdown cpuiiiodel PowerPC operatingsystem windows networkconnectlon ecm memo Installedappllcatmns Excel Netscape Vlrusscan 15k 1 1 ikely cauae 722 CADET 15 stored examples of mechanical devices Each training example qualitative ivnctioni mechanical structure New query desired function Targetvalve mechanical structure for thisivnction Distance metric match qualitative function descri tions Size of largest svbgraph between the function graphs i 39 39 EMEEW Exploits dornian as matching more likely umuaiiaammiiii mm A 7gt B to A gtgtltgt B y uxeiese nmt is A half pound of beef Twot bles chop the garlic into pieces the size of matchheads Shred the beer Marinate the beer in the garlic sugar corn starch ncewme and soy sauce surfry the spices rice erle and beeffor one minute Add the green bean to the spices nce wine and beef surfry the spices rice Wine green bean and beef torthree minutes Add the salt to the spices ncewine green bean and beef y uxeiese nmt is CHEF consists of six processes a Problem anticipation the planner anticipates planning problems by noticing features in the current input that haye preyiously participated in past planning pro lerns 7 Plan retrieval The planner searches for a plan that satisfies as many of its current goals as possible while ayoiding the problems that it has pre ic e 7 Plan modification The planner alerts the plans it has found to satisfy any goals from the inputthat are not already achieyed 7 Plan re air When a lan fallsthe planner fixes thefaculty plan by building up a casual explanation ofwhy the failure has occurred and using it to find the different strategies for repairin it 7 Credit assignment Along with repairing a failed plan the planner wants to repairthe characterization of the world that allowed it to create the failed plan in the first place ltdoes this by using the casual explanation ofwhy the failure occurred to identify the features in the inputthat led to the problem and hem as pre ic iye o 7 Plan storage The planner places successful plans in memory indexed by the goals that they satisfy and the problems that they avoid u lame amt i4 Recipe for BEEFANDBROCOLI Found nearest recipe is BEEFWITHGREENBEANS Modifying recipe BEEFWITHGREENBEANS To satisfy include broccoli in the dish Placing some broccoli in recipe BEEFWITHGREENBEANS Considering ingredientcritic re doing step Stir fry the Variable Chop the broccoli into pieces the size of chunks ngredient critic applied Chef alters old plans to satisfy neNgoals using a set of umresaami modi cation rules and a set of nail objects 5 Anattpnttn ntheet enemaxpnnnntmgar Twa tatlespnnn ntsnysaum t natt pmth at tmccntt One a punt ntttce we tt natttatlespnnn ntmtn with one teaspnnn ntstt one chunk ntgalttc Chth 91quotquot mo pacesme sue a mtmuns s the be mnnaemeneq mm game sugar cum smut mwne am 50 sun at the mum m asme sin a dunks M le Sn bathe sung III Will WM and he The haettsnuw tendet neatst nuw tastes satty The tsh nuw tastessavnty nentsnnuwtastessweet Thlhmtmlt nmum hamsquot nuw tastestttegatttu 39 ALTERPUW StDErEFFECT Reptauetne step that Causes the vtntattng Dnndtttnn thn mte that dues nnt have the same state EWEUt but auntevestne same gnat SLPtTrANDrREFORttt SDtttthe step mm twn separate steps and run them tnaependentty 39 ADJUNTVPLAN REMOVE Add a new Step tn be run atnng thn a step that Causes a stderetteutthat temnves the stderettect as tt ts Created tndextng BEEFVANDVBROCCOU under gods and ptobtents tfthts ptan ts successfut the fottcwtng shoutd betrue he ed ts now tender The brocoott ts now atsp tnctude beet tn the dtsh tnctude broccott tn the dtsh Make a ittrfry dtsh The ptan avedsfatture exernptttted bythe state The broccdt ts now soggy tn tectpe BEEFVANDVBROCCOU Seatehtng tot pan that sattsttes tnput goas tnduue htcken tn the dtST trtdttd mowpeatn the dtsh Mate asttrttyd 9t Cottectng m0 actttattng tests ts the dtST STVLESTtR FRV tsthettetn aME T ts the ttetn a VEGETABLE ts the TEXTUREot ttetn CRtSP n ttss trrtrted Stttrthng tn so much ttqtttd maltes cttsp vegetaues soggy Remtnded 0t atatute trt the BEEFANDVBROCCOU ptan Fatute the vegetaue ts nowsoggy ChtekenSnow peatsttt Ftytng Fatute Meat sweats the t t Driving down on Make a stirIn dish Succeeded Driving down on void Iailure exemplilied bylhe state The broccoli is now soggyquot in rea39pe BEEFANDBROCCOLI Succeeded Driving down on Indude chicken in the dish Failed Trying more general goal Driving down on Include meal in the dish Succeeded Driving down on Indude snow pea in the dish FailedTwing more general goal Driving down on Include vegetable in the dish Succeeded Found redpegt RECQ BEEFANDBROCCOLI Give Space of Hypotheses Training examples of target concepts Domain theory for explaining examples Determine Hypothesis consistent with both the training examples and the domain theory May need fewer training examples to distinguish relevant attributes because generalization can be based on logical rather than statistical reasoning Given Space of Hypotheses Training examples of target concept Determine Hypothesis consistent with the training examples May need a lot of training examples to distinguish relevant from irrelevant features u mm m Normal Learning Involves a long process ofuncovering the consequences of prior knowledge Guided by a speci c training examples Use prior knowledge to reduce the complexity of the hypothesis space to be searched Reducing sample complexity Improving generalization accuracy u mm m Given Instances pairsolobjects 7 represented owe predicatesiype minimum Owen Material Densly and On Hypotheses mamom W then rules horn clauses Targa Concept Sateroman y Wsis matueeds to helmed Training Example SdetodadrlOBJmBJZl Dmsnyroan 1 Initialize hypothesis l For each positive training example not covered by hypothesis 1 Explain how training example satis es target concept in terms of domain theory 2 Analge the explanation to determine the most general conditions under which this explanation proof holds 3 Refine the hypothesis by adding a new rule whose preconditions are the above conditions and whose consequent asserts the target concept Domain Theory SafeToStackw y lt Not Fragiiey SafeToStackmy lt Lighterxy Lighterxy lt Weightxwx and Weightwrwy and esswxwy Weightmw lt Voiumexv and Densitxd and Equaimrv d Weightx5 lt Typex ENDTABLE Determine Hypothesis h consistent with domain theorle if B does not entail not h mnmsixmlm in Exn oulwn39 sawM an 11 m um m l mm r 39l l Wm I mmummmn mm Alumnus 2 huhgb39mvuie mnmsixmlm 1x Justi ed generalization from single example Compiling relevant parts of domain theory Explanation determines feature relevance Regression determines needed feature constraints Generality of result depends on domain theory Still require multiple examples SaleTnStach y lt77 vniume x vx menstmxnx KEuuaKvx tvneqvx dx amp LessThanMx S amp Typey ndtable Theoryguided generalization from examples Exampleguided operationalization oftheories Just restating what learner already knows Is it learning Are you learning when you get better overtime at chess Even though you already know everything in principle once you know rules of the game Are you learning when you sit in a math class Even though thosetheorems follow deductively from e axioms you ve already learned Knowledge of the domain is represented as logical rules View learning as a form of reasoning Background knowledge is used to construct proofs or explanations of experience These proofs are then generalized based on a type of goal regression and compiled into rules It is a process of transforming existing knowledge into another form so as to be able to apply it efficiently also called speedup learning Resource Bounded Reasoning Learning as a search for the best hypothesisfunction that matches the training data Learning technology has become very formal and closely related to statistics Credit Assignment is a complex problem Many different approaches to learning based on e sdpeivised vs Unsd eivised 7 Learning Single Step decision process vs multiple Step one We havejust scratche the c Suppo Vector Machines Hidden Markov Models Bavsean NetworkL i g 7 Field is evoiving veiv quckly t mew m Agents have limited computational power They must react within an acceptable time Computation time delays action and reduces the alue of the result Must cope with uncertainty and missing information Limited planning horizon The appropriate level of deliberation is situation dependent It is beneficial to build systems that can tradeoff computational resource for quality of results t mew m Al deliberation techniques have a large variability in execution time memory consumption etc Difficult to predict reasonable upper bounds Nonrdetermini iD Variable amount o Variable amount oluncerlainty The computational resources required to reach an optimal decision normally reduce the overall utility of the result Not only an issue of more computing Need intelligent control How to build agents that satis ce rather than optimize How to get the best answer within available resources uunmsixmim ax Proposed by Herbert Simon in 1957 to denote decision making that searches until an alternative is found th me 5 e agent s aspiration level criterion It appears probable that however adaptive the 39 anisms in Iearnin and choice ideal maximizing postulated in economic theory Evi ently organisms adapt well enough to satisfice they do not in general optimize quot It is not feasible computationally or desirable economically to compute the optimal answer The appropriate level of deliberation is situation dependent It is bene cial to build systems that can tradeoff computational resources for quality of results Key to Applying AI in Open Environments uunmsixmim in o Computation Time versus Quality of Results 0 Memory versus Time versus Power 0 Cost of Information versus Quality of Resuns 0 Communication Bandwidth versus Quality 0 Manymore vummm u Satis cing approximate reasoning Satis cing approximate modeling Satis cing approximate metareasoning Satis cing bounded optimality Satis cing a combination of the above minimum u G 0d 7 Within in semmsuwivire vehicle tvpes msnlnns am movement mamdeiiaicsloiasmanv vendesmovingthmugmhe enviromienl as visible giwngpreiermeeiniming veniuesnnvpe w am determining vehide medians win mgn preuson v 5 d1 amassed 39 v 39 v2 2 12 d2 7 The lollowng evens and iner mirewondvvg laatures are ueiiain Venue ivpe vi lomied Bill movvv mm weed 51 am diledlnn m Vemdelvpe v2 at 12 moving wiweeu 2 am mremnn d2 No Wis vehids ae present in ire environment 39 Best within Dealli 7 From a mimryanalvsis ii isiixevinaiinere exiasa vehide mivpe vi lomled nea ll quotraving will weed between andsi in dvemon d1 Olhel vemdes quotvigil he WEE vummm u 39 Appropriate approximation is based both on the needs of the user and on the current state minimum is Medical diagnosis of treatments Combinatorial Optimization Evaluation of belief networks Realtime heuristic search Information Gathering Mobile robot navigation Database query optimization Software validation Realtime Graphics no 4 Mmbvrinhgnonrpumbnvrk Fig 3 Roleojmuolmm Weseek sprrurem Tahmnlakesadvmlageof mm mmwmmm watmlandtempnmlcnheienaeby f 1 f mm mmmdmmcs 5 My tovanmlsczSesn in MW graphicsSegmth wat and Recmrgles cameo aroma 1m f lm m 3 boundspnteswaxpdtn tanewfmme 39g quotWquot attenm nsnr mimgsofinhdhnesbnundm n WWW m m 5m s adlzaemyrdaslan em ngmund zandmobonnfwnles Hnmwmgvei worm Notes illneAAAFallsyw on Heme mmpuomn woe wle a mom Methods l39nneducmg Fig 3 mrngwexpnmlsolulwm OUJEES canbe sampkdatavmetynl39 moi remlmmnsbel39me burg sputum ol39models mm on m1 model to cnmwslled min nal scenes allowing pmgremvelysimpkrmodels 39l39hel39vlly foratmdem bementhe clantyol39an detailed model el t isdescdoedby 795 object andthe computaan reqmiedtn vemces The ampli ed model right is renderle object descxioedbvaemces Honill Mengyel Working NolasomeAAAFalsymp on Flexible Computation 1995 A methodologyfor building satis cing systems by addressing the following four major issues 1 Elementary algorithm construction 2 Performance measurement and prediction 3 Composabillty of methods subsystems 4 Monitoring and metalevel control In the context of an Overall System Architecture mnmsixmlm ix Develop computational methods that allow small quantities of resources such as time emory or information to be traded for gains in the value of computed results Quality measures replace correctness r Certaintyr Likelihood that the answer is DDllEDt 7 Precisionr ALDuraDy ol the answer 7 Speci city Level ol detail in the answer 7 Com letenessrPart olprohlem solved 7 Cost 7 Overall solution DDSt 7 Multidimensional quality measum What s are the baselevel computational methods How does the system represent and project resourcdqualitytradeoffs What kind of metalevel control is used What are the characteristics of the overall system mnmsixmim sir An anytime algorithm is an algorithm whose quality of results improves gradually as computation time increases Anytime algorithms have been designed for planning Bayesian inference CSPs combinatorial optimization diagnosis vnmcsnm 5i Deusrun Quality Ideal maximal quality in no time Traditional quality maximizing Anytime utility maximizing mnmsixmim 52 Interruptlble algorithms are anytime algorithms whose run time need not be determined in advance They can be interrupted at any time during execution and return a result Contract algorithms are anytime algorithms that take the deadline as input No assumption is made about the results produced before the deadline v New m 53 Both interruptible and contract algorithms have performance profiles Qt which return quality as a function of time Note that for contract algorithms the horizontal axis is the time that was given in advance u w my amt 54 Good contract algorithms can be easier to design because they can optimize with respect to the extra time input Examples Depthbounded or costbounded tree search Discretizing a continuous problem Composition of anytime algorithms v New m 55 What if we want to use a contract algorithm in a setting where we don t know the deadline We can repeatedly activate the contract algorithm with increasing run times t w my amt 56 When the deadline occurs we can return the result produced by the last contract to nish Deadhne I Mme Return resu t from this vumcsiwnm a mmsmm 5x More on Resource Bounded Reasoning vumcsiwnm m Victor Lesser CMPSCI 683 Fall 2004 Continuation of Systematic Search for CSPs More Complex Search Vlmltf lm Initial state the empty assignment Successor function a value can be assigned to any variable as long as no constraint is violated Goal test the current assignment is complete Path cost a constant cost for every step yummy function BACllTllACllthSlillflffapf returns isoiufion or failure relvrvlrrviuivrBirmvnnuuiinpl function REf URSlllEBACKIRlleiGlvulgvnrrvfrrpf refurnsr solution or failure if vulgvnvnl is complete filtll refvrv arsigmvuv viriSrlzurhvisucvrMirnaLril39luinlrslnyvulgmvnup loreathmiveiv 01mlllDnnulVllvirlmnnuignvvvtrrn do 39l urine is consistent l iiil vulgvnnvrccording in Covsmlwslnpl flev runliRErUiSIl iBnriinlrllllslimlullu nsrigimuifup if result falfvre hm refvm ivsvfi end refvrvallviv Vlmltf lm WAied Wiigiseii WAtliie Wi iaud WlDd Nigrezii A An arc from Xto Yin the constraint graph is consistent if for every value of x there is some value of Ythat is consistent with X Can detectmore inconsistencies than forward checking Can be applied as a preprocessing step before search As a propagation step after each assignment during search how is this advantageous Process must be applied repeatedly until no more inconsistencies remain W Reduce the branching factor by deleting values that are not consistent with the values ofthe assign variables FonNard checking a simple kind of propagation lrilial domains Alter WAiei Ana owequot A er VDlle anamdiechnu WA in W nliljiinii 9 Lpdale VINSW m IIIIIEI IIEI hamw sum No possible solution with WAred and Qgreen lm nrC m nttte ntmtttymthretu ddtmmflh inpuls 05pttbllm LSPMlhtnrltlblkUQ local variables genre a queue at ates uttrmlly our 5 ms in exp be ltmu while uttuw wmt rmply tin X3 X t mm Erl RUNquht39tlt l if RM m l rlt tlll ltNlAl lll 39 tame AiuYElGHBORSlX do mum when it mum tnte the removea value function REMO Erlt 0slsTENrrVZUlEsl X munutt ut w loop tar em r m Do ttAINlYgl do quot301 etc mum cycat no he nned mt some vane yln Donating men duet xnnm DOMNMW lemweltnue and I39elu rn runhut end A graph is k consistent if for any set of k variables 5 c there is alway onsistent value forthe kth variable given any consistent partial assignment for the other k 1 variables 7 A graph ts strongly krconslstent if its laconslstefft for l l k e lF Kffttfnber ofnoees than no backtracking Higher forms of consistency offer stronger forms of constraint propagation 7 Reduce amount ofbacktrackmg 7 Reduce effecttve branching factor 7 Detemtng lrfDOrfSlSIerfl parllal asstgnrnents Balance of h wmuch preprocessing to get graph to be k consistent versus more searc Abinary CSP has at most 0n7 arcs Each arc X gtY can only be inserted on the agenda d times because at most d values of Ycan be deleted Checking consistency of an arc can be done in 0d7 ime Worst case timecomplexity is 0n2d Does not reveal every possible inconsistency l lll llullllllfdl Datllllatltlllg always backtrackto most recent assignment Not Ncient Cttltlllfl Gel A et olvariahles that caused the lailnre Eatkltlmpllfq backtracktothe most recent variable assignment in the con i set 39 Simple motl ication 039 BACKYRACKING SEARCH leed variable emerne G NSW V T SA WA NT G NSW V U Sk batkup in T makes no sense What Variables Caused the Cenntct Backtracktn v nest ferenlvaflable set n enfltct set 39 Forward Checking can also generate conlliclset based on variables that remove elements lroln domain l onllml dilelled Dalklumpllig better de nition 0 con ict sets leads to better perlonnance holtomuptopdown state integration o WAnee NSWred can never be solved T red inen assign mow anemone 3 9 0 Howlo knowlnal Willem oonlllol sel MM is Cnnllml sel MM is selnlpreuedmg yarlableslhal Caused NT lngelher nnn any subsequenlvanamee lo have no oonslslenl solutions WAand NEW slime they don t CnnlllDlWllh NT SAlalls oonlllollWA NT G based on lorward pmpagalmn banklump lo a o absorbs ennnm sel olSA minus a WA NSW NTL menump in NT NT absorbs unnnm sel olG minus NT WA NSWL vulmsixmlm u The complexity of solving a CSP is strongly related to the structure of its constraint graph Decomposition into independent subproblems yields substantial savings Didquot u Oldcrnc Treestructured problems can be solved in linear time Oln d1 Cutsel conditioning can reduce a general CSP to treestructured one and is very e icient if a small cutset can be found vulmsixmlm l5 Pnneemne lNFORMEDrENDKTRNDK VARSlEFT VARSVDONE llall variables are corslaenl lnen mlullon lomd srov a variable in VARSVLEFl mat lslncmllld mam random mm VAR mm VARSVDONE Lei VALUES ha nlm lhle valueslorVAR ordered in asenmng order amidvvg in nunme mennms win varieties anARSrLEFT mmmmmhevnsnc Foream VALUE in VALUES unn minim lound nva does ml mnllld win anyvanadelnal lsan RSDONE mm Assign VALuE in VAR Call lNFORMEDrEAEKTRACKCARHEFT VARSVDONE end n end my end pmuedure Begin pmgmm Let VARSlEFT ha m all vaiahles eam assgnee m lmllal sale at ARSVDONE ml Call lNFORMEDEMKTRA KOARSLEFT VARSDONE End pmgmm mnmsumm u 1 loose a Variable as men order vanables lrum rool to leaves such that every node s palm precedes it in the oidcrlng 2 For J lrom 71de m 2 applyHEHOVEINCOKSISTENTletnthHQ 3 For y lnunn 1 to n aeign or consistently wnn I m39e vl mnmsilmm in A r 9 Q WA WA U n g M of d 6 Cutset conditioning instantiate in all ways a set of variables such that the remaining constraint graph is a tree Cotset size 5 a rpniime OifVI739LiZi veiyfastforsmall c CSPs are a speciai kind or probiem states defined by values of iixed set of variables goal tes denied by constlainr on variable vaiues Backtracking depthrlirst midi with one ariabie assigned per nod Variable ordering and vziiie selection iieuiistis help significantly Forward checking prevenls assignments that guarantee iatev failure Consirainl propagation tgq are cansutcncv does additionai mull to constrain values and detect inconsistenc es The CS representation alinws anaiysis 0i min structure Tree summed CSPs can be soived in linear time Iterative mincon ius is usually eneuiie in practice MultiLevelHierarchical Search lnterdependence of Search Paths NonMonotonic Domain Cost of Control Nonuniform cost of operator application Operators applicableto a search node are not affect by path to node Markw like assumption Rating of one search node doesn t affect rating of others nodes on different paths Near independence of search Patiis Contrast Drilling example samples laken from one well may change the likelihood of olher well being successful Keel Heunsnc Search Km Ti StatesOperators7 Each numbered row L418 and column 356 Independence of StatesOpemiors7 If 4 Linequot1hen no my to fill 1n5 Crossword Puzzle Search as Interacting Subproblems Walt Filteiin Exploiting PailMs Constraints w 39klWXIIE Dim EDthi EDquot awmtznmwnwwm mva APm inw What happens if you add in more constraints among words Grammar heme What happens if you add in speech input 7 Probabhstm knowiedge about Word iikeiihnnd 7 Constraint satisiabimnvsbonsiramtoptimizatmn 39 Ham and suit Lnnsivahts How does the interaction among subproblems change Subgoals B and 0 cannot be solved independen y El gm aim From the pespective of subgoai Bi subgoai D appears to be the best soiution cost of2 vs cost of 5 Hang subgod C but since 0 must aiso be satiiied to sdve A the overaii best sdution is subgoai C n o Simple blocks world problem initia Hna In ialstate ONCACLRBC c Goal state 0NBC0MABCLRA Operators 1i clearaik leyl A CLRM CLRy 2i Put0nCLRw n cuzm 0mm vunmsixmim 75 Take into account the existence of other states or solution to other subproblems ARC consistency Reevaluate rating nodeoperator Reduce variance uncertainty in rating Decrease cost of operator application V eliminates certain states as mleaslble vunmsixmim 1 Cannot just combine the solutions to omega and arms since eachsolution mak the othersolution invalid onionsuni 2n Some problems are nearly decomposable Their subproblems have only a small amount of interaction A goal that can be decomposed into a set of subgoals is nearly decomposable if most ofthe time independently considered solutions to the subgoals can be combined into a consistent solution to the goal Only a subset ofthe subgoal solutions interact so as to be incons39stent39 Consistent solutions can be found without completely resolving the joint subproblem onionsuni 2x Two states both apply to same operatorampdata Two nodes A amp B ratingA gt ratingB If extended by same data then ratingA1 gt Many Al techniques have been developed to handle nearly decomposable problems rating B1 Typically this Involves Independently soIVIng the subproblems and then repairing the 3 solutions to deal with any interactions quot M a How does it relate to success of Heuristic mum Mm mintmm 39 w minimum mil I RepairLocal Search success I I WMHymnal I A W m w lt I Dynamically evolvmg Interacting subproblems NourMouomue Example 3 I mwm I Blackboard Systems