Artificial Intelligence CSCI 538
Popular in Course
Popular in ComputerScienence
This 169 page Class Notes was uploaded by Janiya Renner MD on Friday October 30, 2015. The Class Notes belongs to CSCI 538 at Texas A&M University - Commerce taught by Derek Harter in Fall. Since its upload, it has received 9 views. For similar materials see /class/232407/csci-538-texas-a-m-university-commerce in ComputerScienence at Texas A&M University - Commerce.
Reviews for Artificial Intelligence
Report this Material
What is Karma?
Karma is the currency of StudySoup.
Date Created: 10/30/15
CS 538 Artificial Intelligence Fall 2007 Lecture 4 Heuristic Search Derek Harter Texas AampM University Commerce Many slides over the course adapted from Srini Narayanan Dan Klein Stuart Russell and Andrew Moore Announcements Today 39 A Search 39 Heuristic Design 39 Local Search Recap Search 39 Search problems 39 States successors costs start and goal tests 39 Search trees 39 Nodes represent paths have costs 39 Strategies differing fringe management 39 Tree vs graph search Uniform Cost Problems 39 Remember explores Increasmg cost contours 39 The good UCS is complete and optimal 39 The bad 39 Eltplores options in every direction 39 No information about goal location BestFirst Greedy Search Arml Buclmres Pinst Rimuicll Vilcea 1 n1 Timisoam Urziceui V aslul Zeriml suaAgmruue dismnce to Bucllares 366 0 BestFirst Greedy Search 39 Expand the node that seems closest Arad 329 374 366 380 193 39 Bucharest 0 253 39 What can go wrong BestFirst Greedy Search BestFirst Greedy Search 39 A common case 39 Bestfirst takes you straight to the goal on a wrong path 39 worstcase ike a badly gUIded DFS In the worst case 39 Can explore everything 39 Can get stuck in loops if no cycle checklng 39 Like DFS in completeness fInIte states w cycle checking Best First Greedy Search Algorithm Complete Optimal Time Space Greedy BestFirst Search Y Obm Obm 39 What do we need to do to make it complete 39 Can we make it optimal Combining U08 and Greedy 39 Uniformcost orders by path cost or backward cost gn 39 Bestfirst orders by goal proximity or forward cost hn 39 A Search orders by the sum fn gn hn Example Teg Grenager When should A terminate 39 Should we stop when we enqueue a goal 39 No only stop when we dequeue a goal Is A Optimal 39 What went wrong 39 Estimated goal cost gt actual good goal cost 39 We need estimates to be less than actual costs Admissible Heuristics 39 A heuristic is admissible optimistic if Mn S VOL where mm is the true cost to a nearest goal 39 Eg Euclidean distance on a map problem 39 Coming up with admissible heuristics is most of what s involved in using A in practice timality f A lokin l a l 39 a r 7 77 4 L in 7 a r 7 7 7 3 I ll l l til it g 1 ll 1 l V l39fgiiIV l if l L quot V l39igiijIW Lilli a l l 39 P r0 0 f l f 39 What could go wrong 39 We d have to have to pop a suboptimal goal off the fringe queue 39 This can t happen x V 77 I t Y V17 77 f N N G 39 Imagine a suboptimal goal 6 is on the queue 39 Consider any unexpanded fringe node n on a shortest path to optimal G 39 n will be popped before G fut S 93 93 lt 9G 9G fG fn lt fG What to do with revisited states h 100 1 The heuristic h iS 1 2 clearly admissible What to do with revisited states h 100 f 1100 21 1 2 90 490 100 7 I 0 104 If we discard This new node Then The search algoriThm expands The goal node nexT and r39eTur39ns a nonopTimal soluTion What to do with revisited states 1 2 100 1 mm 21 1 2 90 mo 49o 100 O 102 104 InsTead if we do noT discard nodes revisiTing sTaTes The search Termina res wiTh an op rimal soluTion Optimality of A Contours 39 Consider what A does 39 Expands nodes in increasing total fvalue f oontours 39 Proof idea optimal goals have lowerf value so get expanded first Cnsistency Wait how do we know we expand in increasing fvalue Couldn t we pop some node n and find its child n to have lower f value YES g10 What do we need to do to fix this Consistency Cm a x 2 Mn hn Real cost always exceeds reduction in heuristic Admissibility and Consistency 39 A consis ren r heur39is ric is also admissible Lef r as an exercise 39 An admissible heur39is ric may no r be consis ren r bu r many admissible heur39is rics ar39e consis ren r UCS vs A Contours 39 Uniformcost expanded in all directions 39 A expands mainly toward the goal but does hedge its bets to 608 ensure optimality Properties of A UniformCost A Admissible Heuristics Most of the work is in coming up with admissible heuns cs Good news usually admissible heuristics are also conSIstent More good news inadmissible heuristics are often quite effective especially when you have no choice Very common hack use or x hn for admissible h or gt 1 to generate a faster but less optimal inadmissible h Example 8Puzzle III II I ll Ell IE Start State Goal State 39 What are the states 39 What are the actions 39 What states can I reach from the start state 39 What should the costs be 8Puzzle Number of tiles misplaced Why is it admissible hstart 8 This is a relaxed problem heuristic E E E Stan State Goal State Average nodes expanded when optimal path has length 4 steps 8 steps 12 steps ID 112 6300 36 x106 TILES 13 39 227 8Puzzle ll What if we had an easier 8puzzle where any tile could slide any one direction at any ti me Total Manhattan distance Why admissible hstart 312 18 E E E Stan State Goal State Average nodes expanded when optimal path has length 4 steps 8 steps 12 steps TILES 13 39 227 MAN 12 25 73 HATTAN 8Puzzle 39 How about using the actual cost as a heuns c 39 Would it be admissible 39 Would we save on nodes 39 What s wrong with it 39 With A tradeoff between quality of estimate and work per node Trivial Heuristics Dminance 39 Dominance emact Vn 2 han 2 hcn man h h 39 Heuristics form a semilattice a b 39 Max of admisSIble heuristics is admissible ha Mn ma ham hbn I he 39 Trivial heuristics 39 Bottom of lattice is the zero heuristic what Zero does this give us 39 Top of lattice is the exact heuristic Course Scheduling From the university s perspective 39 Set of courses C1 c2 cn 39 Set of room times r1 r2 rn I Each pairing ck rm has a cost wkm 39 What s the best assignment of courses to rooms States list of pairings Actions add a legal pairing Costs cost of the new pairing Admissible heuristics Other A Applications Pathing routing problems Resource planning problems Robot motion planning Language analysis Machine translation Speech recognition Summary A 39 A uses both backward costs and estimates of forward costs 39 A is optimal with admissible heuristics 39 Heuristic design is key often use relaxed problems On Completeness and Optimality 39 A with a consistent heuristic function has nice properties completeness optimality no need to revisit states 39 Theoretical completeness does not mean practical completeness if you must wait too long to get a solution spacetime limit 39 So if one can t design an accurate consistent heuristic it may be better to settle for a non admissible heuristic that works well in practice even through completeness and optimality are no longer guaranteed Local Search Methods 39 Queuebased algorithms keep fallback options backtracking 39 Local search improve what you have until you can t make it better 39 Generally much more efficient but incomplete Example NQueens What are the states What is the start What is the goal What are the actions What should the costs be Types of Problems 39 Planning problems 39 We want a path to a solution examples 39 Usually want an optimal path 39 Incremental formulations 39 Identi cation problems 39 We actually just want to know what the goal is examples 39 Usually want an optimal goal 39 Completestate formulations 39 Iterative improvement algorithms Example NQueens h0 39 Start wherever move queens to reduce conflicts 39 Almost always solves large nqueens nearly Instantly Hill Climbing 39 Simple general idea 39 Start wherever 39 Always choose the best neighbor 39 If no neighbors have better scores than current quit 39 Why can this be a terrible idea 39 Complete 39 Optimal 39 What s good about it Hill Climbing Diagram objecti efunction oba maximum shoulder local maximum quotflatquot local maximum quot space curren state 39 Random restarts 39 Random sideways steps m msmbm 9a m mmmlt 30563 5 ea in 5 La 3 95 SEQ S Imam u m o msmu m v n O U 0 3 u A U I 00 0UDt 39 D U 039 39 n The Shape of a Yet Harder Problem Remedies to drawbacks of hill climbing 39 Random restart 39 Problem reformulation 39 In the end Some problem spaces are great for hill climbing and others are terrible Monte Carlo Descent 1 S initial state 2 Repeatktimes 3 b C d If GOALS then return 8 S 6 successor of S picked at random if hS S hS then S 6 8 else Ah hS39h with probability exp AhT where T is called the temperature 5 6 539 Metropolis criterion 3 Return failure Simulated annealing lowers T over the k iterations It starts with a large T and slowly decreases T Simulated Annealing 39 Idea Escape local maxima by allowing downhill moves 39 But make them rarer as time goes on function SIMI LATED ANNEALINGproblem schedule returns a solution state inputs problem a problem schedule a mapping from time to temperature local variables current a node next a node T a temperature controlling prob of downward steps current e llAKE NODEINITIAL STATEproblem for tlt 1 to 00 d0 TH scheduldt if T 0 then return current nerte a randomly selected successor of current AER VALUEne17t VALUEurrerzt if AE gt 0 then currentlt nert else curreutg n e rt only with probability 6A ET Simulated Annealin 39 Theoretical guarantee Em 39 Stationary distribution 1923 OC e W 39 If T decreased slowly enough will converge to optimal state 39 Is this an interesting guarantee 39 Sounds like magic but reality is reality 39 The more downhill steps you need to escape the less likely you are to every make them all In a row 39 People think hard about ridge operators which let you jump around the space In better ways Beam Search Like greedy search but keep K states at all times if o i o 30 o o o Greedy Search Beam Search Variables beam size encourage diversity The best choice in MANY practical settings Complete Optimal Why do we still need optimal methods Genetic Algorithms 32752411 32748552 327482 24748552 24752411 24752411 20 26 32752411 32752124 322124 11 14 24415124 24415411 6 244154 F ness Semc on Pans CrossOver 24748552 Genetic algorithms use a natural selection metaphor 39 Like beam search selection but also have painvise crossover operators with optional mutation Probably the most misunderstood misapplied and even maligned technique around Example NQueens Why does crossover make sense here When wouldn t it make sense What would mutation be What would a good fitness function be The Basic Genetic Algorithm 1 Generate random population of chromosomes 2 Until the end condition is met create a new population by repeating followmg steps 1 Evaluate the fitness of each chromosome Select two parentchromosomes from a population weighed by their fitness 2 3 With probability pC cross over the parents to form a new offspring 4 With probability pm mutate new offspring at each position on the chromosome 5 Place new offspring in the new population 3 Return the best solution in current population Search problems Blind search Heuristic search bestfirst and A Construction of heuristics Variants of A Local search Continuous Problems 39 Placing airports in Romania 39 States x1y1x2y2x3y3 39 Cost sum of squared distances to closest city Timisoara X 212 I Hirsova I Urziceni Bucharest Dobreta l l Eforie Grint Mths 39 How to deal with continous therefore infinite state spaces 39 Discretization bucket ranges of values 39 Eg force integral coordinates 39 Continuous optimization 39 Eg gradient ascent W 8f 8f 8f 8f 8f 8f 8178y178278y278378y3 a e I Iozfa 39 More on this next class l l Image from viasorg CSci 538 Artificial Intelligence Fall 2007 Lecture Bayes Nets 11122007 Derek Harter Texas AampM University Commerce Slides adapted from Srini Narayanan ICSI and UC Berkeley railistio Mels A probabilistic model is a joint distribution over a set of variables PX1 X2 X I Given a joint distribution we can reason about unobserved variables given observations eVIdence 39 General form of a query P I3q 13 1 l3 k Stuff you Stuff you care about k already know I This kind of posterior distribution is also called the belief function of an agent which uses this model Independence Two variables are independent if PX Y PXPY This says that theirjoint distribution factors into a product two simpler distributions Independence is a modeling assumption I Empirical joint distributions at best close to independent I What could we assume for Weather Traffic Cavity Toothache How many parameters in the joint model How many parameters in the independent model Independence is like something from CSPs what Examle Ineenence I N fair independent coin flips PX1 PX2 PXn H 05 H 05 H 05 T 05 T 05 T 05 V PX1X2Xn 2 lt K Examle Ineenence Most joint distributions Pg 133 are not independent Most are poorly modeled as independent PT S Conditional Independence PToothache CavityCatch If I have a cavity the probability that the probe catches in it doesn39t depend on whether I ave a toothache I Pcatch toothache cavity Pcatch cavity The same independence holds if I don t have a cavity I Pcatch toothache cavity Pcatch cavity Catch is conditionally independent of Toothache given Cavity I PCatch Toothache Cavity PCatch Cavity Equivalent statements I PToothache Catch Cavity PToothache Cavity I PToothache Catch Cavity PToothache Cavity PCatch Cavity Cnitinal lneenence Unconditional absolute independence is very rare why Conditional independence is our most basic and robust form of knowledge about uncertain environments PXYZ PXZPYZ What about this domain I Traffic 39 Umbrella I Raining What about fire smoke alarm The Chain Rule ll Can always factor anPijint distribution as an incremental product 0 conditiona Istnbutlons PX1 X2 X PX1PX2X1PX3X2 X1 PX17X27 39 39 39 Xi 1 7L Why This actually claims nothing What are the sizes of the tables we supply The Chain ule lll Trivial decomposition PTraffic Rain U mbrella PRainPTraffic RainPU mbrella Rain Traffic With conditional independence PTraffic Rain Umbrella PRainPTrafficRainPUmbreIIaRain Conditional independence is our most basic and robust form of knowledge about uncertain environments Graphical models help us manage independence The Chain Rule IV Write out full joint distribution using chain rule I PToothache Catch Cavity PToothache Catch Cavity PCatch Cavity PToothache Catch Cavity PCatch Cavity PCavity PToothache Cavity PCatch Cavity PCavity PCaVitY Graphical model notation Each variable is a node The parents of a node are the other variables which the decomposed joint conditions on MUCH more on this to come PToothache Cavity PCatch Cavity Cmining Evience Pcavity toothache catch 06 Ptoothache catch cavity Pcavity 06 Ptoothache cavity Pcatch cavity Pcavity C This IS an example of a naive Bayes model PCauseEffect1 Effect PCause H PEffectilCause 7L Graphical Models I Models are descriptions of how a portion of the world works I Models are always simplifications I May not account for every variable I May not account for all interactions between variables I What do we do with probabilistic models I We or our agents need to reason about unknown variables given eVIdence I Example explanation diagnostic reasoning I Example prediction causal reasoning I Example value of information Bayes Nets Big Picture I Two problems with using full joint distributions for probabilistic models I Unless there are only a few variables the joint is WAY too big to represent explicitly I Hard to estimate anything empirically about more than a few variables at a time I Bayes nets more properly called graphical models are a technique for describing complex joint distributions models using a bunch of simple local distributions I We describe how variables locally interact I Local interactions chain together to give global indirect interactions Graphical Model Notation Nodes variables with domains Can be assi ned observed or una55gnedunobsewed Arcs interactions Similarto CSP constraints I Indicate direct influence between variables For now imagine that arrows mean causation Toothache Example Coin Flips I N independent coin flips No interactions between variables absolute independence Example Traffic Variables R It rains I T There is traffic Model 1 independence Model 2 rain causes traffic Why is an agent using model 2 better Example Traffic ll 39 Let s build a causal graphical model Variables I T Traffic R It rains L Low pressure D Roofdrips B Ballgame I C Cavity Example Traffic II I Variables o I T Traffic I R It rains I L Low pressure a a I D Roofdrips I B Ballgame Example Alarm Network Variables B Burglary A Alarm goes off M Mary calls 39 J John calls E Earthquake Example Alarm Network Burglary Earthquake ayes Net Semantics Let s formalize the semantics of a Bayes net A set of nodes one per variable X A directed acyclic graph A conditional distribution for each node I A distribution over X for each combination of parents values PXA A PXa1an I 1 n I CPT conditional probability table I Description of a noisy causal process A Bayes net Topology graph Local Conditional Probabilities railities in Ns I Bayes nets implicitly encode joint distributions I As a product of local conditional distributions I To see what probability a BN gives to a full assignment multiply all the relevant conditionals together TL P131 m2 mm H Pm7parentsXZ i1 I Example Pcavity catch atoothache I This lets us reconstruct any entry of the full joint I Not every BN can represent every full joint I The topology enforces certain conditional independencies Examle Cin Flis PX1 PX2 PXn h 05 h 05 O h 05 t 05 t 05 t 05 Ph h t h Only distributions Whose variables are absolutely independent can be represented by a Bayes net with no aros r Traffic P0 t Example Alarm Network Burglary B E PABE T T 95 T F 94 F T 29 F F 001 PbAeAIaAjAmPjaPmaPa bA39IeP X0999XO998000062 09 XO7 0b x 0001 PB PE 001 Earthquake 002 A PJIA A PMA T 90 T 70 F 05 F 01 P e Example Alarm Network TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE M TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE A TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE TRUE TRUE FALSE FALSE TRUE TRUE FALSE FALSE TRUE TRUE FALSE FALSE TRUE TRUE FALSE FALSE TRUE TRUE FALSE FALSE TRUE TRUE FALSE FALSE TRUE TRUE FALSE FALSE TRUE TRUE FALSE FALSE Sum TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE PJ MABE 000000120 000059102 000036503 00006281 1 000000000 000000003 000000071 000049800 000000051 000025329 000015644 000026919 000000000 000000296 000007022 004930225 000000013 000006567 000004056 000006979 000000000 000000057 000001348 000946205 000000006 000002814 000001738 000002991 000000009 000005632 000133417 093674270 099999998 Examle Nai39ve ayes Imagine we have one cause y and several effects x Py 1 2 71 PyP1yP2ly Pny PCause Effect 1 Effect PCause HiPEffecti Cause This is a naive Bayes model Ch 13 pg 482 We ll use these for classification later Size f a ayes Net How big is a joint distribution over N Boolean variables How big is an Nnode net if nodes have k parents Both give you the power to calculate BNs Huge space savings Also easier to elicit local CPTs Also turns out to be faster to answer queries next class PX1 X2 X uilin the Entire Jint I We can take a Bayes net and build the full joint distribution it encodes TL P131 m2 mm H Pm7oarentsXZ i1 I Typically there s no reason to build ALL of it I But it s important to know you could I To emphasize every Bll over a domain implioitly aepresents some JOIl lt distribution over that omain Examle Traffic I Basic traffic net Let s multiply out the joint 0 r 14 r 34 P T IR r t 34 t 14 r t 12 t 12 Examle everse Traffic I Reverse causality t 916 T t 716 PRT t r 13 0 r 23 t r 17 r 67 Causality I When Bayes nets reflect the true causal patterns I Often simpler nodes have fewer parents I Often easier to think about I Often easier to elicit from experts I BNs need not actually be causal I Sometimes no causal net exists over the domain especially if variables are missmg I Eg consider the variables Traffic and Drips I End up with arrows that reflect correlation not causation I What do the arrows really mean I Topology may happen to encode causal structure I Topology really encodes conditional independencies Creating Bayes Nets 39 So far we talked about how any fixed Bayes net encodes a jomt distribution Next how to represent a fixed distribution as a Bayes net Key ingredient conditional independence The exercise we did in causal assembly of BNs was a kind of intuitive use of conditional independence I Now we have to formalize the process After that how to answer queries inference Estimation I How to estimate the a distribution of a random variable X I Maximum likelihood I Collect observations from the world I For each value x look at the empirical rate of that value countm total samples I This estimate is the one which maximizes the likelihood of the data PCB I Elicitation ask a human I Harder than it sounds I Eg what s Praining cold I Usually need domain experts and sophisticated ways of eliciting probabilities eg betting games Estimation I Problems with maximum likelihood estimates I If I flip a coin once and it s heads what s the estimate for Pheads I What if I flip it 50 times with 27 heads I What if I flip 10M times with 8M heads I Basic idea I We have some prior expectation about parameters here the probability of heads I Given little evidence we should skew towards our prior I Given a lot of evidence we should listen to the data I How can we accomplish this Stay tuned 080i 538 Artificial Intelligence Fall 2007 Adversarial Search amp Game Playing 10162007 Derek Harter Texas AampM University Commerce Many slides over the course adapted from Srini Narayanan Dan Klein Stuart Russell or Andrew Moore Games Motivation I Games are a form of multiagenz environment What do other agents do and how do they affect our success Cooperative vs competitive multiagent environments Competitive multiagent environments give rise to adversarial search aka games Why study games Games are fun Historical role in Al Studying games teaches us how to deal with other agents trying to foil our plans Huge state spaces Games are hard Nice clean environment with clear criteria for success Game Playing Axes 39 Deterministic or stochastic 39 One two or more players Perfect information can you see the state Want algorithms for calculating a strategy policy which recommends a move in each state Deterministic SinglePlayer Deterministic single player perfect information Know the rules Know what actions do Know when you win Eg Freecell 8 Puzzle Rubik s cube it sjust search Slight reinterpretation I Each node stores the best outcome it can reach I This is the maximal outcome of its children I Note that we don t store path sums as before After search can pick move that leads to best node Deterministic TwoPlayer I Eg tictactoe chess checkers I Minimax search I A statespace search tree I Players alternate I Each la er or ply consists of a round 0 moves I Choose move to position with hi hest minimax value best ac ievable utility against best Play I Zerosum games I One player maximizes result I The other minimizes result Tictac toe Game Tree MAX X MIN 0 MAX X MIN 0 TERMINAL Utility Minimax Example Minimax Search function MAX VALUE5tate returns a utility value if TERMINALTESTstate then return UTILITYstate v 9 00 for a 5 in SUCCESSORSstate do 11 MAXU MIN VALUES return U function MIN VALUEstate returns a utility value if TERMINALTESTstate then return UTILITYsmte v e 00 for a 5 in SUCCESSORS5tate do We MINv MAX VALUES return U Minimax Properties Optimal against a perfect player Otherwise Time complexity I Space complexity 39 Obm For chess b z 35 m z 100 I Exact solution is completely infeasible I But do we need to explore the whole tree Resource Limits Can not search to leaves Limited search I Instead search a limited depth of the tree Replace terminal utilities with an eval function for nonterminal posmons Guarantee of optimal play is gone More plies makes a BIG difference Example Suppose we have 100 seconds can explore 10K nodes sec So can check 1M nodes per move I ocB reaches about depth 8 decent chess program Evaluation Functions I Function which scores nonterminals Black to move White to move White slightly better Black winning 39 Ideal function returns the utility of the position In practice typically weighted linear sum of features EWKS u11f18 w2f28 ann8 egf1s num white queens num black queens etc Iterative Deepening Iterative deepening uses DFS as a subroutine 3 Do a DFS which only searches for paths of length 1 or less DFS gives up on any path of length 2 4 If 1 failed do a DFS which only searches paths of length 2 or less 5 If 2 failed do a DFS which only searches paths of length 3 or less and so on This works for singleagent search as well Why do we want to do this for multiplayer games AlphaBeta Pruning I A way to improve the performance of the Minimax Procedure Basic idea If you have an idea which is surely bad don t take the time to see how truly awful it is Pat Winston We don t need to compute the value at this node No matter what it is it can t effect the value of the root node ocB Pruning Example Pruning in Minimax Search ocB Pruning I General configuration I or is the best value the Player can get at any choice point along the current path Player Opponent I lfn is worse than or MAX will avoid it so prune n s branch Player I Define B similarly for MIN Opponent ocB Pruning Pseudocode function hIAX VALUE8tat returns a utility value if TERMINALTESTstate then return UTILITYstate u lt oo for a s in SUCCESSORSstate do w MAXu MINVALUEs return 39U function MAX VALUEstate oz returns a utility value inputs state current state in game oz the value of the best alternative for MAX along the path to state 6 the value of the best alternative for MIN along the path to state if TERMINAL TESTstate then return UTILITYstate u H 00 for a s in SUCCESSORSstate do B u H MAXu M1NVALUEs 13 if u 2 5 then return 1 or MAXa 7 return 1 oc B Pruning Properties Pruning has no effect on final result Good move ordering improves effectiveness of pruning With perfect ordering Time complexity drops to Obm2 I Doubles solvable depth I Full search of eg chess is still hopeless A simple example of metareasoning here reasoning about which computations are relevant NonZero Sum Games Similar to minimax Utilities are now tuples I Each player maximizes their own entry at each node I Propagate or back up nodes from Children 126 432 l612 741 511 152 771 545 Stochastic SinglePlayer What if we don t know what the result of an action will be Eg I In solitaire shuffle is unknown I In minesweeper don t know where the mines are Can do expectimax search I Chance nodes like actions except the environment controls the action chosen I Calculate utility for each node I Max nodes as in search I Chance nodes take average expectation of value of children Later we ll learn how toformalize this as a Markov DeClSlon Process Stochastic TwoPlayer Eg backgammon MAX Expectiminimax Environment is an extra player that moves after CHANCE each agent Chance nodes take MIN expectations othenNise like minimax if state is a MAX node then return the highest EXPECTIMINIMAXVALUE of SUCCESSORSQSHU C if state is a MIN node then return the lowest EXPECTIMINIMAX VALUE of SUCCIESSORS51 Lt if state is a chance node then return average of EXPECTIRIINIMAX VALUE of SUCCESSORSstute Game Playing StateoftheArt Checkers Chinook ended 40yearreign of human world champion Marion Tinsley in 1994Used an endgame database defining Berfect play for all pOSItions involvmg 8 or fewer pieces on the oard a total of 443748401247 positions Chess Deep Blue defeated human world champion Gary Kasparov In asiXgame match In 1997 Deep Blue examined 209 million positions per second used verysophisticated evaluation and undisclosed methods for extending some lines of search up to 40 ply Othello human champions refuse to compete against computers which are too good Go human champions refuse to compete against computers which are too bad In go b gt 300 so most rograms use pattern knowledge bases to suggest plauSIb e moves Stochastic TwoPlayer Dice rolls increase b 21 possible rolls with 2 dice I Backgammon z 20 legal moves Depth 4 20 x 21 x 20312 x109 As depth increases robability of reaching a given no e shrinks 39 80 value of lookahead is diminished So limiting depth is less damaging But pruning is less possible TDGammon uses depth2 search very good eval function reinforcement 25 24 23 22 21 21 19 111111151413 learning worldchampion level play 080i 538 Artificial Intelligence Fall 2007 Ch 15 Hidden Markov Models 11202007 Derek Harter Texas AampM University Commerce Slides adapted from Srini Narayanan ICSI and UC Berkeley Reasoning over Time Often we want to reason about a sequence of observations I Speech recognition I Robot localization I User attention I Medical monitoring Need to introduce time into our models Basic approach hidden Markov models HMMs Using Continuous Variables Kalman Filters More general dynamic Bayes nets Time and Uncertainty Previous ch 14 Bayes Nets techniques probabilistic reasoning in the context of static worlds 39 each random variable has a single fixed value Treating diabetic patient 39 assess current state of patient 39 actual blood sugar level and insulin level vary over time Dynamic aspects of the problem are essen al States and Observations Process of change can be viewed as a series of snapshots 39 each of which describes the state of the world at a particular time Each snapshot or time slice contains a set of random variables 39 Some observable some not Observables or evidence Et Unobservable state variables hidden Xt The observation at time t is E et Stationary Assumption Assume changes in world state are caused by a stationary process 39 A process of change that is governed by laws that do not themselves change over time In practical terms the conditional probability can be stated generically for all time t P Ut ParentsUt is the same for all t Given the stationary assumption need only specify variables within a representative time slice Markov Assumption I The current state depends on only a finite history of previous states First studied by Markov and are called Markov Processes or Markov Chains firstorder current state only depends on previous 39 secondorder thirdorder etc F Xt X PXtXt1 0t 1 I Called the transition model Markov Assumption In addition to restricting the parents of the state variables we must restrict parents of the evidence variables 39 PEt X01 E0t1 PEt X9 Called the sensor model or sometimes the observation model it describes how the sensors that is the evidence variables are affected by the actual state of the world Example R2 1 Ple 07 03 N I An HMM is Initial distribution PRain0 Transition Model P Raint RainM Sensor Model P Umbrellat Raint Markov Models A Markov model is a chainstructured BN Each node is identically distributed stationarity I Value of X at a given time is called the state 39 As a BN PX1 PXX 1 Parameters called transition probabilities or dynamics specify how the state evolves over time also initial probs Conditional Independence I Basic conditional independence I Past and future independent of the present I Each time step only depends on the previous I This is called the first order Markov property I Note that the chain is just a growing BN I We can always use generic BN reasoning on it if we truncate the chain Joint Distribution P x0 x1 xt E1 Et P099 Him P09 Xi1 PEi Xi Examle Markv Chain I Weather 0391 I States X rain sun I Transitions 09 0 9 x This is a CPT not a 01 BN I Initial distribution 10 sun I What s the probability distribution after one step PX2SUH PltX2SUHIX1 sunPX1 SUH PX2 SUHIXl rainPX1 2 rain 0910 01 00 09 MiniFrwar Algrithm 39 uestion probability of being in state x at time t 39 Slow answer Enumerate all sequences of length twhich end in s I Add up their probabilities PXt 2 sun 2 Z P131 Et1sun 131Cl3t1 PX1 sunPX2 sunX1 sunPX3 sunX2 sunPX4 sunX3 sun PX1 sunPX2 raianl sunPX3 sunX2 rainPX4 sunX3 sun MiniForward Algorithm Better way cached incremental belief updates A PM Z Ptt 1Pvt 1 CI3t 1 P1 known Forward simulation Examle I From initial observation of sun 38gt 8 33gt I From initial observation of rain 38gt 839 33 PX1 P09 P09 Stationary Distributions I If we simulate the chain long enough I What happens I Uncertainty accumulates I Eventually we have no idea what the state is I Stationary distributions I For most chains the distribution we end up in is independent of the initial distribution I Called the stationary distribution of the chain I Usually can only predict a short time out Web Link Analysis I PageRank over a web graph I Each web page is a state I Initial distribution uniform over pages I Transitions I With prob c uniform jump to a random page dotted lines I With prob 1c follow a random outlink solid lines I Stationary distribution I Will spend more time on highly reachable pages I Eg many ways to get to the Acrobat Reader download page I Somewhat robust to link spam Google 10 returned the set of pages containing all your keywords In decreasmg rank now all search engines use link analySIs along With many other factors Mst Likely Exlanatin 39 uestion most liker sequence ending in x at t Eg if sun on day 4 what s the most likely sequence I lntuitively probably sun all four days 39 Slow answer enumerate and score PXt 2 sun 2 P131 Et1sun PX1 sunPX2 sunX1 sunPX3 sunX2 sunPX4 sunX3 2 sun PX1 sunPX2 raianl 2 sunPX3 sunX2 rainPX4 816an3 2 sun MiniViteri Alrithm I Better answer cached incremental updates vvv A A A rain rain rain rain I I 39 mthj P1It 17 at arg maXP1t 17 1i 1 I Read best sequence off of m and a vectors MiniViteri mtx 2 133 P1t 17 1t 1 maXPat13t1 max P1It 1 azt1 CEli 2 g1axPxtmt1mt 1 t l m1M P1 Example Va tsls s 71s s gt gtooo gt Example s s t sl 3 35 25 15 45 13 23 Example ltsgt rain 0 o 1 A S 45 r way lt s gt Example ltsgt 11 rain 0 sun 39 45 45 ltsgt Hidden Markov Models I Markov chains not so useful for most agents I Eventually you don t know anything anymore I Need observations to update your beliefs I Hidden Markov models HMMs I Underlying Markov chain over states 8 I You observe outputs effects at each time step I As a Bayes net Conditional Independence I HMMs have two important independence properties I Markov hidden process future depends on past via the present I Current observation independent of all else given current state 3 I Quiz does this mean that observations are independent given no eVIdence I No correlated by the hidden state Filtering PXte1t PXte1t1 et Apply Bayes Rule P etl Xv e11 1 PXte1t 1Petle1t1 Use Markov Property PetXt X PXte1t1Pete1t 1 Predict from previous state PXte1t1 th1 PXtXt1 e1t 1 PXt1e1t1 Same form Use recursion FnNar Alrithm I Can ask the same questions for HMMs as Markov chains I Given current belief state how to update with evidence I This is called monitoring or filtering Formally we want PXt tlel39t P33t 1t QC PC3715 e113 Z Pent 1733157 81215 3375 1 2 Pt 1761t 1Ptt 1P tt CI3t 1 P tt Z P33t37t 1P37t 17 611 1 CI3t 1 Examle Pt 1t Olt ft37t Pta 6115 0500 0500 ft P tt Z Ptt 1ft 1t 1 CI3t 1 Viteri Alrithm 39 Question what is the most liker state sequence given the observations 39 Slow answer enumerate all possibilities I Better answer cached incremental version 33T arg maxP395L 1T 1T 1 i mtkvt P1t 17 517157 e116 P371t 1781t 1Ptt 1P tlt P tlt rpagtlt13tlt 1 max P1t 1a 1t 1 t l 1i 2 P tlt QB Ptt 1mt 1 15 11 Example Rain 1 Rain 2 Rain 3 Rain 4 Rain 5 State true r true r true r true r true paths false r false k false false v false umbrella true true false true true 8182 gt5155 gt0361 0334 0210 most likely lt paths 1818 0491 1237 0173 0024 III11 In12 In13 In14 m125 Real HMM Examples I Speech recognition HMMs I Observations are acoustic signals continuous valued I States are specific positions in specific words so tens of thousands I Machine translation HMMs I Observations are words tens of thousands I States are translation positions dozens I Robot tracking I Observations are range readings continuous I States are positions on a map continuous Dynamic Bayes Nets DBN Multiple Hidden State Variables Each State is a BN Structured Probabilistic Inference Factorization CPRM CSPN k PRM TDPRM BN DBN HMM CBN 9 6600 RMM lt29 M N Coordination CSci 538 Artificial Intelligence Spring 2007 Ch 7 Logical Agents 10302007 Derek Harter Texas AampM University Commerce Many slides over the course adapted from Srini Narayanan Dan Klein Stuart Russell or Andrew Moore Logical Agents 39 Reflex agents find their way from Arad to Bucharest by dumb luck 39 Chess program calculates legal moves of its king but doesn t know that no piece can be on 2 different squares at the same time 39 Logic KnowledgeBased agents combine 39 general knowledge amp 39 current percepts 39 to infer hidden aspects of current state prior to selecting actions 39 Crucial in partially observable environments Outline Knowledgebased agents Wumpus world example Logic in general models and entailment Propositional Boolean logic Equivalence validity satisfiability Inference Knowledge bases Inform on engine I dom ain in dependant algorithms Knowledge been i dom ain spmific content Knowledge base set of sentences in a formal language Declarative approach to building an agent or other system 39 Tell it what it needs to know Ellgen it can Ask itself what to do answers should follow from the Agents can be viewed at the knowledge level ie what they know regardless of how implemented Or at the implementation level 39 ie data structures in KB and algorithms that manipulate them A simple knowledgebased agent function KBeAGEN pemept returns an action b tatic KB 3 knowledge ase t munm initially 0 indicating time TELLKB MAKEPERCEPTSENTENCM percept t nation ASKKB MAKEACTIOMQUEKYU TELLKB MAKE ACTIONrSENTENCE c 0n fl 2 f return action 39 The agent must be able to Represent states actions etc Incorporate new perce s Update internal representations of the world Deduce hidden properties of the world Deduce appropriate actions Wumpus World PEAS description Performance measure 39 gold 1000 death 1000 39 1 per step 10 for using the arrow Environment 39 Squares adjacent to wumpus are smelly 39 Squares adjacent to pit are breezy 39 Glitter iff gold is in the same square 39 Shooting kills wumpus if you are facing it 39 Shooting uses up the only arrow 2 3 A 39 Grabbing picks up gold if in same square 39 Releasing drops the gold in same square Sensors Stench Breeze Glitter Bump Scream Actuators Leftturn Right turn Forward Grab Release Shoot Wumpus world characterization 39 Fully Observable No only local perception 39 Deterministic Yes outcomes exactly specified 39 Episodic No sequential at the level of actions 39 m Yes Wumpus and Pits do not move 39 Discrete Yes 39 Singleagent Yes Wumpus is essentially a natural feature Exploring the Wumpus World emugnum in o The KB initially contains the rules of the environment 0 11 The first percept is none nonenonenonenone Move to safe cell eg 21 0 21 Breeze indicates that there is a pit in 22 or 31 Return to 11 to try next safe ce Exploring the Wumpus World 12 YET El Age11 We Gold are square mm K much mm 9M isiled Wumpus lt Stench in cell wumpus is in 13 or 22 not in 11 not in 22 or stench would have been detected in 21 wumpus is in 13 22 is safe because of lack of breeze in 12 pit in 3 1 Move to next safe ce 22 Exploring the Wumpus World ltltm1gn 22 Detect nothing Move to unvisited safe cell eg 23 23 Detect glitter smell breeze Thus pick up gold Thus pit in 33 or 24 Logic in general Logics are formal languages for representing information such that conclusions can be drawn Syntax defines the sentences in the language Semantics define the quotmeaningquot of sentences 39 ie define truth of a sentence in a world Eg the language of arithmetic 39 x2 2 y is a sentence x2y gt is not a sentence 39 x2 2 y is true iff the number x2 is no less than the number y 39 x2 2 y is true in a world where x 7 y 1 39 x2 2 y is false in a world where x 0 y 6 Entailment 39 Entailment means that one thing follows from another KB a 39 Knowledge base KB entails sentence or if and only if or is true in all worlds where KB is true 39 Eg the KB containing the Giants won and the Reds won entails Either the Giants won or the Reds won 39 Eg xy 4 entails 4 xy 39 Entailment is a relationship between sentences ie syntax that is based on semantics Schematic perspective Sente H see m Represez rmrmn m E E Jihad 391 Aspects of th teat world Entaits Sentence sauuemag teat world F l If KB is true in the real world then any sentence 0 derived from KB by a sound inference procedure is also true in the real world Models Logicians typically think in terms of models which are formally structured worlds with respect to which truth can be evaluated We say m is a model of a sentence or if or is true in m Mor is the set of all models of or Then KB 0i iff MKB g Mor 39 Eg KB Giants won and Reds won or Giants won Entailment in the wumpus world Situation after detecting nothing in 11 moving right breeze in 21 Consider possible models for KB assuming only pits 3 Boolean choices gt 8 possible models Wumpus models xg Wumpus models 39 KB wumpusworld rules observations Wumpus models 39 KB wumpus world rules observations G1 quot12 is safequot KB a1 proved by model checking Wumpus models 39 KB wumpusworld rules observations Wumpus models 39 KB wumpusworld rules observations a2 quot22 is safequot KB F02 Inference Procedures KB i or sentence or can be derived from KB by procedure i Soundness iis sound if whenever KB id it is also true that KB or no wrong inferences but maybe not all true statements can be derived 39 Completeness i is complete if whenever KB I or it is also true that KB ior all true sentences can be derived but maybe some wrong extra ones as well Propositional logic Syntax 39 Propositional logic is the simplest logic illustrates basic ideas The proposition symbols P1 P2 etc are sentences 39 If S is a sentence 8 is a sentence negation If 81 and 82 are sentences S1 82 is a sentence conjunction If 81 and 82 are sentences 81 v 82 is a sentence disjunction If 81 and 82 are sentences 81 gt 82 is a sentence implication If 81 and 82 are sentences S1 ltgt 82 is a sentence biconditional