Special Topics CAP 6938
University of Central Florida
Popular in Course
Popular in System Engineering
This 77 page Class Notes was uploaded by Khalil Conroy on Thursday October 22, 2015. The Class Notes belongs to CAP 6938 at University of Central Florida taught by Gita Sukthankar in Fall. Since its upload, it has received 39 views. For similar materials see /class/227213/cap-6938-university-of-central-florida in System Engineering at University of Central Florida.
Reviews for Special Topics
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 10/22/15
K CAP693802 Plan Activity and Intent Recognition Lecture 3 Event Hierarchy Circumscription cont Event Tracking in SOAR Instructor Dr Gita Sukthankar Email gitarseecsucfedu Schedule T amp Th 130245pm Location CL1 212 Office Hours HEC 232 T 3430pm Th 101130am Outline Finish discussing Kautz paper relevant pages pp 124 pp 4647 pp 6365 Student presentations of final research project New domain Opponent modeling for games and battlefield analysis Background on SOARfl39acAir SOAR Event tracking in SOAR CAP6938 Dr Gita Sukthankar Kautz s Model First order predicate calculus Event hierarchy logical encoding of a semantic network Event predicates Abstaction axioms Decomposition axioms General axioms hardest to use for inference Includes temporal constraints between the steps Equality constraints between the agents executing steps or objects involved in steps Preconditions Special event predicates Eng AnyEIent top level abstraction CAP6938 Dr Gita Sukthankar Event Hierarchy Circumscription Event hierarchy General axioms My Evenl V d apg mcilihih D Enu Event Entrapment Prepare MES Was immin Disnss anzzmm Temporal i a 6 05 EDII Fas a Make Dish 2 Meal sq Dish Put15mm Make Noooles 15quot Make Make Make Fenucini Spaghetti Spaghem Make Anreda Pesto Marmara auae MEKE Cthken Mannm s2 55 Make mm Marmara Y I Sau e as a Make Spaghelll H Kautz A Formal Theory of Plan Recognition and its Implementation in Reasoning about Plans CAP6938 Dr Gita Sukthankar Closedworld Assumptions What are they Exhaustiveness Disjointedness Componentuse assumptions Minimum cardinality assumptions Are observations assumed to be complete CAP6938 Dr Gita Sukthankar Exhaus veness Known ways of specializing an event type are the only ways of specializing it Example the only pasta dishes which exist are fettucini alfredo spaghetti pesto spaghetti marinara Allows Sherlock Holmes style conclusions Not fettucini alfredo Not spaghetti pesto Must be spaghetti marinara CAP6938 Dr Gita Sukthankar Disjointedness Types are disjoint unless one abstracts the other or they abstract a common type Can t invent new dishes meat ravioli that abstract both the meat dish and the pasta dish Similar to exhaustiveness but for event types Allows the conclusions to be made Making a pasta dish Therefore agent is not making a meat dish since neither abstracts each other CAP6938 Dr Gita Sukthankar ComponentUse Assumption Seeing an event implies the disjunction of the plans which include it as a component Agent is boiling water must be a pasta dish since nothing else includes that event Allows for missing but not erroneous observations CAP6938 Dr Gita Sukthankar Minimum Cardinality Assumption Assume parsimony the minimum number of plans to explain the observations Without this assumption each event could always belong to a separate plan If you observe the event get gun and go to woods assume that both are related to the plan hunt rather than believing that the person is going hunting and robbing a bank CAP6938 Dr Gita Sukthankar 9 Kautz s Inference Procedure Note that there have been faster more specialized inference procedures developed to handle Kautz s model From observations apply component use assumption and abstraction axioms to find all top level plans contain the observed event Apply other constraints expressed by general axioms locally this is where most of the work occurs Combine information from multiple observations using the minimum cardinality assumption to minimize the number of plans under consideration CAP6938 Dr Gita Sukthankar 10 Summary Handles well Incomplete sequences of observations Plans that lack tota ordering Handles poorly Errors in observations Situations with large numbers of possible but improbable plans In contrast probabilistic frameworks handle those cases quite well CAP6938 Dr Gita Sukthankar Student Project Presentations CAP6938 Dr Gita Sukthankar Domain Opponent Modeling How does plan recognition differ in adversarial domains than in nonadversarial ones More time pressure Smart opponents will deliberately mislead you Performance is usually measured by an improvement in agent s planning rather than recognition accuracy Gametheoretic methods work well assume the opponent is strengthening its position and harming you CAP6938 Dr Gita Sukthankar SOAR Stands for State Operator And Result URL httpsitemakerumichedusoarhome Developed from Newell and Simon s General Problem Solver GPS Original purpose to create a cognitive architecture that could integrate both goal driven and reactive behavior Now mainly used as a planningexecution system for simulated agents especially in military simulation applications What s the difference between cognitive architecture and any other type of planning CAP693 m3kthankar 14 SOAR Definitions Problem space set of states situations plus set of operators actions System cycles through proposing selecting and applying operators Knowledge encoded as productions condition action rules All relevant productions trigger in parallel whenever changes in goals state and perception cause conditions to be met Impasses solved through subgoaling solution remembered by chunking CAP6938 Dr Gita Sukthankar 15 TacAir SOAR Testbed SOAR plus set of perceptual and motor interfaces that allow agents to pilot aircraft in DIS Distributed Interactive Simulation Focus is on beyondvisualrange combat where pilots rely on radar and communication Demonstrated in Simulated Theater of War STOW97 Mission types defensive counter air close air support suppression of enemy air defense strategic attack escorts tankers airborne early warning and reconnintel Demonstrated that 2 people could monitor 70 simultaneously active agents CAP6938 Dr Gita Sukthankar 16 Recognize Flight Manuevers as growth we iii il iai ibl 223 Edi l Observations enemy flies towards you then turns to a certain angle Want to recognize enemy fired an unseen missile then did an FPOLE maneuver Agent should execute missile evasion maneuver CAP6938 Dr Gita Sukthankar Problem Characteristics Events are not the result of a single agent s ac ons Agent must consider the actions of multiple agents simultaneously Agent must consider the effect of its own actions Realtime continuous vs one shot recognition Ambiguity in the opponent s behaviors CAP6938 Dr Gita Sukthankar Solution Simple insight model what you would do if you were in the opponent s position What are problems with this High overhead must program an agent capable of solving the problem Modeling the opponent s world state can be difficult what is the opponent s sensor model Maintaining multiple hypotheses is even more expensive What are the strengths Allows designer to leverage extra domain knowledge Does not require enumerating chains of possible events CAP6938 Dr Gita Sukthankar 19 Ambiguity in Event Tracking Ambiguity the bane of plan recognition Potential solutions Maintain multiple operator hierarchies continue considering all valid hypotheses Delay until more evidence presents itself Tambe solution attempt to resolve ambiguity and commit to a single interpretation Passive ambiguity resolution gametheoretic Active resolution modify agent s actions to resolve ambiguity Detect incorrect interpretation through match failure Recovery mechanisms assumption injection backtracking CAP6938 Dr Gita Sukthankar 20 Other Issues Spending time on recognition vs computation Feature selection which details should the agent pay attention to SOARspecific issues with maintaining multiple problem spaces worldcentered problem space Incomplete plan libraries What category does this type of plan recognition system fall under How did Tambe evaluate this system CAP6938 Dr Gita Sukthankar 21 Next Time Conclusion of student presentations Application area monitoring teammates s acUons Efficiency improvements for symbolic plan recognition Kaminka paper CAP6938 Dr Gita Sukthankar 22 K CAP693802 Plan Activity and Intent Recognition Lecture 2 Overview cont Event Hierarchy Circumscription Instructor Dr Gita Sukthankar Email gitarseecsucfedu Schedule T amp Th 130245pm Location CL1 212 Office Hours HEC 232 T 3430pm Th 101130am Outline Recap of last lecture Homework for next week Application area Quality of Life Technology General approaches Background Types of planners Circumscriptionframe problem Event hierarchy circumscription Kautz paper Reminder course signup deadline Friday CAP6938 Dr Gita Sukthankar Recap Course expectations Definitions of planactivityintent recognition ChaHenges Application areas Robocup domain CAP6938 Dr Gita Sukthankar Homework for next week One page writeup of final project plan due end of class on Thursday Aug 30 Should definitely include General overview of the problem you re trying to solve Potential sources of data A vague idea of approach hammers of interest Your vision for a final demo Optional related work evaluation procedure references Oral presentation of plan about 5 minutes no power point required Graded on a checkcheckcheck basis Reading on web page M Tambe and P Rosenbloom Event tracking in a dynamic multi agent environment Technical Report ISIRR 94393 USCInformation Sciences Institute 1994 D AvrahamiZilberbrand and G Kaminka Fast and complete svmbolic plan recognition In Proceedings of IJCAI 2005 CAP6938 Dr Gita Sukthankar Elder Care Assistance If you re interested in doing a project in the area of smart homespervasive computing consider focusing on eldercare assistance Potential opportunity for students to present a poster of their work as part of an NSF site visit to the Quality of Life Technology Center W at CMU Get to visit Pittsburgh in Feb CAP6938 Dr Gita Sukthankar Aging Population Total number nf persons age 65 er older by age group 19 tn ease in minions Ell El 10 II 7 135cm GLUE 391 l I I I I u I I u I I I v a L 1911 195i ZCICID BEBE prayutter Hebe Dxefz er the years HIM tn mm are middlescribes pmjwections M the pupula nn Mem39ente pnpululinn These data refer tn the resident popu ation Sewrce ILLS team Bureau Deeennial l Census ute and Populat un ijediun CAP6938 Dr Gita Sukthankar Quality of Life Technology Center d herrei eds C rnarl vcummunlzy K muggy mcer1gt EKIZ 7 Research mm Thrusts r quot 39 munn gm up Ptg m a P e r 5 an Acuve Home Persona Mohillty a M anipulalion I Safe I CAP6938 Dr Gita Sukthankar Quality of Life Technology Center McKiz and other testbed User centric system em 39 39 must understand user s P Aa intent Datadriven learning models should be learned from data Multiple data sources E3335quot combining data from multiple types of sensors 7 7 7 7 Grand Challenge Dam V CAP6938 Dr Gita Sukthankar Possible Projects Intelligent medicine cabinet Reminds user when to take their medicine Monitors user s actions to ensure safe drug dosages Suggests correct course of action after incorrect dosages have been taken Activity coach Designed for users with mild memory impairment Recognizes household activities visionbased wireless tracking data Coaches user through household activities like cooking and using appliances Driver assistance systems CAP6938 Dr Gita Sukthankar General Approaches to PAIR Symbolic consistencybased Probabilistic likelihoodbased Graphical models using graphs to represent joint probability distributions Decisiontheoretic utilitybased Gametheoretic examining players strategies payoffs CAP6938 Dr Gita Sukthankar Symbolic Consistencybased Based on the idea that plan recognition is a consistencychecking process A model matches the set of observations if the observed actions don t violate any of the constraints specified in the plan library Example techniques first 2 weeks of reading Event hierarchy circumscription Kautz Event trackingmodel tracing Tambe Fastcomplete symbolic plan recognition Kaminka Output return complete set of models that pass consistency checking CAP6938 Dr Gita Sukthankar Probabilistic Likelihoodbased Based on the idea of selecting the plan that has a high probability based on the observed evidence Belief is usually calculated using some variant on Bayesian belief update but DempsterShafer evidential reasoning has also been used Includes both directedundirected graphical model based procedures Examples dynamic Bayes networks DBNs hidden MarkovsemiMarkov models HMMs conditional random fields CRFs Output model with the maximum likelihood at the current time step given the set of previous observations CAP6938 Dr Gita Sukthankar Decisiontheoretic Utilitybased Based on the idea that the agent is rational and acts to maximize a known utility function Plan recognition process occurs by calculating utility of all plans in current situation Gametheory is applicable for adversarial reasoning when the agent is simultaneously trying to maximize their utility while minimizing their opponents Output a rankordering of models by utility Note this method is wellsuited for prioritizing or pruning the search process and is often used in combination with one of the previous methods CAP6938 Dr Gita Sukthankar 13 Other Issues Layered architectures and hybrid approaches Format of plan library Exact vs approximate inference Learningestimating model parameters from data Use of domainspecific heuristics CAP6938 Dr Gita Sukthankar Types of Planning Decompositional planner eg Hierarchical Task Network planners Stateaction space is largeill defined Plans are set of predefined recipes for action Reach goal by completing recipe Small search space CAP6938 Dr Gita Sukthankar Forwardchainingstate space eg path planners Welldefined set of action primitives Often uses heuristics for evaluating nearness of state to goal Large search space Relevant to goal recognition CircumscriptionFrame Problem Circumscription A type of nonmonotonic logical procedure to formalize the commonsense assumption that things are expected unless otherwise specified Frame problem Formulated as the problem of expressing a dynamical domain in logic without explicitly specifying which conditions are notaffected by an action Discussed in the Robot s Dilemma Pylyshyn definitions from Wikipedia CAP6938 Dr Gita Sukthankar Event Hierarchy Circumscription Strengths Weaknesses Application areas H Kautz A Formal Theory of Plan Recognition and its Implementation in Reasoninq about Plans CAP6938 Dr Gita Sukthankar 17 Event Hierarchy Circumscription Good points Very general no domain specific heuristics Spells out the closedworld assumptions that are implicitly used by many recognition systems Uses deductive inference so could be implemented using a general purpose theorem prover depending on the general axioms Bad points Inference procedure is poor slow and undecidable Most applications don t care about justifying inferencesquot care about completeness or if incomplete precisionrecall Weak at handling temporal constraints CAP6938 Dr Gita Sukthankar 18 Next Time Discuss paper M Tambe and P Rosenbloom Event trackinci in a dvnamic multiacient environment Technical Report ISIRR 94393 SCInformation Sciences Institute 1994 Do about half the inclass presentations CAP6938 Dr Gita Sukthankar K CAP693802 Plan Activity and Intent Recognition Lecture 10 Sequential DecisionMaking Under Uncertainty part 1 MDPs and POMDPs Instructor Dr Gita Sukthankar Email gitarseecsucfedu SP2 1 Reminder Turnin questionnaire Homework due Thurs 1 page describing the improvements that you plan to make to your project in the next half of the semester SP2 2 Model POMDP Applications Strengths Weaknesses How does a POMDP differ from an HMM SP2 3 Model POMDP Applications Humanrobot interaction Dialog management Assistive technology Agent interaction Strengths Integrates action selection directly with state estimation Weaknesses Intractable for realworld domains How does a POMDP differ from an HMM MDP and POMDP are for calculating optimal decisions from sequences of observations HMMs are for recognizing hidden statefrom sequences of observations MDP and POMDP actions and rewards SP2 4 Markov Decision Processes Classical planning models logical representation of transition systems goalbased objectives plans as sequences Markov decision processes generalize this view controllable stochastic transition system general objective functions rewards that allow tradeoffs with transition probabilities to be made more general solution concepts policies SP2 5 Markov DeCISIon Processes An MDP has four components 8 A R Pr finite state set S S n finite action set A A m transition function Prsat each Prsa is a distribution over S represented by set of n x n stochastic matrices bounded realvalued reward function Rs represented by an nvector can be generalized to include action costs Rsa can be stochastic but replacable by expectation Model easily generalizable to countable or continuous state and action spaces SP2 6 System Dynamics Finite State Space Q State 51013 Loc 236 Joe needs printout Craig needs coffee SP2 7 System Dynamics Finite Action Space A Pick up Printou rs Go To Coffee Room SP2 8 System Dynamics Transition Probabilities Prsi a Sj Prob 095 SP2 9 System Dynamics Transition Probabilities Prsi a sk S1 2 Sn 31 09 005 00 PM 03905 sz 00020 01 SP2 1O Reward Process Reward Function Rsi action costs possible R Reward 10 52 Q5 SP2 11 Assumptions Markovian dynamics history independence Prst1 AtstAt1 st1 so Prst1Atst Markovian reward process PrRtAtstAt1st1 so PrRtAtst Stationary dynamics and reward Prst1Atst Prst 1At st for all t t Full observability though we can t predict what state we will reach when we execute an action once it is realized we know what it is SP2 12 Polic1es Nonstationary policy 11S x T gt A 11st is action to do at state s with tstagestogo Stationary policy 118 gt A 11s is action to do at state s regardless of time analogous to reactive or universal plan These assume or have these properties full observability historyindependence deterministic action choices MDP and POMDPs are methods for calculating the optimal lookup taglei policies Value of a Policy How good is a policy Tr How do we measure accumulated reward Value function V S gtR associates value with each state sometimes 8 x T V1Ts denotes value of policy at state s how good is it to be at state 3 depends on immediate reward but also what you achieve subsequen y expected accumulated reward over horizon of interest note Vns at Rs it measures utility SP2 14 Value of a Policy con t Common formulations of value Finite horizon n total expected reward given 1T Infinite horizon discounted discounting keeps total bounded SP2 15 Value Iteration Bellman 1957 Markov property allows exploitation of DP principle for optimal policy construction no need to enumerate ATn possible policies Value Iteration Bellman backup V0s Rs vs Vks 2 125 max ZSPI S a S39 Vk1s39 a 7r S k arg max ZSPI S a S39 Vk1s39 a Vk is optimal ks rogetogo value function SP2 16 Value Iteration vs4 Rs4max 07 v 1 s1 03 v1 4 I 04 v 1 s2 06 vM s3 l SP2 17 Value Iteration Hs4 max I I SP2 18 Value Iteration l9 5 2 U n all s f 1 loop iquot 1 lnnp fuzn all s E 5 and fur all a E A l39ll39quot quotif H Ea39ES 5 397 3 11mm end 1001 until Ill5quot H4 all at F frin39 all E 539 SP2 19 Complexity T iterations At each iteration A computations of n x n matrix times nvector OAn3 Total OTAn3 Can exploit sparsity of matrix OTAn2 SP2 20 MDP Application Electric Elves Calculating optimal transfer of control policy in an adjustable autonomy application Dynamically adjusts users meetings State ofworld is known future actions ofusers are unknown SPZVZl Recognizing User Intent SP2 22 POMDPS Partially observable Markov Decision Process POMDP a stochastic system 2 S A P as before A finite set 0 of observations aos probability of observation 0 in state s after executing action a Require that for each a and s 2 Paos 1 oinO 0 models partial observability The controller can t observe s directly it can only observe o The same observation 0 can occur in more than one state Why do the observations depend on the action a Why do we have Paos rather than Pos This is a way to model sensing actions which do not change the state but return information make some observation available eg from a sensor SP2 23 Example of a Sensing Action Suppose there are a state 31 action a1 and observation 01 with the following properties For every state s Pa1SS 1 a1 does not change the state Pa101S1 1 and Pa1O1IS O for every state s a5 s1 After performing a1 01 occurs if and only if we re in state 31 Then to tell if you re in state 31 just perform action a1 and see whether you observe o1 Two states 3 and s are indistinguishable if for every 0 and a Paos Paos SP2 24 Belief States At each point we will have a probability distribution bs over the states in S bs is called a belief state our belief about what state we re in Basic properties OsbsS1foreverysin S ZsinS bS 1 Definitions ba the belief state after doing action a in belief state b Thus bas Pin s after doing a in b 2m 5 Pass bs bao Pobserve 0 after doing a in b ZS in SPaOS bS ba s Pin s after doing a in b and observing o Marginalize over states Belief states are ndimensional vectors representing the probability of being in every state SP2 25 Belief States Continued Recall that in general PXyz Pylz PXyz Thus Pa0S bas Pobserve 0 after doing a in s Pin s after doing a in b Pin s and observe 0 after doing a in b Similarly ba s ba0 Pin s after doing a in b and observing o Pobserve 0 after doing a in b Pin s and observe 0 after doing a in b Thus ba s PaOS baS bao Formula for updating belief state Can use this to distinguish states that would othenvise be indistinguishable SP2 26 be ief state b amm athZ 1752U5 l7530 Robot r1 can move mums 17540 between I1 and I2 K at H 1 amuz m Ver1 1 2 inmmz mc1l2 mover121 There may be a container 0 in location I2 i c12 a mover112 beliet state b O fun empty alr11 alr112 dam s ansIo nzs4n 5 at1r1l2 in1c1 52 abbreviate full as f and empty as e belief state b xa mple aim m alr1l2l Continued WW whom Neither move action blsl05 17540 retums useful observations V at H 1 atm lz PENIS PJSIS PMS PMS 05 a mover1l1l2 Thus if there are no other V actions hen bellet state b 51 and 52 are ar1l1 alr1l2 indistinguishable l Juana 6 bnlslo nzs4in 5 atlrl l2 inlcl lzl indistinguishable Example Continued Suppose there s a sensing 27 action see that works perfectly in location l2 Pseef54 P5250953 1 Pseef53 P5250954 0 see does not work elsewhere P522 eIS1 P522652 P5250952 05 bl and s2 are still indistinguishable d s4 are now distinguishable aim m aim lzl 52105 l7530 lstD5 1754o aim lil atm lz inc1 l2 inc1l2 a mover1l1l2 beliet state b alr1l1 alr1l2 d 171152 530 5 bumps 540 5 am J1 atlrt l2 inlet lzl waltz belief state b Example Continued amtm biSZJZU 5 By itself see doesn t tell us the state with certainty bi 0105 bsEEES3 P es3 253 lbseee 25 l 05 05 If we first do amovel12 then do see this will tell the state with certainty Let b39 ba arm H see at SE inmrz 1 0 beiief state b 17540 V atm iz rnct r2 a mover112 beliet state b b b39 e 53 is 3 3mm alr112 522els b39seelts3gtbgeeltegt WEEH Q SM 5 1 0505 1 b isl0 540 5 alr1it atirtJZ mm rz inicHz belief state b M 0d ified am m am J2 Example blts21os 52 Magma 1711 05 b54 V Suppose we know the mm mm Inmal bellef state IS b imam mmm Policy to tell if there s a container in I2 amover112 Tr b mover112 b see belie state b 1 mm m ar1l2 b ISZUbnsan5 bum n nts4o5 ar11 aurHZ inc 2 in1c1 IZ Solving POMDPs Informationstate MDPs Belief states of POMDP are states in new MDP Continuous state space Discretise Policytree algorithms SP2 32 Policy Trees steps a go H p to go 012 0k 9 0 2 Steps to gm 1 steps a go Policy tree an agent s nonstationary tstep policy TreeaT create a new policy tree with action a at root and observation zTz Vp vectorforvalue function for policy tree p with one component per state Actp action at root of tree p Subtreepz subtree of p after obs z Stvalazp vector for probabilityweighted value oftree p after az SP2 33 Application Nursebot Robot assists elderly patients Model uncertainty about the user s dialog and position Exploit hierarchical structure to handle large state space SP2 34 Value Functions p expected P tstep dlscoumed value 3state 2state SP2 35 References Most slides were taken from Eyal Amir s course CS 598 Decision Making under Uncertainty lectures 12 and 13 L Kaebling M Littman and A Cassandra Planning and Acting in Partially Observable Stochastic Domains Artificial Intelligence Volume 101 pp 99134 1998 SP2 36
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'