Outline for EECS 700 with Professor Kulkarni at KU
Outline for EECS 700 with Professor Kulkarni at KU
Popular in Course
Popular in Department
This 53 page Class Notes was uploaded by an elite notetaker on Friday February 6, 2015. The Class Notes belongs to a course at Kansas taught by a professor in Fall. Since its upload, it has received 15 views.
Reviews for Outline for EECS 700 with Professor Kulkarni at KU
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 02/06/15
5 Process Virtual Machines Outline Structure of a process VM Compatibility issues Guesttohost state mapping issues Emulation of memory instructions exceptions and OS calls 0 Pro ling Optimization issues EECS 700 Virtual Machines Spring 2009 5 Background Compiled applications are bound by the ABI to only work for one OSISA pair process VMs overcome this limitation Example lA32 EL process VM with le sharing interfaces for q T7 Wmdows and Linux EECS 700 Virtual Machines Spring 2009 2 host process host process runtlme 39 re te k 1 i x J HesTfos f a I 39 39 I l I l I l I I l I Structure of a PVM 39 gtit ig39iz tjgn Emulation Engir i Initialize signals 05 Cal Emulator EECS 700 Virtual Machines Spring 2009 5 Structure of a PVM 2 loader pro le database load guest code and data hold program profile info load runtime code blockedgeinvocation profile initialization block OS call emulator allocate memory translate OS calls establish signal handlers translate OS responses emulation engine exception emulator interpreter andor translator handle signals code cache manager form Praise State manage translated guest code 39 Slde tables ush outdated translations structures used during emulation EECS 700 Virtual Machines Spring 2009 is Compatibility How accurately does the emulation of the guest s functional behavior compare with its behavior on its native platform two systems are compatible if in response to the same sequence of input values they give the same sequence of output values 0 Intrinsic compatibility precise behavior dif cult to achieve 0 Extrinsic compatibility accuracy Within some wellde ned constraints acceptable for most systems EECS 700 Virtual Machines Spring 2009 g Intrinsic Compatibility Compatibility requires 100 accuracy for all programs all the time compatible for all possible input sequences no further veri cation needed to con rm emulation accuracy dif cult to achieve Based entirely on the properties of the VM eg hardware designers use intrinsic compatibility to guaranty microarchitectural ISA compatibility EECS 700 Virtual Machines Spring 2009 g Extrinsic Compatibility Compatible for wellde ned subset of input sequences based on VM implementation architecture OS speci cations and external guarantees or certi cates some burden on the users to ensure that guarantees are met eg VM may only guaranty accuracy for programs compiled with a particular compiler eg program may be compatible as long as it has limited resource requirements EECS 700 Virtual Machines Spring 2009 5 Verifying Compatibility Too complex to theoretically prove except in simple systems In practice use informal reasoning use test suites Suf cient conditions decompose compatibility into parts allows the reasoning process to be simpli ed Assume state of guest is l to l mapped to host but same type of state is not necessary EECS 700 Virtual Machines Spring 2009 5 A Compatibility Framework The need for a framework rigorously proving that compatibility holds is hard allow to reason about compatibility issues decide whenwhere during program execution should compatibility be guaranteedveri ed Model of program execution machine state de ned by registers memory IO etc operations that change state EECS 700 Virtual Machines Spring 2009 a A Compatibility Framework 2 Guaranty isomorphic mapping between guest and host states EECS 700 Virtual Machines Spring 2009 10 5 Managing changes to program state at two levels usermanaged state main memory registers Compatibility Framework 3 straightforward mapping between guest and host states operated on by userlevel instructions OS managed state disk contents IO state networks operated Via OS calls traps interrupts operations can affect userlevel state as well EECS 700 Virtual Machines Spring 2009 11 5 Compatibility Framework 5 Conditions for compatibility guest state should be equivalent to host state at control transfer from user instructions to OS control transfer from OS to user instructions all usermanaged state must be compatible instructionlevel equivalence not required EECS 700 Virtual Machines Spring 2009 13 3 Trap Compatibility If source traps then target traps If target traps then source would have trapped runtime can lter target traps to remove false ones Page faults are special case page fault behavior is nondeterministic Wrt user process Source Target r4 9 r6 1 r1 6 r2 r3 H trap R4 9 R6 1 Remove r1 6 r4 r5 R1 9 R4 R5 dead assignment r6er1r7 R66 R1 R7 EECS 700 Virtual Machines Spring 2009 14 3 Register State Compatibility At the time of an exception is the register state exactly as in the real machine including dead register values R1 lt R2 R3 lt R2 R3 trap R9 lt R1 R5 7 I R6 lt R1 R7 I eSchedue 1 R6 lt R1 R7 R9 lt R1 R5 R3ltR61 R3ltR61 EECS 700 Virtual Machines Spring 2009 15 5 Memory State Compatibility Memory state compatibility is maintained if at the time of a trap or interrupt the contents of memory are exactly the same in the translated target program as in the original source program 0 Source Target R7lt R6ltlt8 R7lt R6ltlt8 mem R6 lt R1 B mem R7 lt R2 mem R7 R2 A mem R6 R1 gt Protection fault EECS 700 Virtual Machines Spring 2009 16 5 Memory Ordering Compatibility Maintain equivalent consistency model Important for multiprocessors A Flag 0 Process P1 Process P2 A 1 while Flag 0 Flag 1 A EECS 700 Virtual Machines Spring 2009 17 5 Unde ned Architecture Cases Some most ISAs have unde ned cases example selfmodifying code with Icaches unless special actions are performed result may be unde ned Different unde ned behavior is compatible behavior can be tricky what if unde ned behavior is different from all existing implementations what if existing implementations do the logical thing eg selfmodifying code works as expected EECS 700 Virtual Machines Spring 2009 18 5 Constructing a Process VM Mapping of usermanaged state held in registers held in memory Perform emulation operations to transform state memory architecture emulation instruction emulation exception emulation OS emulation EECS 700 Virtual Machines Spring 2009 19 5 Map usermanaged register amp memory state guest data and code map into host s address space host address space includes runtime data and code guest state does not have to be maintained in the same type of resource Register mapping straightforward depends on number of guest and host registers EECS 700 Virtual Machines Spring 2009 State Mapping I Host Registers Host Register Space Host ABI Address Space 20 5 Memory address space mapping map guest address space to host address space Memory State Mapping maintain protection requirements Methods results in different performance and exibility levels software supported translation table direct translation EECS 700 Virtual Machines Spring 2009 21 9 Software Translation Tables VM software maintains translation table map each guest memory address to host address similar to hardware page tables TLBs used when all other approaches fail provides most exibility and least performance Guest Application Address Space EECS 700 Virtual Machines Spring 2009 Host Application Address Space LSfaftwajr 22 5 Software Translation Tables 2 Initially Rl holds source address R30 holds base address of mapping table srwi r29rll6 shift rl right by 16 slwi r29r292 convert to a byte address lwzx r29r29r30 load block location in host memory slwi r28rll6 shift leftright to zero out srwi r28r28l6 source block number slwi r29r29l6 shift up target block number or r29r28r29 form address lwz r20r29 do load EECS 700 Virtual Machines Spring 2009 23 3 Direct Memory Translation Use underlying hardware guest memory allocated contiguous host space guest address space runtime lt host address space minimal overhead most performance base addr zero 0 set fixed nonzero o sel EECS 700 Virtual Machines Spring 2009 24 5 Memory State Mapping Summary Runtime guest space lt host space direct memory translation can achieve performance and intrinsic compatibility Runtime guest space gt host space software translation will lose intrinsic compatibility performance or both guest space host space happens often sameISA dynamic translation no room for runtime use software translation extrinsic compatibility EECS 700 Virtual Machines Spring 2009 25 5 Memory Architecture Emulation Aspects of the ABI memory arcnitecture that need to be emulated 7FFE FFFF Address space structure segmented or at Access privilege types combination of N R W E Protection allocation granularity size of the smalled block of memory that can be allocated by the OS EECS 700 Virtual Machines Spring 2009 0001 0000 0000 0000 5 Access restrictions placed on different regions of memory Guest Memory Protection Can be achieved during software supported translation slow and inefficient but very exible Host supported memory protection runtime sets access restrictions using OS system calls OS delivers signals to runtime on access violations protection faults reported to runtime requires host OS support EECS 700 Virtual Machines Spring 2009 27 Host S Support Direct mechnism runtime sets protection levels Via system calls mprotect protection faults trap to handler in runtime SIGSEGV Indirect mechanism mapping region of memory to le With access protections mmap 111 111111X Physical Memory File T VM Memory Virtual Machine39s i Free Pages Virtual Address Space w cause ReadOnly Mappings protection faults EECS 700 Virtual Machines Spring 2009 28 a Guest Memory Protection 2 Implementation issues host and guest ISAs provide different protection types host provides a superset of guest protections host provides a subset of guest protections host and guest support different page sizes difficult to map access privileges simple if guest page size is a multiple of host page size EECS 700 Virtual Machines Spring 2009 29 5 SelfReferencingModifying Code Program may either refer to itself or attempt to modify itself Solution maintain guest program code memory image loadstore addresses are mapped into source memory region loads from code region are ok writes to code region trigger segfault ush relevant cache entry enable writes to code region interpret the code block that caused the fault reenable writeprotection EECS 700 Virtual Machines Spring 2009 30 8elf ReferencingModifying Code 2 original COde translated code original de translated code 7 1 gtj I write protected Self i referencing code Self i modl mg code EECS 700 Virtual Machines Spring 2009 31 5 Runtime and guest application share the same process address space guest program can readwrite portions of the runtime Protecting Runtirne Memory Addressing software translation tables hardware address translation software protection checking hardware for both address translation and protection checking OS sets protections for emulation mode and runtime mode see Figure 316 EECS 700 Virtual Machines Spring 2009 32 9 Protecting Runtime Memory 2 Change protections on context switch from runtime to translated code Translated code can only access guest memory image Translated code cannot jump outside code cache emulation sW sets up links Multiple system calls at context switch time high overhead Rum ime mode N R39untiime Ex Guest Dita RW EECS 700 Virtual Machines Spring 2009 Emulation mode R un39tli m g N Data 33 5 Instruction Emulation Techniques for instruction emulation interpretation binary translation Startup time S cost of translating code for emulation one time cost for translating code Steadystate performance T cost of emulation average rate at which instructions are emulated EECS 700 Virtual Machines Spring 2009 34 9 Instruction Emulation 2 Overall performance S NT N is the number of times an instruction is executed S1000 T220 tradeoff point55ins 2500 2000 1 500 10007 Total Emulation Time 500 0 10 20 30 40 50 60 70 80 90 100 EECS 700 Virtual Machines Spring 2009 5 Staged Emulation Application of emulation techniques in stages start with low startup overhead tech interpretation pro le data determines hot dynamic blocks of code if execution count gt threshold then compile place in code cache update links and side table entries optimize hotter code further Inferp39reter Emulation Manager Trahi slat39o rl Op39tinjizer 1 EECS 700 Virtual Machines Spring 2009 36 J Emulation Engine Execution Flow return from exception emulator return from OS emulator mm Metronome o trap conditir call exceptlon trap via sigma emu39ator lt no call cache manager call OS lt emulator EECS 700 Virtual Machines Spring 2009 37 5 Types of exceptions trap produced by a speci c program instruction during program executlon interrupt an external event not associated with a particular 1nstructlon Precise exceptions all prior instructions have committed none of the following instructions have committed Further division of exceptions for a process VM ABI visible exceptions returned to the application via an OS s1gna ABI invisible ABI is unaware of the exception s occurrence Exception Emulation EECS 700 Virtual Machines Spring 2009 38 5 Detecting trap conditions interpretive trap detection checking trap conditions during interpretatlon rout1ne trap condition detected by the host OS Implementation runtime registers all exceptions with the host OS all signals registered by the guest OS are recorded on receiving OS signal if signal is guestregistered then send to guest signalhandling code else runtime handles the trap condition special tables needed during binary translation Trap Detection EECS 700 Virtual Machines Spring 2009 39 g Interrupt Handling Interrupts are not associated with any instruction a small response latency is acceptable maintaining precise state easier than traps Receiving interrupt during interpretation complete current routine service interrupt Receiving interrupt during binary translation execution may not be at an interruptible point precise recovery at arbitrary points dif cult no idea when control will return to the EM from the code cache EECS 700 Virtual Machines Spring 2009 40 5 Solving the interrupt response time problem during binary translation on interrupt control is passed to runtime Interrupt Handling cont runtime unlinks the current translation block from the next block control is returned back to translated code control returns to runtime after end of current block runtime handles the interrupt EECS 700 Virtual Machines Spring 2009 41 5 Interpreter easy each source instruction has its own routine source PC and state updated in each instruction routine Binary Translation hard rst determine the source PC source PC not continuously updated maintain reverse translation table mapping target PC to source PC inef c1ent target instruction can map to multiple source instructions target code may be optimized and reordered Determining Precise State EECS 700 Virtual Machines Spring 2009 42 Reverse Translation Table source code code cache signal returns target PC block A trap ccurs nd corresponding 5 ur e block B side table target PCs source PCs i Start PC End PC Src PC I Src PC2 Src PCn sijeeatggle v Start PC End PC Src PC I SrC P Src PCm block N Start PC End PC Src PC I Src PC2 Src PCZ EECS 700 Virtual Machines Spring 2009 g Restoring Precise State Register state during binary translation 2 cases based on if sourcetotarget register mapping remains constant throughout emulation if not constant side tables can be maintained or analyze from start of translation block again Memory State during binary translation changed by store instructions do not reorder stores or other potentially trapping instructions with stores restricts optimizations EECS 700 Virtual Machines Spring 2009 44 5 A PVM emulates the function or semantics of the guest s OS calls not emulate individual instructions in the guest OS Different from instruction emulation given enough time any function can be performed on the input operands to produce a result most ISAs perform same functions ISA emulation is always possible with OS it is possible that providing some host function is impossible operation semantic mismatch OS Call Emulation EECS 700 Virtual Machines Spring 2009 45 is os Call Emulation 2 Different source and target OS semantic translation of mapping required may be difficult or impossible adhoc process on a casebycase basis Same source and target OS emulate the guest calling convention guest system call jumps to runtime which provides wrapper code EECS 700 Virtual Machines Spring 2009 46 5 os Call Emulation 3 Runtime Source code segment Target code segment 39 Binary 5 in5t1 Translation Lil s mapper code t instz copyconvert arg1 sinst2 ssystemca X gt jump runti me COPYComeIt argz sinst4 tinst4 s inst5 t inst5 39 tsystemca X copyconvert return val return to tinst4 Same source and target OS cont runtime may handle some guest OS calls itself signals memory management handling abnormal conditions like callbacks runtime maintaining program control lack of documentation EECS 700 Virtual Machines Spring 2009 47 3 Code Cache Storage space for holding translated guest code Code cache is different from ordinary caches code cache blocks do not have a xed size code cache blocks are chained with each other code cache blocks are not backed up has implications on code cache management replacement algorithms used Code cache space is limited blocks need to be replaced if cache lls up EECS 700 Virtual Machines Spring 2009 48 5 Least recently used LRU good is theory problematic in practice overhead of keeping track of the LRU block backpointers are needed to eliminate chained links Code Cache Replacement fragmentation problem due to variablesized blocks unlink blocks before removing maintain backpointers EECS 700 Virtual Machines Spring 2009 49 5 Code Cache Back Pointers code cache hash tabl EECS 700 Virtual Machines Spring 2009 50 5 Code Cache Replacement 2 Cache ush when full or on phase change gets rid of stale blocks minimal maintainence overhead even actively used blocks may be removed and may need retranslation I detect working set change and flush new translations time EECS 700 Virtual Machines Spring 2009 51 5 Code Cache Replacement 3 First In First Out FIFO nonfragmenting as cache can be maintained as a circular buffer alleviates LRU problems at lower hit rates needs to maintain backpointers Coursegrained FIFO partition code cache into large FIFO blocks Links only maintained between blocks that span replacement boundaries see Figure on next slide EECS 700 Virtual Machines Spring 2009 52 3 Code Cache Replacement 4 Coursegrain FIFO cont Code Cache Backpointer Tables FIFO block A FIFO block B FIFO block D EECS 700 Virtual Machines Spring 2009 g PVM Performance Important for VM acceptance optimization framework along with staged emulation Difference from static optimization conservative over small code regions traces superblocks high level semantic information not available profiling architectural information can be used Will study in next chapter EECS 700 Virtual Machines Spring 2009 54
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'