Class Note for EECS 700 with Professor Kulkarni at KU 2
Class Note for EECS 700 with Professor Kulkarni at KU 2
Popular in Course
Popular in Department
This 26 page Class Notes was uploaded by an elite notetaker on Friday February 6, 2015. The Class Notes belongs to a course at Kansas taught by a professor in Fall. Since its upload, it has received 30 views.
Reviews for Class Note for EECS 700 with Professor Kulkarni at KU 2
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 02/06/15
Disco Running Commodity Operating Systems on Scalable Multiprocessors EDOUARD BUGNION KINSHUK GOVIL SCOTT DEVINE MENDEL ROSENBLUM Stanford University Presented by Arturo Ramos The University of Kansas EECS 700 Virtual Machines Spring 2009 Introduction Some things to note before starting This paper was published in 1997 VMware was founded in 1998 n This paper was written by the founders of VMware At the time of publishing scalable computers were making their first introductions into the commercial marketplace What is a scalable computer system A computer with multiple processors which share the same memory or resources Multiple computers tied together which act as one unified computer All computers need Operating Systems even scalable computers Make a brand new OS for the scalable computer PortReprogram an existing OS to run on the scalable computer Write Use a Virtual Machine Monitor to run exiting OS39s on the scalable computer What kind of OS should we use on scalable computers The Scalable Computer Problem a Writing a new Operating System for new hardware with the functionality of existing Operating Systems is hard takes time and is impractical Porting eXisting Operating Systems to work on new hardware will also take a lot of time Development Costs Existing Operating Systems are MILLIONS oflines of code Resource Handling Need to rewrite drivers for changed devices Resource Handling Need to modify memory handling CCNUMA Scalability Need to divide the system into scalable units Fault Containment Need fault containment for each unit Distribution Need to build a single OS image across all units The Result Hardware is released with late buggy or even no software to support it In order for new technology to succeed hardware needs reliable software that will let users continue to use their existing library ofapplications The Solution Virtual Machine Monitors 1 Rather than creating a new OS or modifying existing OS39s use a VMM n The VMM can run existing commodity OS39s and also specialized OS39s all at the same time on one scalable computer a With small changes to existing OS39s make them VMM Aware the different Virtual Machines can share their data with each other 1 Using a VMM solves the following issues involved in porting an OS Development Costs The time needed to develop a VMM for a scalable computer is muchless than developing a complete OS Resource Management The VMM will handle all of the resource management Scalability Each Virtual Machine is one scalable unit ie 1 VM per processorcomputer Fault Containment Since each unit contains only one VM if a unit fails only one VM crashes Distribution Only the VMM needs to be distributed across all units in the scalable computer Virtual Machine Monitor Disadvantages n Overhead More Exception processing More Instructions executed More Memory needed Device I O virtualization a Resource Management VMMs make bad resource decisions when the 0539s running on a VM are VM unaware OS39s idleloop looks the same as calculations if the OS does not tell the VMM that the process is unimportant VMM cannot know when an VM is no longer using a page if the VM does not tell it so I Communication and Sharing Virtualized disks that are in use by one VM cannot be used by another VM until the disk is released by the first VM Even though data is stored on the same hardware it cannot be shared between VMs if the VMs are VMunaware What kind of VMMs to use 1 Now that we know the scalable computing problem and have a possible solution we need to try it out n One example ofa good VMM to use to address the scalable computing problem is Disco a What is Disco The Interface The Implementation The Simulated Results The Actual Results Disco A Virtual Machine Monitor 1 Disco is a VMM designed for the FLASH multiprocessor Scalable computer cluster Each unit has its own CPU DRAM and IO devices Units are connected together by a high speed interconnect Software running on this machine sees one computer with multiple processors and one bank of shared memory a Disco is just an example VMM Which can be used to address the scalable computer problem Disco The Interface n Processors Disco emulates a MIPS R10000 processor which maps directly to the actual processors contained on the FLASH machine Disco extends additional CPU features to improve performance for Discoaware VMs eg reduce trap emulation overhead El Physical Memory Disco emulates a contiguous physical address space The actual FLASH system has a NUMA space What is NUMA NonUniform Memory Addressing Remember each scalable unit has its own RAM so the total ofall memory available to the machine is spread out nonuniformly across several units a By emulating contiguous space existing OS s can run without modification 1 I O Devices Every IO device is virtualized Hard Disk Network Interface Interrupt Timers etc Special virtualization options on virtual disks to allow mounting across multiple VMs Implements an additional network device type other than Ethernet which allows faster and larger transfers between two VMs Disco The Implementation n Multithreaded sharedmemory program a Very lightweight 13000 lines of code 72KB executable footprint a Strives to be efficient Careful attention to NUMA placement Cacheaware data structures to lower cachemiss rates Interprocessor communication patterns Uses as few locks as possible so that multiple processors can access data at the same time Uses Waitfree synchronization for the same reason Disco The Implementation 2 Virtual CPUs Direct execution of Virtual CPU code on the Real CPU Allows most operations to run at the same speed as hardware except privileged instructions Disco runs in Kernel mode OS in Supervisor Mode Programs in User mode Schedules each virtual CPU to be timeshared across the physical processors Tries to run the virtual CPU on a real CPU that is close to the memory that it needs to access a Virtual Physical Memory Does a VMtoreal memory mapping by using the translationlookaside buffer TLB of the MIPS processor Due to a limitation of the MIPS processor Disco cannot use the TLB to remap memory that is within a kernelmode direct access memory segment In this case Disco must force the OS to use only memory that can be mapped Since Disco uses the TLB to remap memory the TLB will cache miss more often Disco implements a secondary TLB in software to quotincreasequot the size of the TLB 10 Disco The Implementation n NUMA Memory Management Uses a dynamic page migration and page replication system to help keep data in local memory Pages heavily accessed by one node are moved to that node Pages that are read by several nodes are copied to the nodes most heavily accessing them Pages have a limited number of times a page can move to avoid constant movement overhead E w l r 9 quot Ht 3 7 3 W m 3 NbanL y E 4 1 bmklnun 1 Dm c as 1 mm m E V wlualmacmne mu 392 1 unlry m navy win hbckmm m ycpu 7 mfg mum mummaw r ump I saw IleI nquot Disco wlual machine vcvu T 7 wow Fig 3 Msjanlam structures of Diuo Node 0 Node 1 VCFUD VCFEII V mePaQES vam ng Systom M Physical Pages Disco 1 Machine Pages in m Fig 2 Transpu pugs replication 11 Disco The Implementation 1 Virtual IO Devices Each virtual device uses a custom device driver written for the operating system a The custom driver passes all command arguments in one trap rather than several traps 1 Disk and network devices send Direct Memory Access DMA information through the trap to allow Disco to remap memory Since all DMA information is sent to Disco memory disk and disk resources can be shared between VMs ie Two VMs with the same exactVirtual Disk Virtual Machine 0 Virtual Machine 1 Code Data Bu arCache Code Daia Bu erCache 1 V 7 7 I T C LZE c i Machine Memory Shared pages Prlvale pages Fma pages Fig 4 Memory sharing in Discoi 12 Disco The Implementation in Virtual Network Interface The custom Disco network interface device does not limit the maximum transfer unit MTU of packets Disco remaps existing pages rather than copies them when a network device receives a packet that already exists in memory Saves memory and allows VMs to share memory through network communication NFS Server NFS Client Physical Pages Machine Pages Fig 5 Example of transparent sharing of pages over 13 Disco Running Commodity Operating Systems a The OS Virtualized for this test was IRIX a UNIX based operating system a To Virtualize an existing operating system using Disco some changes need to be made MIPS Architecture workarounds Custom Device Drivers Hardware Abstraction Layer HAL modifications 14 Disco Running Commodity Operating Systems a MIPS Architecture workarounds MIPS always allows direct memory access to the first kernel segment of memory Disco needs to remap memory before allowing an OS to access memory A change to the OS must be made so that the OS will never use this first kernel segment of memory a Device Drivers Custom Discoaware device drivers written specifically to run on the Guest OS can be made Allows devices to be optimized for disco In the case of IRIX no custom drivers were written 15 Disco Running Commodity Operating Systems a The Hardware Abstraction Layer HAL The HAL is a layer that comes with most modern OS s Allows an OS to be modified to run on different hardware Common changes to the HAL include Reduce the number of privileged instructions the OS uses Inform the VMM of resource utilization Inform the VMM of idle time 16 Disco Running Specialized Operating Systems I Allows Disco to support largescale parallel applications or applications that do not need a fullfunction or existing OS I SPLASHOS is a specialized OS for Disco Runs all applications in the same address space as the OS Allows Disco to have full control over page faults Very simple OS but a custom OS that can run parallel to other commodity 08 s 17 El El Experimental Results Experimental Setup The FLASH machine didn39t exist yet Had to simulate it using SimOS SimOS is a machine simulator The MIPS R10000 simulation model was too slow used simpler CPU model 1 Simulation was too slow to get useful data for large workloads Workloads Software Development pmake Parrallel compile of GNU chess many short lived processes OSVMM intensive Hardware Development ashlight VCS Concurent run of ashlight and vcs simulators long lived processes OS unintensive Scientific Computing raytrace Single parrallel process renders a quotcarquot model OS unintensive Commercial Database sybase Database queries on a preloaded database already in memory Memory intensive 18 Experimental Results n Execution Overheads Normalized Execution Time 160 140 120 100 80 60 4o 20 116 116 106 I 100 100 gig 45 Q E5 h IFIIX Disco IFIIX Disco IFHX Disco IHIX Disco Pmake Engineering Haytrace Database Fig 6 Overhead of virtualization 19 Experimental Results In Memory Overheads Fooiprim size MB 4O 35 27 27 g1 Disco BufferCache lRiXiText IHIXADaIa 77 51 38 40 33 v M ZVMs v M 1VM Fig 7 J M v M BVMs 8VMsNFS v M 4VMs Data sharing in Disco between virtual machines 20 Experimental Results In Scalability Normalized Execution Time 160 140 I20 100 80 SO 40 20 136 100 100 92 86 64 50 I 34 IRIX 1VM 2VM 4VM 8VM BVMnfs IFHX SplashOS pmake RADIX Fig Workload scalability under Disco Userislall User 21 Experimental Results a Dynamic Page Migration and Replication Normalized Execution Tlme 100 m c a o 40 20 100 00 Disco remote iocal 39 Exec 67 4s 1 5 75 1 00 5 76 1 00 1le Disco UMA IRIX Disco UMA Engineering Raytrace Fig 9 Performance bene ts of page migration and replication 22 Real Hardware Results 1 Disco was ported to hardware that did exist in order to confirm test results a Disco ported to run on SGI Origin200 Single 180MHz MIPS R10000 128 MB RAM a Porting Disco First IRIX boots and discovers Hardware Drivers Disco is jumped into before IRIX begins init Disco takes control over the system using the hardware drivers discovered by IRIX Z3 Real Hardware Results 1 Virtualization Overheads Slightly different workloads Pmake compiles Disco Engineering simulates the FLASH memory system Results are consistent with simulation Pmake 8 slower with Disco Engineering 7 slower with Disco Table III 0rigin200 Execution Time Pmake Engineering Breakdown RIX Disco Rm in ux Disco Ratio sear Sec sea son User 113 117 103 552 69 L07 Kernel 59 96 L62 02 02 100 Idle 13 114 L87 0 0 Total 303 32 108 654 690 10 Conclusion Virtual Machine Monitors are a viable solution to the problem of developing system software for scalable sharedmemory multiprocessors The Disco prototype experiment shows that the overheads ofVMMs are low Implementation costs using this technique will be lower than developing system software from scratch The growing trend toward multiprocessors promotes the growth ofVMMs 25 Questions 26
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'