CONCEPT OPER SYSTEMS
CONCEPT OPER SYSTEMS CS 533
Popular in Course
Popular in ComputerScienence
This 131 page Class Notes was uploaded by Orrin Rutherford on Tuesday September 1, 2015. The Class Notes belongs to CS 533 at Portland State University taught by Jonathan Walpole in Fall. Since its upload, it has received 83 views. For similar materials see /class/168272/cs-533-portland-state-university in ComputerScienence at Portland State University.
Reviews for CONCEPT OPER SYSTEMS
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 09/01/15
The Mach System Wm 2mm Support for multiple architectures Varying network speeds Simple kernel structure Distributed operation Integrated memory management amp IPC Heterogeneous system support Task Th read Port Port Set Message Memory Object task ex vegwon mveads povt tpvogvam coumev pon 5a 0 Task Threads Process 0 All threads are kernelsupported 0 Synchronization implemented with IPC C Threads library provides all standard tasks Mutual exclusion implemented with spinlocks Blocking waits with condition variables Simple model only threads are scheduled Multiple queues 0 Made up of kernel primitives 0 Handler is a thread RPC for logic Victim noti es handler via IPC Victim waits for resolution Handler receives noti cation with information Handler clears or terminates 0 Dif cult to implement 0 Process to process converted to IPC 0 Hardware exceptions handled by kernel 0 Traditional way global names untyped streams 0 Mach way location independent ports typed data 0 Managed by kernel 0 All objects have standard ports 0 Can be grouped into Port Sets destination port rep p0 sizeoperation pure typed data 1 on rights outoHInerdata message oontrol memory cache object memory cache object Arnap kernel map Brnap Amap kernel map B map send operation receive operation 0 All objects should be location independent 0 NetMsgServer handles naming amp transportation in user level 0 Everything is a Memory Object 0 Control via IPC 0 Memory managers are userIevewith default backup 0 BSD system calls 0 Emulation and servers at userlevel 0 Th reading library 0 Stubs Scalable Data Structures with Relativistic Programming Josh Triplett June 3 2009 Review 0 Motiviations for relativistic programming 0 Correctness requires performance 0 Performance requires scalability 0 Scale by reducing the need to agree 0 What does need to agree mean 0 Why do we think need to agree matters Topics 0 How do we create scalable data structures using relativistic programming techniques 0 How do we test the scalability of these data structures 0 How can we make this process easier in the future Hash table example Sample to illustrate relativistic programming methodology Maps key to value Chaining hash function maps key to bucket with list of nodes containing keys and values Lookup use hash to find bucket then check key of each node Allows ookups in constant average time Used frequently in operating systems Move operation Move operation Move operation semantics If a reader doesn39t see the old item subsequent lookups of the new item must succeed If a reader sees the new item subsequent lookups of the old item must fail The move operation must not cause concurrent lookups for other items to fail Semantics based on filesystems Why the move operation Trivial to implement with mutual exclusion o Insert then remove or remove then insert 0 Intermediate states don39t matter Hash table buckets use linked lists Relativistic linked list implementations already exist Move operation semantics not implementable using relativistic linked list operations insert and remove Solution Characteristics 0 Principles o One semantically significant change at a time 0 Intermediate states must not violate semantics 0 Need a new move operation specific to relativistic hash tables making moves a single semantically significant change with no broken intermediate state 0 Must appear to simultaneously move item to new bucket and change key Solution Characteristics 0 Principles o One semantically significant change at a time 0 Intermediate states must not violate semantics 0 Need a new move operation specific to relativistic hash tables making moves a single semantically significant change with no broken intermediate state 0 Must appear to simultaneously move item to new bucket and change key My solution o Crosslink end of new bucket to node in old bucket My solution o Crosslink end of new bucket to node in old bucket 0 While target node appears in both buckets change the key My solution Crosslink end of new bucket to node in old bucket While target node appears in both buckets change the key Need to resolve cross linking safely even for readers looking at the target node First move target node to the end of its bucket so readers can39t miss later nodes Memory barriers Benchmarking with rcuhash bash Run one thread per CPU Continuous loop randomly read or write Configurable algorithm and readwrite ratio Run for 30 seconds count reads and writes Reads Results 9999991 7e09 6e09 5e09 4e09 3e09 2e09 1e09 readwrite ratio reads tamejpmxock r r r r W tes 1400 1200 1000 Results 9999991 readwrite ratio writes tamejpmxock r r r r CPUS Reads Results 9991 readwrite ratio reads 7e09 00k 7 smock 7 7 09 tab erv ock 7 7 tab ejpm ock 7 7 5e09 7 4e09 7 3e09 7 2e09 7 1e09 7 o 64 W tes Results 9991 readwrite ratio writes 3e06 H 2 5e06 7 2e06 7 1 5e06 7 1e06 7 500000 7 Reads Results 11 readwrite ratio reads 2 5e08 2e08 1 5e08 1e08 5e07 W tes 1 2e08 1e08 8e07 6e07 4e07 2e07 Results 11 readwrite ratio writes Relativistic data structures Determine the required semantics and design to those only Within these semantics reduce algorithmic need to agree One semantically significant change at a time Intermediate states must not violate semantics Changes potentially become visible to other processors immediately or after arbitrarily long delay Ordering not guaranteed unless explicitly requested Future data structures 0 Resizable hash tables 0 Heaps priority heaps balanced trees skip lists Judy arrays Problems 0 Hash table required inventing a new algorithm 0 Coding the algorithm required direct use of low level operations such as memory barriers o How can we make this easier Goals 0 High level building blocks for relativistic data structures 0 No inventiveness required to create algorithms for new data structures or semantics 0 No manual placement of memory barriers or other low level synchronization operations Lightweight Remote Procedu Brian N Bershad Thomas E Anderson Ed Lazowska and Henry M Levy SOSP1 Presented by Tim Chevalier for CS 533 Concepts in Operating Systems April 22 2009 RPC is a protection mechal Design systems by putting different subsystems in diffe domains RPC was originally used in largekernel 05 Different protection domains on different mac Machines communicate with each other over a netwc Unintended consequenc RPC caught on in smallkernel OSes too Different subsystems still lived in different protectk But all protection domains lived on the same m RPC as way to enforce protection vs RPC as me communicating over a network RPC is overkill RPC was designed for communication over a new 0 for calls within the same machine it suffers from overly g overly expensive machinery As a result system designers used RPC only for communice applications and kernel and put all kernel subsystems int protection domain Want to make RPC fast enough that different kernel subsyst each other through RPC too The common case Small arguments Fixedsize arguments Simple arguments Calls Traditional RPC FIN alumina km 331quot CM 1295 5f ock WWWquot 3 Local calls T V 97 94 Unix 994 Sources of overhead In theory In reality Stub overhead One rocedure call Message buffer TWO ernel traps Access validati Messa a trans e Tyvo VM context Schedg ng SWItches VM context swiw Dispatch LRPC design goals Traditional RPC Traditional RPC interfaces A sewer exports an interface A client binds to that interface An interface lists all the procedures that the server wants to let clien LRPC binding a 3 gay 000 v laud nan Gan K o ArnoH I A 5 Omaha 9 m nu ma swur De W 2223 Calling V 6 Wu gucl I a Argument copying 00L to 9qu ash Multiprocessors Exploit multiprocessors to minimize context switching Minimize shared data structures thus maximizing concurre Cache domain contexts with multiple processors Stubs Avoid needless scheduling Running client thread in server domain Processor switching avoids CPU context switches Avoid needless indirection Simple stub design lets the kernel jump to the sewer entry stub din avoiding the indirection of an extra procedure I The argument and execution stacks are the key to avoiding this indin threads and stacks are decoupled Avoid redundant access valic Binding Object bypasses extra validation after the initial bindini Stacks avoid need to verify lhread s right to return to caller Avoid needless copying Pairwise shared by client and sewer allocation of Astacks avoids 0 Server and client share a private communication channel Client also copies results directly off A stack Avoid needless contention No shared data structures except for A stack queues One Astack queue for each procedure in each binding Corner cases Expect arguments to be small and handle larger ones specially Corner cases Sewers may terminate at any time Linkage stacks allow pending calls to server to return to their cal Summary Performance of LRPC is close to theoretical mil LRPC optimizes for common cases and minimizt simple calls sim le stubs good use 0 multiprocessors minimal argument copying minimizes thread context switches avoids indirection minimizes access validation avoids contention Does well for the majority of cases The Structure of the THE Multi rorammin System quotTHEquot Technische Hogeschool Eindhoven Edsger W Dijkstra 1968 Presented by Payal Agrawal 439 Portlanm Who is Dijkstra Remember him from algorithms 0 Dutch professor of Mathematics 0 Designed and implemented one of the first modern operating systems for a multi process computer using a layered scheme History Revisited Outline Goal of project Platform Tool System Structure System Hierarchy Level 05 Software Eng Concepts Contributions Goal of Project Process smoothly a continuous flow of user programs Specific objectives Reduction of turnaround time for programs of short duration Economic use of peripheral devices Efficient use of Memory and CPU The ability to run large flexible user programs that can communicate with each other Testable Not designed as a multiuser operating system Tool Dutch Electrologica EL X8 computer 32K core memory vs 12 Gb 512K words drum vs 100 Gb Memory cycle time 25 115 vs 5 70 ns Address size 27bits vs 64bits Low capacity channels supporting peripherals 3 paper tape readers and punches 1 printer 1 plotter and 2 teleprinter Q Portlansi gt 5 Terms used differently Core memory lt3igt Main Memory Drum ltigt Secondary storage Disk Segments ltigt Virtual pages Whole system is quotSociety of Sequential processes not sequential execution System Structure Storage allocation Processor allocation System Hierarchy Virtual Address Storage Allocation Von Neumann machine stored program computer Information in drum address Distinction between memory units pages and information units segment Pages Core Pages page frames Drum Pages disk blocks Segment virtual pages have quotvirtual addressquot that are mapped to the page quotphysical addressquot Virtual space gtgt physical space A page returned from core to drum can reside anywhere in the drum A program should not be saved in consecutive drum pages Virtual Memory Portland State UNIVERSlTV Physical Address Process Allocation Society of sequential processes Process per program Process per peripheral Processes for segment controller and message interpreter Delaying of process execution will have no harmful effects to the internal logic ofthe process Numerous processes executing independently of number of actual processors as long as processors can switch to the various processes Mutual synchronization of parallel sequential processes is implemented via quotsemaphoresquot virtualization of the CPU Semaphores Thread not invented by that time Integer variable initialized to 0 or 1 Two atomic operations P quotpasserenquot to let pass and V quotvrygevenquot to give up PSemaphore 5 Wait mm ma 1 while Sval lt0 add calling process to Slist and block sleep VSemaphore 5 Signal 3le Dle 1 If Sva lt 0 remove a process erom Slist wakeup Q Portlansimatgte 10 Semaphores For Mutual Exclusion muteX 1 PmuteX critical section VmuteX Mutual Exclusion One process at a time in critical section 0 P and V quotindivisible action atomic actions on quotmutexquot Mutex is globally accessible 0 Protection of critical sections PortlanSiN I Etkatre 11 Private Semaphores Conditional Synchronization mutex 1 rivateSema hore 0 Pmutex modify global state variables inspect condition Vprivateemaphore Vmutex Pprivateemaphore Private Semaphores used for Condition synchronization wait until condition holds before proceeding signal when condition holds so others may proceed Each sequential process has associated with it a number of private semaphores initialized to O Max value equals 1 Min value equals 1 Private semaphore can be released globally but only locked privately Proving Harmonious Cooperation Homing Position The neutral point in which all cyclic processes are when the system is at rest Process waiting for V operation on private semaphore signal is in homing position Process leaves homing position then performs task Process returns to homing position All processes will eventually be in nomlng position no deadlock System Hierarchy Layered Operating system Lower layers provide abstraction of resources Higher layers access lower levels for resources Access always proceeds from top down Each layer can be tested and verified indelenden Six Levels layers Level 0 5 Portland I Etgt 14 Level 0 CPU Virtualization Contains Processor Allocation Realtime Clock Interrupt Priority scheduling CPU Scheduler Semaphores to implement synchronization Provides Processor Virtualization Abstraction Each process appears to have own processor Processor Allocation EL X8 Level 1 Memory Virtualization Contains Segment Controller Unique identifiers for each segment of memory At all higher levels identification of information takes place in terms of segments 0 Provides Memory Virtualization call direction Abstraction Each process has Segment Controller a segment Of memory to use Processor Allocation EL X8 Portland I Etkatre 16 Level 2 Console Virtualization Contains Message Interpreter All processes share one physical console using mutual synchronization Messages passed from operators to processes Handles user input and processes output Virtual COnSOle call direction PrOVldES conSOle Virtualization Messagelnterpreter Abstraction Each process has Segmentc ntr er Processor Allocation own individual con l ELX8 Level 3 5 Level 3 IO Virtualization Contains the Sequential Processes associated with buffering of Input streams and unbufferlng of output streams Manages all IO between the devices attached to the computer Abstraction Devices wrapped by buffers and abstracted to logical communication units call direction 39 Level 4 User Programs User Programs Consists of the independent Buffering IO streams user programs Message Interpreter Segment Controller Level 5 The actual user Processor Allocation quotnot implemented by us EL xs ltgt Portlansi stc 18 Expenence I Conception I Construction 39 verITIcatlon Proof of correctness of the system Conception Construction Conception All the concepts are born It took a long time Learnt that the society of mutually synchronized processes in time behavior can satisfy all the requirements Construction Done in rather traditional manner Change of specifications has been rare Verification Testing Done in stages Bottom layer thoroughly tested before implementing layer above Prevent bugs instead of debugging Allows full system testing in manageable time 555555vs555555 testing had not yet been completed but the resulting system is guaranteed to be flawless Software Engineering Observations Production speed severely slows down when working with part time people People lose time and energy in switching over and the group loses decision speed Similar to context switching in OS 0 This type of work is very difficult and that every effort to do it with other than the best people is doomed to either failure or moderate success at enormous expense Mistakes made Paying too much attention to eliminating what was not the real bottle neck Trying for quotPerfect Installation Not much thought given during design phase Late Debugging lesson learnt that prevention is better then cure Contributions Layered Structure of 05 Concurrent Programming semaphores Memory segments virtual addresses The quotTHEquot system introduced the first forms of softwarebased memory segmentation freeing programmers from being forced to use actual physical locations on the drum memory Proof of correctness of the system A refined multiprogramming system whose implementation removes exhaustive testing Portland nite 24 References Reused material by Jin Li CS 533 Spring 2006 Jimmy Pierce CS 533 Spring 2005 Navya Jammula CS 533 Winter 2008 Additional information from httpenwikipediaorgwikiTHE rmnn s qstreamorgkrasiccs508 2006summaries pupa LATHEppt PortlanSiN nite 25 THANK YOU Virtual Memory Primitives for User Programs Andrew W Appel and Kai Li 1991 Presented by Nish Aravamudan May 17 2006 Overview Background Necessary Primitives Performance Considerations System Design Examples Background Virtual memory traditionally increases size of process39 address space Can also perform tricks via protection Including userspace handlers for protection violations If certain primitives are available to userspace Necessary Primitives TRAP Userspace pagefault handler PROTM ROTN decrease accessibility of pages UNPROT increase accessibility of page DIRTY list of dirtied pages since last call MAP2 map same physical page to two different virtual addresses with different protections in same address space Performance Comparison The better the primitives perform the better the algorithms perform Sun 380 Sunos 41 Math Sun 380 Sa39cStn 1 SunOS 41 Sa39cStn 1 Mach DEC 3100 Mach 25 Graphical Performance Comparison Sun aGnrSvaSm Sun 360SmeS1I 1 Sun Rfil lrrhlnchl txp Sun 39tillach2r 13M Spur 1 un EC 8pm ulSqu Hpm Mauhz m spamsmrwm Xc DIE 731JUFUltl39ix1l Examples39 Primitive Usage Moihods TRAP PROTJ PROTN iiNPROT MAP DIRTY PAGESIZF Concurrent CC V V SYN V V V V Concurrent chorkpoint V V i Gonoraiiional CC V V V 1 V Pcrsisiionii store V V V V Extending arltircssninliiy V V V V Daliaicomprossion paging V V V 1 cap over ow V 1 Concurrent Garbage Collection Synchronize collector and mutator Based on Baker39s algorithm At beginning of collection all objects in fromspace tospace is empty Collector traces graph of reachable objects copying each into tospace Every time mutator allocates new object invoke collector Concurrent Garbage Collection Baker39s algorithm39s invariants Mutator only sees tospace pointers in registers Objects in new and scanned areas only contain to space pointers Objects in unscanned area contains both to and fromspace pointers Concurrent Garbage Collection VM protections detect fromspace references and synchronize mutatorcollector Unscanned area set to no access Mutator will get pageaccess trap on unscanned objects Collector handles trap copying objects and fonNarding pointers Collector unprotects page and restarts mutator Collector also scans unscanned area concurrently with mutator execution Concurrent Garbage Collection scanned objects unscanned objects unused new objects Tospace From reference 4 Collector Copies fromspace objects to unscanned area Scans objects for pointers to fromspace objects Copies object to unscanned area if necessary Forwards pointers to tospace Once scanned unscanned all of fromspace is garbage acm scanned unsclnnad new and Mutator Allocates at new In Baker39s version blocks while Collector runs Concurrent Garbage Collection A B RW gt C gt D E gt F NA G gtA R1 gt A R2 gt B Registers Time 0 FromSpace To Space RW NA Concurrent Garbage Collection A A1 B B1 39gt g gtec1 39gt gtDD1 E gt F G gt A FromSpace To Space RW Collector Copies reachable objects from registers to to space Forwards pointers stored in registers R1 gt A1 R2 gt B1 Registers Time 1 RW NA Concurrent Garbage Collection A A1 3 B 39gt C 1gt C D 1 gt gt D1 E gt F G gt A FromSpace To Space RW Collector Protects unscanned area Starts mutator R1 gt A1 R2 gt B1 Registers Time 2 RW NA Concurrent Garbage Collection A A1 B B1 39gt C gt C 1 gt D gt D1 E gt F G gt A FromSpace To Space RW Mutator Uses R1 R2 Collector Concurrently scans next page copies reachable objects and forwards pointers R1 gt A1 R2 gt B1 Registers Time 3 Concurrent Garbage Collection Mutator A A A Uses object E B B B Page fault RW gt 0 WW gt c 1 WW Assume G also gt D gt D 1 reachable 39gt D1 Collector Handles page fault Remap page E E Copy forward gt F E1 Unprotect MA 39gt F RW RW G G 39 1 Registers Later FromSpace M FromSpace C To Space Shared Virtual Memory Multiple processormemory nodes connected by fast messagepassing network SVM presents large coherent shared memory address space Maintain singlewriter and multiplereader coherence at page level Many copies of readonly pages one copy of read write pages Concurrent Checkpointing Overlap timeconsuming checkpoint operations to overlap with program execution 1 Stop all threads in program 2 Heap globals stacks saved Set address space to readonly Restart threads Copying thread sequentially scans address space copying pages to new addresses then sets access to readwrite 3 If restarted thread writes page gets trap Concurrent Checkpointing 1 RW gt RO 1 Mark address space read only 2 Restart application threads 3 Copier thread copies pages to separate space 4 Thread accesses uncopied page 5 Copier handles page fault 6 Copier restarts thread Generational Garbage Collection Two properties of dynamically allocated records 1Younger records more likely to die soon 2 Younger records tend to point to older records All records in Gi older than 3ii1 2nd property means few or no pointers from G to G i ltj I Collect in youngest generation Detect assignments to old objects Persistent Stores Dynamic allocation heap persisting from one invocation to the next On stable storage Memorymap persistent store disk file Application can modify store but must commit modifications dirty pages Extending Addressability Persistent store can exceed 232 objects In any one run not likely Use 64bit addresses in disk objects and 32bit addresses in core objects Translation table stores 64bit pointer gt 32bit pointer translations on pagein Accessing 32bit pointer may result in page fault DataCompression Paging Pages of 32bit words can be compressed to 14 ofpage Compress instead of paging out Uncompress instead of paging in Can be done in OS or in garbage collector Heap Overflow Detection Must protect against overflow of stack Mark pages above stack noaccess Kernel usually handles page faults by allocating physical pages for stack transparently Heap overflow in garbagecollected system Ordinarily done with compare and conditional branch Instead terminate heap with guard page On fault collect garbage Primitive Performance Matters Algorithms share several common traits Memory made lessaccessible in large batches and moreaccessible page by page Almost all faulthandling done by CPU and takes time proportional to page size Every page fault makes page more accessible Frequency of faults inversely related to locality Usermode service routines need access to pages protected from usermode client routines Service routines don39t care about client39s CPU state Relevance to System Design TLB Consistency with lessaccessible pages Batch shootdown of TLB entries Same for paging out Optimal page size Fault handling by CPU in GPAGESIZE Access to protected pages Allow one thread access while preventing others Multiple mapping kernel copy shared pages or in kernel thread Conclusions VM design has significant impact on userspace performance and program design If you provide good services applications will run better Perhaps more important to provide the right services Other primitives as well eg pin pages mlock Also must ensure correctness eg mprotect not flushing TLB Spin Lock Alternatives r The Performance of Spin Lock Alternatives for SharedMemory Multiprocessors Thomas E Anderson L J Spin Lock Alternatives p Introduction 7 7 Spinlocks are commonly used to protect small critical regions 0 Problem On multiprocessor machines spinning consumes memory bandwidth slowing other processors L J Spin Lock Alternatives p Introduction 7 7 Question Are there more efficient algorithms for spin waiting that work on existing hardware L J Spin Lock Alternatives p Simple approaches to spin waiting spln on testandset lOCk 2 CLEAR while TestAndSetlock BUSY critical section lOCk 2 CLEAR 0 spin on read lOCk 2 CLEAR while lock 2 BUSY or TestAndSetlock BUSY critical section lOCk 2 CLEAR 0 ran simulation on Sequent Symmetry Model B 20 80386 processors L J Spin Lock Alternatives p Simple approaches to spin waiting r performance degrades badly with many processors spinwaiting T 40 ideal I sp ntestaset 4 5pm on road Elapsed time sec 8 1 S 9 1 3 1 7 number of processors Figs 1Principal perfonnance comparison elapsed time second to execute benchmark measured Each processor loops one millionP times acquire lock do critical section release lock and compute L J 5pm Lock Allemauves 7 p Software alternative 1 7 delay after spinning processor notices lock has been released idea reduce number of unsuccessful testandset ops statically assign each processor a delay from O to P dynamically adjust delay depending on how many collisions experienced similar to Ethernet J Spin Lock Alternatives p Software alternative 1 V hile lock BUSY l w BUSY or TestAndSetlock begin while lock BUSY Delay end L J Spin Lock Alternatives p Software alternative 2 V 7 p delay after every read of the lock 0 idea limit total communication bandwidth of spinning statically or dynamically set delays similar to alternative 1 L J Spin Lock Alternatives p Software alternative 2 F hile lock W 7 BUSY or TestAndSetlock BUSY Delay L J Spin Lock Alternatives p Software alternative 3 queueing in shared memory idea reduce cache invalidations from spinning processors each waiting processor enqueues itself and spins on separate location in memory processor leaving critical section notifies next processor in line high latency with small number of processors J Spin Lock Alternatives p 39 Init Lock Unlock L Software alternative 3 7 flagsO HASLOCK flagslP l MUSTWAIT queueLast O myPlace ReadAndIncrementqueueLast while flagsmyPlace mod P MUSTWAIT flagsmyPlace mod P MUSTWAIT flagsmyPlace 1 mod P HASLOCK J Spin Lock Alternatives p 39 Performance of software alternatives spin on read static 39elease backofi rel siaiic 39eir backoir ref qieue Overhead sec i 5 9 13 17 rumber 01 processors Fig 3 Principal performance comparison spinwaiting overhead seconds in executing the benchmark measured Each processor loops one milh on P times acquire lock do erilical section release lock and compute J Spin LockAitemanves 7p 39 Overhead sec Overhead vs number of slots 20 15 spin on read 1 1sot 1 4 4sols 165th 645m 5 0 1 5 9 13 17 number of processors Fig 4 Spinwaiting overhead seconds versus number of slots SmenckAHemalNesip 1 The Design and Implementation of a LogStructured File System Mendel Rosenblum and John K Ousterhoust Presented by Ian Elliot
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'