Adv Operating Systems
Adv Operating Systems CS 4210
Popular in Course
Popular in ComputerScienence
This 0 page Class Notes was uploaded by Alayna Veum on Monday November 2, 2015. The Class Notes belongs to CS 4210 at Georgia Institute of Technology - Main Campus taught by Staff in Fall. Since its upload, it has received 6 views. For similar materials see /class/234153/cs-4210-georgia-institute-of-technology-main-campus in ComputerScienence at Georgia Institute of Technology - Main Campus.
Reviews for Adv Operating Systems
Report this Material
What is Karma?
Karma is the currency of StudySoup.
Date Created: 11/02/15
Virtual Memory Primitives for User Programs Andrew Appel and Kai Li Objectives User programs can benefit from use of VM primitives Efficiency is important not dominated by disk access overhead Many examples What s the performance on set of OSarch Some design considerations VM Primitives Trap Handle pagefault traps in user mode Prot1 Decrease accessibility of a page ProtN Decrease accessibility of N pages More efficient than calling Prot1 N times Unprot Increase accessibility of a page Dirty Return list of written pages since last call Can be emulated with ProtN Trap and Unprot Map2 Map same physical page at two different virtual addresses at different access levels in the same address space Example Concurrent Garbage Collection Fromspace and tospace Traverse reachable objects and move from fromspace to tospace eventually discard fromspace ProtN fromspace and initiate gc collector Do not block running threads mutator On mutator Trap move object and Unprot Must make sure in same process collector and mutator have different privileges Map2 Don t have to worry about synchronization taken care by protection restriction Benefits from small page size Example Virtual Shared Memory One writer multiple readers Trap Prot1 Unprot to modify readwrite access to shared pages to get current copy of remote page Map2 trap handler needs access to page protected from clients Small page size useful for false sharing Example Concurrent Checkpointing Simple approach stop all threads copy all state restart all threads More efficient approach ProtN state and start copyingcheckpointing On Trap copy and Unprot For repeated checkpoints Dirty works best Suggestion medium page sizes why Other examples Generational garbage collection Persistent stores Extending addressability Datacompression paging Heap overflow detection l Methodrs TRAP PRU r1 PRoer UNPRUI MAPZ DIRI y PAGESIZL l Concurrent GC M1 74gt11current checkpoint I Generational GC I l ersietent store Extending addressaliuility Dataeconlpreesion paging t Heap over ow f How did real systems perform IRA RAP PRUTl PRUID Machmc OS ADD IRAP iUNPROI iUNPROI MAPZ PAGESIZE Sun 360 3111103 10 012 760 1238 1016 yes 8102 Sun 360 3111103 11 012 2080 1800 8102 Sun 360 Mach 23Xp 012 3300 2310 8102 Sun 360 exc 012 3380 2880 8102 SparcStn 1 3111103 103c 003 010 830 1006 SparcStn 1 3111103 11 003 T230 1008 000 1006 839 arcStn 1 Mach 23xp 003 1330 1230 L 1006 SparcStn 1 Mach 23exc 003 1770 1170 1006 DEC 3100 Ultrix 11 0002 210 303 311 110 1000 DEC 3100 Mach 2139 21 0002 037 766 110 100039 DEC 3100 Mach 23 exc 0002 1203 1003 110 1000 hVaX 3 L ltrix 23 021 311 612 186 110 1021 i3800111P5C2 NX2 013 172 302 252 yes 1006 Or scaled performance Sun 36USunOSlU Sun 360SunOSl1 Sun 360Mac1123xp t Sun 360Mach25yexc SpartStnlSunOS391U3c SpartStnlSunOStl SparcSt111Macl125xp SparcSt111Mac112339exc J3EC73100L7112I1XL1 DECI 100MachZ3 xp DEC3100Macl123xp pVax3UltriyL1 1380NX2 10000 20000 C Numbers suggest it s possible to get good performance but not necessarily what s done Deadlocks Silberschatz Ch 7 also Priority Inversion Problem The Deadlock Problem A set of blocked processes each holding a resource and waiting to acquire a resource held by another process in the set Example System has 2 disk drives P1 and P2 each hold one disk drive and each needs another one Example semaphores A and B initialized to 1 P0 P1 wait A waitB wait B waitA Deadlock Characterization Deadlock can arise if four conditions hold simultaneously Mutual exclusion only one process at a time can use a resource Hold and wait a process holding at least one resource is waiting to acquire additional resources held by other processes No preemption a resource can be released only voluntarily by the process holding it after that process has completed its task Circular wait there exists a set P0 P1 P0 of waiting processes such that PO is waiting for a resource that is held by P1 P1 is waiting for a resource that is held by P Pn1 is waiting for a resource that is held by P and PO is waiting for a resource that is held by P0 Example of a Resource Allocation Graph Resource Allocation Graph With A Deadlock Graph With A Cycle But No Deadlock Basic Facts If graph contains no cycles gt no deadlock If graph contains a cycle gt if only one instance per resource type then deadlock if several instances per resource type possibility of deadlock Methods for Handling Deadlocks Ensure that the system will never enter a deadlock state Allow the system to enter a deadlock state and then recover Ignore the problem and pretend that deadlocks never occur in the system used by most operating systems including UNIX the Ostrich Algorithm Deadlock Prevention Restrain the ways request can be made Mutual Exclusion not required for sharable resources must hold for nonsharable resources Hold and Wait must guarantee that whenever a process requests a resource it does not hold any other resources Require process to request and be allocated all its resources before it begins execution or allow process to request resources only when the process has none Low resource utilization starvation possible Deadlock Prevention Cont No Preemption If a process that is holding some resources requests another resource that cannot be immediately allocated to it then all resources currently being held are released Preempted resources are added to the list of resources for which the process is waiting Process will be restarted only when it can regain its old resources as well as the new ones that it is reques ng how do you preempt a robot movement Circular Wait impose a total ordering of all resource types and require that each process requests resources in an increasing order of enumeration Deadlock Avoidance Requires that the system has some additional a priori information available Simplest and most useful model requires that each process declare the maximum number of resources of each type that it may need The deadlockavoidance algorithm dynamically examines the resourceallocation state to ensure that there can never be a circularwait condition Resourceallocation state is defined by the number of available and allocated resources and the maximum demands of the processes ResourceAllocation Graph single resource instance version of Bankers Algorithm Dijkstra multiinstance version similar objective keep system in safe state Unsafe State In Resource Allocation Graph Deadlock Detection Allow system to enter deadlock state Detection algorithm Recovery scheme DetectionAlgorithm Usage When and how often to invoke depends on How often a deadlock is likely to occur How many processes will need to be rolled back one for each disjoint cycle If detection algorithm is invoked arbitrarily there may be many cycles in the resource graph and so we would not be able to tell which of the many deadlocked processes caused the deadlock Recovery from Deadlock Process Termination Abort all deadlocked processes Abort one process at a time until the deadlock cycle is eliminated In which order should we choose to abort Priority of the process How long process has computed and how much longer to completion Resources the process has used Resources process needs to complete How many processes will need to be terminated ls process interactive or batch Recovery from Deadlock Resource Preemption Selecting a victim minimize cost again considerfactors Rollback return to some safe state restart process for that state need logging facilitymechanism Starvation same process may always be picked as victim include number of rollback in cost factor Livelock Different type of starvation problem Processes continue executing as opposed to waiting like in deadlock but without making progress or at least some of them don t Eg High priority thread accepting requests runs too frequently so low priority never gets a change to do rest of request processing Eg processes call each other in a infinite loop none makes progress but they are both execu ng Priority Inversion Locking can have complex interactions with pnon es Example low priority thread A locks mutex M medium priority thread B runs for long high priority thread C waits on M Problem B will complete before C will ever get to run because B has priority over A and C waits on a resource that A has even worse C may have a time deadline to complete deadline may expire and C may determine system problem deadlock etc and restart Solution priority inheritance once C blocks on M A which has M gets C s priority or higher The Sprite Distributed File System Based on Caching in the Sprite Network File System by Nelson et al This is another distributed le system case study and will serve as a review too The traces of the Sprite study have been used repeatedly in the literature The paper is very readable Overview There are two main ideas in Sprite Cache consistency no stale data Caches vary in size dynamically this is not really a distributed le systems issue but it is a good topic to cover Sprite caches les in the main memory of clients and servers Clients only cache le data Servers can cache either data or metadata eg directory information recall the distinction between real le service and directory service The end result is a consistent le system but still not good enough to use for communicationsynchronization Motivation Sprite was motivated by a study of le system usage that has been Widely used Findings of the study one third of all le accesses are writes 75 of les are open less than 05 seconds 90 of les are open less than 10 seconds 2030 of new data is deleted Within 30 seconds 50 is deleted Within 5 minutes le sharing is rare What do these numbers mean think in terms of caching mechanism Write Policy in Sprite Recall that we want delayed writing for performance minor reliability problems are generally ignored Several le systems employ writeonclose instead of writethrough Problem it may still be too soon 75 of les are open less than 05 secs 90 of les are open less than 10 secs Sprite does periodic writes instead every 30 seconds blocks that have not been modi ed for 30 seconds are written back a block will go to the server s cache in 3060 secs and to disk in 60120 secs Problem this may be too infrequent if les are used for communicationsynchronization Cache Consistency Let s distinguish between sequential write sharing and concurrent writesharing sequential writesharing the le is shared but is opened for reading and writing in turn not simultaneously concurrent write sharing readers and writers can open the le concurrently Most distributed le systems guarantee strong consistency for sequential write sharing Even NFS NFS write policy is writeonclose when a le is opened the version of cached blocks is checked against the server version Sprite also guarantees strongly consistent concurrent write sharing Sprite Cache Consistency Main idea all le requests go through a server the server keeps state about the clients so that it knows when a file is about to be concurrentwrite shared in this case caching is disabled on all clients 0 Mechanism for disabling caching the server notifies all clients in case of concurrent write sharing the write client has to write all changed blocks back to the server read clients have to discard their cached blocks and direct all future read requests to the server Clearly this is not too efficient but concurrent write sharing is supposed to be rare not so if files are used for communicationsynchronization Sprite Cache Consistency cont d Recall that Sprite delays writes for xed amounts of time instead of doing writeonclose Thus the NFS policy is not enough to prevent sequential write sharing inconsistencies Instead when a le is opened the last writer is noti ed so that it can write back the changed blocks Dynamically Varying Cache Size RAM is used for caching both le blocks and Virtual memory pages used to two separate caches in many systems Sprite uses an approximate LRU algorithm for dynamically sizing the two caches the time of last access is kept for every block the oldest block is replaced for Virtual memory oldest is approximate common optimization can be complicated by different pageblock sizes For swap les the le block cache is bypassed ie no delayed writeback but pages will be cached in server RAM and access faster than on local disk maybe Executable ie readonly le pages need to be mapped to VM but they are already in le cache eg just compiled the pages move from one cache to the other get marked for replacement in le cache Performance Measurements client caches with delayed write important for client performance server utilization network utilization scalability caching directorynamingattributes gt further reduces servernetwork utilization need more state at server Program Description Do Kbytessec Re V 549 t every byte of the les compile and link the les Developed by M Satyanarayanan for benchmarking the Andrew le see HOWA87 for Fsmake Use the make 566 289 39 ID Simulator Simulate set 23 0 00 The 10 columns give the average rates at which le data were read and written by the benchmark 6 when run on Sun3 s with local disks and warm caches they measure the benchmark s IO intensity Diff Compare 2 ident 2524 O ical lMbyte les Nroff Format the text 161 181 39 39 of this naner wdmnr af d c o LSOHM 5039 Lxxxi 7 alt 7777777 WT 4 Andrew K 439 Sort o 40 a 4quot Fsmake I K Simulator 44 Tho 30 was Diff 2 u quott l 1 1 20 t t i H 7 74L 109039 Vno U Megaby1es of Cache a Q D onmwwmv lt U W 10539 H Andrew 90 4 Sort quot 439 Fsmake 7539 l quotquot4 Simulator 4 Nroff o 1 7 3 4 Megabytes in Cache b HmltHmm p o Hf N HHMHC 7 o o 9 15 u o 9 o o 9 Hm Andrew F smake Simulator NO client cache cold N0 client cache warm Diff h n q now mgmanqmd N o C hem Caches 4 I y With Client Caches 14 39 Number of Clients Hmltemm poAmNHHlC N0 Client Caches With Client Caches 3 4 5 6 7 Number of Clients WHog nvz I o m N q 3039 2539 2039 1039 Number of Clients mEH mam TgmHm ms I 747 77774 4 Andrew Spnte 7 r39quotquot l 2 3 4 i 6 7 Number of Chants gt1rult1gt1rn n 3 o H m N H 39 A39 4 39r kz4 Number of Clients Implementing Lightweight Threads paper Implementing Lightweight Threads by Stein and Shah implementation report threadspeci c data signals and threads signal handling also in Nichols et al Ch 5 Library Model described library is rnanytomany scheduling userlevel threads onto LWPs threads can be bound and unbound this is not a Pthreads implementation but is close Thread Structure Each thread has the standard set of data associated with it thread ID execution context signal mask priority pointer to thread stack Stacks have a red zone if they are allocated automatically page after stack is invalid or users can specify own stack and do Whatever they want ThreadSpeci c Data threadspeci c data useful Pthreads library offers an interface for maintaining threadspeci c data implementation techniques through a library API as with Pthreads with compilerlinker support compiler recognizes special variables e g pragma uns hared and the linker de nes a symbol to represent their size the library then allocates space from the base of the stack object les with threadspeci c data cannot be loaded dynamically dlopen since the stack Will then be already active hW support makes this more convenient the g7 register on a SPARC is reserved compilers should not use it When generating code and can be used to hold a pointer to threadspeci c data Preemption Two cases a newly runnable thread has a higher priority than an active one an active thread s priority is lowered below that of a runnable thread The library handles preemption by recognizing these cases they happen through library calls sending an S IGLWP signal to the LWP that needs to take action a handler is installed by the library Possible problem the preemted thread may be in the kernel executing a system call library has no way of doing anything it can only see the user stack the call will be restarted unsafe Threads and LWPs The thread library offers a way to ask for a number of LWPs thrsetconcurrency New LWPs are created then all existing ones are blocked and runnable threads eXist the kernel sends a SIGWAITING signal when all LWPs in a process are blocked LWP can idle wait for more work or can be parked wait for speci c thread to become runnable again for bound threads always park Traditional Signal Processing A process can associate an action with the signals it may receive sigaction SIGI GN ignore signal S IGD FL default action usually terminate or ignore but can be stop S IGSTOP suspend S IGTSTP resume S IGCONT execute handler Signals can be generated by the system or by other processes can be synchronous or asynchronous A process can block mask signals If they are later unmasked they get delivered then but they are not queued total number of signals received lt number of signals sent Signal handling in thread libraries a singlethreaded process should continue to use signals the way it always has even if linked with a threads library solution signals are still delivered to the entire process ie sigaction table is common for entire process in a multithreaded process one thread must be selected to process the signal run signal handler solution per thread signal masks let the threads declare what they want to handle a signal handler has to work in a way such that it does not interfere with the thread execution solution async safety Signal handling kinds of signals synchronous undirect S I GFPE e g divide by zero S I GSEGV access to protected memory S I GPI PE use of broken pipe synchronous directed threadki11 asynchronous undirected kill SIGKILL SIGALRM the rst two kinds are delivered to the thread directly the last can be delivered to any one thread that can handle it only one Thread libraries and Async Signals problem userlevel threads and their masks are invisible to the kernel signal delivery depends on the thread signal mask kernel cannot see it so cannot determine Whether to deliver an asynchronous signal to the thread or not kernel cannot deliver signal to a thread that has it masked but cannot drop it just because currently executing thread has blocked it Async Safety another goal of thread libraries async safe critical sections a function is asyncsafe if it is reentrant While handling a signal if a call is not asyncsafe then calling it again from a signal handler may result in a deadlock ie the signal handler will block waiting for the interrupted code Pthreads mutexes and condition variables are not async safe POSIX semaphores are Async safety is a standard property of system routines Async Safety in User libraries how does a library enable some operations to be asyncsafe just mask the signals in the critical section eg an asyncsafe muteXlock operation will mask the signals and a muteXunlock will unmask them such critical sections could be in user code or the thread library code itself since the rst signal handler to run is inside the library masking signals could have very low overhead just change library data no system call Solutions in Thread Libraries how can signals be handled by a userlevel thread library the LWP signal mask is set to the union of all thread signal masks or at least it s less restrictive than currently active thread the library installs its own signal handlers signals are not rst handled by the current thread but by the library when a signal is delivered the library signal handler nds a thread that can receive it and delivers it to it this possibly causes an inactive thread to be scheduled if the signal is masked by all threads not really masked but deferred for async safety the LWP signal mask is changed and the signal is resent to the process if unidirected or the LWP if directed so that the default action is taken similarly inside the kernel substitute process for LWP and LWP for thread Destroying threads Deteached threads on exit gt put on deathrow and periodically destroyed by reaper thread Undeteached threads which eXit are reaped on thread j oin join code should irnplrnent appropriate operations Thread strucuturesstacks are placed on stack gt not to wait for new page allocs etc Consistency Models Based on Tanenbaum van Steen s Distributed Systems Ch 6 section 62 Consistency Models 39 Both for DSM but also for regular multiprocessors with real shared memory ie either hardware or software could be enforcing these protocols 39 Tanenbaum 62 Consistency Models 0 A Consistency Model is a contract between the software and the memory it states that the memory will work correctly but only if the software obeys certain rules 0 The issue is how we can state rules that are not too restrictive but allow fast execution in most common cases 0 These models represent a more general view of sharing data than what we have seen so far Conventions we will use 0 Wxa means a write to x with value a 0 Ryb means a read from y that returned value b 0 processor used generically Strict Consistency Strict consistency is the strictest model a read returns the most recently written value changes are instantaneous not welldefined unless the execution of commands is serialized centrally otherwise the effects of a slow write may have not propagated to the site of the read this is what uniprocessors support a 1 a 2 printa always produces 2 to exercise our notation Pl W X 1 P2 RX0 RX1 is this strictly consistent Sequential Consistency Sequential consistency serializability the results are the same as if operations from different processors are interleaved but operations of a single processor appear in the order specified by the program Example of sequentially consistent execution Pl Wxl P2 RXO RX1 Sequential consistency is inefficient we want to weaken the model further Causal Consistency by Tech people 0 Causal consistency writes that are potentially causally related must be seen by all processors in the same order Concurrent writes may be seen in a different order on different machines causally related writes the write comes after a read that returned the value of the other write 0 Examples which one is causally consistent if any P1 P2 P3 P4 P1 P2 P3 P4 WXl WX3 RXl WX2 RX2 RXl RXl RX2 0 Implementation needs to keep dependencies Pipelined RAM PRAM or FIFO Consistency PRAM consistency is even more relaxed than causal consistency writes from the same processor are received in order but writes frorn distinct processors may be received in different orders by different pI39OCESSOI39S Pl Wxl P2 RXl WX2 P3 Rx2 Rxl P4 Rxl Rx2 i re inernen Sl ght f t Processor consistency PRAM consistency plus writes to the same memory location are viewed everywhere in the same order Weak Consistency 0 Weak consistency uses synchronization variables to propagate writes to and from a machine at appropriate points accesses to synchronization variables are sequentially consistent no access to a synchronization variable is allowed until all previous writes have completed in all processors no data access is allowed until all previous accesses to synchronization variables by the same processor have been performed 0 That is accessing a synchronization variable ushes the pipeline at a synchronization point all processors have consistent versions of data Weak Consistency Examples 0 Which one is valid under weak consistency convention 8 means access to synchronization variable P1WXl WX2 S P2 RXl RX2 S P3 RX2 RXl S P1WXl WX2 S P2 S R X 1 Tanenbaum says the second is not weakly consistent Do you agree What is the order of synchronizations How do the rules prevent this execution 0 Weak consistency means that the programmer has to manage synchronization explicitly Release Consistency Release consistency is like weak consistency but there are two operations lock and quotunlockquot for synchronization acquire release are the conventional names doing a lock means that writes on other processors to protected variables will be known doing an unlock means that writes to protected variables are exported and will be seen by other machines when they do a lock lazy release consistency or immediately eager release consistency 39 example valid or not Pl L WXl WX2 U P2 L RX 2 U P3 R X l 39 Variant entry consistency like lazy release consistency but all data variables are explicitly associated with synchronization variables Summary of Consistency Models Consistency models not using synchronization operations Consistency Description Strict Absolute time ordering of all shared accesses matters Linearizability All processes must see all shared accesses in the same order Accesses are furthermore ordered according to a nonunique global timestamp All processes see all shared accesses in the same order Accesses are not ordered in Sequential time Causal All processes see causallyrelated shared accesses in the same order FIFO All processes see writes from each other in the order they were used Writes from different processes may not always be seen in that order Models with synchronization operations Consistency Description Weak Shared data can be counted on to be consistent only after a synchronization is done Release Shared data are made consistent when a critical region is exited Entry Shared data pertaining to a critical region are made consistent when a critical region is entered a Consider the following sequence of operations P1 Wx1 Wx3 P2 WX2 P3 RX3 Rx2 P4 Rx2 Rx3 Is this execution causally consistent Add or modify an event to Change the answer