Class Note for CMPSCI 377 at UMass(50)
Class Note for CMPSCI 377 at UMass(50)
Popular in Course
Popular in Department
This 7 page Class Notes was uploaded by an elite notetaker on Friday February 6, 2015. The Class Notes belongs to a course at University of Massachusetts taught by a professor in Fall. Since its upload, it has received 15 views.
Reviews for Class Note for CMPSCI 377 at UMass(50)
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 02/06/15
CMPSCI 377 Operating Systems Fall 2008 Lecture 3 September 9 Lecturer Pmshant Shenoy Scribe Shashz39 Singh 31 Operator Overloading in C It allows you to provide an intuitive interface to users of your class7 plus makes it possible for templates to work equally well with classes and builtinintrinsic typesi Operator overloading allows CC operators to have userde ned meanings on userde ned types classes Overloaded operators are syntactic sugar for function calls class Fred public if 0 Without operator overloading Fred addconst Fredamp x const Fredamp y Fred mulconst Fredamp x const Fredamp y Fred fconst Fredamp a const Fredamp b const Fredamp c return addaddmulab mulbc mulca Yuk else With operator overloading Fred operator const Fredamp x const Fredamp y Fred operator const Fredamp x const Fredamp y Fred fconst Fredamp a const Fredamp b const Fredamp c return ab bc ca endif By overloading standard operators on a class7 you can exploit the intuition of the users of that class This lets users program in the language of the problem domain rather than in the language of the machine The ultimate goal is to reduce both the learning curve and the defect rater 3 1 32 Lecture 3 September 9 3 2 The C Standard Template Library As you know already7 it is tough to program without good data structures Fortunately7 most C imple mentations have them builtinl They are called the Standard Template Library STL The two that you may need for 377 are queues and maps You should know what these are already Using the STL is pretty straightforward Here is a simple example include ltiostreamgt include ltqueuegt using namespace std queueltintgt myQueue int mainint argc char argv myQueuepush10 myQueuepush11 cout ltlt myQueuefront ltlt endl 10 myQueuepop cout ltlt myQueuefront ltlt endl 11 myQueuepop cout ltlt myQueue size ltlt endl Zero The type inside of the angle brackets says what the queue myQueue will hold Push inserts a COPY of what you pass it Front gives you a reference to the object7 unless you copy it Pop7 pops it off the queue and discards iti Often you will have a queue or a map of pointers Here is a more complex example you should now understand include ltiostreamgt include ltmapgt using namespace std class IntCell public IntCell int initialValue int getValue int setValue int val private int storedValue IntCellIntCell int initialValue 0 storedValue initialValue int IntCellgetValue return storedValue Lecture 3 September 9 33 339 int IntCellsetValue int val storedValue val 339 In a map the first parameter is the key mapltint IntCell gt myMap int mainint argc char argv IntCell a int 1 max 100 for i O i lt max i a newIntCell agtsetValuemaxi myMapi a Inserts a copy of the pointer for i O i lt max i a myMapi cout ltlt agtgetValue ltlt endl delete a myMapi NULL Good idea myMapO gtsetValueO Quiz can I do this Think about what the output from this should be 33 Computer Architecture Although some computer architecture problems underly a lot of the science involved in designing Operating Systems7 in this class we will review just some of the important topics that are relevant to make 055 FAST Specifically7 in this class we will focus on memory hierarchy and on good policies for improving caching mechanisms 331 Memory Hierarchy Modern computers have several types of memory The most obvious ones are the RAM and the hard disk There are7 however7 at least two other types of memory that are important the registers and the cache memory 3311 Registers and a little bit on context changing Registers are specialized pieces of memory in the CPU7 and can be seen as dedicated names for speci c words of memory Some of them are of general purpose such as AX7 BX7 CX7 which on a X86 can be used for addition7 multiplication7 etc Other registers have special purposes the SP Stack Pointer7 for instante7 34 Lecture 3 September 9 is used to point to one of the ends of the stack the PC Program Counter is used to point to the current instruction being executed and so on One of the problems that an Operating System has to deal with is how to make a machine with several simultaneous processes get by with just only one set of registers In other words we know that the CPU has only one Program Counter one Stack Pointer etc also each process has a different instruction being executed a different stack pointer and so on how then is it possible for the OS to manage multiple concurrent processes with just one set of registers available The secret is to realize that in fact on a single CPU there are no two processes running at the same time What happens is that the Operating System keeps alternating between processes vary fast giving us the impression of true concurrency From now on we will call this replacement of the active process by another one by the name of context switching During a context switch what the OS has to do is to save the current register s values ie copy them from the CPU to the RAM and to reload the saved registers of the process being activated ie to copy them from the RAM to the CPU In this way even though at any given time there might be a bunch of saved registers there will be only one set of registers actually being used that of the currently active process 3312 Cache memory As mentioned before registers are incredibly fast pieces of memory located straight into the CPU they are however few RAM on the other hand is much slower than registers but has a much bigger capacity ls there any type of memory in between Yes the cache memory Consider that every access to the main memory is expensive in terms of CPU cycles typical access times are around 100 cycles The role of the cache memory is to a fasterthan RAM memory but at a more affordable cost Since cache memory is not so big as the main memory we need smart ways to manage it Usually caches hold recentlyaccessed data or instructions The crucial assumption that justi es this approach is that data recently accessed might be needed again in a near future and that data near the one just accessed might also be needed soon Notice that even though cache is the generic term to denote this type of memory there are in fact several types of cache the L1 cache which is an onchip on the CPU memory smallish very fast very expensive but with very low capacity usually 32k 84k the L2 cache usually on or next the the processor larger than Ll and still more expensive than RAM but faster and with capacity in the order of magnitude of megabytes and the L3 cache which is pretty large on busl One of the most important ways to measure the performance gained by means of caching is the hit rate The hit rate refers to the percentage of accesses to data that is cached and that therefore does not need to be fetched from the main memory Very small differences in the hit rate can make a huge difference on performance eg if the hit rate falls from 99 to 99 the computer can become even thousands of times slower When the cache starts being used we say that it is cold meaning that the data we will soon need is not yet cached Notice that when the cache is cold it is unavoidable to incur in some initial misses these are called compulsory misses When they occur the computer needs to fetch the needed data on the main memory and then uses it to populate the cache One important detail to mention however is that the cache is not populated with individual bytes instead full lines are brought to the cache eg 128 bytes at a time The assumption behind this decision is that usually data presents spacial locality ie bytes near the one just accessed are likely to be needed in a near future After the cache has been lled in we said it has 1Compare the orders of magnitude of the cycles required to access each type of memory registers1 cycle of latency L1 cache 2 cycles L2 cache 7 cycles RAM 100 cycles Disk 40 million cycles Network 200 million cycles Lecture 3 September 9 35 been warmed up Cache memory then is expected to be holding the mostfrequently used data Besides compulsory misses we can also mention other two types of misses capacity misses which occur due to the nite size of the cache2 and con ict misses which occur whenever a different memory location had to be loaded into the same cache line as the one used by the needed data Since the cache is nite and because of associativity 7 to be discussed next a policy to manage the cache memory is needed Ideally the cache should be as large as the memory in which case we wouldnlt need any main memory of coursel since it cannot be that large we must nd a smart way to decide which data to keep and which data to remove from the cache when we run out of space The most popular policy for managing cache is known as the Least Recently Used LRU policyi According to LRU whenever we need to free some space in the cache we throw away the data we used the further in the past The assumption that justi es this behavior is that we often can use the past as a predictor of the future Thus it s reasonable to guess that the things we7ve used the further in the past will probably be needed the farthest in the future Of course this is not a certainty maybe the line just evicted will be needed two instructions ahead We don7t knowi Even then LRU turns out to be a pretty decent policy for managing cachet Another important issue related to caching is that of associativityi Ideally we would want that any memory position from the main memory could be stored in any part of the cache memory This would be a fully associative cache however this type of cache is very expensive to manufacture and requires very complicated logici Thus it is usual to restrict the places in the cache where each memory position of the main memory can be saved For example in a 2way set associative cache any piece of memory can end up in one of two places in a 1way direct mapping cache on the other hand each piece of memory always goes to the same place on the cachegi Finally it is important to understand the implications of a context switch on the cache Since the new process being activated is unlikely to share data or instructions with the current process it is reasonable to assume that the cache will turn cold again This brings us to the conclusion that context switches make the data on the cache useless causing misses and increasing the latency 332 Kernel mode and system calls It is a function of the kernel to protect of OS itself from users and users from other users In order to do so the kernel only allows privileged code to execute in what is called the kernel model Eg when a system call is made for reading something from the disk the OS makes sure among other things that the user which requested the reading is not trying to access other user s les Everytime the CPU goes into kernel mode all pipelines are flushed the context is saved the corresponding code is executed in kernel land and then the system returns to user mode restoring the context It is also important to realize that the hardware actually does have some support to ease the implementation of this separation between user and kernel modest 333 Timers and interrupts It is obvious that modern OSs have to deal with multiple processes running at the same time Also the system must remain responsive in the sense that 10 must be permitted even when the CPU is busy processing In order to make the system respond to certain events periodically timers and interrupts are used Eg when the timer that gives the time limit for process execution goes off the processor is interrupted the current process stops the OS takes control through the interrupt handler and nally the scheduler chooses the 2Ie these occur when the needed data had to be evicted due to lack of space and thus is no longer present in the cache 3This can be a problem if for instance A and B map to the same place and we need to read A and B in turns 36 Lecture 3 September 9 next process Interrupts also signal 10 events such as the arrival of a network packet when a disk read is complete etc 334 Traps and Interrupts A great deal of the kernel consists of code that is invoked as the result of a interrupt or a trap While the words 77interrupt77 and 77trap77 are often used interchangeably in the context of operating systems there is a distinct difference between the two An interrupt is a CPU event that is triggered by some external device A trap is a CPU event that is triggered by a program Traps are sometimes called software interrupts They can be deliberately triggered by a special instruction or they may be triggered by an illegal instruction or an attempt to access a restricted resource When an interrupt is triggered by an external device the hardware will save the the status of the currently executing process switch to kernel mode and enter a routine in the kernel This routine is a rst level interrupt handler It can either service the interrupt itself or wake up a process that has been waiting for the interrupt to occur When the handler nishes it usually causes the CPU to resume the processes that was interrupted However the operating system may schedule another process instead When an executing process requests a service from the kernel using a trap the process status information saved the CPU is placed in kernel mode and control passes to code in the kernel This kernel code is called the system service dispatcher It examines parameters set before the trap was triggered often information in speci c CPU registers to determine what action is required Control then passes to the code that performs the desired action When the service is nished control is returned to either the process that triggered the trap or some other process Traps can also be triggered by a fault In this case the usual action is to terminate the offending process It is possible on some systems for applications to register handlers that will be evoked when certain conditions occur 7 such as a division by zero 335 Synchronous and Asynchronous IO There are two types of inputoutput lO synchronization synchronous 10 and asynchronous lO Asyn chronous 10 is also referred to as overlapped 10 In synchronous 10 a thread starts an lO operation and immediately enters a wait state until the 10 request has completed A thread performing asynchronous lO sends an lO request to the kernel by calling an appropriate function If the request is accepted by the kernel the calling thread continues processing another job until the kernel signals to the thread that the 10 operation is complete It then interrupts its current job and processes the data from the 10 operation as necessary In situations where an lO request is expected to take a large amount of time such as a refresh or backup of a large database or a slow communications link asynchronous 10 is generally a good way to optimize processing efficiency However for relatively fast lO operations the overhead of processing kernel lO requests and kernel signals may make asynchronous lO less bene cial particularly if many fast lO operations need to be made In this case synchronous lO would be better 336 MemoryMapped IO Memorymapped 10 is an lO scheme where the device s own onboard memory is mapped into the pro cessorls address space Data to be written to the device is copied by the driver to the device memory and data read in by the device is available in the shared memory for copying back into the system memory Memorymapped 10 is frequently used by network and video devices Many adapters offer a combination of programmed 10 and memorymapped modes where the data buffers are mapped into the processors memory space and the internal device registers that provide status are accessed through the 10 space The adapter s memory is mapped into system address space through the PCI BIOS a software setup program or by setting jumpers on the device Before the kernel can access the adapter s memory it must map the Lecture 3 September 9 37 adapter s entire physical address range into the kernels virtual address space using the functions supplied by the driver interface 337 Virtual Memory Virtual memory is a technique which gives an application program the impression that it has contiguous working memory while in fact it may be physically fragmented and may even over ow on to disk storage Systems that use this technique make programming of large applications easier and use real physical memory egi RAM more ef ciently than those without virtual memory Note that 77virtual memory77 is not just 77using disk space to extend physical memory size h Extending memory is a normal consequence of using virtual memory techniques but can be done by other means such as overlays or swapping programs and their data completely out to disk while they are inactive The de nition of 77virtual memory77 is based on tricking programs into thinking they are using large blocks of contiguous addresses All modern general purpose computer operating systems use virtual memory techniques for ordinary applications such as word processors spreadsheets multimedia players accounting etci Few older operating systems such as DOS of the 1980s or those for the mainframes of the 1960s had virtual memory functionality
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'