Class Note for CMPSCI 377 at UMass(15)
Class Note for CMPSCI 377 at UMass(15)
Popular in Course
Popular in Department
This 4 page Class Notes was uploaded by an elite notetaker on Friday February 6, 2015. The Class Notes belongs to a course at University of Massachusetts taught by a professor in Fall. Since its upload, it has received 14 views.
Reviews for Class Note for CMPSCI 377 at UMass(15)
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 02/06/15
CMPSCI 377 Operating Systems Spring 2009 Lecture 21 April 23 Lecturer Mark Corner Scribe Bruno Silva 211 Garbage Collection The dynamic memory allocator is a layer between the application and the OS managing heap objects When a program requests memory from the allocator via malloc for instance the allocator will return a pointer or reference to a piece of memory of the appropriate size When the program is done with the memory the memory should be released back to the allocator Languages such as C and C leave this job to the programmer to perform manually for example by using free On the other hand languages such as Java python etc automatically manage dynamicallyallocated memory which makes the programmers life easier and can eliminate entire classes of memory management bugs Although using free and delete is relatively simple it can be tricky to get them right A signi cant fraction of bugs in C and C Programs are related to manual memory management If we forget to free objects we end up with memory leaks if we free memory too soon we end up with dangling pointers also we can try to do weird things like performing double frees etc Therefore a process that manages memory automatically is clearly useful The most important concept for correctly implementing a garbage collector is that of live objects a live object is any object that can still be reached through one or more pointers Letls analyze the following C code node x new node quothappyquot note class node implicitly is a pointer node ptr x delete x node y new node quotsadquot can allocate into same space x pointed to cout ltlt ptrgtdata ltlt endl can print quotsadquot Even after the call to delete x is still alive because it can be reached through ptr But C doesnlt keep track of that After the call to delete the allocator considers memory area that used to belong to x as free So when new space is allocated to y the space which x used to point to and which ptr still points to could be reused This type of program can lead to insidious hardto nd bugs 212 Garbage Collection Algorithms Any garbagecollection algorithm has to reclaim all objects that are in memory but that can no longer be reached through any pointers These objects are clearly useless and their memory areas can be returned to the system Even though garbage collection can be very helpful from a programmerls point of view it can be slow and can degrade cache performance and page locality The main question for all garbage collection algorithms is how can we know whether or not something is still reachable77 Some of the classical algorithms for determining this are the marksweep algorithm reference counting and semispace 211 212 Lecture 21 April 23 2121 Marksweep The objects that a program can access directly are those objects which are referenced by local variables on the processor stack or by any globalstatic variables that refer to objects or by variables in CPU registers In the context of garbage collection these variables are called the roots An object is indirectly accessible if it is referenced by a eld in some other directly or indirectly accessible objecti An accessible object is said to be live Conversely an object which is not live is garbage Note that heap objects which are live are indirectly accessible from the roots or other heap objects The idea of marksweep is relatively straightforward We start at the roots and recursively visit every object accessible through pointers marking them as liver At the end of the process everything not marked is considered garbage and will be deleted Notice that marksweep can perform lazy garbage collection in the sense that it does not necessarily need to remove the garbage immediatelyi Note that marksweep does not clean up memory which is allocated but simply never used Also periodically we have to visit all objects recursively starting from the roots For a large program this will be slowi This is a problem with the traditional marksweep algorithmi 212 2 Reference counting The idea of reference counting is to maintain for every object the total number of references to that object ie the number of incoming pointersi Whenever the number of references is zero we know that the object is not accessible through any pointers and thus it is garbage Also whenever we choose to delete some useless object we have to recursively check that object to see if it contains pointers to other objects if so we have decrement the reference counters for those objects as well One problem of reference counting though is how to deal with cyclesi Suppose object A points to B and B points to A If these objects can only be reached through each other we have to free them too even though their counters are not zero Note that cycles can common in many data structures for example consider a doublylinked list The commonly adopted solution to this problem is to run marksweep now and then in order to remove cyclic references and then run normal reference counting on the rest of the timer Reference counting spreads out the overhead of garbage collection Traditional marksweep does all of the garbage collection operations in one large pass whereas with reference counting every reference operation will have some additional overhead spreading the garbage collection costs out in time Relative to traditional marksweep reference counting scales better even for large programsi 2123 Semispace GC Semispace works by maintaining two disjoint areas from which memory can be allocated These areas are called the fromspace and the tospacei At rst the algorithm allocates memory only from the fromspace without ever worrying about garbage collection Allocation then is typically performed using simple pointer bumping which simpli es the whole process a lot When we nally run out of space in the fromspace we sweep through all allocated objects that can be somehow reached those are the ones that are still live We then move each of these live objects to the to space taking care to update all pointers to the live objects to point to its new location in the to spacei Hence semispace is called a copying collectori After having done all this moving only live objects are in the to spacei From this moment on we can start using the to space to perform allocation The process is repeated again when the to space eventually runs out of space This method has as advantages the fact that it might compact objects in the memory thus increasing locality and minimizing fragmentationi Also when performing allocation from one of the spaces it can use simple Lecture 21 April 23 213 pointer bumping which is very fast However this approach doubles the memory requirements 2124 Generational GC Generational GC is an optimization used by copying collectors lts key assumption is that most objects die young this is called the generational hypothesis When this assumption holds we can optimize the GC for the common case To do so we first allocate objects into a nursery which is a small area of memory We can collect in that area very frequently since according to the generational hypothesis most of them will be garbage Also notice that since most of them will indeed be garbage it will be very fast to traverse them using eg MarkSweep After performing GC in the nursery we copy out all the survivors into the mature space by doing so we are assuming that all objects that survived the nursery will most likely live for a long time Note that we still have to collect the mature space however since those objects are not likely to die soon we don7t have to garbagecollect that area as frequently as we garbagecollect in the nursery A key idea is that we need to keep track of pointers from the mature space into the nursery 2125 Conservative GC Conservative GC can be used for languages such as C and C which were not explicitly designed for garbage collection This is a noncopying technique A conservative garbage collector is one that instead of knowing the exact location of each object discovers the set of live objects by scanning a region of memory looking for areas that may be objects Because C and C allow casting anything that can hold a pointer could conceivably be a pointer such as an unsigned long which is cast as a pointer or a floatingpoint type which is cast as a pointer etc Conservative GC more or less uses the duck test paraphrased here as if it looks like a pointer and acts like a pointer it s probably a pointer Starting from the roots we can find all objects which look like pointers and the objects to which they would point if they were in fact pointers and mark those objects as live By doing this we can provide garbage collection to languages which otherwise would not support it The possible objects found may or may not actually be objects but we are ensured that all live objects referred to from that particular area are found by the garbage collector Since we must discover the areas that might be objects we also have to know how to identify pointers usually if something looks like a pointer we assume it is a pointer Then conservative GC traces through those pointers marking everything that is still live Some of the drawbacks of this method are 1 areas of memory that look like pointers to objects but aren7t cause garbage objects to be retained as long as the fake pointer exists this increases memory usage of the garbage collector and can cause other limited resources to be exhausted and 2 the methods of finding exactly which areas to trace aren7t always portable 2126 Comparing GC to malloc Garbage collectors trade space for time If we collect all the time which requires a lot of processing time the GC allocator will use the least memory possible On the other hand if we have a lot of space to use we can collect very infrequently and thus be very fast In general with enough extra memory GC can take almost no extra time extra when compared to explicit memory management For example if we are willing to spend 4x more space than usual garbage collection can run as fast as explicit memory management if on the other hand we don7t give it too much space it will have to run very frequently and thus will demand much more processing time than explicit memory management 214 Lecture 21 April 23 Faced with this tradeoff7 when should we use a garbage collector instead of manual memory management Generally7 use a garbage collector 1 if you have lots of memory and 2 if we really want to avoid bugs or extra programming effort related to memory management If however7 hardware resources are the limiting factor and programmer resources are less of a limiting factor7 then manual memory management is probably a better idea
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'