Class Note for CMPSCI 691 at UMass(5)
Class Note for CMPSCI 691 at UMass(5)
Popular in Course
Popular in Department
This 6 page Class Notes was uploaded by an elite notetaker on Friday February 6, 2015. The Class Notes belongs to a course at University of Massachusetts taught by a professor in Fall. Since its upload, it has received 15 views.
Reviews for Class Note for CMPSCI 691 at UMass(5)
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 02/06/15
CMPSCI 691W Parallel and Concurrent Programming Spring 2006 Lecture 2 February 6 Lecturer Emery Berger Scribe Richard Chang 21 Overview This lecture gives an introduction to processes and threads Speci cally it covers how both can be used for parallel programming and concurrencyi Both can be used to hide latency maximize CPU utilization and handle mutiple asynchronous eventsi Issues such as programming style communication and synchronization are discussed The tradeoffs of using either are discussed and a bakeoff of using processes vsi threads is presented 22 Processes A process can be thought of as a program in execution running on an operating sytemi it consists of an execution context program counter registers an address space an open les list process id group id etci Processes can be used for parallel programming by spawning processes to execute concurrently and using some form of interprocess communication to allow them to share data We will examine the APls for creating processes methods of interprocess communication lPC and some example programs that illustrate these concepts 221 Process API On UNIX operating systems new processes are created using the fork 6 system call Fork creates a new copy of the current process The original process parent and the newly created process child execute the same program that the parent process was prior to fork being called but the return value of the fork call will differ in the parent and child processes The return value of the fork in the parent process will be the process id pid of the child process The return value of the fork in the child process will be 0 This allows the programmer to specify different behavior for the parent process and the child process Because processes created using forkO continue to execute the same program that their parent was the execO 6 system call is often used to replace the current process with a executable program after forkO has been called Figure 21 shows the source code for a C Program that uses fork and exec to create a new process and execute a new programi On line 8 the name of the program to be executed by the child process is read from stdin using a call to getsOi Then a call to fork is made which creates a new process which is a copy of the current process Both the child and the parent execute concurrentlyi The if statement on line 10 checks the return value of the fork calli If the return value is 0 then the process currently executing is the child and a call to execlpO is made to run the program whose name was read from stdini in the case when the return value of the fork call was nonzero then the currently executing process is the parent and the value returned by fork is the pid of the child process The parent process will sleep for 1 minute and then wait until the child process terminates 2 1 m 10 11 12 13 14 15 1e 15 1s 19 20 Lecture 2 February 6 include ltunistdhgt include ltsyswaithgt include ltstdiohgt main int parentID getpid ID of this process char prgname1024 getsprgname Read the name of the program we want to start int cid fork if cid 0 f I m the child process execlp prgname prgname O Load the program If the program named prgname can be started we never get to this line because the child program is replaced by prgname printfquotI didn t find program snquot prgname else I m the parent process sleep1 Give my child time to start waitpidcid O O Wait for my child to terminate printfquotProgram Ks finishednquot prgname Figure 21 An example program using fork and exec On Windows operating systems processes are not created by forking the current process Instead new pro cesses are created using the function CreateProcess 2 which takes 10 arguments that specify parameters such at the name of the application to execute and process attributes 222 Translation Lookaside Buffer Translation lookaside buffer TLB is a fast fully associative memory that is used as a buffer for Virtual address to physical address translation Entries in a processls page table are buffered in TLB in order to speed up memory address translation If a page number is found in TLB then the frame number can quickly be determined but if a page number is not found in a TLB and the page table in memory has to be queried and a TLB miss has occured which can lead to a huge performance loss Because the entries in a TLB are process speci c whenever a context switch occurs all of the TLB entries become invalid This leads to what is a called a TLB showdown Initially after a context switch the entries in the TLB are incorrect because processes have distinct address spaces This means that memory accesses after a process context switch will be very expensive because of all of the TLB misses that occur 223 Copyon Write When fork is called conceptually all resources of the parent process are copied for the child process to have its own address space execution context etc This would potentially mean that all of the page frames used by the parent would have to be copied and then if exec was called to run a new program all those copied pages for the child would be invalidated In order to avoid this fork is usually implemented using copyonwrite Instead of copying the page frames of the parent as new copies for the child the parent p 10 11 12 13 14 15 1e 15 Lecture 2 February 6 include ltunistd hgt int main int pfds 2 pipepfds if fork close1 close normal stdout duppfds1 make stdoutsame as pfdsEl closepfds0 we don t need this execlpquotlsquot quotlsquot NULL else close0 close normal stdin duppfds0 make stdinsame as pfds O closepfds1 we don t need this execlpquotwcquot quotwequot quot1quot NULL Figure 22 An example program using pipes for IPC and child processes initially share page frames While these page frames are shared their data cannot be modi ed so after a call to fork all shared page frames are marked as readonlyi When either process attempts to write to a readonly shared page a new copy is made for that process The original page frame is then writable to the other process 224 Interprocess Communication Processes can communicate using the system calls we have already seen One can think of the input to a process as its state before calling forkOi Processes can provide simple output by passing an integer argument to the exit function This value can be read by another process using the wait system call There are also methods of communication that allow running processes to communicate during execution These methods of interprocess communication IPC include signals pipes sockets and mmapi Signals allow processes to send and receive integer values While there are userde ned signals in UNIX signals are not terrible useful for parallel or concurrent programming They are mainly intended for interupts and not general communicationi Using pipes is a method of IPC that is easy and fast They can be used much like using pipes on the UNIX commandline to communicate between processes Figure 22 shows an example program that uses pipes to redirect the output of a call to ls to be the input to a call to we li This has the same effect as running ls I we 1 on the commandline in UNIXI Sockets can be used for IPC but they do require explicit message passing One bene t of sockets is that processes that are communicating over sockets can be distributed over networksi Running a program written using sockets on a local machine is usually quite ef cient while giving the programmer the option to distribute that program over a network Mmap is a UNIX system call that allows les to be mapped to memory This system call can be used for IPC by having multiple processes map the same le to their own distinct address spaces This way Lo 10 11 1 13 14 15 1e 15 1s 19 20 Lecture 2 February 6 include ltpthread hgt void run void d intq int d intv O for inti O i lt q i v v expensiveComputationi return void v main pthreadtt1 t2 intrl r2 pthreadcreateampt1 run 100 pthreadcreateampt2 run 100 pthreadjoinampt1 void ampr1 pthreadjoinampt2 void ampr2 printfr1 7d r2 dn r1 r2 Figure 213 An example program using pipes for IPC changes made in the address space of one process that map to the shared le would be re ected in the others that are mapping the same le Because the mapped le is in memory disk IO is avoid and this is relatively ef cienti Synchronization can be handled by the flock system call When a process wants to ensure that no other processes are writing to the shared le it can obtain a le lock on that le using flock1 It should be noted that calls to flock are very expensive 23 Threads A thread consists of a thread ID program counter register values and a stackmi Unlike processes which each have a distinct address space threads share the same address space les sockets etci Similarly to processes threads can be used for parallel programming We will now discuss how threads can be created how they communicate and will see an example program using threads on a UNIX operating systemi 231 Threads API On UNIX operating systems the threads API used is called pthreads which stands for POSIX threads Threads are created using the function pthreadcreate which takes as an argument the name of the function that should be excuted as a separate thread The function pthreadjoin is used to wait for a thread to complete All threads created using pthreadcreate in a given process execute within that process Figure 213 shows the an example program that creates two new threads to execute an expensive computation Because the pthreads API5 speci es that arguments passed to a function to be started as a new thread Lecture 2 February 6 25 must be a single void pointer the function run is de ned to take a void pointer The main function creates two threads that will run for 100 iterations each Then the main thread waits for both threads to nish executing and prints their results In Windows there is a function that is used to create threads called CreateThread3 which takes 6 parameters that specify attributes like function to be executed and stack size 232 Thread Communication ln threads everything is shared except stacks registers and threadspeci c data The old way of accessing this threadspeci c data was to use the pthreadjetspec if ic and pthreadgetspecific to access and modify data that a programmer wanted to be speci c to each thread and thus not shared A newer way to achieve the same result is to declare a variable using the static thread modi er This type of declartion means that variables de ned in this way will be threadspeci c Because data is shared among threads by default updates to this data must be sychronized Mutual exclusion to allow only one thread in a critical section are a time can be enforced using calls to pthreadJnutexJock amp1 and pthreadJnut ex7unlockamp1 Critical sections of code that contain updates to shared data can be wrapped in a pair of those calls to obtains a lock and then release it 24 Bakeoff There are tradeoffs in almost all design descisions and the same can be said of using either process or threads for parallel programming There is not one correct answer for all situations Whether it is best threads or processes for parallel programming is situation dependent 241 Performance Much of the performance of threads and processes is determined by the time and work done when a context switch occurs to execute a new thread or process Context switches for threads a much cheaper because the only data that needs to be stashed and loaded is the registers the program counter stack pointer All other data is shared amongst threads For processes all of that data must be stashed and loaded plus the process context must be stashed and loaded The TLB shootdown mentioned earlier occurs as well which causes performance hits every time a page in memory is accessed until the TLB is repopulated with entries for the new process Because context switches for processes are so expensive longer time quanta are required to overcome the cost of the context switch There is a tradeoff between time quanta and system responsiveness Longer time quanta usually means the system is less repsonsive 242 Flexibility Processes are much more exible than using threads It is very easy to spawn processes remotely Parallel programming using sockets and processes can very easily be distributed across a local network or the internet One downside to using processes is that communication must be done explicitly sockets or through some kind of hack mmap 26 Lecture 2 February 6 Threads on the other hand communicate through memory and must be on the same machine which reduces their exibilityi Also many programmers nd it dif cult to ensure that their code is threadsafe which means that their code maintains data consistency even when being executed concurrently by mutiple threads7i 243 Robustness Processes in general are much more robust because they are isolated from other processes In principle if one process dies then the other processes which are executing can just continue to execute with no effect Apache4 an open source web server comes in two versions The Apache lix branch is implemented using multiple processes to handle requests from users This allows for more robustness If a user s request causes a process to die then the server can continue to execute and serve other users7 requests Conversely threads are not as robust as processes If a thread crashes because of a deference of NULL for example then the entire process terminates Apache version 2x is implemented using multiple threads to handle requests from users The motivation behind this design decision is to increase performance because as we have already seen context switches from threads are much cheaper than for processes The downside of this descision is the loss in robustness If a thread crashes the whole server process might terminate References 1 Di Bovet and Mi Cesatii Understanding the Linux Kernel O Reilly 2002i 2 Microsoft Corporation process and thread functions Createprocess 2005i httpmsdnmicrosofticomlibrarydefaultiasp urllibraryenusdllprocbasecreateprocessaspi 3 Microsoft Corporation process and thread functions Createthread 2005i httpmsdnmicrosofticomlibrarydefaultiasp urllibraryenusdllprocbasecreatethreadiaspi 4 The Apache Software Foundation Welcome the apache http server project 2005 httphttpdiapacheiorgi 5 Lawerence Livermore National Laboratory Posix threads programming 2006i httpwwwillnligovcomputingtutorialspthreadsi 6 Mi McKusick Ki Bostic Mi Karels and Jr Quarterman The Design and Implementation of the 44 BSD Operating Systemi AddisonWesley 1996 7 Al Silberschatz P Galvin and G Gagnei Applied Operating System Concepts John Wiley and Sons 2000
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'