### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# Computer Architecture CPSC 5155U

GPA 3.91

### View Full Document

## 23

## 0

## Popular in Course

## Popular in ComputerScienence

This 708 page Class Notes was uploaded by Earlene Cremin III on Sunday October 11, 2015. The Class Notes belongs to CPSC 5155U at Columbus State University taught by Edward Bosworth in Fall. Since its upload, it has received 23 views. For similar materials see /class/221204/cpsc-5155u-columbus-state-university in ComputerScienence at Columbus State University.

## Similar to CPSC 5155U at

## Reviews for Computer Architecture

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 10/11/15

Views of Memory We begin With a number ofvieWs of computer memory and comment on their use The simplest view of memory is that presented at the ISA Instruction Set Architecture level At this level memory is a monolithic addressable unit Addresses amp Control Signals Main Memory At this level the memory is a repository for data and instructions With no internal structure apparent For some very primitive computers this is the actual structure In this view the CPU issues addresses and control signals It receives instructions and data from the memory and Writes data back to the memory This is the view that suf ces for many highilevel language programmers In no modern architecture does the CPU Write instructions to the main memory The Logical Multi Level View of Memory In a course such as this We Want to investigate the internal memory structures that allow for more ef cient and secure operations The logical view for this course is a threeilevel view With cache memory main memory and virtual memory Mam I I 151231 I Memory The primary memory is backed by a DAS D Direct Access Storage Device an external highicapacity dev1ce While DASD is a name for a device that meets certain speci cations the standard disk drive is the only device cmren y in use that ts the bill Thus DASD Disk This is the view We shall take When We analyze cache memory A More Realistic View of Multi Level Memory Main Memory 4 Kb SRAM buffer 16 Mb chip 4K Rows of 4Kb each 64 bit lines 8Way interleaved byte addressable memory 16 MB Databurst System Disk DRAM Cache Generic Primary Secondary Memory This lecture covers two related subjects Virtual Memory and Cache Memory In each case we have a fast primary memory backed by a bigger secondary memory The actors in the two cases are as follows Technology Primary Memory Secondary Memory Block Cache Memory SRAM Cache DRAM Main Memory Cache Line Virtual Memory DRAM Main Memory Disk Memory Page Access Time Tp Primary Time Ts Secondary Time Effective Access Time TE h 0 TP l h 0 TS where h the primary hit rate is the fraction of memory accesses satis ed by the primary memory 00 S h S 10 This formula does extend to multi level caches For example a two level cache has TEZhl T11 h1h2 T21 h1 h2 Ts NOTATION WARNING In some contexts the DRAM main memory is called primary memory I never use that terminology when discussing multi level memory Examples Cache Memory Suppose a single cache fronting a main memory which has 80 nanosecond access time Suppose the cache memory has access time 10 nanoseconds Ifthe hit rate is 90 then TE 09 o 100 1 09 o 800 090100 01 o 8009080 170 nsec Ifthe hit rate is 99 then TE 099 o 100 1 099 o 800 0990100 001 o 8009908 107 nsec Suppose a L1 cache with T1 4 nanoseconds and h1 09 Suppose a L2 cache with T2 10 nanoseconds and hz 099 This is de ned to be the number of hits on references that are a miss at Ll Suppose a main memory with Ts 800 TE 2410T11 h1oh20T21 h1o1 h2oTs 090 o 40 01 o 099 0100 01 o 001 o 800 090 o 40 01 o 99 01 o 080 36 099 008 467 nanoseconds Note that with these hit rates only 01 o 001 0001 01 of the memory references are handled by the much slower main memory Precise De nition of Virtual Memory implementation that we may use it However 1 shall give its precise de nition executing program into actual physical memory addresses Logical Address This de nition alone provides a great advantage to an Operating System which can then Secondary Storage pairing a fast DRAM Main Memory with a bigger slower backing store Originally W 39 39 39 39 39 39 ariarn o now part of the common de nition A program and its data couldbe swapped f VM out to the disk to allow another program to run and then swapped in later to resume Common Accurate De nition of Virtual Memory Virtual memory allows the program to have a logical address space much larger than the computers physical address space It maps logical addresses onto physical addresses and moves pages of memory between disk and main memory to keep the program running An address space is the range of addresses considered as unsigned integers that can be generated An N bit address can access 2N items with addresses 0 2N 1 16 bit address 216 items 0 to 65535 20 bit address 220 items 0 to 1048575 32 bit address 232 items 0 to 4294967295 In all modern applications the physical address space is no larger than the logical address space It is often somewhat smaller than the logical address space As examples we use a number of machines with 32 bit logical address spaces Machine Physical Memory Logical Address Space VAX 11780 16 MB 4 GB 4 096 MB Pentium 2004 128 MB 4 GB Desktop Pentium 512 MB 4 GB Server Pentium 4 GB 4 GB NOTE The MAR structure usually allows the two address spaces to be equal Generic Primary Secondary Memory View A small fast expensive memory is backed by a large slow cheap memory Memory references are rst made to the smaller memory 1 If the address is present we have a hit 2 If the address is absent we have a miss and must transfer the addressed item from the slow memory For ef ciency we transfer as a unit the block containing the addressed item The mapping of the secondary memory to primary memory is many to one in that each primary memory block can contain a number of secondary memory addresses To compensate for each of these we associate a tag with each primary block For example consider a byte addressable memory with 24 bit addresses and 16 byte blocks The memory address would have six hexadecimal digits Consider the 24 bit address OxAB7129 The block containing that address is every item with address beginning with OxAB712 OxAB7120 OxAB7121 OxAB7129 OxAB712A OxAB712F The primary block would have 16 entries indexed 0 through F It would have the 20 bit tag OXAB712 associated with the block either explicitly or implicitly Valid and Dirty Bits At system start up the faster memory contains no valid data which are copied as needed from the slower memory Each block would have three elds associated with it The tag eld discussed above identifying the memory addresses contained Valid bit set to O at system start up set to 1 when valid data have been copied into the block Dirty bit set to O at system start up set to 1 whenever the CPU writes to the faster memory set to 0 whenever the contents are copied to the slower memory Associative Memory Associative memory is content addressable memory The contents of the memory are searched in one memory cycle Consider an array of 256 entries indexed from O to 255 or 0X0 to OXFF Suppose that we are searching the memory for entry OXAB7 12 Normal memory would be searched using a standard search algorithm as learned in beginning programming classes If the memory is unordered it would take on average 128 searches to nd an item If the memory is ordered binary search would nd it in 8 searches Associative memory would nd the item in one search Think of the control circuitry as broadcasting the data value here oxAB7l2 to all memory cells at the same time If one of the memory cells has the value it raises a Boolean ag and the item is found We do not consider duplicate entries in the associative memory This can be handled by some rather straightforward circuitry but is not done in associative caches Associative Cache We now focus on cache memory returning to virtual memory only at the end Primary memory Cache Memory assumed to be one level Secondary memory Main DRAM Assume a number of cache lines each holding 16 bytes Assume a 24 bit address The simplest arrangement is an associative cache It is also the hardest to implement Divide the 24 bit address into two parts a 20 bit tag and a 4 bit offset lBits l23 4 3 0 l lFields lTag IOffset l A cache line in this arrangement would have the following format l D bit I VBit l Tag l 16 indexed entries l l 0 1 l OXAB712 lMOXAB7120 MOXAB712F l The placement of the 16 byte block of memory into the cache would be determined by a cache line replacement policy The policy would probably be as follows 1 First look for a cache line with V 0 If one is found then it is empty and available as nothing is lost by writing into it 2 If all cache lines have V 1 look for one with D 0 Such a cache line can be overwritten without rst copying its contents back to main memory Direct Mapped Cache This is simplest to implement as the cache line index is determined by the address Assume 256 cache lines each holding 16 bytes Assume a 24 bit address Recall that 256 28 so that we need eight bits to select the cache line DiVide the 24 bit address into three elds a 12 bit explicit tag an 8 bit line number and a 4 bit offset within the cache line Note that the 20 bit memory tag is diVided between the 12 bit cache tag and 8 bit line number Bits 23 12 l11 4 3 0 Cache View Tag l Line Offset Address View Block Number Offset Consider the address OXAB7129 It would have Tag OXAB7 Line 0X12 Offset 0X9 Again the cache line would contain MOXAB7120 through MOXAB712F The cache line would also have a V bit and a D bit Valid and Dirty bits This simple implementation often works but it is a bit rigid An design that is a blend of the associative cache and the direct mapped cache might be useful Set Associative Caches An N way set associative cache uses direct mapping but allows a set of N memory blocks to be stored in the line This allows some of the exibility of a fully associative cache without the complexity of a large associative memory for searching the cache Suppose a 2 way set associative implementation of the same cache memory Again assume 256 cache lines each holding 16 bytes Assume a 24 bit address Recall that 256 28 so that we need eight bits to select the cache line Consider addresses OXCD4128 and OXAB7129 Each would be stored in cache line 0X12 Set 0 of this cache line would have one block and set 1 would have the other Entry 0 Entry 1 D V Tag Contents D V Tag Contents a a OXCD4 MOXCD4120 to OXAB7 MOXAB7120 to MOXCD412F MOXAB712F O p l Virtual Memory Again Suppose we want to support 32 bit logical addresses in a system in which physical memory is 24 bit addressable We can follow the primary secondary memory strategy seen in cache memory We shall see this again when we study Virtual memory in a later lecture For now we just note that the address structure of the disk determines the structure of Virtual memory Each disk stores data in blocks of 512 bytes called sectors In some older disks it is not possible to address each sector directly This is due to the limitations of older le organization schemes such as FAT l6 FAT l6 used a 16 bit addressing scheme for disk access Thus 216 sectors could be addressed Since each sector contained 29 bytes the maximum disk size under pure FAT l6 is 225 bytes 25 o 220 bytes 32 MB To allow for larger disks it was decided that a cluster of 2K sectors would be the smallest addressable unit Thus one would get clusters of 1024 bytes 2048 bytes etc Virtual memory transfers data in units of clusters the size of which is system dependent Examples of Cache Memory We need to review cache memory and work some speci c examples The idea is simple but fairly abstract We must make it clear and obvious While most of this discussion does apply to pages in a Virtual Memory system we shall focus it on cache memory To review we consider the main memory of a computer This memory might have a size of384 MB 512 MB 1GB etc It is divided into blocks of size 2K bytes with K gt 2 In general the N bit address is broken into two parts a block tag and an offset The most significant N K bits of the address are the block tag The least significant K bits represent the offset within the block We use a speci c example for clarity byte addressable memory a 24 bit address cache block size of 16 bytes so the offset part of the address is K 4 bits Remember that our cache examples use byte addressing for simplicity EXAMPLE The Address 0xAB7129 In our example the address layout for main memory is as follows Divide the 24 bit address into two parts a 20 bit tag and a 4 bit offset l Bits l 23 4 l 3 0 l Fields l Tag l Offset I Let s examine the sample address in terms of the bit divisions above Bits 23 20l19 16l15 12l11 8l7 4 3 0 Hex Digit A l B l 7 l 1 l 2 9 Field OxAB712 0x09 So the tag field for this block contains the value OxAB712 The tag eld of the cache line must also contain this value either explicitly or implicitly More on this later Remember It is the cache line size that determines the size of the blocks in main memory They must be the same size here 16 bytes What Does The Cache Tag Look Like All cache memories are divided into a number of cache lines This number is also a power of two usually between 256 28 and 216 for larger L2 caches Our example used in this lecture calls for 256 cache lines Associative Cache As a memory block can go into any available cache line the cache tag must represent the memory tag explicitly Cache Tag Block Tag In our example it is OxAB7 12 Direct Mapped and Set Associative Cache For any speci c memory block there is exactly one cache line that can contain it Suppose an N bit address space 2L cache lines each of 2K bytes Address Bits N L K bits l L bits K bits Cache Address Cache Tag l Cache Line Offset Memory Address Memory Block Tag Offset To retrieve the memory block tag from the cache tag just append the cache line number In our example The Memory Block Tag OxAB7l2 Cache Tag OxAB7 Cache Line 0x12 Example Associative Cache for Address 0XAB7129 Suppose that the cache line has valid data and that the memory at address OXAB7129 has been read by the CPU This forces the block with tag OXAB712 to be read in Cache Tag 0XAB712 Valid 1 Dirty O Offset Contents 0x00 M OXAB7120 0x01 M OXAB7121 0x02 M OXAB7122 0x03 M OXAB7123 0x04 M OXAB7124 0x05 M OXAB7125 0X06 M OXAB7126 0x07 M OXAB7127 0X08 M OXAB7128 0x09 M OXAB7129 OXOA M OXAB712A OXOB M OXAB712B OXOC M OXAB712C OXOD M OXAB712D 0on M OXAB712E 0on M OXAB712F Example Direct Mapped Cache for Address 0XAB7129 Suppose that the cache line has valid data and that the memory at address OXAB7129 has been read by the CPU This forces the block with tag OXAB712 to be read in Cache Line 0X12 Cache Tag OXAB7 Valid 1 Dirty 0 Because the cache line is always the lower order bits of the memory block tag those bits do not need to be part of the cache tag Offset Contents 0x00 M OXAB7120 0x01 M OXAB7121 0x02 M OXAB7122 0x03 M OXAB7123 0x04 M OXAB7124 0x05 M OXAB7125 0X06 M OXAB7126 0x07 M OXAB7127 0X08 M OXAB7128 0x09 M OXAB7129 OXOA M OXAB712A OXOB M OXAB712B OXOC M OXAB712C OXOD M OXAB712D 0on M OXAB712E 0on M OXAB712F Reading and Writing in a Cache Memory Let s begin our review of cache memory by considering the two processes CPU Reads from Cache CPU Writes to Cache Suppose for the moment that we have a direct mapped cache with line 0X12 as follows Line Number 0X12 Tag Valid Dirty Contents Array of 16 entries OXAB7 1 0 MOXAB7120 to MOXAB712F Since the cache line has contents by de nition we must have Valid 1 For this example we assume that Dirty 0 but that is almost irrelevant here Read from Cache The CPU loads a register from address OXAB7123 This is read directly from the cache Write to Cache The CPU copies a register into address OXAB712C The appropriate page is present in the cache line so the value is written and the dirty bit is set Dirty 1 Now What Here is a question that cannot occur for reading from the cache Writing to the cache has changed the value in the cache The cache line now differs from the corresponding block in main memory The two main solutions to this problem are called write back and write through Write Through In this strategy every byte that is written to a cache line is immediately written back to the corresponding memory block Allowing for the delay in updating main memory the cache line and cache block are always identical Advantages This is a very simple strategy No dirty bit needed Disadvantages This means that writes to cache proceed at main memory speed Write Back In this strategy CPU writes to the cache line do not automatically cause updates of the corresponding block in main memory The cache line is written back only when it is replaced Advantages This is a fast strategy Writes proceed at cache speed Disadvantages A bit more complexity and thus less speed Example Cache Line Replacement For simplicity assume direct mapped caches Assume that memory block OXAB712 is present in cache line 0X12 We now get a memory reference to address OX895123 This is found in memory block OX89512 which must be placed in cache line 0X12 The following holds for each of a memory read from or memory write to OX895123 Process 1 2 The valid bit for cache line 0X12 is examined If Valid 0 go to Step 5 The memory tag for cache line 0X12 is examined and compared to the desired tag OX895 If Cache Tag OX895 go to Step 6 The cache tag does not hold the required value Check the dirty bit If Dirty 0 go to Step 5 Here we have Dirty 1 Write the cache line back to memory block OXAB712 Read memory block OX89512 into cache line 0X12 Set Valid 1 and Dirty O With the desired block in the cache line perform the memory operation More on the Mapping Types We have three different major strategies for cache mapping Direct Mapping this is the simplest strategy but it is rather rigid One can devise almost realistic programs that defeat this mapping It is possible to have considerable page replacement with a cache that is mostly empty Fully Associative this offers the most exibility in that all cache lines can be used This is also the most complex because it uses a larger associative memory which is complex and costly N Way Set Associative This is a mix of the two strategies It uses a smaller and simpler associative memory Each cache line holds N 2K sets each the size of a memory block Each cache line has N cache tags one for each set Example 4 Way SetAssociative Cache Based on the previous examples let us imagine the state of cache line 0x12 Tag Valid Dirty Contents Arrays of 16 bytes OxAB7 1 1 MOxAB7120 through MOxAB712F 0x895 1 0 MOx895120 through MOx89512F OxCD4 1 1 MOxCD4120 through MOxCD412F O O 0 Unknown Memory references to blocks possibly mapped to this cache line 1 Extract the cache tag from the memory block number 2 Compare the tag to that of each valid set in the cache line If we have a match the referenced memory is in the cache Say we have a reference to memory location 0x543126 with memory tag 0x54312 This maps to cache line 0x12 with cache tag 0x543 The replacement policy here is simple There is an empty set indicated by its valid bit being set to 0 Place the memory block there If all sets in the cache line were valid a replacement policy would probably look for a set with Dirty O as it could be replaced without being written back to main memory Relationships Between the Cache Mapping Types Consider variations of mappings to store 256 memory blocks Direct Mapped Cache l Way Set Associative 2 Way Set Associative 4 Way Set Associative 8 Way Set Associative l6 Way Set Associative 32 Way Set Associative 64 Way Set Associative 128 Way Set Associative 256 Way Set Associative Fully Associative Cache 256 cache lines 256 cache lines 128 cache lines 64 cache lines 32 cache lines 16 cache lines 8 cache lines 4 cache lines 2 cache lines 1 cache line 1 set per line 2 sets per line 4 sets per line 8 sets per line 16 sets per line 32 sets per line 64 sets per line 128 sets per line 256 sets per line 256 sets N Way Set Associative caches can be seen as a hybrid of the Direct Mapped Caches and Fully Associative Caches As N goes up the performance of an N Way Set Associative cache improves After about N 8 the improvement is so slight as not to be worth the additional cost Example Both Virtual Memory and Cache Memory Any modern computer supports both virtual memory and cache memory Consider the following example based on results in previous lectures Byte addressable memory A 32 bit logical address giving a logical address space of 232 bytes 224 bytes of physical memory requiring 24 bits to address Virtual memory implemented using page sizes of 212 4096 bytes Cache memory implemented using a fully associative cache with cache line size of 16 bytes The logical address is divided as follows lBits I31 28 l27 24 I23 20I19 16I15 12I11 8 l7 4 l3 0 l l Field I Page Number I Offset in Page l The physical address is divided as follows lBits l23 20l19 16l15 12l11 87 4 3 0 l lField l Memory Tag IOffset l VM and Cache The Complete Process 20 bits 12 bits Page umber 4 bit Offset 12 bits 12 bits 24bit address 20 bits 4 bits 20 bits Page Table 20 bit Memory Tag Cache Line Cache We start with a 327bit logical address The virtual memory system uses a page table to produce a 24bit physical address The cache uses a 24bit address to find a cache line and produce a bit offset This is a lot ofwork for a process that is supposed to be fast The Virtually Mapped Cache Suppose that we turn this around using the high order 28 bits as a virtual tag If the addressed item is in the cache it is found immediate y 32bit Logical Address 20 bits 8 bits 4 bits 4 bito set Page Cache Hit 20 28 bits Cache Line A Cache Miss accesses the Virtual Memory system 20 bits 12 bits Page umber Cache 12 bits 12 bits MISS 24bit address 20 bits Page Table More on Virtual Memory Can It Work When there is a cache miss the addressed item is not in any cache line The virtual memory system must become active Is the addressed item in main memory or must it be retrieved from the backing store disk The page table is accessed If the page is present in memory the page table has the high order 12 bits of that page s physical address But wait The page table is in memory Does this imply two memory accesses for each memory reference This is where the TLB Translation Look aside Buffer comes in It is a cache for a page table more accurately called the Translation Cache The TLB is usually implemented as a split associative cache One associative cache for instruction pages and One associative cache for data pages A page table entry in main memory is accessed only if the TLB has a miss Branch amp Bound The Assignment Problem Our previous discussion of the knapsack problem has already introduced a part of this strategy when we evaluated each partial solution for maximum possible value This lecture develops the branch and bound strategy in a more formal way The outline of this lecture is as follows 1 We rst de ne the branch and bound strategy 2 We then describe the assignment problem 3 We apply the branch and bound strategy to solution of an instance of the assignment problem The assignment problem is one of many problems for which the only known solutions take exponential time in general In general the assignment of N jobs to N workers 1 job to a worker requires N computations While we cannot reduce the complexity to polynomial we can use a number of tricks to speed up the solution considerably Optimization Problems The branch and bound strategy is applied to optimization problems Recall that an optimization problem attempts to maximize pro ts or minimize costs subject to constraints The value to be maximized or minimized is called either the value function or the objective function Here are some terms associated with the study A feasible solution satis es the constraints An optimal solution is a feasible solution that optimizes the objective function A bounding solution not necessarily feasible optimizes the unconstrained objective function Please note that the term bounding solution may be one that only your instructor uses Bounding the Solution A key part of the branch and bound strategy is placing upper and lower bounds on 1 Any possible solution to the problem and 2 Any possible solution arising from a given partial solution The branch and bound strategy is quite similar to backtracking We generate and evaluate a sequence of partial solutions rejecting ones that can t work In assessing the solution to the entire problem we use both 1 feasible solutions that are easily generated and 2 bounding solutions which might not be feasible By problem type here is how we use the two solution types Solution Type Minimization Problem Maximization Problem Feasible No optimal solution can No optimal solution can give cost more than this solution less pro t than this solution Bounding No solution of any sort can No solution of any kind can cost less than this solution give more pro t than this one Any optimal solution must be at least as good as a feasible solution already produced The Assignment Problem The assignment problem is a minimization problem For N 2 2 we are given N workers and N jobs We are given an N by N matrix of costs with entries CI I being the cost to assign worker I to job I We want to minimize the total costs of the assignments The feasibility constraints are obvious 1 Each worker is to be given exactly one job 2 Each job is to be assigned to exactly one worker In terms of the cost matrix the translation of the requirements is also obvious 1 Each row is to have exactly one element selected 2 Each column is to have exactly one element selected The answer is a total cost and either an array assigning jobs to workers or an array assigning workers to jobs In either case the array will have N elements and be a permutation ofthe set ofintegers 1 2 N An Instance of the Assignment Problem Here we use the textbooks example Four workers A B C amp D Fourjobs 1 2 3 4 1 2 3 4 A 9 2 7 8 B 6 4 3 7 C 5 8 1 8 D 7 6 9 4 The solution vectors are as follows Assigning jobs to workers 2 1 3 4 a total cost of 13 Worker A gets job 2 worker B gets job 1 worker C gets job 3 and D gets 4 Assigning workers to jobs B A C D again a total cost of 13 Job 1 goes to worker B job 2 goes to worker A job 3 to C andjob 4 to D It should be obvious that either representation of the solution conveys the same information Neither representation is better than the other Generating the Bounding Solutions an It will be generated without any regard to row or column constraints Row Minimum Give each workertne cheapest job Forget constraints 1234 A 9 78 2 B o4 7 3 C 58 8 1 D 769 i 10 Examination of the Bounding Solutions First note that neither bounding solution is feasible This is expected Now we examine what each of the bounding solutions is telling us The row minimum solution says No optimal solution can cost less than 10 The column minimum solution says No optimal solution can cost less than 12 Both of these statements must be true From this we conclude that no optimal solution can cost less than 12 NOTE Frequently these lower bounds on costs and upper bounds on pro ts are not actually used in the branch and bound solutions Your instructor just likes to calculate them Generating Trial Feasible Solutions We want to generate a few feasible solutions very quickly to aid in providing an upper bound to the optimal cost Recall that the assignment problem for N workers has N feasible solutions We are trying to avoid generation of all or a signi cant part of these Two obvious quick tries are 1 2 N and N N 1 2 1 For our four workers the two quick tries are 1 2 3 4 and 4 3 2 1 1234 1234 NB 7 13 94 dnwb 3me 2 4 6 Cost 26 Cost 1 Here we have discovered a feasible solution with total cost of 18 This solution is more costly that the bounded solution as expected so it might not be the optimal solution It is the best solution so far The Branch and Bound Strategy Branch and bound must begin with the complete generation of one feasible solution Most implementations do this by a complete generation of one path from the root node of the search tree to a leaf node Our trial feasible solutions are essentially two complete paths from root to leaf nodes At each point in the generation of the remaining part of the search tree branch and bound does two things 1 It updates the best known solution as new solutions are found 2 If examines each partial solution comparing the best solution that can be generated from this partial solution to the best known solution 3 A partial solution and its subtree in the search tree is pruned if it has no possibility of generating a solution better than a known solution Bounding Solutions Here are the bounds we have denved for the optimal solution of our problem Solutjons gt H l h 12 18 26 l Lower Upper Bounds ant a way r L pnulm solution This obviously involves the actual cost of the partial solution A getsjob 1 cost9 A gets 1 andB gets2 cost13 A gets 1 andB gets 3 cost 12 A getsjob 2 cost2 A gets 2 andB gets 1 cost 8 A gets2 andB gets 3 costs 4 AU 1 39 39 so that we can establish a lower bound on the cost of the partial solution run 39 39 39 39 uppei uuuuu Lower Bounds on the Partial Solutions We derive afigure such that any feasible solution based on this partial solution must cost at least as much as the estimate Ifthat gure is greater than the actual cost ofaknown feasible solution drop the partial solution prune its subtree Here is the notation we use to describe every node For this node A has been given job 1 The sum ofthe minima in the remaining rows ofthe matrix is 3 1 4 8 Choices made gt a i li Estimated cost remainin Estimated total cost gt With this approach our search tree begins as follows Better Lower Bounds on the Partial Solutions We get a better lower bound for the remaining cost if we compute the minima of the rows excluding the columns that have been already chosen I I I J A 1 212 213 144 9 2 7 8 3 3 13 1o 1 10 20 18 1234 1234 1234 1234 A 278 149 78 492 8 14927 B 4 7 B6 7 B6 7 B64 C 8 8 c5 3 C 318 053 D 594 D7 9 D76 D79 3148 3148 45413 31610 Under each node we see the method for estimating the lower bound on remaining cost Evaluating the Assignment for Worker A Recalling our quick feasible solution 1 2 3 4 with total cost 18 we now use the above information to make a tentative choice for worker A A 1 A 2 A 3 A 4 9 2 7 3 8 8 13 ll 17 ll 21 18 Our Best Choice I X What we see here is that no solution based on assignment of job 3 to worker A can be better than the solution we have already produced We prune that subtree We now have three open nodes nodes that can be further expanded into solutions We choose to practice best first branch and bound always expanding the node with the best here lowest cost bound on the solution Expand A 2 We now have ve live nodes with lowerbound values 13 14 17 17 and 18 Best rst strategy indicates that we expand the node with lower bound 13 Instruction Set Architecture ISA of the Boz S This chapter of the textbook begins a three chapter sequence on the design of a didactic computer called the Boz S the 5th design by your instructor The design is called didactic because many of its features exist only to be used within a context of teaching The Boz 5 shares many RISC characteristics including being a Load Store design but retaining some CISC characteristics as well The Boz 5 is a stored program computer following the von Neumann model It is a fetch execute design with no pipelining and few modern features Topics for the ISA include 1 The general purpose register set and its use 2 The special purpose registers including the PSR and their use 3 The address space and management of memory 4 The handling of the Input Output devices IO As much as possible I shall try to justify my choices in designing the ISA 1 2 3 4 5 6 Speci cation of the Boz S It is a stored program computer ie the computer memory is used to store both data and the machine language instructions It is a Load Store machine only register loads and stores access memory All arithmetic and logic operators are done on registers The Boz 5 is a 32 bit machine The basic unit of data is a 32 bit word This is in contrast to machines such as the Pentium class in Which the basic data unit is the byte 8 bits This is a two scomplement machine Negative numbers are stored in the two scomplement form The range for integer values in from 2147483648 to 2147483647 inclusive Real number arithmetic is not supported We may envision the computer as a typical RISC With an attached oating point unit that we will not design The CPU uses a 26 bit Memory Address Register MAR to address memory This implies 226 words 64 M words of main memory Cache memory is not used Speci cation of the Boz S Part 2 7 The memory uses a 32 bit Memory Buffer Register MBR to transfer data to and om the Central Processing Unit The MBR is used by Instruction Fetch to transfer instructions to the CPU 8 The Boz 5 uses isolated 10 with the dedicated instructions GET and PUT each of which takes an address on the IO bus as an argument 9 The CPU uses a 16 bit IO Address Register 10A to address IO registers This implies 216 65536 registers associated with IO 10 The CPU uses a 32 bit IO Data Register IOD to put and get IO data 11 The Boz 5 uses 20bit addressing and the entire address space is occupied The memory is 32 bit wordaddressable for a total of 220 l 048 576 words It is not byte addressable One advantage of this addressing scheme is that we may ignore the byte ordering problem known as Big Endian Little Endian 12 The Boz 5 has a 5 bit opcode allowing for a maximum of 25 32 different instructions As yet not all opcodes have been assigned Speci cation of the Boz S Part 3 13 The Boz 5 has four addressing modes direct indirect indexed and indexedindirect In addition two instructions explicitly call for immediate addressing in Which the operand is in the instruction itself Indexedindirect addressing is implemented as preindexed indirect This decision allows implementation of register indirect addressing a fth address mode 14 The Boz 5 has eight general purpose registers denoted RO through R7 with the pre x indicating a general purpose register Each of these registers is a 32 bit register able to hold a complete memory word RO is identically 0 It stores only the constant 0 and is not changed Rl through R7 is readwrite registers used to store results of computation Each of the eight registers can be used as an index register or as the source operand for an instruction Only registers Rl R7 can be changed by arithmetic or register load operations Attempts to change R0 are undertaken for side effects only such as setting the ALU status ags in the PSR Structure of the Memory Address Space The physical memory of the Boz 5 is quite small comprising only 64 M words which is 67108864 32 bit words or 256 megabytes It is not byte addressable The MMU Memory Management Unit breaks the memory into 64 pages of l M word l 048 576 words each Each executing process except the Operating System is allocated exactly one page of physical memory and cannot generate addresses outside that page The mapping of logical addresses to physical addresses is achieved as follows PhysicalAddress Page Number0220 LogicalAddress All processes including the Operating System view the MAR as a 20 bit register that generates a 20 bit logical address The Operating System executes in page 0 but has the privilege to change its page number so that it can access any physical address in memory The goals of this strange design are 1 To introduce a simple memory management system 2 To motivate discussion of simple Operating System security Operation of the Memory Management Unit The MMU takes the address part ofthe IR and the page bits in the PSR to generate an address The address part of the Instruction Register is 1K the low order 20 bits The page number is found in hits 21716 ofthe PSR 41it address is formed by concatenation which has the same effect as multiplication followed by addition Q Address Part v 31 2827 2423 2019 1515 1211 87 43 o o 19 o The Program Status Register PSR The PSR Program Status Register of the Boz 5 is a 32 bit word It contains Boolean ags and bit elds to control the status of the CPU All of the bits in the PSR can be read by any program User or 08 The write status of these bits varies by function The ALU status bits V C Z and N are set directly by ALU actions No programs 08 or User can write directly to these bits The O S can write to the other bits and does so in order to manage execution of IO and User processes In particular it uses some elds as follows The CPU priority to match execution priority to the IO device priority The I bit to turn off interrupts until they can be handled safely Security Flags to grant each program appropriate access rights The Page Number to allocate a page of memory to a user process to change the range of addresses it can access The PSR Part 2 Here is the layout of the PSR 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 Not presently assigned 6 bit Page Number The ten bits marked not presently assigned are reserved for future options for support of the Operating System Software designers who give in to the temptation to use these ten spare bits for other purposes might get a surprise when these bits are assigned by the hardware architect This has happened O 15 14 13 12 11 10 8 7 6 5 4 3 2 1 0 Security Flags V C Z N ROO 1 CPU Priority The security ags are yet to be determined They are mentioned here as a part of your author s campaign for more effective hardware support of security The 10 design will call for eight priority levels for interrupts thus three bits to specify CPU priority The control unit design is facilitated by grouping the I bit and 3 priority bits as the four low order bits of the PSR More On Handling Interrupts At any given time the CPU is executing with a given priority PCPU For user programs normally PCPU O PCPU is stored in the PSR At any given time either interrupts are enabled 1 1 or disabled I 0 Each IO device is assigned an interrupt priority PDEV A11 hardware interrupts are handled by an Interrupt Handler which issues a general interrupt signal INT to the CPU only when 1 A device interrupts 2 PDEV gt PCPU and 3 I 1 The normal CPU response to the general INT signal is as follows 1 Set I 0 to disable further interrupts until this one can be identi ed 2 Determine the source of the interrupt and its priority At this point we know that PDEV gt PCPU 3 Schedule the interrupt handler to be run Set PCPU PDEV and set I 1 4 Run the device handler for the IO device Data Formats in the Boz S Each word in the Boz 5 has 32 bits numbered left to right as 31 through 0 Bit 31 is the most signi cant bit bit 0 is the least signi cant bit Character Data Format 8 bit characters ASCII Bits 31to 24 23 to 16 15 t08 7t00 Character 3 2 1 0 16 bit characters Unicode Bits 31to 16 15 toO Character 1 0 Integer Data Format Thirty two bit two s complement arithmetic is used leading to a range from 231 to 231 1 or 2147483648 to 2147483647 inclusive Real Number Format Real numbers and other data formats are not supported by this simple CPU The Assembly Language of the B0z 5 The Boz 5 uses bits 31 27 of the IR as a ve bit opcode Of the possible 32 opcodes only 26 are implemented OpCode Mnemonic Description 00000 HLT Halt the Computer 00001 LDI Load Register from Immediate Operand 00010 ANDI Logical AND Register with Immediate Operand 00011 ADDI Add Signed Immediate Operand to Register 00100 NOP Not Yet De ned At Present it is does nothing 00101 NOP Not Yet De ned At Present it is does nothing 00110 NOP Not Yet De ned At Present it is does nothing 00111 NOP Not Yet De ned At Present it is does nothing This strange gap in the opcodes caused by not using 4 5 6 and 7 is an example of adjusting the ISA to facilitate a simpler design of the control unit With this grouping the instructions With a 00 pre x either have no argument or have an immediate argument in short no memory reference More Opcodes Here are the opcodes in the range 8 to 15 OpCode Mnemonic Description 01000 GET Input from IO Device Register to Register 01001 PUT Output from Register to IO Device Register 01010 RET Return from Subroutine 01011 RTI Return from Interrupt Not Implemented 01100 LDR Load Register from Memory 01 101 STR Store Register into Memory 01110 J SR Call a subroutine 01111 BR Branch on Condition Code to Address Here we begin to see some structure in the ISA The 01 pre x is shared by all instructions that generate memory address references This grouping simpli es the design of the Major State Register which is an integral part of the control unit The Last Opcodes OpCode Mnemonic Description 10000 LLS Logical Left Shift 10001 LCS Circular Left Shift 10010 RLS Logical Right Shift 10011 RAS Arithmetic Right Shift 10100 NOT Logical NOT One s Complement 10101 ADD Addition 10110 SUB Subtraction 10111 AND Logical AND 11000 OR Logical OR 11001 XOR Logical Exclusive OR The instructions with the 10 and 11 pre x in short all with the single bit pre x of 1 are register to register operations With no memory reference Again one sees this grouping in an attempt to simplify the control unit Privileged Instructions In modern computer design certain instructions are not available to programs executing in User Mode The assembler Will translate these into Traps Which are requests to the Operating System for it to do something Here are the Boz 5 instructions that should be privileged HLT Stop the Computer This Will be translated into a return to Operating System control The operating system Will terminate program execution resume control and pass the execution to another program GET Input directly from a physical device This Will be translated into a Trap to the O S input routine PUT Output directly to a physical device This Will be translated into a Trap to the O S output routine Addressing Modes The Boz 5 uses a number of modes to generate memory addresses For this discussion a direct reference to a general purpose register such as used by the arithmetic instructions Will not be called an addressing mode We shall devote a lecture to addressing modes For the moment we shall give the names of these modes and de ne one The modes are l 6 2 3 4 5 Immediate Direct Indirect Indexed Indexed Indirect Register Indirect These siX addressing modes are a bit too many for a pure RISC design but do not compare With the twenty or more seen in a fully CISC design Immediate Addressing It is convenient and good design practice to encode some small arguments directly into the assembly language instruction The Boz 5 has three instructions With immediate arguments LDI Load the register With the immediate argument ANDI Mask the register With the immediate argument and ADDI Add an immediate argument to the register In immediate addressing the twenty low order bits of the instruction stored in bits 19 0 of the IR are used to hold a constant For the ANDI the 20 bits are viewed as a bit mask and not an integer value This mask is represented as ve hexadecimal digits As a result of the ANDI the high order 12 bits of the register are changed to O For the LDI and ADDI this constant is viewed as a 20 bit two s complement integer having values in the range 219 to 219 l or 524 288 to 524 287 The most common immediate arguments Will be l O and l The unusually large range in the Boz 5 is a result of a desire to keep the control unit simple Immediate and Direct Addressing To clarify the idea of immediate addressing we investigate three instructions LDI Load Register Immediate LDR Load Register om Memory STR Store Register into Memory Consider the following instructions LDI R1 100 Load the value 100 into register Rl LDR R1 100 Load the contents of memory location 100 into register R1 Rl MlOO STR R2 100 Store the contents of register R2 into memory location 100 MlOO R2 There is no such thing as a Store Immediate instruction Direct Addressing Memory vs 10 The Boz 5 uses isolated 10 in which there are two address spaces A 20 bit address space for memory references and A 16 bit address space for IO references Consider the following two instructions LDR R3 200 Load register R3 from memory location 200 R3 M200 GET R3 200 Load register R3 om the 32 bit IO device register at address 200 on the IO bus For simplicity of design all references to IO device registers will use direct addressing only NOTE Some IO registers are read only Attempts to write to these registers using a PUT instruction will have no effect Implementation of Boolean Logic by Digital Circuits We now consider the use of electronic circuits to implement Boolean functions and arithmetic functions that can be derived from these Boolean functions Digital circuits are built from stande analog components such as transistors It is the manner in which these transistors are used that causes them to display the properties required for a digital circuit Early digital circuits were based on electromechanical relays automatic switches thatwere either on or off l39l39l a I39M Relay 0n Relay Off In 1937 George Stibitz ofBell Labs developed what he called the Model K It was a binary full adder based on relays implementing Boolean logic He developed the device at home in his kitchen hence the name In 1938 Konrad Zuse developed a relayrbased digital computer the 21 in his parents apartment in Berlin It was lost to bombing during the war Digital Technologies There are quite a few ways to build digital circuits The choice of which to use in any given device is based on a tradeoff of cost speed and power usage This course is based on an older technology that is a bit simpler to understand This technology is still seen in digital labs used for teaching The technology is called TTL Transistor Transistor Logic It is based on the use of transistors in a mode in which they act as switches much like relays Logically each TTL device is a Boolean device All inputs to this device and outputs from this device are either logic 0 or logic 1 Electrically these TTL devices are built to a standard that determines how voltages into the device will be interpreted and what voltage is output Here are the voltage standards for active high TTL the variety we study Input to Device Output by Device Logic 1 20 to 50 volts 24 to 50 volts Logic 0 00 to 08 volts 00 to 04 volts Note the greater latitude on input speci cations to allow for voltage degradation Implementation of Basic Circuits These circuits use simple switches to implement NOT NOR and NAND gates V1 5 v x1 25 v J1 Key Space V3 5 v lt lt C lt R2 J5 v2 lt 10kohm75 Key D 5 v J Ke In the circuit at right if both switches are closed logic 1 the output is 0 volts If neither is closed or only one is closed the output is 5 volts NAND gate Basic Digital Circuit Elements We have already discussed these gates but present them again NOT an r as below The circle at the right end ofthe triangle is important A Algebrajcally this function is denoted x x or x X The evaluation ofthe function is simple 6 l andi 0 Here is the truth table for the NOT operator Basic Boolean Operators Part 2 Logic OR This is a function of two Boolean variables We denote the logical OR of two Boolean variables X and Y by X Y Some logic books will use X v Y The evaluation of the logical OR function is shown by a truth table X Y XY O O O O l l l O l l l l Basic Boolean Operators Part 3 Logic AND This is a function of two Boolean variables We denote the logical AND of two Boolean variables X and Y by X 0 Y Some logic books will use X Y The evaluation of the logical AND function is shown by a truth table X Y XoY O O O O l O l O O l l l Another Boolean Operator While not a basic Boolean operator the exclusive OR is very handy Logic XOR This is a function of two Boolean variables We denote the logical XOR of two Boolean variables X and Y by X 6 Y Most logic books seem to ignore this function The evaluation of the logical XOR function is shown by a truth table X Y XGDY O O O O l l l O l l l O From this last table we see immediately ithat X OXandX l X Other Logic Gates The top gate shows the NOR gate and its logical equivalent The bottom line shows the NAND gate and its logical equivalent WW i In my notes I call these derived gates as they are composites ofBoolean gates that are more basic from the purely theoretical approach In actual fact the NAND and NOR gates are more primitive than the AND OR and NOT gates in that they are easier to build from transistors The students should note that the lab experiment discussed above is based on NAND and NOR gates The NAN D Gate as a Universal Gate We show how to use a NAND gate to implement the three basic gates of Boolean logic AND OR and NOT We beginWith a simple NAND gate and its truth table X Y XoY We now use the NAND gate to implement the basic Boolean devices The NAND Gate as 3 NOT Gate or an AND Gate Note in the above truth table that ifY X then X 0 Y XoX Y Here is the NAND implementation of the NOT gate X Y Since the NAND gate is logically equivalent to NOT AND We may use double negation to say that the AND gate is equivalent to NOT NAND Here is the AND gate as implemented from two NAND gates x W v XvY Y The NAND Gate as an OR Gate In order to fabricate an OR gate from NAN D gates we must recall DeMorgan s laws One of DeMorgan s laws is usually stated as X IY f F This canbe changedtothe form 17 7 XY Here is the circuit XY The standard de nitions of the AND and OR gates call for two inputs Multiple Input Gates 3 input and 4 input varieties of these gates are quite common Here we give informal but precise de nitions Gate NOT AND OR NAND NOR Number of Inputs Exactly l 2 or more 2 or more 2 or more 2 or more Output 0 ifinput is l l ifinput is O 0 if any input is 0 1 if and only if all inputs are l 1 if any input is l 0 if and only if all inputs are 0 1 if any input is 0 0 if and only if all inputs are l 0 if any input is l 1 if and only if all inputs are 0 Example Changing the Number ofInputs Some lab experiments call for gates with input counts other than what we have We begin with two ways to fabricate a 47input AND gate from 27input ANDS UOwgt Another Example We now consider how to take a 47input AND gate and make it act as if it Were a 27input AND gate m Here I There are many others A 3 AB AB 3 Logic 1 Fan Out By definition the fan out of a logic gate is the number of other logic gates receiving input from it Considerations based on electrical engineering limit the fanrout of any gate Here is an OR gate with a fanrout of 5 It drives ve other gates of some kind tltgtlt When the fanrout of a circuit element gets too large there is a voltage sag This is similar to What can happen in a building When a large motor or large electric heater turns on Real Numbers We have been studying integer arithmetic up to this point We have discovered that a standard computer can represent a finite subset of the in nite set of integers The range is determined by the number of bits used for the integers For example the range for 16 bit two s complement arithmetic is 32768 to 32767 We now turn our attention to real numbers focusing on their representation as oating point numbers The oating point representation of decimal numbers is often called Scientific Notation Most of us who use real numbers are more comfortable with fixed point numbers those with a xed number of digits after the decimal point For example normal US money usage calls for two digits after the decimal 12345 Most banks use a variant of xed point numbers to store cash balances and similar accounting data This is due to the round off issues with oating point numbers It might be possible to use 32 bit two s complement integers to represent the money in pennies We could represent 2147483648 to 2147483647 Floating Point Numbers Floating point notation allows any number of digits to the right of the decimal point In considering decimal oating point notation we focus on a standard representation often called scientific notation in which the number is represented as a product 1S o X o 101 where 10 g X lt 100 The restriction that 10 g X lt 100 insures a unique representation Examples 009375 10 o 9375 o 10392 23375 11o 23375 101 14530 10 o 1453 o 103 6022142 0 1023 Avogadro s Number already in the standard form Ayogadro s number an experimentally determined value shows 2 uses of the notation 1 Without it the number would require 24 digits to write 2 It shows the precision with which the value of the constant is known This says that the number is between 60221415 o 1023 and 60221425 o 1023 QUESTION What common number cannot be represented in this form HINT Consider the constraint 10 S X lt 100 Zero Cannot Be Represented In standard scienti c notation a zero is simply represented as 00 One can also see numbers written as 00 o lOP for some power P but this is usually the result of some computation and is usually rewritten as simply 00 or 000 The constrained notation l3 o X o 10P 10 g X lt 100 not normally a part of scienti c notation is the cause of the inability to represent the number 0 Argument Solve X o 10P 0 Since X gt 0 we can divide both sides by X We get 10P 0 But there is no value P such that 10P 0 Admittedly 10 1000000 is so small as to be unimaginable but it is not zero Having considered this non standard variant of scienti c notation we move on and discuss normalized binary numbers Our discussion of oating point numbers will focus on a standard called IEEE Floating Point Standard 754 Single Precision Normalized Binary Numbers Normalized binary numbers are represented in the form 1S o X o 21 where 10 g X lt 20 Again the constraint on X insures a unique representation It also allows a protocol based on the fact that the rst digit of the number X is always 1 In other words X lY Here are some examples 10 10 o 2 thusP 0 X 10 and Y 0 15 15 o 2 thusP 0 X 15 and Y 5 20 10 o 21 thusP 1 X 10 and Y 0 025 10 o 272 thus P 2 X 10 and Y 0 70 175 o 22 thus P 2 X 175 and Y 75 075 15 o 271 thus P 1 X 15 and Y 5 The unusual representation of Y will be explained later The standard calls for representing a oating point number with the triple S P Y Representing the Exponent The exponent is an integer It can be either negative zero or positive In the IEEE Single Precision format the exponent is stored as an 8 bit number in excess 127 format Let P be the exponent This format calls for storing the number P 127 as an unsigned 8 bit integer The range of 8 bit unsigned integers is 0 to 255 inclusive This leads to the following limits on the exponent that can be stored in this format 0 g P 127 255 127 S P S 128 Here are come examples P 5 5 127 122 Decimal 1220111 1010 binary the answer 1 1 127 126 Decimal 126 0111 1110 binary the answer P 0 0 127 127 Decimal 127 0111 1111 binary the answer P 4 4 127 131 Decimal 131 1000 0011 binary the answer P 33 33 127 160 Decimal 160 1010 0000 binary the answer IEEE Floating Point Standard 754 Single Precision The standard calls for a 32 bit representation From left to right we have One sign bit 1 for a negative number and 0 for a non negative number Eight exponent bits storing the exponent in excess 127 notation 23 bits for the signi cand de ned below The standard calls for two special patterns in the exponent eld 0000 0000 P 127 Reserved for denormalized numbers de ned below 1111 1111 P 128 Reserved for in nity and NAN Not a Number Each de ned below The range of exponents for a normalized number is 127 lt P lt 128 Normalized Numbers in IEEE Single Precision In this standard a normalized number is represented in the format 1S o X o 21 where 10 g X lt 20 and 126 g P g 127 The smallest positive number that can be represented as a normalized number in this format has value 10 0 2 126 We convert this to decimal Log102 0301030 so Log102 126 126 o 0301030 3792978 007022 38 But 10quot07022 z 12 We conclude that 2 126 is about 12 0 1038 the lower limit on this format The largest positive number that can be represented as a normalized number in this format has a value 2 2 23 0 2127 minutely less than 2 o 2127 2128 Now Log102128 128 o 0301030 3853184 Now 10053184 z 34 We conclude that 2128 is about 34 o 1038 the upper limit on this format The range for positive normalized numbers in this format is 12 o 10 38 to 34 o 1038 Denormalized Numbers Consider two numbers one small and one large Each is a positive number that can be represented as a normalized number in this format Let X 1020 and Y 1020 Then X Y 1040 a number that cannot be represented in normalized form This leads to what is called an underflow error There are two options either say that X Y 00 or store the number in another format The designers of the IEEE Floating Point Format devised denormalized numbers to handle this underflow problem These are numbers with magnitude too small to be represented as normalized numbers The one very important denormalized number that is the exception here is zero Zero denoted as 00 is the only denormalized number that will concern us The standard representation of 00 is just thirty two 0 bits 0000 0000 0000 0000 0000 0000 0000 0000 0x00000000 Infinity and NAN Not A Number Here we speak loosely in a fashion that would displease most pure mathematicians Infinity What is the result of dividing a positive number by zero This is equivalent to solving the equation X 0 Y or O 0 Y X gt O for some value Y There is no value Y such that O 0 Y gt O Loosely we say that X O 00 The IEEE standard has a speci c bit pattern for each 00 and 00 NAN What is the result of dividing zero by zero This is equivalent to solving the equation 0 O Y or O 0 Y O This is true for every number Y We say that O O is not a number It is easy to show that the mathematical monstrosities oo co and co 00 must each be set to NAN This involves techniques normally associated with calculus An implementation of the standard can also use this not a number to represent other results that do not t as real numbers One example would be the square root of l Normalized Numbers Producing the Binary Representation Remember the structure of the single precision oating point number One sign bit 1 for a negative number and 0 for a non negative number Eight exponent bits storing the exponent in excess 127 notation 23 bits for the signi cand Step 1 Determine the sign bit Save this for later Step 2 Convert the absolute value of the number to normalized form Step 3 Determine the eight bit exponent eld Step 4 Determine the 23 bit signi cand There are shortcuts here Step 5 Arrange the elds in order Step 6 Rearrange the bits grouping by fours from the left Step 7 Write the number as eight hexadecimal digits Exception 00 is always 0x0000 0000 Space used for legibility only This is a denormalized number so the procedure does not apply Example The Negative Number 0750 Step 1 The number is negative The sign bit is S 1 Step 2 0750 15 o 050 15 0 21 The exponent is P 1 Step 3 P 127 1 127 126 As an eight bit number this is 0111 1110 Step 4 Convert 15 to binary 15 1 12 112 The signi cand is 10000 To get the signi cand drop the leading 1 from the number Note that we do not extend the signi cand to its full 23 bits but only place a few zeroes after the last 1 in the string Step 5 Arrange the bits Sign Exponent Signi cand Sign Exponent Significand 1 0111 1110 1000 m 00 Step 6 Rearrange the bits 1011 1111 0100 0000 m etc Step 7 Write as 0XBF40 Extend to eight hex digits 0xBF40 0000 The trick with the signi cand works because it comprises the bits to the right of the binary point So 10000 is the same as 1000 0000 0000 0000 0000 000 Example The Number 8009375 This example will be worked in more detail using methods that are more standard Step 1 The number is not negative The sign bit is S 0 Step 2 We shall work this out in quite some detail mostly to review the techniques Note that 26 8009375 lt 27 so the exponent ought to be 6 Convert 80 80 2 40 remainder 0 40 2 20 remainder 0 20 2 10 remainder 0 10 2 5 remainder 0 5 2 2 remainder 1 2 2 1 remainder 0 1 2 0 remainder 1 101 0000 in binary Convert 009375 0090375 0 2 01875 01875 0 2 0375 0375 0 2 075 075 o 2 150 Drop the leading 1 050 o 2 100 The binary value is 101 000000011 Example The Number 8009375 Continued Step 2 Continued We continue to convert the binary number 101 000000011 To get a number in the form 1Y we move the binary point six places to the left This moving six places to the left indicates that the exponent is P 6 101 000000011 10100 0000 011026 Step 3 P 127 6 127 133 128 5 In binary we have 1000 0101 Step 4 The signi cand is 0100 0000 011 or 0100 0000 0110 0000 Again wejust take the number 10100 0000 011 and drop the 1 Step 5 Arrange the bits Sign Exponent Signi cand Sign Exponent Significand 0 1000 0101 0100 0000 0110 0000 Step 6 Rearrange the bits 0100 0010 1010 0000 0011 00000m etc Step 7 Write as 0X42A030 Extend to eight hex digits 0x42A0 3000 Example in Reverse 0x42E8 0000 Given the 32 bit number 0x42E8 0000 determine the value of the oating point number represented if the format is IEEE 754 Single Precision Just do the steps backwards Step 1 From left to right convert all non zero hexadecimal digits to binary If necessary pad outwith trailing zeroes to get at least ten binary bits 4 2 E 8 0100 0010 1110 1000 Step 2 Rearrange the bits as 1 bit 8 bits the rest Sign Exponent Signi cand 0 10000101 1101000 Step 3 Interpret the sign bit S 0 the number is non negative Step 4 Interpret the exponent eld 1000 01012 128 4 1 133 P127133P6 Step 5 Extend and interpret the signi cand Extend to 111012 Drop the trailing 0 s 111012 1 12 14 116 11316 18125 Step 6 6a 6b 6c Example in Reverse 0X42E8 0000 continued Evaluate the number I show three ways to compute the magnitude Just do the multiplication We have 18125 0 26 18125 0 64 1160 Consider the fractional powers of 2 111012 1 12 14 116 so we have 1 12 14 116064 64 32 16 4 1160 The binary representation is 111012 0 26 Move the binary point six places to the right to remove the exponent But rst pad the right hand side of the signi cand to six bits The binary representation is 11101002 o 26 This equals 111 01000 64 32 16 4 1160 REMARK Whenever the instructor gives more than one method to solve a problem the student should feel free to select one and ignore the others Example in Reverse 0xC2E8 0000 This is a example rigged to make a particular point Step 1 From left to right convert all non zero hexadecimal digits to binary If necessary pad out with trailing zeroes to get at least ten binary bits C 2 E 8 1100 0010 1110 1000 Step 2 Rearrange the bits as 1 bit 8 bits the rest Sign Exponent Signif1cand 1 10000101 1101000 Here we take a shortcut that should be obvious Compare this bit pattern with that of the previous example which evaluated to 1160 This pattern 1 1000 0101 1101000 1160 0 10000101 1101000 This is the same number just with a different sign bit The answer is the negative number 1160 A Final Example 0xC000 0000 Step 1 From left to right convert all non zero hexadecimal digits to binary C 1 100 If necessary pad outwith trailing zeroes to get at least ten binary bits Just to be thorough I pad the number out to twelve binary bits C 0 0 1100 0000 0000 Step 2 Rearrange the bits as 1 bit 8 bits the rest Sign Exponent Signi cand 1 1000 0000 0000 Step 3 Interpret the sign bit S 1 the number is negative Step 4 Interpret the exponent eld 1000 00002 128 P127128P1 Step 5 Extend and interpret the signi cand Extend to 100002 This is exactly 10 Step 6 Evaluate the number The magnitude is 10 o 21 20 The number is 20 Precision How accurate is this oating point format Recall again the bit counts for the various elds of the number One sign bit 1 for a negative number and O for a non negative number Eight exponent bits storing the exponent in excess 127 notation 23 bits for the signi cand It is the 23 bits for the signi cand that give rise to the precision With the leading 1 that is not stored we have 24 bits thus accuracy to 1 part in 224 224 24 o 220 16 o 220 16777216 1 part in 10 000 000 would imply seven signi cant digits This is slightly better so we can claim seven signi cant digits The IEEE double precision format extends the accuracy to more digits Bankers and other nancial types prefer exact arithmetic so use another format BCD for all of their real number money calculations IBM S370 Floating Point Data We now discuss the representation used by IBM on its mainframe computers the Sytem360 System370 and subsequent mainframes All oating point formats are of the form S E F representing lSoBEoF It is the triple S E F that is stored in memory S B E F the sign bit 1 for negative and 0 for non negative the base of the number system one of 2 10 or 16 the exponent the fraction The IEEE 754 standard calls for a binary base The IBM 370 format uses base 16 Each of the formats represents the numbers in normalized form For IBM 370 format this implies that 00625 lt F g 10 Note 116 00625 S370 Floating Point Storing the Exponent The exponent is stored in excess 64 format as a 7 bit unsigned number This allows for both positive and negative exponents A 7 bit unsigned binary number can store values in the range 0 127 inclusive The range of exponents is given by The leftmost byte of the format stores both the sign and exponent 0 g E 64 g 127 or 64 E 63 1 Bits 1 0 1 1 1 2 1 3 1 4 6 1 7 1 1 Field 1 Sign Exponent in Excess 64 format 1 Examples Negative number Exponent 8 E 64 56 48 8 X 38 B 011 1000 0 1 2 1 4 1 5 6 1 7 Sign 3 8 1 0 1 1 1 1 0 0 1 0 The value stored in the leftmost byte is 1011 1000 or B8 Converting Decimal to Hexadecimal The first step in producing the IBM 370 oating point representation of a real number is to convert that number into hexadecimal format The process for conversion has two steps one each for the integer and fractional part Example Represent 12390625 to hexadecimal Conversion of the integer part is achieved by repeated division with remainders 123 16 7 with remainder 11 X B 7 16 0 with remainder 7 X 7 Read bottom to top as X 7B Indeed 123 7016 11 112 11 Conversion of the fractional part is achieved by repeated multiplication 090625 0 16 145 Remove the 14 hexadecimal E 05 o 16 80 Remove the 8 The answer is read top to bottom as E8 The answer is that 12390625 in decimal is represented by X 7BE8 Converting Decimal to IBM 370 Floating Point Format The decimal number is 12390625 Its hexadecimal representation is 7BE8 Normalize this by moving the decimal point two places to the left The number is now 162 o 07BE8 The sign is O as the number is not negative The exponent is 2 E 64 66 X 42 The leftmost byte is X 42 The fraction is 7BE8 The left part of the oating point data is 427BE8 In single precision this would be represented in four bytes as 42 78 E8 00 CPSC 5155 Chapter7 Slide 1 of 17 slides Sample Design A Controller for a Simple Traf c Light CPSC 5155 Chapter7 Slide 2 of17 slides Assumption Two Linked Pairs of Traf c Lights iNorth 4 Light 2 Light 1 If one light is Green the cross light must be Red CPSC 5155 One Light Green Yellow Red Red Chapter 7 Slide 3 of 17 slides Assumed Cycling Rules Cross Light Red Red Red Green Comments Traffic moving on one street Traffic on cross street must wait for this light to turn red Both lights are red for about one second Cross traffic now moves This is the basic sequence for a traffic light Without turn signals or features such as an advanced green etc CPSC 5155 Chapter 7 Slide 4 of 17 slides Name the States State Light 1 Light 2 Alias 0 Red Red RR 1 Red Green RG 2 Red Yellow RY 3 Red Red RR 4 Green Red GR 5 Yellow Red YR cPsc 5155 Chaptex7 snags afl7 slides Step la State Diagram for the System A Sixrslale Design A Fiverslale Design Notation L1L2 so RG 5 Light 1 is Red and Light 2 is Green The sixistate design is more easily implemented CPSC 5155 Chapter 7 Slide 6 of 17 slides Step 1b De ne the State Table Present State Next State Number Alias Number Alias 0 RR 1 RG 1 RG 2 RY 2 RY 3 RR 3 RR 4 GR 4 GR 5 YR 5 YR 0 RR At the moment this is just a modulo 6 counter with unusual output We shall add some additional circuitry to allow for safety constraints The choice of Red Red as state 0 is arbitrary but convenient CPSC 5155 Chapter 7 Slide 7 of 17 slides Step 2 Count the States and Determine the Flip Flop Count There are siX states so we have N 6 Solve 2P 1 lt N S 2P for P the number of ip ops 2P 1 lt 6 S 2P gives P 3 because 22 lt 6 S 23 We denote the states by QZQlQO because the symbol Y is taken to indicate the color Yellow CPSC 5155 Chapter 7 Slide 8 of 17 slides Step 3 Assign a 3 bit Binary Number to Each State This is a modified counter so the assignments are quite obvious State Q2 Q1 Q0 0 0 0 0 l 0 0 l 2 0 l 0 3 0 l l 4 l 0 0 5 l 0 l We have two possible additional states 6 and 7 Normally these are ignored but we consider them due to safety constraints CPSC 5155 Chapter7 Shots 9 of17 shdes Redefine the State Diagram to Add Safety States 6 and 7 should never be entered Each is RR for safety CPSC 5155 Chapter 7 Slide 10 of 17 slides Step 4a Derive the Output Equations Alias QZQIQO R1 G1 Y1 R2 G2 Y2 0 RR 0 O O 1 O O 1 O O 1 RG 0 O 1 1 O O O 1 O 2 RY O 1 O 1 O O O O 1 3 RR 0 1 1 1 O O 1 O O 4 GR 1 O O O 1 O 1 O O 5 YR 1 O 1 O O 1 1 O O 6 RR 1 1 O 1 O O 1 O O 7 RR 1 1 1 1 O O 1 O 0 Here are the output equations Glez39Qlj39on G2Q2quotQ17 Q0 Y1 Qz39Qlj39Qo Y2 Q2quotQ1 Qo R1 G1 Yl R2 G2 Y2 CPSC 5155 Chapter 7 Slide 11 of 17 slides Step 4a Derive the Output Equations page 2 Here are the equations again G1 Qz39Qlj39on G2Q2quotQ17 Q0 Y1 Qz39Qlj39Qo Y2Q2 Q1 Qo R1 G1 Yl R2 G2 Y2 We derive the Green and Yellow signals which are easier We stipulate that if a light is not Green or Yellow it must be Red Now add a safety constraint If a light is Green or Yellow the cross light must be Red R1 G1 Y1 G2 Y2 and R2 G2 Y2 G1 Y1 These equations may lead to a light showing two colors This is obviously an error situation CPSC 5155 Chapter 7 Slide 12 of 17 slides Step 4b Derive the State Transition Table Present State Next State QleQo QleQo 0 000 001 1 001 010 2 010 011 3 011 100 4 100 101 5 101 000 6 110 000 7 111 000 Slide 13 of 17 slides Chapter 7 CPSC 5155 Step 5 Separate the Table into Three Tables Qo NS Qo PS Q2Q1Qo 000 001 010 011 100 101 l l 10 11 NS Q1 Q Q2Q1Qo PS 000 001 010 01 1 100 101 110 11 1 NS Q2 Q2 Q2Q1Qo PS 000 001 010 01 1 100 101 1 1 10 1 1 Color added to emphasize the transitions of interest CPSC 5155 Chapter 7 Slide 14 of 17 slides Step 6 Select the Flip Flops to Use Use JK ip ops What a surprise The excitation table for a JK ip op is given again 1 1 11 1 10 100 101 1 01 010 000 001 PS 0 0 0 1 0 0 Q2Q1Q0 Q2 J2 K2 NS Input FlipFlop 2 11 1 110 100 1 01 010 000 PS 0 0 0 0 1 0 Q2Q1Q0 Q1 J1 K1 NS Input FlipF10 1 110 111 PS NS Q2Q1Q0 Q0 Jo K0 In ut Fli F10p 0 Step 7 Derive the Input Tables CPSC 5155 Chapter 7 Slide 15 of 17 slides CPSC 5155 Chapter 7 Slide 16 of 17 slides Step 8 Derive the Input Equations Here they are J2 QI39QO J1 sz39Qo J02Q23Q17 K22Q1Q0 K12Q2Q0 K021 There is no need to summarize the equations Boolean Satisfiability and Related Problems Edward L Bosworth PhD TSYS Department of Computer Science Columbus State University Columbus GA 31907 bosworthedwardcolstateedu Slide 1 of 37 slides CPSC 5115 Revised on November 24 2008 Boolean Satis ability Contents a Brief discussion of Boolean algebra De nition of standard Boolean forms SAT The Boolean Satis ability Problem Relation of SAT to Boolean simpli cation 9759 Discussion of Karnaugh maps as a method for Boolean simpli cation Slide 2 of 37 slides CPSC 5115 Revised on November 24 2008 Boolean Satis ability Boolean Algebra amp Digital Logic Boolean algebra was developed by the Englishman George Boole who published the basic principles in the 1854 treatise An Investigation of the Laws of Thought on Which to Found the Mathematical Theories of Logic and Probabilities The applicability to computing machines was discovered by three Americans Claude Shannon Symbolic Analysis of Relay and Switching Circuits 1938 George Stibitz An employee of Bell Labs he developed a binary adder using mechanical relays in 1937 the model K 1 adder because he built it at home on his kitchen table John Atanasoff He was probably the rst to use purely electronic relays vacuum tubes to build a binary adder Boolean algebra is a two valued algebra based on the constant values denoted as either FALSE TRUE 0 1 7 The use of this algebra for computation is based on the fact that binary arithmetic is based on two values always called 0 and 1 Slide 3 of 37 slides CPSC 5115 Revised on November 24 2008 Eaaieinsiasrnoiity Basic Boolean Operators Boolean algebra is de ned in terms oftwo constants de ned above which we call 0 and Other courses will call these values and T Boolean algebra is de ned in terms ofthree basic operators NOT AND amp OR Each r t t called a logic gate We present the gates along with the de nition NOT 39 39 39 39 39 39 r below The circle at the right end ofthe triangle is important A Algebraically this function is denoted x x or x X m The evaluation ofthe function is simple 6 1 andi 0 SlideAaf37 slides CPSCSHS RevlsedmNavembexZA2ElEl8 Boolean Satis ability Basic Boolean Operators Part 2 Logic OR This is a function of two Boolean variables We denote the logical OR of two Boolean variables X and Y by X Y Some logic books will use X v Y The evaluation of the logical OR function is shown by a truth table X Y XY O O O O l l l O l l l l The logical OR is extended to three or more inputs by noting that the function has value 1 if any of its inputs has value 1 and has value 0 only if all of its inputs have value 0 Slide 5 of 37 slides CPSC 5115 Revised on November 24 2008 Boolean Satis ability Basic Boolean Operators Part 3 Logic AND This is a function of two Boolean variables We denote the logical AND of two Boolean variables X and Y by X 0 Y Some logic books will use X Y The evaluation of the logical AND function is shown by a truth table X Y XoY O O O O l O l O O l l l The logical AND is extended to three or more inputs by noting that the function has value 0 if any of its inputs has value 0 and has value 1 only if all of its inputs have value 1 Slide 6 of 37 slides CPSC 5115 Revised on November 24 2008 Boolean Satis ability Standard Boolean Forms Standard forms are either canonical forms or normal forms The standard expressions are written in one of the two standard forms SOP Sum of Products form also called Disjunctive Normal Form POS Product of Sums form also called Conjunctive Normal Form Each of the two forms may be canonical or normal so that we have Canonical Sum of Products Normal Sum of Products Canonical Product of Sums Normal Product of Sums IMPORTANT These forms use only the 3 basic Boolean functions AND OR NOT Speci cally XOR is not used Slide 7 of 37 slides CPSC 5115 Revised on November 24 2008 Boolean Satis ability Variables and Literals We start with the idea of a Boolean variable It is a simple variable that can take one of only two values 0 False or 1 True Following standard digital design practice we use the values 0 and 1 Following standard teaching practice we denote all Boolean variables by 46A 66B 66C CD777 46W 46X ELY 277 A literal is either a Boolean variable or its complement Literals based on the variable X X and X Literals based on the variable Y Y and F NOTE X and X represent the same variable but they are not the same literal X and F represent different variables and are distinct literals Slide 8 of 37 slides CPSC 5115 Revised on November 24 2008 Boolean Satis ability Product and Sum Terms A product term is the logical AND of one or more literals with no variable represented more than once A sum term is the logical OR of one or more literals with no variable represented more than once The following are all valid product terms over the two variables X and Y X0 XOY XOY XOY Forms such as XOXX andiXoXoY are not considered as XoX X and XOX 0 so XoXoY XOY and XOXOY OOY O The following are all valid sum terms over the two variables X and Y X Y X Y Y Y Y Y Slide 9 of 37 slides CPSC 5115 Revised on November 24 2008 Boolean Satis ability Sum of Products and Product of Sums A SOP Sum of Products expression is the logical OR of product terms A POS Product of Sums expression is the logical AND of sum terms Sample SOP expressions F1x Y xoY Xi G1x Y YoY xi Sample POS expressions F2x Y XY Xi G2x Y YY x Note POS expressions almost always have parentheses to indicate the correct evaluation Slide 10 of 37 slides CPSC 5115 Revised on November 24 2008 Boolean Satis ability Inclusion A product term T1 is included in a product term T2 if every literal in T1 is also in T2 A sum term T1 is included in a sum term T2 if every literal in T1 is also in T2 Consider the following examples FA B C AoB AoC AoBoC Each of AoB and AoC is included in AoBoC GA B c A BgtA CgtA B C Each of A B and A C is included in A B C There is no inclusion in the next expression FA B C AoB AoC ZBC The literal A does no appear in the third term The literal B does not appear in the second term The inclusion rule is based on literals not just variables Slide 11 of 37 slides CPSC 5115 Revised on November 24 2008 Boolean Satis ability More on Inclusion Consider F1A B C AOB AOC AOBOC and F2A B C AOB AOC We claim that the two are equal for every value of A B and C Let A 0 Clearly F1A B C O F2A B C 0 Let A 1 Then F1A B CBCBOC and F2A B C B C Notice that we still have inclusion in F1 as each of B and C is included in BoC We prove these versions of F1A B C F2A B C using a truth table B C BoC BC BCBoC 0 0 0 0 0 0 1 0 1 1 1 0 0 1 1 1 1 1 1 1 Slide 12 of 37 slides CPSC 5115 Revised on November 24 2008 Boolean Satis ability Last Word on Inclusion If a SOP or POS expression has included terms it can be simpli ed F1A B C AoB AoC AoBoC is identically equal to F2A B C AoB AoC G1A B C A B0A CoA B C is identically equal to G2A B C A 1339A C The conclusion is that Boolean expressions With included terms are needlessly complicated We can simplify them by the application of trivial rules Note that duplication is a form of inclusion The expression F3A B AoB AoB has 2 terms each included in the other Slide 13 of 37 slides CPSC 5115 Revised on November 24 2008 Boolean Satis ability Normal and Canonical Forms A Normal SOP expression is a Sum of Products expression with no included product terms A Normal POS expression is a Product of Sums expression with no included sum terms A Canonical SOP expression over a set of Boolean variables is a Normal SOP expression in Which each product term contains a literal for each of the Boolean variables A Canonical POS expression over a set of Boolean variables is a Normal POS expression in Which each sum term contains a literal for each of the Boolean variables Note A canonical expression on N Boolean variables is made up of terms each of Which has exactly N literals Slide 14 of 37 slides CPSC 5115 Revised on November 24 2008 Boolean Satis ability The CNF Boolean Satis ability Problem De ned We rst give the de nition of the problem in the terms used above Definition Let B be a Boolean expression of N Boolean variables X1 to XN expressed as a Normal Product of Sums POS expression The expression is said to be satisfiable if and only if there is an assignment of truth values to the variables that causes the expression to evaluate to true The SAT Problem Satis ability problem is to determine whether a given Boolean expression is satis able Recall that a Boolean expression in Normal Product of Sums form is also said to be in CNF Conjunctive Normal Form It is the latter terminology that is commonly used in discussions of SAT There is another form of the SAT problem called DNF SAT in which the Boolean expression is stated in Disjunctive Normal Form also called Sum of Products The two forms of the problem are logically equivalent Slide 15 of 37 slides CPSC 5115 Revised on November 24 2008 Boolean Satis ability CNF Satis ability Two Easy Examples Consider the following two examples of CNF SAT B1i oi YoX W B2 x vx Yox ox Y The expression B1 is easily seen to be satis able IfX 0 and Y 0 then B1 66o60o06 l lol Oo0 1 10101 1 The expression B2 is shown to be not satis able but this requires a bit more work We shall show that on the next slide NOTE Each of these two is easy for two reasons 1 It is a small problem instance 2 It is expressed in Canonical POS It is easy to show that a CNF expression over N Boolean variables expressed in Canonical POS form is satis able if and only if it has less than 2N terms Slide 16 of 37 slides CPSC 5115 Revised on November 24 2008 Boolean Satis ability CNF Satis ability TWO Easy Examples Part 2 IfX 0 and Y 0 then B2 66o60o06o0 0 11o1 0o0 1o0 0 101100 O 61o6 1o0 1o0 1 1 0o11o0 0o0 1 1010001 0 IfXOandYlthenB2 IfX 1 and Y 0 then B2 16o10o16o1 0 0 1 0 0 11 1 0 1000101 0 IfX 1 andY 1 then B2 11o11o11o1 1 O Oo0 lol Ool l 0010101 0 Slide 17 of 37 slides CPSC 5115 Revised on November 24 2008 Boolean Satis ability Standard Forms of CNF SAT A Boolean expression is said to be in k CNF k conjunctive normal form if it is in CNF and each of its terms contains exactly k literals The following expression is in 2 CNF Note that the expression is de ned over three Boolean variables but each term contains only two literals F2A B C A BoA CoB C In terms of digital design this is the Normal POS expression of the carry output of a full adder The two most commonly discussed forms of CNF satis ability are 2 SAT more properly called 2 CNF SAT and 3 SAT more properly called 3 CNF SAT There exists a polynomial time formula to determine the satis ability of any expression in 2 CNF This is a special case Slide 18 of 37 slides CPSC 5115 Revised on November 24 2008 Boolean Satis ability 3 SAT The problem 3 SAT refers to the determination of whether or not a Boolean expression in 3 conjunctive normal form is satis able Recall that a Boolean expression over N Boolean variables is said to be in 3 CNF if each of its terms contains exactly three literals There are a number of signi cant facts about 3 CNF satis ability 1 It can be shown that any Boolean expression over N Boolean variables can be converted to a logically equivalent Boolean expression in 3 CNF 2 There is no known polynomial algorithm that will determine whether or not a 3 CNF expression is satis able 3 To date there has been no proof that the problem is intractable that is that no polynomial algorithm can solve it 4 This problem was rst posed in 1971 and has been examined extensively since that time In a theoretical sense the 3 SAT problem is just as dif cult as any variety of the problem As it is easier to discuss it is the more popular variant Slide 19 of 37 slides CPSC 5115 Revised on November 24 2008 Boolean Satis ability Circuit Satis ability As a problem SAT originated in formal logic and migrated into Boolean algebra The problem can be expressed in terms of Boolean combinational circuits that are used in digital design and commonly discussed in courses on Computer Organization and Computer Architecture A Boolean combinational circuit with N Boolean inputs is said to be satis able if and only if there is a truth assignment to the inputs that causes the output to be 1 The Circuit Satis ability Problem is thus Given a Boolean combinational circuit composed of AND OR and NOT gates is it satis able One of the interesting results of this equivalence is that one can use some of the tools developed for simpli cation of Boolean expressions in digital design to determine the satis ability of any Boolean expression Put another way both Kamaugh Maps and the Quine McCluskey methods will solve the 3 SAT problem Unfortunately neither is very good at that We now discuss Kamaugh Maps also called K Maps and then apply the results to an understanding of Boolean satis ability Slide 20 of 37 slides CPSC 5115 Revised on November 24 2008 Boolean Satis ability Logical Adj acency K Maps are a standard method for simpli cation of Boolean expressions The difference between K Maps and Qunie McCluskey Q M is that K M is well suited to manual solutions of small problems and Q M lends itself to automated solutions Logical adjacency is the basis for all Boolean simplification methods The K Map approach is a manual procedure that transforms logical adjacency into physical adjacency on the paper Simplification is done by inspection The key idea behind logical adjacency is expressed in the following simpli cations of Boolean expressions X0YX0Y XoY Xol X X YoX Y XoX XoT YoX YoY XoXXo XoY0 XoX XoT XoY XXo Y Xo1XX X Slide 21 of 37 slides CPSC 5115 Revised on November 24 2008 Boolean Satis ability Logical Adjacency Part 2 Two Boolean terms are said to be logically adjacent when they contain the same variables and differ in the form of exactly one variable Put another way only one literal is different a variable appears negated in one term and is not negated in the other term All other variables appear in the same way Consider the following lists of terms List 1 X X List 2 XoY x i ioY List3 XY on i XY In each of the lists each term is logically adjacent not only to the term before it but also to the term following In particular XOY is adjacent to both X0 and ioY X Y is adjacent to both X and i Y In other words the adjacency list is circular the first item follows the last one Slide 22 of 37 slides CPSC 5115 Revised on November 24 2008 E mlemsnis nnlity K Maps for Small Variable Count 2 3 and 4 Some textbooks show KrMaps for 5 ando variables Th se are so complex as to be useless for manual computation In those cases the QuinerMcCluskey methodis used Krmaps are shown as rectangles Here are Krmaps for 2 3 and 4 variables X X Y WX Y 0 1 YZ 0 1 z 00 01 11 1o YZ 00 01 11 1o 0 00 0 1 01 1 11 1o 39L arowor A to two variables 00 01 ll 10 These are in the form ofa Gray code in hich each term differs in exactly one bit from the term preceding it and following it ma 39 39 39anja 39 adjacency Slide 23 uf37 shdes cPsc 5115 Revised mNWembexZA 2cm Boolean Satis ability The Complete K Map and Its Common Forms Every entry in a K map is either a O or a 1 These are the two values that a Boolean function can assume While a Complete K map has all entries lled all common forms have only a subset lled Either the 1 s are placed and the 0 s omitted or vice versa quotY Y Z III 1 11 10 Z JD 01 11 10 Z Ill 11 10 0 0 1 II 0 1 II II 0 1 0 1 1 1 1 l 1 1 1 CI The three K maps shown above contain identical information The second K map shows the entries with values equal to 1 Those entries not shown have value 0 The third K map shows the entries with values equal to 0 Those entries not shown have value 1 Slide 24 of 37 slides CPSC 5115 Revised on November 24 2008 Boolean Satis ability Creating a K Map The standard K map is used to simplify Boolean expressions in canonical form each term has exactly one literal for every Boolean variable The method can be expanded to Boolean expressions not in canonical form SOP Expressions Place a 1 in the K Map for every term in the expression The positions not lled with a l are assumed to have a 0 PCS Expressions Place a O in the K Map for every term in the expression The positions not lled with a O are assumed to have a 1 KY KY E 00 01 11 10 z 00 11 11 10 II U 1 1 1 1 1 1 1 1 PCS K Map SOP K Map Slide 25 of 37 slides CPSC 5115 Revised on November 24 2008 Boolean Sans abrlrty Locating Terms for Sum of Product Expressions In SOP each term is to be represented by a 1 in the Kimap Where should that term be put Apply the SOP Copy Rule to each ofthe terms 1 Write the Variables in a standard uniform order 2 Replace each complemented Variable With a 0 Replace each plain Variable With a 1 Consider FX Y Z the term o oi corresponds to O the term ROYZ corresponds to O the term XoYZ corresponds to l HOHHO O l the term XI0Z corresponds to l O l l the term XoYoZ corresponds to l Slide 26 of 37 Slides CPSC 5115 Revised on November 24 2008 Boolean Satis ability Locating the SOP Terms Part 2 Let us examine me SOP expression i FX Y Z XoYoZ XoYoZ XoYoZ XoYoZ We place four 1 one for each of the terms For XOYOZ the l is placed at X O Y 1 Z 1 For XoYoZ the 1 is placed at X 1 Y 0 2 1 For XoYoZ the 1 is placed at X 1 Y 1 z 0 For XOYOZ the l is placed at X 1 Y 1 Z 1 3 1111 111 11 111 11 1 XoYOZ 1 1 1 1e Xez fowz 39 39 XoYoZ Slide 27 of 37 slides CPSC 5115 Revised on November 24 2008 Boolean Satis ability Combining the SOP Terms Terms are combined into rectangles containing 2 4 8 or 16 squares Valid combinations 2 by 1 1 by 2 4by l2by2lby4 8by14by22by4 lby 8 etc Terms can be combined if the squares are adjacent either vertically or horizontally Any term can be a part of more than one combination The terms ioYoZ and XOYOZ represented as 011 and 111 combine to l l or YOZ The terms XOTOZ and XOYOZ represented as 101 and 111 combine to l l or XOZ The terms XoYoi and XOYOZ represented as 110 and 111 combine to 11 or XOY 3 no 01 11 10 0 11 XY FX Y Z XoY XoZ YoZ Note 111 is used three times Slide 28 of 37 slides CPSC 5115 Revised on November 24 2008 Boolean Satis ability Simpli cation Using Algebra Alone The straight algebraic simpli cation illustrates the K Map approach It uses two basic Boolean identities for any X X X X foranyX X X 1 Given FX Y Z XoYoZ XoYz XoYi XoYoZ This is FX Y Z XoYoZ XoYoZ XoYz XoYoZ XoYi XoYoZ This is FX Y Z i XoYoZ XoY Yz XoYi Z This is FX Y Z Yz on XoY or FX Y Z XoY on Yoz Slide 29 of 37 slides CPSC 5115 Revised on November 24 2008 Easiemsnisnsoiity Locating Terms for Product of Sums Expressions In POS each tenn is to be represented by a 0 in the Krmap Where should that tenn be put Apply the Pos Copy Rule to each ofthe tenns 1 wnte the variables in a standard nnifonn order 2 Replace each complemented variable with a 1 Replace each plain variable with a0 ConsiderFx Y z the tenn XYz corresponds to 0 0 0 the tenn x v Z conesponols to 0 0 1 the tenn x z conesponols to 0 1 0 the tenn i H z conesponols to 1 0 0 the tenn 2 conesponols to 1 1 XVZ YYZ 11 10 7 SlideZU uf37 shdes cPsc 5115 Renseaonuoveinheiu 2m Boolean Satis ability Locating the POS Terms Part 2 Let us examine the POS expression i i i FXYZ XYZoXY ZoX Y ZoX YZ We place four 1 one for each of the terms For X Y Z the O is placed at X O Y 0 Z O For X Y 2 the 0 is placed at X 0 Y 0 2 1 For X Y Z the 0 is placed at X 0 Y 1z 0 For i Y Z the 0 is placed at X 1 Y 0 z 0 l H H xTZ XYZ 0 0 T XYZ XYZ Slide 31 of 37 slides CPSC 5115 Revised on November 24 2008 Boolean Satis ability Example K Map 0n 3 Variables Consider the K map shown above It can be shown t3 represent the Booleani expression FX Y Z X Y ZoX Y Z0X Y Z0X Y Z 3 no 01 11 1c 1 This is a K map representation of a POS Product of Sums expression also called an expression in Conjunctive Normal Form The K map procedure works by grouping adjacent squares Here are the adjacencies 000 and 010 group to form 0 0 representing the term X Z 100 and 000 group to form 00 representing the term Y Z 000 and 001 group to form 00 representing the term X Y The expression is FX Y Z X Y0X Z0Y Z Slide 32 of 37 slides CPSC 5115 Revised on November 24 2008 Boolean Satis ability Handling Non Canonical Terms Suppose we have FX Y Z X Y X Y ZX Y Z How do we handle this For X Y Z denoted as 010 the 0 is placed at X 0 Y 1 Z 0 For X Y Z denoted as 100 the 0 is placed at X 1 Y 0 Z 0 But what about X Y which is denoted as 00 The answer is to view the term as equivalent to the two terms X Y Z denoted as 000 and X Y Z denoted as 001 For the term X Y we would ll boxes 000 and 001 each with a 0 Slide 33 of 37 slides CPSC 5115 Revised on November 24 2008 Boolean Sans ability A K Map for CNF Satis ability Consider the Boolean expression B3 W X0Wo 0Y Z Is this satis able Here is the KiMap generated for this expression 00 YZ 10quot WX Missing TV Y E W W Note that there is one square in le Kimap that is not covered by these expressions This corresponds to the term W Y Y Z The assignmenw that make this missingterrn equal 0 areW1 X1 Y0 andZ 1 For these Values B3 110101 6o0 1 1 0 1olo0 1o0 1 1 B3 is satis able Shots 34 of 37 slides CPSC 5115 RevisedonNovember24 2008 Boolean Satis ability An Obvious Comment on the Above Consider again the expression B3 W X0Woi 0Y Z It should be obvious that ifW 0 then B3 O The only way to get B3 l demands the assignment W 1 Assuming w 1 we have B3 i Xo1oi mm Z 0 X1i vv Z X1i vv Z Xi vv Z Xoi Xo oY Z 0 Xo oY Z X Y Z Xovov Xovoz X00 Xovoz XOYOZ WenowhaveW lX 1YOandZ 1 Slide 35 of 37 slides CPSC 5115 Revised on November 24 2008 Enniemsnsnioiiw Another K Map for SAT Consider the Boolean expression B4 W Y0W YoW Y0W Yxz Here is the KrMap generated for this expression 0 0 1 0 WY i v VY NI 0 1 W Y As aKrMap this simpli es to B4W x v z 0 it is not satis able 7 siinezo n37 slides cPscsns RewsedmNawmbexZA2 8 Complexity and Intractability A Discussion of NP Completeness Edward L Bosworth PhD TSYS Department of Computer Science Columbus State University Columbus GA Slide 1 of 37 slides Created November 27 2008 Complexity and Intractability Basic Terms and A Reference We begin with a few de nitions taken directly om Reference 1 A problem is a general question to be answered usually possessing several parameters or ee variables whose values are left unspecified A problem is usually speci ed by giving 1 a general description of all its parameters and 2 a statement of what properties the answer or solution is required to satisfy An instance of a problem is obtained by specifying the problem and the values of all of its parameters An algorithm is a general step by step procedure for solving a problem It is required that the algorithm can be applied to any instance of the problem and produce a solution for that instance The primary reference for all work in algorithmic complexity is the classic text Computers and Intractability Michael R Garey and David S Johnson W H Freeman amp Company New York 1979 ISBN 0 7167 1044 7 As of November 2008 this text was still in print Slide 2 of 37 slides Created November 27 2008 Complexity and Intractability Algorithm De nition An algorithm is a nite set of instructions which if followed will accomplish a particular task In addition every algorithm must satisfy the following criteria i input there are zero or more quantities which are externally supplied ii output at least one quantity is produced iii de niteness each instruction must be clear and unambiguous iv niteness if we trace out the instructions of the algorithm then for all valid cases the algorithm will terminate after a nite number of steps v effectiveness every instruction must be suf ciently basic that it can in principle be carried out by a person using only a pencil and paper It is not enough that each operation be de nite as in iii but it must be feasible Ref 3 Slide 3 of 37 slides Created November 27 2008 Complexity and Intractability Algorithmic Effectiveness Note the above requirement that an algorithm in principle be carried out by a person using only a pencil and paper This limits the operations that are allowable in a pure algorithm The basic list is addition and subtraction multiplication and division and the operations immediately derived from these One might argue for expansion of this list to include a few other operations that can be done by pencil and paper I know an easy algorithm for taking the square root of integers and can do that with a pencil on paper However this algorithm will produce an exact answer for only those integers that are the square of other integers their square root must be an integer Operations such as the trigonometric functions are not basic operations These can appear in algorithms only to the extent that they are understood to represent a sequence of basic operations that can be well de ned Slide 4 of 37 slides Created November 27 2008 Complexity and Intractability More on Effectiveness Valid Examples Sort an array in non increasing order Note The step does not need to correspond to a single instruction Find the square root of a given real number to a speci c precision Invalid Examples Find four integers A 2 0 B 2 0 C 2 0 and N 2 3 such that AN BN CN This is generally thought to be impossible Find the exact square root of an arbitrary real number The square root of 40 is exactly 20 The square root of 70 is some number between 2645 751311064 590 and 2645 751311064 591 This last example becomes valid only when we specify a precision that is required for the answer The above has 15 signi cant digits Slide 5 of 37 slides Created November 27 2008 Complexity and Intractability Dif culty Some problems are easy to solve Some problems are hard to solve Some problems cannot be solved at all The issue here is a theoretical analysis of problems in general With the goal of classifying problems by the dif culty of their solutions We shall discover a different way to view the classes stated above There are some problems for Which every instance of the problem may be solved ef ciently There are some problems for Which some instances can be solved ef ciently but some problems resist ef cient solution There are some problems for Which almost no problem instances admit ef cient solutions Slide 6 of 37 slides Created November 27 2008 Complexity and Intractability Garey amp Johnson s Example We place this subject in context by quoting from an example in this standard text Your boss gives you a dif cult problem to solve You are to determine if a speci c component can be manufactured to meet a given set of speci cations Some weeks later you have failed to discover any method for solving the problem that is more ef cient than searching all possible designs and checking each for compliance You go to your boss to report the lack of progress What do you say I can t nd an algorithm I guess I m just too dumb Slide 7 of 37 slides Created November 27 2008 Complexity and Intractability Garey amp Johnson s Example Continued There is a better way to approach your boss than suggest that he lay you off You could prove that the problem is intractable in which case you could say I can t nd an ef cient algorithm because no such algorithm is possible Unfortunately it is extremely dif cult to prove that a problem is inherently intractable in the sense that ef cient algorithms to solve it cannot exist Indeed there is a class of problems called NP Complete with the property that anyone proving any of these problems inherently intractable would gain instant world fame Slide 8 of 37 slides Created November 27 2008 Complexity and Intractability Garey amp Johnson s Example Part 3 The theory of NP Completeness will give you a third option for approaching your boss Na 5 l i i I39ga I I 3 II n i 417739 El a I can t nd an ef cient algorithm but neither can all these famous people Proof that a problem is NP complete does require some skill which is mostly gained from experience working with such problems A successful proof would have a number of advantages 1 It would show that f1ring you and hiring another expert would do no good 2 It would allow you to examine a number of other approaches Slide 9 of 37 slides Created November 27 2008 Complexity and Intractability Time Complexity Polynomial and Otherwise The time complexity function for an algorithm expresses its time requirement by giving for each possible input length the largest amount of time needed by the algorithm to solve a problem instance of that size The input size for a problem instance is the number of symbols characters required to describe that instance using the most ef cient unambiguous representation In complexity theory we are interested in the upper limit of the complexity This leads to the Big O notation Definition A function FN is OGN whenever there exists a constants C gt O and No gt 0 such that FN C0GN for all N 2 N0 Informally the classes include polynomial linear quadratic cubic logarithmic and exponential Other classes do exist The time complexity function for a problem is de ned in terms of the time complexity functions of the algorithms that solve that problem For example some algorithms sort in time that is ONZ and others in time that is ONologN As we are interested in worst case complexity and NologN lt N2 we can say that the problem of sorting has polynomial time complexity Slide 10 of 37 slides Created November 27 2008 Complexity and Intractability Time Complexity Examples and Discussion Here is a table generated by MS Excel It is adapted from Garey amp Johnson Ref 1 Complexity N 10 N20 N30 N40 N50 N60 10 20 30 40 50 60 LogN 0001000 0001301 0001477 0001602 0001699 0001778 Linear 00010 00020 00030 00040 00050 00060 NOLogN 00010 00026 00044 00064 00085 00107 Quadratic 0010 0040 0090 0160 0250 0360 Cubic 0100 0800 2700 6400 12500 21600 N5 100 3200 24300 10240E04 31250E04 77760E04 2N 01024 1049 1073742 10995E08 11259E11 11529E14 3N 5905 3487E05 2059E10 1216E15 7179E19 4239E24 N 36288 2433E14 26525E28 81592E43 30414E60 83210E77 The basic assumption is that the time to solve a problem of size 1 is 00001 second or 100 microseconds T1 1001074 The basic point of this chart is to show that while polynomial functions do grow fast exponential functions grow at truly explosive rates We shall use these observations to divide problems into two major classes Polynomial time and exponential time Slide 11 of 37 slides Created November 27 2008 Complexity and Intractability Time Complexity Suppose a Faster Machine Here we quote another table from Garey amp Johnson Suppose a given computer can solve a problem instance of size N in one hour Suppose we change the machine to one either 100 times faster or 1000 times faster The size of the problem now solvable in an hour depends on the algorithm s time complexity Note that the size barely increases at all for exponential time algorithms Time Complexity Present Computer 100 times faster 1000 times faster N N1 1000N1 1000N1 N2 N2 iooN2 316oN2 N3 N3 464oN3 10oN3 N5 N4 251oN4 398oN4 2N N5 N5 7 N5 10 3N N6 N6 4 N6 6 As an example suppose the computer can solve a problem instance of size 1000 in one hour Now we make the computer 1000 times faster If TN is ON3 the faster computer can solve a problem instance of size 10000 If TN is 02N the faster computer can solve a problem instance of size 1010 Slide 12 of 37 slides Created November 27 2008 CnmplEmty and lmxmabrlrty Some Inequalities and Classi cations Here are afew observations ForN gt101ltLogNltNso NltN0LogNltNZ ForN 2 7 2quot lt 3quot ltN The latter is a simple inductive proof noting that 37 2187 and 7 5040 Use N 7 for the base case Ile gt 3quot then N 1 N1oNgt N1o3 gt 33 3quot Here are the classi cations Logarithmic TN is 0LogN Polynomial TN is 0NL 0N L0gNL 0NZL 0NZ L0gNL 0N3gt 9W Superrpolynomial TN is NW0 Exponential TN is 02 03 0N Presburger Arithmetic is doubly exponential TN is 022N Slide 13 an7 slides CreatedNnvanba39 27 2008 Complexity and Intractability Counting vs Enumeration At this point we make an important distinction between counting problems and enumeration problems By de nition enumeration problems are almost always intractable counting problems may be Consider the problem of the permutations of the nite subset of integers denoted by 12 N say 12 34 There are exactly N distinct permutations of this set The value N may be determined by N 1 multiplications the problem is ON The enumeration problem calls for listing each permutation As each of the N answers must be listed the problem is ON By de nition that is an intractable problem 4 24 1234 1243 1324 1342 1423 1432 2134 2143 2314 2341 2413 2431 3124 3142 3214 3241 3412 3421 4123 4132 4213 4231 4312 4321 In general this type of intractability is seen as a sign that the problem is not well formulated or de ned realistically The theory focuses on problems that are inherently intractable in that any solution takes polynomial time Slide 14 of 37 slides Created November 27 2008 Complexity and Intractability Inherently Intractable Problems Informally a problem is inherently intractable if it can be proved that there is no fast algorithm for it More formally a problem is inherently intractable if it can be proved that any algorithm that solves the problem must have exponential time complexity According to Garey amp Johnson Most exponential time algorithms are merely variations on exhaustive search whereas polynomial time algorithms generally are made possible only through the gain of some deep insight into the structure of a problem We might extend the notion of problem classi cations to include a new term probably intractable This is your instructor s terminology A problem may be said to be probably intractable when there is no known algorithm of polynomial time complexity that solves it and it appears likely that none exists Note that the inability to discover a polynomial time algorithm for a problem despite over 30 years of hard work is not the same as a proof that such an algorithm cannot exist Slide 15 of 37 slides Created November 27 2008 Complexity and Intractability Decision Problems As a matter of theoretical convenience the theory of NP completeness is designed to be applied only to decision problems A decision problem is one that has a Yes No answer ie is this list of numbers sorted Abstractly a decision problem H consists simply of a set of problem instances denoted DH and a subset YH g DH of Yes instances instances for which a proper algorithm solves the decision problem and returns a result Yes Examples Is the integer a square of another integer DH the set of non negative integers Negative integers not being the square of any integer are not valid instances of the problem YH 0 1 4 9 16 25 etc Is the integer a prime number DH the set of integers greater than 1 DH 2 3 4 etc by de nition primality is restricted to the positive integers and 1 is not a prime number YH 2 3 5 7 11 13 etc Slide 16 of 37 slides Created November 27 2008 Complexity and Intractability Veri cation and Certi cates One of the key processes in the study of NP completeness is the veri cation of an answer usually a Yes answer Here we are not asking to nd a solution but to verify that an alleged solution really is correct Quite often the method uses what is called a certi cate Consider a problem H and an instance of the problem I e YH for this problem instance the algorithm returns a Yes answer The inclusion I e YH is veri ed by passing the problem description and its certi cate seen as a pair I CI to an algorithm R that recognizes that the instance I really does have a Yes answer What we shall want is for algorithm R to operate on the pair I CI in polynomial time Quite often R is a linear time algorithm Slide 17 of 37 slides Created November 27 2008 Complexity and Intractability Example Travelling Salesman Air Travel for Five Cities Atlanta Dallas Mexico City Miami amp Seattle These are actual air fares for direct connections SEA 960 Note that the triangle inequality does not apply to air fares Consider MIA gt DFW 865 vs MIA gt ATL gt DFW 232 This illustrates an instance of the TSP Travelling Salesman Problem Question Does there exist a tour With total cost less than 2600 Slide 18 of 37 slides Created November 27 2008 Complexity and Intractability The Certi cate for This Instance For this instance the certi cate is the list of cities Visited in order and the total cost The problem instance I is best described as the array of inter city costs The problem size is the number of cities to be Visited here it is 5 CISEA ATL DFW MEX MIA SEA SEA 960 This can be shown in linear time to have a total cost of 2540 As this cost is less than 2600 we have recognized the instance as belonging to YH Slide 19 of 37 slides Created November 27 2008 Complexity and Intractability Encoding Decision Problems Recall that a decision problem is one that has only two answers Yes and No Consider a problem H and DH the set of all valid instances of the problem An encoding scheme for the problem H is an ef cient way to describe each instance of the problem Each encoding scheme is based on 2 which is an alphabet of symbols Recall that we use 2 to denote the set of all nite length strings over the alphabet There are two conditions that describe the idea of a reasonable encoding scheme 1 the encoding of an instance I should be concise and not be padded with unnecessary information or symbols and 2 numbers occurring in I should be represented in binary or any equivalent notation Base l notation is not to be used This encoding must be based on an alphabet Quite often the assumption is that Z 0 l and that binary encodings such as ASCII are used for non numerics Consider a speci c instance of the problem I 6 DH Let e eI be the encoding of the problem instance Then e e 2 it is a sequence of symbols from Slide 20 of 37 slides Created November 27 2008 Complexity and Intractability Encodings and Languages Consider a decision problem H which is described with symbols from an alphabet The problem and its encoding scheme partition the set 2 into three classes of strings the set of strings that are not encodings of instances of the problem the set of strings that encode instances of H for which the answer is No the set of strings that encode instances of H for which the answer is Yes We may use this to formulate an equivalence between decision problems and language recognition problems De ne the language LHe x e 2 such that Z is the alphabet used by the encoding scheme e eH and x is the encoding under e of an instance I e YH g DH For each decision problem H and encoding scheme e eH it is possible to construct a Turing Machine T with the property that any problem instance I e YH is equivalent to a string accepted by the Turing Machine The set YH g DH can be considered as the language accepted by the Turing Machine Slide 21 of 37 slides Created November 27 2008 Complexity and Intractability The Problem Class P The problem class P is the set of decision problems for which there is an algorithm A and an encoding e such that for every instance I 6 DH the algorithm will produce a solution in time 0NK for xed K Informally P is the set of easy decision problems More formally a problem H belongs to the class P if there exists a DTM Deterministic Turing Machine that will recognize the encoding of any instance I e YH g DH in polynomial time Each valid description of an instance of the problem H for which the answer is Yes is often called a word in the language LHe It will become important later to note that any problem in the class P may have a proposed answer veri ed in polynomial time If nothing else just solve the problem in polynomial time and the solution is automatically veri ed Slide 22 of 37 slides Created November 27 2008 Complexity and Intractability The Problem Class NP There are many correct ways to describe the problem class NP Again all problems in this class are decision problems which are equivalent to language recognition problems Here is one accurate de nition A decision problem H belongs to NP if there is an algorithm A that does the following 1 Associated with each word of the language H each instance I for which the answer is Yes there is a certi cate CI such that when the pair I CI is input to algorithm A it recognizes that I belongs to the language 2 If I is some word that does not belong to the language H then there is no choice of certi cate CI that will cause A to recognize I as belonging to the language 3 The algorithm A operates in polynomial time We have just shown that the above three criteria hold for any problem in the class P In set notation it is obviously the case that P g NP The big question Is P NP More on this later Slide 23 of 37 slides Created November 27 2008 c amplexity and lntractabdity Deterministic Turing Machines Here is a depiction of a standard lrtape Deterministic Turing Machine The speci cation of the Turing machine includes the following 1 A nite set 1quot of tape symbols including a subset 2 C lquot of input symbols and a distinguished blank symbol Which We can denote ltgt 2 A nite set Q of states for the nite automaton that serves as the control of the Turing machine This set must include three distinguished states the start state denoted qu and a halt state qy called the accept state and a halt state qN called the reject state 3 A transition Jnction that causes the nite state controller to move through the states possibly reading from the tape and Writing to the tape as it goes SlideZAuf37 slides CreatszWemberZIZUUE Complexity and Intractability Operation of a Turing Machine At the start the nite state controller is in state qo the read write head is accessing square 0 of the tape and the input string X is on squares 1 through X of the tape The Turing Machine begins by moving to square 1 and reading the rst symbol of the string After that it proceeds as directed by the transition function There are three possible outcomes 1 It arrives at the accept state qy at which time it halts and accepts the string 2 It arrives at the reject state qN at which time it halts and rejects the string 3 It loops which is to say it never arrives at a halt state but loops in nitely In general we say that a DTM M with input alphabet 2 accepts x e 2 if and only if M halts in state qY when it is applied to input x If x e 2 LM the set of strings not in the language LM then the computation on M might halt in the state qN or it might continue forever without halting Slide 25 of 37 slides Created November 27 2008 Complexity and Intractability Nondeterministic Computation and the Class NP All models of computation including Finite State Machines and Turing Machines have two major classes Deterministic and Nondeterministic The theory of NP completeness is based on the theoretical construct called a NDTM a Non Deterministic Turing Machine Such a machine cannot be designed or built At this stage it is very important to de ne nondeterministic computing in terms of what it is not It is not random computing Consider a decision problem such as TSP Travelling Salesman Problem At each city the algorithm must make a choice of what city to visit next A deterministic algorithm would pick a city after some computation After the tour is completely calculated the algorithm would then try a large number of other tours A random algorithm would pick a city at random at each stage The tour generated would be chaotic A nondeterministic algorithm would pick exactly one city to be next but it would always be the correct city There is no randomness here it is a perfect guesser Slide 26 of 37 slides Created November 27 2008 Complexity and Intractability Non Deterministic Turing Machines A Non deterrninistic Turing Machine NDTM is a Deterministic Turing Machine DTM to which has been added one new component a guessing module The guessing module is used to write the guess onto the tape for analysis by the Finite State Control of the DTM After the Guessing Module has written its guess onto the tape it ceases functioning and the deterministic part begins computation At this point the tape holds 2 strings I The encoded representation of the problem instance and 2 The encoded answer from the nondeterministic guessing module Guessing Finite State Module Control T Re all Write Head ape 7 l lSl4I3l2lIIquotIIIZIS 4 The guesser is always correct At this stage the DTM part of the deVice always accepts the string and terminates in its Yes state Slide 27 of 37 slides Created November 27 2008 Complexity and Intractability Comments on the Class NP The class NP can be de ned as that set of decision problems that under reasonable encoding schemes can be solved in polynomial time by a NDTM This is equivalent to solvability by a polynomial time nondeterministic algorithm While the formal de nition of the class NP rests on polynomial time solvability by a nondeterministic algorithm the essence of the class is far simpler There are some problems that can be solved in polynomial time There are more problems for which a proposed solution can be veri ed in polynomial time To quote Garey amp Johnson It is this notation of polynomial time veri ability that the class NP is intended to isolate It should be evident that a polynomial time nondeterministic algorithm is basically a de nitional device for capturing the notion of polynomial time veri ability rather than a realistic method of solving decision problems If there does not exist an algorithm that can verify a solution in polynomial time it is quite evident that no algorithm can solve the problem in polynomial time Slide 28 of 37 slides Created November 27 2008 Complexity and Intractability The Relationship Between P and NP To show the well known result that P g NP we note the following Every decision problem solvable by a polynomial time deterministic algorithm is also solvable by a polynomial time nondeterministic algorithm To see this one simply needs to observe that any deterministic algorithm can be used as the checking stage for the nondeterministic algorithm Let decision problem H e P with A as a polynomial time deterministic algorithm that solves the problem Construct the nondeterministic algorithm for H e P by ignoring the guess and using the same deterministic algorithm A as the checking stage Thus H e P implies that H e NP Here is an important theorem that I shall not attempt to prove If H e NP then there exists a polynomial p such that H can be solved by a deterministic algorithm having time complexity 2p Thus Traveling Salesman TSP e NP is solved in linear time by a nondeterministic algorithm and in exponential time by a deterministic algorithm Slide 29 of 37 slides Created November 27 2008 Complmty and immunity The Classes NP and Co NP Recall the definition ofthe class NP A decision problem l39l belongs to NP if for eachl E Yn g Dn instance ofthe problem for which the answer is Yes there is an algorithm A and a certi cate CI such that when the pair 1 C1 is input to algorithm A it recognizes that I E Y Consider He the complement ofaproblem H that is in NP The complement of aproblem is the same problem with the answers reversed The set of problem instances for the problem and its complement are identical DHC D1 I The answers for the dual problem are the reverse of those for the original YHC DH YH Put simply CorNP l39lC l39lthat is in NP Slide 30 um slides CreatedNannba 27 2003 Complexity and Intractability Example The composite number problem Example H the composite number problem Given a positive integer N are there two positive integers K gt 1 and M gt 1 such that N KOM If the answer is Yes the certi cate CN is just the two factors The composite number problem is clearly in the class NP If the answer is No the number N is a prime number Thus HC the primality problem which is immediately seen to be in Co NP Given a positive integer N is it the case that its only divisors are l and N In 1975 V Pratt showed that the primality problem is in the class NP This showed that the composite number problem is in the class Co NP Here both H and HC are in the intersection NP m Co NP gt Historic Note In a 1903 American Mathematical Association a conjecture about a conjectured class of prime numbers was disproven in the shortest talk on record 267 1 193707721 o 761838257287 Slide 31 of 37 slides Created November 27 2008 Complexity and Intractability Reducibility The process called polynomial time reduction is a key part of the theory of NP Completeness Given two decision problems P1 and P2 a reduction operates as follows If someone comes with an input X1 to problem P1 and wants the Yes or No answer we use the reduction algorithm to transform X1 into X2 which is an input to problem P2 such that the answer to P2 on X2 is identical to P1 on X1 If this reduction algorithm operates in polynomial time the algorithm is called a polynomial time reduction The key feature of this process is that it shows that problem P2 is just as hard to solve as problem P1 Suppose the following 1 You are given a problem P1 that is supposed to be hard in the sense that no polynomial time deterministic algorithm solves it 2 You reduce any instance X1 of P1 into an equivalent instance X2 of P2 and do this in polynomial time by a deterministic reduction algorithm 3 You solve problem P2 using a polynomial time deterministic algorithm You have just constructed a polynomial time deterministic algorithm for P1 Slide 32 of 37 slides Created November 27 2008 Complexity and Intractability The Class NP Complete Formally a problem H is said to be NP Complete if 1 H is a member ofthe class NP and 2 All other problems H in NP can be reduced to H in polynomial time Cook s Theorem SAT is NP Complete The proof offered by Cook in 1971 that Boolean Satis ability is NP Complete is long tedious but not overly complex Here is an outline of the proof 1 It is easily shown that SAT 6 NP 2 Imagine an arbitrary NDTM Non Deterministic Turing Machine Cook constructs an instance of SAT with the following properties a The NDTM terminates in state qY if and only the instance is satisflable b The NDTM terminates in state qN if and only the instance is not satisflable This means that SAT is as hard as any problem in the class NP it is NP Complete Slide 33 of 37 slides Created November 27 2008 Complexity and Intractability Showing NP Completeness To show that a problem H belongs to the class NP Complete one must follow a four step procedure 1 Show that H e NP 2 Select a known NP Complete problem H 3 Construct a reduction R om H to H 4 Show that the reduction R operates in polynomial time Due to Cook s theorem the rst problem to be used was SAT Properties of problems in the class NP bWNH 5 Slide 34 of 37 slides All are solvable by exponential time deterministic algorithms None can be solved by any known polynomial time algorithm There is no proof that any of the problems cannot be solved in polynomial time If any one is shown to be tractable solvable in polynomial time by a deterministic algorithm then every member is tractable If any one is shown to be intractable then all of the problems are intractable Created November 27 2008 Complexity and Intractability Some Problems in the Class NP Complete Here are a few of these important problems The reference Garey amp Johnson R11 lists over 320 such problems 1 Hamiltonian Cycle Given a graph G V E is it possible to nd a simple cycle that includes every vertex exactly once except that the start vertex is also the end vertex 2 Traveling Salesman Given a weighted graph G V E W that is a complete graph with each vertex adjacent to every other vertex is it possible to specify a Hamiltonian cycle of total weight less than a speci c value 3 Partition Given a set X such that each element x e X has an associated size sx Is it possible to partition the set X into two subsets with exactly the same total size 4 01 Knapsack Given a set X of items each having size sx and value vx is it possible to create a subset Y g X such that the total size of Y is less than a xed value S while the total value of the items in Y is at least another xed value V 5 3Coloring Given a graph G V E is it possible to assign one of three colors to each of the vertices so that no two adjacent vertices have the same color Slide 35 of 37 slides Created November 27 2008 Complexity and Intractability More Problems in the Class NP Complete 6 Clique Given an N Mgraph G and an integer L 2 lt L lt N determine whether G contains a clique of size greater than or equal to L that is a subgraph isomorphic to KL the complete graph on L vertices 7 Dominating Set Let G be an N Mgraph A dominating set D is a set of vertices in G such that every vertex in VG is either in D or adjacent to a vertex in D Given an integer L determine whether or not G has a dominating set containing not more than L vertices 8 Vertex Cover Let G V E be an undirected graph A vertex cover is a set C of vertices such that every edge in EG is incident to at least one vertex in C Given an integer L determine whether G has a vertex cover containing not more than L vertices 9 3SAT Given a Boolean expression in Conjunctive Normal Form CNF such that each clause contains literals for exactly three variables determine whether it is satis able 10 Minimization to 0 Given a Boolean expression in Product of Sums Form Is it possible to minimize this expression to identically false B 0 Slide 36 of 37 slides Created November 27 2008 Slide 91 Importance of Growth Order Suppose that a given algorithm takes 1 minute to solve a problem of size 10 How long to solve a problem of size 20 How long for size 30 Complexity Ratio 1201 Time 120 Ratio 130 Time 130 Olog N log20log10 log30log10 1301 13 minutes 1477 90 seconds ON 2010 2 2 minutes 3010 3 3 minutes ON log N 2 1301 26 26 minutes 3 1477 44 44 minutes ONZ 202 102 4 4 minutes 302 102 9 9 minutes ON3 203 103 303 103 8000 1000 8 8 minutes 27000 1000 27 minutes OZN 220 210 Z 210 230 210 Z 220 1024 17 hours 1048576 728 days Slide 101 Asymptotic Growth Rates There are three sets of interest to our investigation of the big picture when studying algorithmic time complexity OGN the Big O notation and QGN the Big Omega notation and GN the Big Theta notation The set OGN is the set of functions that grow no faster than GN The set QGN is the set of functions that grow no slower than GN The set GN is the set of functions that grow exactly as fast as GN This terminology is a bit odd arising as it does from set theory The function 2N2 5N 17 is a member of the set NZ which is the set of quadratic functions We normally just say that it is a quadratic Slide 102 Formal De nition of the Set OGN De nition A function TN is said to be OGN denoted as TN e OGN if there exists some positive constant C and some nonnegative integer N0 st TN S COGN for all N 2 N0 Example TN 5N 17 is both ON and ONZ TN is ON because TN S 200N for all N 2 4 TN is ONz because TN s 10de2 for all N 2 2 Notice I did not solve any exact equation to determine the point at Which COGN is bigger than TN I just picked an integer Exactly 5N 17 S 200N becomes 17 S SON or N 2 34 N0 4 is just a good guess Solving 5N 17 S 100N2 is equivalent to solving 100N2 5N 17 O T1 12 and T2 13 so there is a root between the two values of N Slide 103 Loose and Exact De nitions OGN and GN Consider TN 13N2 52N 17 TN is both ONZ and ON3 TN e ONZ because 13oN2 520N 17 g mm2 for all N 2 1o TN e ON3 because 13oN2 520N 17 g mm3 for all N 2 10 What we are saying here is that l TN grows no faster than a quadratic and 2 TN grows no faster than a cubic Loosely we would say TN is ONZ because N2 is the smallest polynomial that bounds TN from above Really that is saying that TN is NZ TN is a member of the set of functions that grow exactly as fast as N2 Slide 104 Some Sets of Functions logN the set of logarithmic functions log2N lnN and log10N N the set of linear functions N2 the set of quadratic functions N3 the set of cubic functions NOTE TN 6 GN if and only if both TN e O GN and TN e Q GN Slide 141 Use of Calculus in Establishing Complexity Consider a function TN We say the following 1 If hm gt 0 then TN is QGN Note that the limit could be 00 N gtoo 2 If mm m C for some positive constant C then TN is GN N gtoo 3 If hm lt 00 then TN is OGN Note that the limit could be 0 N gtoo When we call for C gt 0 as a constant we mean a nite number Slide 142 Calculus for Complexity Measures Part 2 Consider TN 130N2 520N 17 We use the calculus methods to prove some properties Claim TN is QN 2 N gtoo N gtoo N N oo N Claim TN is 9N2 TN 13 252oNl7 52 17 lim 2 lim N z lim 132 13 N gtoo N N gtoo N N gtoo N Claim TN is NZ 2 TIj W 1352172 137 asjust N gtoo N N gtoo N N m N N Slide 143 Calculus for Complexity Measures Part 3 Consider TN 130N2 520N 17 again Claim TN is 0N2 2 TIj W 137 asjust N m N N gtoo N N m N N above Claim TN is 0N3 TN l3oN252oNl7 13 52 17 11m 73 11m 3 1m 77 0 N gtoo N N gtoo N N gtoo N N N Slide 151 L H pital s Rule TN dTN A dN L H0 1tal s Rule states that 1 7 1 r0v1ded that p N13 GN N13 dGNN p both TX and GX as functions of real variables are differentiable and either 1 both hm TN 00 and hm GN 00 or N gtoo N gtoo 2 both hm TN 0 and hm GN O N gtoo N gtoo dXK K71 Der1vat1ves For K gt O 7 KOX dX K For K gt 0 dX K0X K 1 dX dlogX 1 X dX Slide 152 L H pital s Rule Part 2 TN NologN N 3 Show TN is 0N2 We evaluate W i N m N N m N N 2 mm 10gN O 0 N gtoo But hm logN 00 and hm N 00 so the rule applies N oo N gtOO dlog N dN logN UN so d 1 an d so 11m Iglmoo N gtoo and hm NologNN3 2 hm W o o o TN is 0N2 N gtoo N N gtoo Slide 153 More Calculus TN N010gN N 3 Show TN is O N010gN N010gNN3 1 3 We evaluate 11m 11m 1 N gtoo N 3910gN N gtoo 10gN N 3910gN 11m 1001 N gtoo 1 3 A 39 0 1 39 0 S Iiinwllogmj a gillloam So TN is N010gN Slide 191 Introduction to Loop Counting We need one new sum formula to consider loop counting Consider the following loop structure For J l to N do the outer loop For K l to N do the inner loop Something For each execution of the outer loop the inner loop is executed N times The inner loop is executed N o N N2 times This is easy Slide 192 Loop Counting Part 2 But consider For J l to N do the outer loop For K J to N do the inner loop Something For J l the inner loop is executed N times For J 2 the inner loop is executed N 1 times For J N the inner loop is executed 1 time The number of executions of the inner loop is N N l N 2 2 l N 2 But 2K 1 2 N 1NN Il 6 NZ K21 Slide 271 Time Complexity from Recurrences Consider the time complexity speci ed by Tl 1 TN 20TN 1 1 Express TN as a well known algebraic function 11 1 21 1 T22113 22 1 T32317 23 1 Hypothesis TN 2N 1 TN120TN1202N 11202N 212N1 1 TN is ZN This is the number of moves for Towers of Hanoi Slide 272 Time Complexity Fibonacci Numbers There are many algorithms to calculate Fibonacci numbers The recursive algorithm is as follows Fibonacci numbers introduced by Leonardo Fibonacci in 1202 are de ned by the following recurrence FO O F1 1 FN FN 1 FN 2 forN gt1 The sequence begins 0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 etc The time complexity for the recursive computation is given by TN TN 1 TN 2 forN gt1 Slide 273 Time Complexity The Master Theorem Theorem Let TN satisfy the following recurrence equation TN AoTN B FN T1 C Where A Z 1 B Z 1 and C gt 0 If FN e Nd Where d 2 0 then TN e Nd ifA lt 3 1 TN e Nd1ogN ifA B1 TN e NP P LogBA ifA gt 3 1 Please note that the recurrence TN A0TN B FN cannot be solved using the Master Theorem What Is It Consider the following set of 32 binary digits written in blocks of four so that the example is not impossible to read 0010 0110 010011001101100110111111 How do we interpret this sequence of binary digits Answer The interpretation depends on the use made of the number Where is this 32 bit binary number found Instruction Register If in the IR this number will be decoded as an instruction probably with an address part Address Register If in the MAR or another address register this is a memory address Data Register If in a general purpose data register this is data Possibly a 32 bit real number Possibly a 32 bit integer Possibly four 8 bit character codes Hexadecimal Numbers But rst we present a number system that greatly facilitates writing long strings of binary numbers This is the hexadecimal system The hexadecimal system base 16 24 has 16 digits the normal ten decimal digits and the rst six letters of the alphabet Because hexadecimal numbers have base 24 each hexadecimal digit represents four binary bits Hexadecimal notation is a good way to write binary numbers The translation table from hexadecimal to binary is as follows 0 0000 4 0100 8 1000 C 1100 1 0001 5 0101 9 1001 D 1101 2 0010 6 0110 A 1010 E 1110 3 0011 7 0111 B 1011 F 1111 Consider the previous example 0010 0110 010011001101100110111111 As a hexadecimal number it is 264CD9BF better written as 0x264C D9BF The 0x is the standard C and Java pre x for a hexadecimal constant Conversions between Hexadecimal and Binary These conversions are particularly easy due to the fact that the base of hexadecimal numbers is a power of two the base of binary numbers Hexadecimal to Binary Just write each hexadecimal number as four binary numbers String the binary numbers together in a legible form Binary to Hexadecimal Group the binary bits by fours Add leading zeroes to the leftmost grouping of binary bits so that all groupings have exactly four binary bits Convert each set of four bits to its hexadecimal equivalent Write the hexadecimal number It is better to use the 0x pre x The numbering system is often called Hex Early proponents of computer security noted many similarities between their subject and that of disease prevention called for safe hex Three Number Systems This course is built upon three number systems and conversions between them Binary Base 2 Digit set 0 l Decimal Base 10 Digit set 0 l 2 3 4 5 6 7 8 9 Hexadecimal Base 16 Digit set 0 l 2 3 4 5 6 7 8 9 A B C D E F We shall discuss ve of the six possible conversion algorithms I don t know a good algorithm for direct conversion from decimal to hexadecimal so I always use binary as an intermediate point Octal base 8 notation is useful in certain applications but we won t study it Other number systems such as base 5 and base 7 are useless teaching devices Binary Decimal and Hexadecimal Equivalents Binary Decimal Hexadecimal 0000 0 0 0001 1 1 0010 2 2 0011 3 3 0100 4 4 0101 5 5 0110 6 6 0111 7 7 1000 8 8 1001 9 9 1010 10 A 1011 11 B 1100 12 C 1101 13 D 1110 14 E 1111 15 F Conversion between Binary and Hexadecimal This is easy just group the bits Recall that A1010 B1011 C1100 D1101 E1110 F1111 Problem Convert 10011100 to hexadecimal 1 Group by fours 1001 1100 2 Convert each group of four 0x9C Problem Convertl 1 110101 1 1 to hexadecimal 1 Group by fours moving right to left 11 1101 0111 Add leading zeroes 0011 1101 0111 2 Convert each group of four 0x3D7 Problem Convert 0xBAD1 to binary 1 Convert each hexadecimal digit B A D 1 1011 1010 1101 0001 2 Group the binary bits 1011101011010001 Conversion from Hexadecimal to Decimal Remember or calculate the needed powers of sixteen in decimal form 160 1 161 16 162 256 163 4096 164 65536 etc 1 Convert all of the hexadecimal digits to their decimal form This affects only the digits in the set A B C D E F 2 Use standard positional conversion Example OXCAFE Convert each digit 12 10 15 14 Positional conversion 12163 10162 15161 14160 124096 100256 15ol6 1401 49152 2560 240 14 51966 NOTE Java class les begin With the following 32 bit 8 hex digit identi er CAFE BABE This is an inside joke among the Java development team Conversion between Binary and Decimal Conversion between hexadecimal and binary is easy because 16 24 In my view hexadecimal is just convenient shorthand for binary Thus four hex digits stand for 16 bits 8 hex digits for 32 bits etc But 10 is not a power of 2 so we must use different methods Conversion from Binary to Decimal This is based on standard positional notation Convert each position to its decimal equivalent and add them up Conversion from Decimal to Binary This is done with two distinct algorithms one for the digits to the left of the decimal point the whole number part and one for digits to the right At this point we ignore negative numbers Powers of Two Students should memorize the rst ten powers of two 20 1 21 2 2 1 12 05 22 4 2 2 14 025 23 8 2 3 18 0125 24 16 21 116 00625 25 32 2 5 132 003125 26 64 2 6 164 27 128 2 7 1128 28 256 2 8 1256 29 512 2 9 1512 210 1024 2 10 11024 m 0001 10111011 124 023 122 121120 0239112392 123 1016 008 14 12 101 0005 10025 100125 23375 Conversion of Unsigned Decimal to Binary Again we continue to ignore negative numbers Problem Convert 23375 to binary We already know the answer One solution 23375 16 4 2 1 025 0125 124 023 122 121120 0239112392 12393 10111011 This solution is preferred by your instructor but most students nd it confusing and opt to use the method to be discussed next Side point Conversion of the above to hexadecimal involves grouping the bits by fours as follows Left of decimal by fours om the right Right of decimal by fours from the left Thus the number is 1 0111011 000101110110 or OX176 But OX176 1016 701 616 23 38 23375 Conversion of the Whole Number Part This is done by repeated division With the remainders forming the binary number This set of remainders is read bottom to top Quotient Remainder 232 11 1 Thus decimal 23 binary 10111 112 5 1 52 2 1 Remember to read the binary 22 1 0 number from bottom to top 12 0 1 As expected the number is 10111 Another example 16 Quotient Remainder 162 8 0 82 4 0 42 2 0 Remember to read the binary 22 1 0 number from bottom to top 12 0 1 The number is 10000 or 0X10 Convert the Part to the Right of the Decimal This is done by a simple variant of multiplication This is easier to show than to describe Convert 0375 Number Product Binary 0375 X 2 075 0 075 X 2 15 1 Read top to bottom as 011 05 X 2 10 1 Note that the multiplication involves dropping the leading ones om the product terms so that our products are 075 15 10 but we would multiply only the numbers 0375 075 050 and ofcourse 00 Another eXample convert 07 187 5 Number Product Binary 071875 X2 14375 1 04375 X 2 0875 0 Read top to bottom as 10111 0875 X2 175 1 or as 1011100000000 07 5 X 2 15 1 with as many trailing zeroes as you like 05 X2 10 1 00 X2 00 0 Convert an Easy Example Consider the decimal number 020 What is its binary representation Number 020 040 080 060 020 040 080 02 02 02 02 02 02 02 Product Binary 040 080 160 120 040 080 160 Or IHOO 0 1 but we have seen this see four lines above So 020 decimal has binary representation 00 1100 1100 1100 Terminating and Non Terminating Numbers A fraction has a terminating representation in base K notation only if the number can be represented in the form J BK Thus the fraction 12 has a terminating decimal representation because it is 5 101 It can also be 50 102 etc Also 14 25 102 18 125103 The Memory Component The memory stores the instructions and data for an executing program Memory is characterized by the smallest addressable unit Byte addressable the smallest unit is an 8 bit byte Word addressable the smallest unit is a word usually 16 or 32 bits in length Most modern computers are byte addressable facilitating access to character data Logically computer memory should be considered as an array The index into this array is called the address or memory address A logical View of such a byte addressable memory might be written in code as Const MemSize byte MemoryMemSize Indexed O MemSize l The CPU has two registers dedicated to handling memory The MAR Memory Address Register holds the address being accessed The MBR Memory Buffer Register holds the data being written to the memory or being read from the memory This is sometimes called the Memory Data Register Primary Memory 77 6 Also called core memory store or storage Beginning with the MIT Whirlwind and continuing for about 30 years the basic technology for primary memory involved cores of magnetic material Very large picture has been removed Requirements for a Memory Device Random access by address similar to use of an array Byte addressable memory can be considered as an array of bytes byte memory N Address ranges from 0 to N l Binary memory devices require two reliable stable states The transitions between the two stable states must occur quickly The transitions between the two stable states must not occur spontaneously but only in response to the proper control signals Each memory device must be physically small so that a large number may be placed on a single memory chip Each memory device must be relatively inexpensive to fabricate Varieties of Random Access Memory There are two types of RAM 1 RAM readwrite memory 2 ROM read only memory The double use of the term RAM is just accepted Would you say RWM Types of ROM 1 Plain ROM the contents of the memory are set at manufacture and cannot be changed Without destroying the chip 2 PROM the contents of the chip are set by a special device called a PROM Programmer Once programmed the contents are xed 3 EPROM same as a PROM but that the contents can be erased and reprogrammed by the PROM Programmer Memory Registers MAR Memory Address Register This speci es the address of the instruction or data item For a byte addressable memory each byte has a distinct address For a word addressable memory only the words have individual addresses MBR Memory Buffer Register This holds the data read from memory or to be written to memory Occasionally called MDR for Memory Data Register In a byte addressable memory the MBR is usually 8 bits Wide that is it holds one byte In a 16 bit word addressable memory the MBR would be 16 bits Wide The size of the MBR is the size of an addressable item Memory Control Signals Read Write Memory must do three actions READ copy contents of an addressed word into the MBR WRITE copy contents of the MBR into an addressed word NOTHING the memory is expected to retain the contents written into it until those contents have been rewritten One set of control signals Select the memory unit is selected R W if 0 the CPU writes to memory if 1 the CPU reads from memory Select R W Action 0 0 Memory contents are not changed 0 1 Memory contents are not changed 1 0 CPU writes data to the memory 1 1 CPU reads data from the memory A ROM has only one control signal Select If Select 1 for a ROM the CPU reads data from the addressed memory slot Memory Timings Memory Access Time De ned in terms of reading from memory It is the time between the address becoming stable in the MAR and the data becoming available in the MBR READ SEQUENCE Address Data in in MAR MBR Memory Cycle Time Less used this is defined as the minimum time between two independent memory accesses The Idea of an Address Space The memory size is de ned in terms of the amount of primary memory actually installed The address space determined by the size of the MAR indicates the range of addresses that actually can be generated Absent such kludges such as Expanded Memory and Extended Memory both obsolete dating to about 1980 the memory size does not exceed the size of the address space An N bit MAR can address 2N distinct memory locations 0 2N l Computer MAR bits Address Range PDPl l2O 16 O to 65 535 Intel 8086 20 O to l 048 575 Intel Pentium 32 O to 4 294 967 295 Memory Mapped Input Output Though not a memory issue we now address the idea of memory mapped input and output In this scheme we take part of the address space that would otherwise be allocated to memory and allocate it to IO devices The PDP ll is a good example of a memory mapped device It was a byte addressable device meaning that each byte had a unique address The old PDP 1120 supported a 16 bit address space This supported addresses in the range 0 through 65535 or 0 through 0177777 in octal Addresses 0 though 61439 were reserved for physical memory In octal these addresses are given by 0 through 167777 Addresses 61440 through 65535 octal 170000 through 177777 were reserved for registers associated with InputOutput devices Examples CRll Card Reader 177160 Control amp Status Register 177162 Data buffer 1 177164 Data buffer 2 Reading om address 0177162 would access the card reader data buffer The Linear View of Memory r byte memory N Addresses 0 N s 1 K K4072 Decoder Word Mmquot Select Memory Chip Organization Consider a 4 Megabit memory chip in Which each bit is directly addressable Recall that 4M 2ZZ 211 o 2 and that 211 2 048 The linear View of memory on the previous slide calls for a 227t072ZZ decoder also called a 227toi4194304 decoder This is not feasible If We organize the memory as a twoidimensional grid of bits then the design calls for two 117t072048 decoders This is still a stretch 2048 hits Bit array 204 by 2043 Raw Address 392 e an a o 739 1141172048 deender 11 Cnlumn Address Managing Pin Outs Consider now the twoidirnensional memory mentioned above What pins are needed Address Address 22 11 Data Data RAS Power Ts Power Ground Ground 3 W a a W a Pin Count Address Lines 22 Address Lines 11 RowColumn 0 Row Column 2 Power amp Ground 2 Power amp Ground 2 Data 1 Data 1 Control 3 Control 3 Total 28 Total 19 Separate row and column addresses require two cycles to specify the address Four Megabyte Memory D L i c L L r 39 ht r 39 chips One common solution is to have bitroriented chips This facilitates the twordimensional addressing discussed above Min I Bit7I Bit6 I Bit5 I BM BitS In I Bitl IBM 11 Address Lines For applications in which dataintegrity is especially innponant one might add aninth quotr 39 39 bit W a L r 39 39 39whentheyoccnnwill be localized in one chip Parity provides amechanism to detect but not correct single bit errors Correction 39 39 quot 39 chips also detect all twoibit errors Memory Interleaving Suppose a 64MB memory made up of the 4Mb chips discussed above We now ignore parity memory for convenience and also because it is rarely needed We organize the memory into 4MB banks each having eight of the 4MB chips The gure in the slide above shows such a bank The memory thus has 16 banks each of 4MB 16 24 4 bits to select the bank 4M 222 22 bits address to each chip Not surprisingly 64M 226 Low Order Interleaving lBits 25 4 l3 0 l l Use I Address to the chip l Bank Select Hi gh Order Interleaving Memory Banking lBits 25 22 l21 0 l Use I Bank Select l Address to the chip Faster Memory Chips We can use the 2 dimensional array approach discussed earlier to create a faster memory This is done by adding a SRAM Static RAM buffer onto the chip Consider the 4Mb four megabit chip discussed earlier now with a 2Kb SRAM buffer Bit Line 2048 hits SRAM Bu er 2048 by 2048 39E 1 an a o 739 CAS 1147720 thunder 11 Cnlumn Address In a modern scenario for reading the chip a Row Address is passed to the chip followed by a number of column addresses When the row address is received the entire row is copied into the SRAM buffer Subsequent column reads come from that buffer Memory Technologies SRAM and DRAM One major classi cation of computer memory is into two technologies SRAM Static Random Access Memory DRAM Dynamic Random Access Memory and its variants SRAM is called static because it will keep its contents as long as it is powered DRAM is called dynamic because it tends to lose its contents even when powered Special refresh circuitry must be provided Compared to DRAM SRAM is faster more expensive physically larger fewer memory bits per square millimeter SDRAM is a Synchronous DRAM It is DRAM that is designed to work with a Synchronous Bus one with a clock signal The memory bus clock is driven by the CPU system clock but it is always slower SDRAM Synchronous DRAM Synchronous Dynamic Random Access Memory Suppose a 2 GHz stem clock It can easily generate the following memory bus clock rates mm 500 MHZ 250MHz 125MHz etc other rates are also possible Consider a 2 GHz CPU with 100 MHz SDRAM The CPU clock speed is 2 GHz 2000 MHz The memory bus speedis 100 MHz clock rate This memory bus clock is always based on the system clock In plain SDRAM the transfers all take place on the rising edge ofthe memory bus clock In DDR SDRAM Double Data Rate Synchronous DRAM the transfers take place on both the rising and falling clock edges Clock J 39 SDRAM Transfers More on SDRAM Plain SDRAM makes a transfer every cycle of the memory bus For a 100 MHz memory bus we would have 100 million transfers per second DDR SDRAM is Double Data Rate SDRAM DDR SDRAM makes two transfers for every cycle of the memory bus one on the rising edge of the clock cycle one on the falling edge of the clock cycle For a 100 MHz memory bus DDR SDRAM would have 200 million transfers per second To this we add wide memory buses A typical value is a 64 bit width A 64 bit wide memory bus transfers 64 bits at a time That is 8 bytes at a time Thus our sample DDR SDRAM bus would transfer 1600 million bytes per second This might be called 16 GB second although it more properly is 149 GB second as 1 GB 1073741 824 bytes Byte Addressing vs Word Addressing The addressing capacity of a computer is dictated by the number of bits in the MAR Suppose the MAR Memory Address Register Contains N bits Then 2N items can be addressed In a byte addressable machine the maXimum memory size is 2N bytes If the machine supports longword addressing but not byte or word addressing the maximum memory size is 2N2 bytes Word and longword addressable machines might have their memory sized quoted in bytes but they do not access individual bytes Example 256 KB PDP l 170 vs the CDC 6600 with 256 K words 60 bits each The CDC 6600 could have been considered to have 1920 KB of memory However it was not byte addressable the smallest addressable unit was a 60 bit integer Almost every modern computer is byte addressable to allow direct access to the individual bytes of a character string A modern computer that is byte addressable can still issue both word and longword instructions These just reference data two bytes at a time and four bytes at a time Word Addressing in 3 Byte Addressable Machine Each 8 bit byte has a distinct address A 16bit word at address Z contains bytes at addresses Z and Z l A 32bit word at address Z contains bytes at addresses ZZ lZ2andZ3 Note that computer architecture refers to addresses rather than variables In a high level programming language we use the term variable to indicate the contents of a speci c memory address Consider the statement Y X Go to the memory address associated With variable X Get the contents Copy the contents into the address associated With variable Y Big Endian vs Little Endian Addressing n The Big End m The Little End Byte 3 yrs Byte 1 Byte 0 Address BigEndian LittleEndian Z 01 04 Z 1 02 03 Z 2 03 02 Z 3 04 01 Little Endian Big Endian u 2 Is 7 n u 2 Is 7 n u Alg In 2 z 1 22 23 Example Core Dump at Address 0x200 Note Powersof256 are 25601 2561256 2562 65536 2563 16777216 Suppose one has the following memory map as a result of a core dump The memory is byte addressable Address 0x200 0X201 0x202 0X203 Contents 02 04 06 08 What is the value of the 32 bit long integer stored at address OXZOO This is stored in the four bytes at addresses OXZOO OX201 0X202 and 0X203 Big Endian The number is 0X02040608 Its decimal value is 22563 42562 62561 81 33818120 Little Endian The number is 0X08060402 Its decimal value is 802563 62562 42561 21 134611970 NOTE Read the bytes backwards not the hexadecimal digits Transitive Closure and Warshall s Algorithm We now begin the study of a number of algorithms related to paths in graphs The rst algorithm computes the transitive closure of a graph This algorithm can be applied to either directed or undirected graphs The textbook discusses it within the context of directed graphs This lecture has the following structure 1 It begins with a reView of basic graph theory 2 It then presents Warshall s Algorithm 3 Lastly it argues for the correctness of Warshall s Algorithm that the algorithm does indeed compute what it claims to compute Basic Graph Theory Definition A graph G is a nite nonempty set of vertices denoted VG together with a possibly empty nite set E g VG gtlt VG If we de ne VG N and EG M the graph G is called an N M graph In the formal theory we identify each vertex by an integer IfVG N we say that VG 1 2 3 N EGgJK l SJSNand l SKSN There are a number of special classes of graph that we nd interesting When speaking of graphs we normally mean simple graphs formally de ned as VG 1 2 3 N andEGg J K l ngN 1 gKgN andJiK A directed graph is a simple graph in which the edge J K is not the same as the edge K J J K 7 K J An undirected graph is a simple graph in which the edge J K is de ned to be the same as the edge K J J K K J Additional Restrictions on Graphs A IL 1 to itselfand e1 i t i vertex 4 to itself We shall restrict our discussions to simple graphs 0 o o 0 each edge has a weight eost at capacity associated with it For any edge e 5 155 we is the weight oft Any graph can he considered a weighted graph by setting we 1 for all e 5 155 We restrict ourselves to graphs withpositive edge weights w gt 0 Warshall s Algon39thm will he applied to graphs in whieh the edge weights are eithei F A n r lti n mquot ml hutquot t uni l u i i Adjacency and Incidence Let G be a graph with vertex set VG and edge set EG We have a set of de nitions for undirected graphs and a set for directed graphs G is an undirected graph If vertices J and K are in VG and edge J K e EG then Vertices J and K are said to be adjacent J is adjacent to K and K adjacent to J Edge J K is said to be incident on each of vertices J and K G is an directed graph If vertices J and K are in VG and edge J K e EG then Vertex J is said to be adjacent to vertex K Vertex K is said to be adjacent from vertex J Edge J K is said to be incident from J Edge J K is said to be incident to K In a directed graph an edge is often called an arc The Adj acency Matrix One common way to store adjacency information for a graph is to use a data structure called an adjacency matrix For an N M graph the adjacency matrix is an N by N square matrix The adjacency matrix of a simple graph will have zero elements on its diagonal An element of the matrix is denoted AJ K AJ J O for l 3 J g N For a graph without edge weights AJ K 1 if and only if J K e EG AJ K 0 if J K is not an edge in the graph For a weighted graph AJ K we ife J K e EG The value of AJ K when J K is not an edge depends on the algorithm If the graph is undirected then the adjacency matrix is symmetric and AJ K AK J If the graph is directed we do not have the requirement that AJ K AK J However this might be the case for a large number of matrix elements Graphs Walks Paths Cycles and Connectivity Let u and v be vertices in a graph G with u and v not necessarily distinct A uv walk of G is a nite alternating sequence of vertices and edges starting with u and ending with v thought of as u uo el ul ez us1 es uS v such that e um ui It is often denoted by listing the vertices only uo ul uz us1 us The number s the number of edges in the sequence is called the length of the walk A u v path is a u v walk in which no vertex is repeated A directed u v path is a u v path in a directed graph that follows the edge directions A cycle is a u v walk in which all vertices are distinct with the sole exception that u v A cycle oflength 3 is called a triangle An undirected graph is said to be connected if there exists a u v path between any pair of distinct vertices u and v A directed graph digraph is said to be strongly connected if there exist both a directed u v path and a directed v u path between any pair of distinct vertices u and v A directed graph is said to be weakly connected if it is not strongly connected but every pair of distinct vertices u and v is connected by either a u v path or a directed v u path Example Undirected Graph Consider the following sample undirected graph Here VG 1 2 3 4 5 EG 1 2 1 3 2 3 2 4 2 5 3 5 4 5 The adjacency matrix OOb p O HHHOH De nition of Transitive Closure Transitive closure is de ned on a set of elements with a binary operation The binary operation is often denoted by R or The latter is called squiggle We build sets by transitivity If a b c e S and both a b and b c then a c The transitive closure of a graph is built on the set of vertices with the binary relation being adjacency The adjacency matrix of the graph G is de ned as AJ K 1 if and only ifJ is adjacent to K The transitive closure of the graph G is de ned as TJ K 1 if and only if there is a nontrivial directed path from J to K In this application we are not interested in the number of edges in the path but only in the existence of the path For undirected graphs we have TJ K TK J Each of the adjacency matrix and its transitive closure is best viewed as a Boolean matrix with values 1 True and 0 False Example Again Consider the following sample undirected graph The transitive closure of this graph 0 1 1 1 1 1 0 1 1 1 1 1 0 0 1 1 1 1 0 1 1 1 1 1 0 It is easily shown that a connected graph strongly connected digraph has the property K that TJK 1 Whenever I I Boolean Operations on Matrix Elements The elements of Boolean matrices are Boolean constants O or 1 so we apply Boolean operations to these elements Logical AND AAB 1 ifand only ifA l andB 1 AABO ifeitherAOorBOorboth Logical OR A V B 1 if either A l or B l or both AVBO ifandonlyifAOandBO There is an implementation issue with Warshall s Algorithm that we address here The key expression has an expression depending on the discipline Pure mathematics will write Z W v X Y The programming implementation more closely resembles the following sequence Z W IfZOThenZXY Warshall s Algorithm The Key Idea Warshall s Algorithm is built on the construction of a sequence of Boolean matrices n 1 Nl N RIJ RIJquotquot 5 LJ RIJ There are N 1 matrices in this sequence These are de ned as follows RK 1 if and only if there is a path from Ito J that includes vertices LJ not numbered above K The claim is that This seems plausible as if and only if there is a path from vertex I to vertex J that involves vertices in l N But this is the complete set of vertices in VG Examples of the R Matrix I 1quot There is an edge between vertex I and vertex J I Bi 1 J e EG or both I 1 e EG and 1 J e EG or possibly all three I J e EG I 1 e EG and 1 J e EG I l Ja l Consider the matrix R2 Possibilities 1 I J e EG 2 I 1 e EG and 1 J e EG 3 I 2 e EG and 2 J e EG 4 I 1 e EG 1 2 e EG and 2 J e EG More than one of these can be true at the same time 9 6 IFEI Ja l Ii l Ja l Is l Ja Z All we need is one path There can be quite a few Some Obvious Recurrences If RiJ 1 then RiJ K If RI J 1 then RE 1 In general for O S K lt N we have the following K L If RLJ 1 then 11LJ 1 forKsLsN This follows from the fact that the integer set 1 K is a proper subset of 1 L when L gt K A logical corollary to the above is that for O S K S L S N If Rh o them REJ o Another Example Suppose RiJ0 but Rh 1 No edge I I in 155 or path I gt J involving only vertex 1 2 J 2 Ifl J 1 Neither omie following is possible on Extending the Sequence of R Matrices Rf from Rf K If Rw 1 then 1231 1 mquot and RKHJ W 1 then R W 1 What do these equations say Y r the set 1 m K then there obviouslyis apath involving only Venice 1 to K 1 Otherwise we add vertex K 1 to the mix and ask ifboth ofthe following are true 1 There is apath fromI to K 1 involving only Venice in the set 1 to K and 2 There is apath from K 1 to Jinvolving only Venice in the set 1 to K Ifboth are true then we have our pans fromI to I with only Venice 1 to K 1 Compressing the Sequence The last observation of importance to our algorithm is the fact that new matrix values can overwrite old ones If RK1 is computed then RK is no longer important The algorithm makes use of only one N by N matrix RI J For 1 s K g N the Kth iteration can change RI J from O to 1 Once RI J 1 it stays at that value for every further iteration One Version of the Algorithm Algorithm Warshall AlN lN Implements Warshall s algorithm for computing the transitive closure of a graph on N nodes with adjacency matrix A This uses a sequence of N 1 matrices each N by N and labeled R Ru RW RW RM 2 For K l to N Do For I l to N Do For J l to N Do R IJ R IJ Or R IK And R KJ End Do J End Do I End Do K Return RW W Set matrix Rm equal to matrix A Another Version of the Algorithm Algorithm Warshall AlN lN RlN lN This implements the version of Warshall s algorithm that uses only two matrices one for the input and one for the output Input A the N by N adjacency matrix of the graph OutputR the N by N transitive closure of the graph For I l to N Do Set R A For J l to N Do RI J AI J End Do J End Do I For K l to N Do For I l to N Do For J l to N Do If 0 IJ Then RIJ RIK A RKJ End Do J End Do I End Do K Return R Example Graph for Warshall s Algorithm Consider the following directed graph Here is the adjacency matrix from which the algorithm will work 0 l l l A 0 0 1 1 0 1 0 l 1 0 1 0 This is clearly the adjacency matrix of a directed graph Note that it is not symmetric A1 4 0 but A4 1 1 A2 4 1 but A4 2 O Step 0 Generate R0 This appears to be quite simple and straightforward R0 A With this we have 0000 R00 1 0 0 1 0 CH3 Hatn We should note in passing that some authors define the matrix R0 slightly differently with R A except that 1100 J 1while AJ J 0 With this de nition we have 10 0 R01 01 10 t ntntnc HerIO Introduction to Algorithms Second Edition Cormen Leiserson Rivest and Stein The MIT Press MCGrawiHill 2001 ISBN 0 r 07 7 013151 7 1 Step 1 Start the Computation I shall follow our textbook s usage and stan with the matrix R de ned as below 0 0 0 0 1 1 0 0 1 0 K 1 R 0 0 0 1 1 0 ThekeystepIf 0 RIJ Then RIJ RIK RKJ K 111 1 R1 1 is 0 compute R1 1 R1 1 AR1 1 0 0 0 2 R1 2 is 0 compute R1 2 R1 1 AR1 2 0 0 0 3 R1 3 is 0 compute R1 3 R1 1 AR1 3 0 0 0 4 R1 4 is 0 compute R1 4 R1 1 AR1 4 0 0 0 K 11 2 1 R2 1 is 0 compute R2 1 R2 1 AR1 1 0 0 0 2 R2 2 is 0 compute R2 2 R2 1 AR1 2 0 0 0 3 R2 3 is 1 4 R2 4 is 1 So far we have no change in the matiix The key step If 0 KLI3 KLI4 Still no change Step 1b Continue for K 1 I1 R was cr ncc Flch ecuIa RIJ Then RIJ RIK RKJ R3 1 is 0 compute R3 1 R3 1 AR1 1 0 0 0 Ml nL R3 3 is 0 compute R3 3 R3 1 AR1 3 0 0 0 R3 4 is 0 compute R3 4 R3 1 AR1 4 0 0 0 Mk nL R4 2 is 0 compute R4 2 R4 1 AR1 2 1 0 0 Mk nL R4 4 is 0 compute R4 4 R4 1 R1 4 1 0 0 Step 2 K2 K2 R was cr ncc Flch ecuIa ThekeystepIf 0 RIJ Then RIJ RIK RKJ K 21 1 J 1 R1 1 is 0 compute R1 1 R1 2 AR2 1 0 0 0 J 2 R1 2 is 0 compute R1 2 R1 2 AR2 2 0 0 0 J 3 R1 3 is 0 compute R1 3 R1 2 AR2 3 0 1 0 J 4 R1 4 is 0 compute R1 4 R1 2 AR2 4 0 1 0 K 21 2 1 R2 1 is 0 compute R2 1 R2 2 R2 1 0 0 0 2 R2 2 is 0 compute R2 2 R2 2 R2 2 0 0 0 3 R2 3 is 1 4 R2 4 is 1 Still no change This is getting boring Step 2b Continue for K 2 0 0 0 0 K2 R 0 0 1 1 0 1 0 0 1 0 1 0 Thekeystep1f 0 RIJ Then RIJ RIK RKJ K2I3 J1 R31isOcomputeR31R32R21100 J2 R32is1 J3 R33isOcomputeR33R32R231A11 J4 R34isOcomputeR34R32R241A11 K214 R41is1 1 2 R4 2 is 0 compute R4 2 R4 2 R2 2 0 0 0 3 R4 3 is 1 4 R4 4 is 0 compute R4 4 R4 2 AR2 4 0 A1 0 Examine the Results of Step 2 Look at the graph 0 0 0 6 Consider the following computations K2I 3 J 3 R3 3 is 0 computeR3 3 R3 2 AR2 3 1 1 J 4 R3 4 is 0 compute R3 4 R32R2411 What do these equations say K 21 3 3 There is a loop from vertex 3 back to itself It goes through vertex 2 K 21 3 4 There is apth from vertex 3 to vertex 4 It goes through vertex 2 Neither ofthese paths involves only vertex 1 so they did not appear forthe K 1 loop Step 3 K3 K3 R H333 0 0 1 0 HHH3 3HH3 ThekeystepIf 0 RIJ Then RIJ RIK RKJ K3I1 1 R1 1 is 0 computeR1 1R13R3 1000 2 R1 2 is 0 compute R1 2 R1 3 AR3 2 0 A1 0 3 R1 3 is 0 compute R1 3 R1 3 AR3 3 0 A1 0 4 R1 4 is 0 compute R1 4 R1 3 AR3 4 0 A1 0 K3I2 1 R2 1 is 0 compute R2 1R23R3 1100 2 R2 2 is 0 compute R2 2 R23R32111 3 R2 3 is 1 4 R2 4 is 1 We have discovered another loop Vertex 2 to vertex 3 and back to vertex 2 Step 3b Continue for K 3 0 0 0 0 K3 R 0 0 1 1 0 1 1 1 1 0 1 0 Thekeystep1f 0 RIJ Then RIJ RIK RKJ K3I3 J1 R31isOcomputeR31R33R31100 J2 R32is1 J3 R33is1 J4 R34is1 K314 J1 R41is1 J2 R42isOcomputeR42R43R321A11 J3 R43is1 J 4 R4 4 is 0 compute R4 4 R4 3 AR3 4 1 1 1 Here we have discovered anew path and anew loop 4 gt 3 gt 2 and4 gt 3 gt 2 gt 4 Step 4 K4 K4 R H333 HHH3 HHH3 HHH3 ThekeystepIf 0 R1J Then RIJ RIK RKJ K4I1 1 R1 1 is 0 computeR1 1R14R4 1010 2 R1 2 is 0 compute R1 2 R1 4 AR4 2 0 1 0 3 R1 3 is 0 compute R1 3 R1 4 AR4 3 0 1 0 4 R1 4 is 0 compute R1 4 R1 4 AR4 4 0 1 0 K 4 I 2 1 R2 1 is 0 compute R2 1 R2 4 AR4 1 1 1 1 2 R2 2 is 1 3 R2 3 is 1 4 R2 4 is 1 Step 4b Continue for K 4 K4 R H333 HHH3 0 1 1 1 HHH3 Thekeystep1f 0 RIJ Then RIJ RIK RKJ K4I3 1 R31isOcomputeR31R34R41111 2 R32is1 3 R33is1 4 R34is1 1 R41is1 2 R42is1 3 R43is1 4 R44is1 Result The Transitive Closure of the Graph Here is the matrix representing the transitive closure ofthe graph 0000 What does this say about the graph 1 Vertex 1 is a pendant vertex There are no edges incident from vertex 1 We can see this by examining either the adjacency matrix or the graph itself 2 There are directed paths from every other vertex to vertex 1 just no directed paths from vertex 1 to any other vertex 3 For every otherpair of distinct vertices J and K there is a directed path from J to K and a directed path from K to J 4 For every vertex other than vertex 1 there is anonrtrivial directed path from the vertex back to itself 5 The graph is weakly connected but not strongly connected Chapter 7 1 2 3 4 5 Slide 1 of 15 slides Finite State Machines and Design of Sequential Circuits Design of Sequential Circuits Page 1 Derive the state diagram and state table for the circuit Count the number of states in the state diagram call it N and calculate the number of ip ops needed call it P by solving the equation 213391 lt N S 2P Guess the value of P Assign a unique Pbit binary number state vector to each state Often the first state 0 the next state 1 etc Derive the state transition table and the output table Separate the state transition table into P tables one for each ip op WARNING Things can get messy here neatness counts CPSC 5155 July 30 2010 Chapter 7 Finite State Machines and Design of Sequential Circuits Design of Sequential Circuits Page 2 6 Decide on the types of ip ops to use When in doubt use all JK s 7 Derive the input table for each ip op using the excitation tables for the type 8 Derive the input equations for each ip op based as functions of the input and current state of all ip ops 9 Summarize the equations by writing them in one place 10 Draw the circuit diagram Most homework assignments will not go this far as the circuit diagrams are hard to draw neatly Slide 2 of 15 slides CPSC 5155 July 30 2010 Chapter 7 Finite State Machines and Design of Sequential Circuits Design Problem A Modulo 4 Counter Step 1a Derive the state diagram for the circuit Slide 3 of 15 slides CPSC 5155 July 30 2010 Chapter 7 Finite State Machines and Design of Sequential Circuits Design Problem page 2 Step 1b Derive the state table for the circuit Present State Next State 0 1 1 2 2 3 3 0 Step 2 Count the number of states Determine the number of ip ops There are four states so N 4 We have 21 lt 4 S 22 so P 2 We need two ip ops We shall number the two ip ops as 1 and O Flip op 1 is the more significant bit We denote the state by YlYO Slide 4 of 15 slides CPSC 5155 July 30 2010 Chapter 7 Finite State Machines and Design of Sequential Circuits Design Problem page 3 Step 3 Assign a unique Pbit binary number to each state This is often called a state vector State 2bit Vector 0 0 0 1 O 1 2 1 0 3 1 1 For some designs the state names are not obvious The names and vectors for a modulo N counter are so obvious as almost to be dictated Slide 5 of 15 slides CPSC 5155 July 30 2010 Chapter 7 Finite State Machines and Design of Sequential Circuits Design Problem page 4 Step 4 Derive the state transition table and the output table There is no output from this circuit so no output table Present State Next State 0 00 01 1 01 10 2 10 1 1 3 1 1 00 The output of the circuit is achieved by copying the state of the two ip ops There is no computed output Slide 6 of 15 slides CPSC 5155 July 30 2010 Chapter 7 Finite State Machines and Design of Sequential Circuits Design Problem page 5 Step 5 Separate the state transition table into P tables one for each ip op Here P 2 so we use two tables FlipFlop 1 FlipFlop 0 Present State Next State Present State Next State Y1Y0 Y1Y0 O O O O O 1 O 1 1 O 1 O 1 O 1 1 O 1 1 1 O 1 1 O Slide 7 of 15 slides CPSC 5155 July 30 2010 Chapter 7 Finite State Machines and Design of Sequential Circuits Design Problem page 6 Step 6 Decide on the types of ip ops to use When in doubt use all JK s We elect to use JK ip ops my favorite in this design Here is the excitation table for 21 JK ip op QT QT1 J K 0 0 0 d 0 l l d l 0 d l l l d 0 Slide 8 of 15 slides CPSC 5155 July 30 2010 Chapter 7 Finite State Machines and Design of Sequential Circuits Design Problem page 7 Step 7 Derive the input table for each ip op using the excitation tables for the type Flip Flop 1 PS NS Input Y1 Y0 Y1 J1 K1 0 0 0 0 d 0 1 1 1 d 1 0 1 d 0 1 1 0 d 1 Flip Flop O PS YY 00 01 10 11 Slide 9 of 15 slides CPSC 5155 July 30 2010 Chapter 7 Finite State Machines and Design of Sequential Circuits Design Problem page 8 Step 8 Derive the input equations for each ip op based as functions of the input and current state of all ip ops We try to develop a Boolean expression that matches the values in each column Note that the d or don t care entries are ignored We can use Karnaugh Maps K Maps but I prefer simpler rules 1 If a column does not have a 0 in it match it to the constant value 1 If a column does not have a 1 in it match it to the constant value 0 2 If the column has both 0 s and 1 s in it try to match it to a single variable which must be part of the present state PS Only the 0 s and 1 s in a column must match the suggested function 3 If every 0 and 1 in the column is a mismatch match to the complement of a function 4 If all the above fails try for simple combinations of the present state One can always set 1 0 and develop a Sum of Products expression Slide 10 of 15 slides CPSC 5155 July 30 2010 Chapter 7 Design Problem page 9 Step 8 FlipFlop 1 Finite State Machines and Design of Sequential Circuits Neither J 1 nor K1 can be matched to a constant as each column contains both 0 and 1 Neither J1 nor K1 can be matched to Y1 Both J1 and K1 can be matched to Y0 Match to the present state PS not the next state NS PS NS Input Y1 Y0 Y1 J1 K1 0 0 0 0 d 0 l l l d 1 0 l d 0 1 l 0 d 1 J1 Y0 K1 2 Y0 Slide 11 0f 15 Slides CPSC 5155 July 30 2010 Chapter 7 Finite State Machines and Design of Sequential Circuits Design Problem page 10 Step 8 FlipFlop 0 Neither J 0 nor K0 has a O in its column so each of J 0 and K0 can be matched to the constant 1 PS NS In ut Y1 Yo Yo J 0 K0 0 0 1 1 d 0 1 0 d 1 1 0 1 1 d 1 1 0 d 1 J 0 1 K0 1 Step 9 Summarize the equations by writing them in one place Here they are J 1 Y0 K1 Y0 J 0 1 K0 1 Slide 12 of 15 slides CPSC 5155 July 30 2010 Chapter 7 Finite State Machines and Design of Sequential Circuits Design Problem page 11 Step 10 Draw the circuit diagram l Tl J Q 11 J Q Y0 K 5 K 5 Clock L 39 Here is the same circuit using T ip ops Q Q LY T T a r CLK CLK OI Clock Slide 13 of 15 slides CPSC 5155 July 30 2010 Chapter 7 Design Problem page 12 Here is one version of the complete circuit Finite State Machines and Design of Sequential Circuits Y1 Y1 Y 0 Q Q T T Y a F 5 ELK cu Clock Slide 14 of 15 slides CPSC 5155 July 30 2010 omen pmsunMumsmanewarsmmcmm The Modulo 4 Counter Using a Shift Register T0 T1 T2 T3 This is known as a one hot design as only one ipi op at a time is setto 1 Initially Y01 Y10 Y2 0 and Y 0 sum 15 quS slides CPSC 515 YuIyZEI 2mm Divide and Conquer The divide and conquer strategy may be more sophisticated than brute force but it is not always more ef cient Consider this example Function RecursiveAdd Al N lfN 1 Then Return Al Else M N 2 Integer division 81 RecursiveAdd A l M S2 RecursiveAdd A M l N Return 81 S2 End If The recurrence relation is TN 2oTN2 1 Use the Master theorem with A 2 B 2 and D 0 BD 20 1 so A gt BD Compute P logBA log22 l and TN e N But the brute force approach is also a linear function Recursive Sorts Mergesort and Quicksort The standard divide and conquer strategy for sorting arrays calls for l Splitting the array into two sub arrays of approximately equal size 2 Applying the sort algorithm to each of the sub arrays 3 Recombine the two sub arrays into one big sorted array We consider the strategy for splitting the array Consider array A l N Mergesort Split into two sub arrays by indeX MN2M1Ml SplitA l N into A l M andA Ml N Quicksort Split the array by value Pick an index S and a value P AS Split A l N into two parts one part with elements only less than or equal to P one part with elements only not less than P Standard Mergesort Here is a presentation of the standard mergesort algorithm rewritten for 1 based arrays A 1 N Algorithm Mergesort A 1 N If N gt 1 Then M N 2 NMN M NminusM CopyA1MtoB1M CopyA M 1 N to C 1 NM Mergesort B 1 M Mergesort C 1 NM Merge B C A Recreate the big array End If Common Variant of Mergesort Most people use a slight variant of the algorithm especially for paper solutions Algorithm Mergesort A l N lfN gt 2 Then M N 2 NMN M NminusM CopyAl MtoB l M CopyA M l N to C l NM Mergesort B l M Mergesort C l NM Merge B C A Recreate the big array Else If N 2 Then IfA1 gt A2 Then Swap Al A2 End If A Standard Merge Algorithm Algorithm Merge B1P C1Q A1N Merges two ordered arrays B and C to form an array A J 1 Index into array B K 1 Index into array C L 1 Index into array A WhileJSPandKSQDo IfBJ SCK Then ALBJ JJ1 Else ALCK KK1 EndIf LL1 EndWhile While JSPD0ALBJ JJ1LL1 While KSQD0ALBK JJ1LL1 Mergesort Example Sort the array A 7 2 1 6 4 N5M2NM3 B72 C164 SortA72 7gt2soA27 Sort A 1 6 4 N3M1NM2 B 1 already sorted C 6 4 becoming C 4 6 Merge to A 1 4 6 B27 C146 Merge to A 1 2 4 6 7 Another Mergesort Example Sort the array 3 1 4 1 5 9 2 6 using the variant algorithm 31415926 Ef ciency Analysis of Mergesort The basic observation is that the time ef ciency of merge is N Two lists each of size about N 2 for a total size of N can be merged With almost exactly N comparisons The recurrence relation for mergesort is thus TN 2oTN2 N Use the Master Theorem with A 2 B 2 and D l A 13D so TN 6 NDologN or TN 6 NologN Later we shall see that any algorithm that sorts by comparison only must have time complexity TN e Q NologN Thus mergesort is about as ef cient as possible Goal of Each Stage of Quicksort The array is split by value not by index The pivot element is chosen to divide the array into two sub arrays and a distinct element The picture is as follows Unsorted Array Pivot Element Unsorted Array All AJ S P P AS All AJ2 P After this partitioning each sub array is sorted Note the partial duplication of the sub arrays Each of the two may contain values equal to the pivot This is not a problem as further sorting will move the elements correctly Merging in the Quicksort Algorithm Unlike mergesort there is no explicit merge step in quicksort The reason is easily explained by looking at a given step in the sort both before the recursive calls and after its recursive calls Array has just been partitioned Unsorted Array Pivot Element Unsorted Array All AJ s P P AS All AJ2 P The two recursive calls to sort the sub arrays have each been completed Sorted Array Al 8 1 Pivot Element Sorted Array AS lN A1 s A2 3 AS 1S P AS P P s AS1 s s AN Notice that this array is already sorted No merge step is necessary Selecting the Pivot Element Look again at the result of partitioning the array Al N Unsorted Array Pivot Element Unsorted Array A11AJ S P P AS A11AJ2 P It is desirable for the two sorted sub arrays to be of about the same size Mergesort achieves this end by selecting S N 2 and splitting by index Ideally the pivot element would be chosen as the median of the values in the array So the array 3 1 4 1 5 9 2 6 7 wouldbepartitionedas 3 1 1 2 4 5 9 6 7 This idea leads to a dif culty putting too much work into selection of the pivot element slows down the sorting process possibly making it NZ Two common pivot strategies 1 The most common is to select P Al This usually works well 2 Another commonly suggested strategy is to compute the index M N 1 2 and letting P be the median ofA1 AM and AN More on Pivot Element Selection Experience has shown that clever pivot element selection strategies 1 produce performance improvements only in rare instances and 2 usually degrade performance in commonly seen instances ConsiderPAl on the array example 3 1 4 1 5 9 2 6 7 this gives the partitioned array 1 1 2 3 4 5 9 6 7 This is not too badly balanced Butconsider 1 2 3 4 5 6 7 8 9 this leads to H 1 2 3 4 5 6 7 8 9 the left sub array is empty Consider 987654321 this leads to 8 7 6 5 4 3 2 1 9 the right sub array is empty It is easily shown that the performance of quicksort on a sorted array is no better than brute force and may be considerably worse than bubble sort Overview of Quicksort Quicksort Was developed by C A R Hoare there is a sillyjoke here Algorithm QuickSort A L R Sort that part of the array beginning at index L and ending at index R If L lt R Then S Partition A L R Find the split index Note index S is not included in either subrarray QuickSort AL S 71 QuickSort A S 1 R End Algorithm 1 S Time Ef ciency Analysis of Quicksort The recurrence for quicksort is TN 20TN2 QN Where QN is the time ef ciency of the partition process Let QN 6 ND and apply the Master Theorem A 2 and B 2 IfD 0 then 2 A gt 1 BD 20 and quicksort has TN e Nl This is unlikely since the partitioning of N elements requires the examination of N elements We can argue convincingly that QN e QNl IfD 1 then 2 A 1 131 21 and quicksort has TN 6 NologN Our example partition algorithm has QN e Nl hence D 1 If D 2 2 then 2 A lt BD and quicksort has TN 6 ND for D 2 2 and quicksort on the average is no faster than brute force methods It is for this reason that we are looking for ef cient partitioning strategies The Partition Algorithm Setting Up the Algorithm Partition AL R L R l l T T I J Note I R 1 is off the end of the array 7 at present an invalid index This is allowed because all references to this index are as follows Repeat J J Until AJ g P The rst reference is to AR 1 7 1 or AR Pivot Value P AL I L J R 1 Set up situation shown in gure above Note the last two lines in the book s algorithm They are swap Ai AUD SWaP Al ADD not swap Ai AUD swap Ai AUD as I originally read them The Partition Algorithm Partition AL R Note J R 1 is off the end of the array at present an invalid index Pivot Value P AL I L J R 1 But note AR 1 is never referenced Repeat Repeat 1 1 1 Until AI 2 P Repeat J J 1 Until AU 3 P Swap AI AJ Until I 2 J On the last loop when I 2 J the algorithm does one swap too many Swap AI AJ The most ef cient way to handle that is just to undo it here Swap AL AJ Quicksort Sample Problem Most studies of quicksort just focus on giving examples of the partition algorithm This is mostly What We shall do The situation at start for sorting 3 1 4 1 5 9 2 6 54 is as follows L R T T I J Note again that index I is off the end The rst reference using the index JWill have it decremented so that R Example of Partition Loop 1 L R 4 EEIIHHEEHE T I Repeat I I 1 Until AI 2 P L T J Repeat I I 71 Until AU S P L Swap AI AU L R EIHIHHIEHE T T 1 J After rst swap Example of Partition Loop 2 R 5 Swap AI AJ Example of Partition Loop Exits End of loop Swap AH and AJ Again Swap AL AJ Return J Quicksort then proceeds with two recursive calls One the sort the subiarray 1 1 2 The other to sort the subiarray 9 4 6 5 4 Binary Search Search an Ordered List 1 l L 1 M 9 R Algorithm BinSearch A0 N r 1 K Search for value K RN71 While L SRD0 MLR2 If K AM Then Return M Else if K lt AM Then R M 71 1 L M 1 End While Return 7 M or r 1 ifone prefers Indicate value not found Chapter 1 Boolean Algebra and Some Combinational Circuits We begin this course in computer architecture with a review of topics from the prerequisite course It is assumed that the student is familiar with the basics of Boolean algebra and two s complement arithmetic but it never hurts to review Boolean algebra is the algebra of variables that can assume two values True and False Conventionally we associate these as follows True l and False 0 Formally Boolean algebra is de ned over a set of elements 0 1 two binary operators AND OR and a single unary operator NOT These operators are conventionally represented as follows 0 for AND for OR for NOT thus X is NotX The Boolean operators are completely defined by Truth Tables 0000 Q 000 M 0 0010 0ll l 0 1000 10l 1011 lll Note that the use of for the OR operation is restricted to those cases in which addition is not being discussed When addition is also important we use different symbols for the binary Boolean operators the most common being for AND and v for OR I 39 quot of Boolean Logic bv Circuitry The Boolean values are represented by speci c voltages in the electronic circuitry As a result of experience it has been found desirable only to have two voltage levels called High and Low or H and L This leads to two types of logic Negative Logic High 0 Low 1 Positive Logic High 1 Low 0 This course will focus on Positive Logic and ignore Negative Logic As a matter of fact we shall only occasionally concern ourselves with voltage levels In Positive Logic 5 volts is taken as the high level and 0 volts as the low level These are the ideals In actual practice the standards allow some variation depending on whether the level is output or input Output of Logic Gate Input to Logic Gate Logic High 24 7 50 volts 20 7 50 volts Logic Low 00 7 04 volts 00 7 08 volts The tighter constraints on output allow for signal degradation due to fanout effects We shall discuss fanout brie y at a later time basically it describes the maximum number of devices that can take the output of a logic gate before the voltage drops too low Baste Gates tot Boolean Funetaons The gate m 012 N OR XOR Exetustye OR The Exetustye OR gate ts the same as the OR gate exeept that the output ts 0 False when both tnputs ate1CTtue The symbol forXOR ts gate D gate ts shown below In geneta1 tf any tnputto an OR gate ts 1 the output ts 1 m shown now In geneta1 if any tnput to an AND gate ts 0 the output ts 0 A B AB 0 0 1 0 1 i 0 0 0 1 0 1 1 gate of denottng the NOT funetton NORA ts denoted as etthetA39 on X We use X We tnetude tt at thts 1eye1 beeause ttts extremely eonyentent Thts ts the twodnput Exetustye OR XOR gate the funetaon ofwhxch ts shown tn the fouowtng truth table ABAB 00 if 0 0 1 1 1 0 1 1 1 0 andthe NOT funeuon For any A 0 A andA A Theprooflsbytxuthtable A B AB Result 0 0 A 0 l 0 l Thls resultrs extremely useful when olesrgnrng 0 l l a pple earry addersubtraetor l l 0 1M M1 The basre logre gates are de nedln terrns ofthe blnary Boolean funeuons Thus the basre logre gates are Worlnput AND gates Worlnput OR gates NOT gates Worlnput NAND gates tworlnput NOR gates and tworlnput XOR gates 1t rs eornrnon to nd threerlnput anol fourrlnput yaneues ofAND OR NAND anol NOR gates they are harolto understand Consroler afour rnput AND gate The output funeuon rs olesenbeol as an easy generalrzatron ofthe two rnput AND gate the output rs 1 True lf and only lf all othe rnputs l otherwrset utput rs 0 One ean s thesrze afourrlnput AND gate from three Worlnput AND gates or easrly eonyert afoumnput AND gate rnto a tworlnput AND gate 1702 on the nextpage of these notes or on page 24 ofthe textbook ANDA B c D A B 53 AN DA B Here rs the general rule for ernput AND gates and ernput OR gates AND Outputrs 0 lf any rnput rs 0 Outputrs 1 only lf all rnputs are 1 OR Outputrs l lf any rnput rs l Outputrs 0 only lf all rnputs e 0 XOR ForN gt 2 ernput XOR gates are notuseful and wlll be ayorded Derived Gates We now show 2 gates that may be considered as derived from the above NAND and NOR The NOR gate is the same as aNOR gate followed by aN0T gate The NAND gate is the same as an AND gate followed by aN0T gate Electrical engineers may validly object that these two gates are not derived in the sense they 39 39 39 39 39 di n d nfac v m 0 used to make AND gates As alvmys we are interested in the nonengineering approach and stay with our view NOT AND and OR are basic We shall ignore the XNOR gate and if needed 39 39 39 39 an X0 O T rs We close this chapter with a discussion ofa special integrated circuit that is ofgreat utility Whenoneisauuauy quot p oh w39tthW quot 39 39 enabledhigh and enabledlow The circuit diagrans for these are shown below Inpul Outpul lnpul Outpul Enable Enable Enabled High Enabled Luw TL i i i I I LIAWeshall LI 39abit iii 39 c 39mokt f i This we do by comparison with aN0T gate The output ofaNOT gate has two values logic 0 0 volts and logic 1 5 volts For either of these outputs the gate is trying to assert avoltage on the output line A tristate bu erhas a special output called High Z or High Impedance For this state t c t r 1m rct r c disabled tristate bu er is connected to any other line the value on this line must be determined by another circuit A Btt ofElectxlcal Engtneettng e engtneenng detatls Many textbooks show the lmplementa on oftworlnputNAND gates dtworlnput NOR gates ustng two taanststots eaeh The use oftwo ttanststots pet gate TTL t t t are anumbet oflogl types falllng tnto two classlflcatlons Thete ts pnslu39velngie l and activele lngie Eaeh logte comblna on The h and negative lngie Thete ts aetjvehlgh ngae Tu n ttt l t l t F teptesented by 0 volts The postttve voltage ts sohnetahnes denoted by V and the zeto The symbol tot gtoundts shown atnght Itmeans that the voltage tnputts betng forcedto 0 ttts olten usedto denote the eohnpletaon ofa etteutt Thete ts anothetvoltage state thatts so sunple that tt ts hard to de ne The engtneets eall thts highrimpedance ot higheZ as they use the symbol 2 tot tmpedanee Many ofyou may have heard ofthe tdea of eleetneal teststanee In the study ofhlgh frequency Ac etteutts the about the tenn tn tts pteetse defmmon wetust use the tetm hlghi as alabel Flr t eonneetedto atap Posmve voltage ts analogous to the state tn whteh the tap ts tutned on tutned off A mote tealtstte seenano ts the use oftuhnpet eables to start a eat Constdetthetuhnpet eable thatwete altght bulb attaehedto the eable there would be no ltght seen However thts eable b b n N F U torn hlghrZ to l2volts The voltage goes ftohn 12 volts to hlghrZ to 0 volts 3 wtthouttemovtng the eable attaeh ttto the posmve tetmtnal ofthe battety Thts ts a volts andthe othet endts assetttng gtound Yout eat wtll eateh on Tue Thls rs the marn dlfference between ahlghrZ state and ground state In the ground state t r r t t t t dmln all wrre at hlghrZ state andrt rs temptrng to thnk ofrt as belng at 0 vo ts For many usages thrs lnfol39mal way ofthlnklng works but rt falls when we conslder drgrtal erreurts We now turn our attentaon to the aetual voltage levels rn TTL olevrees We have sarolthat a logre erraet voltages To allow forthrs fact andto allow fornolse margrns the followlng voltage levels are de ned for the TTL logre l l Avoltage The Red nghtGreen Lrght Problem Consroler an attempt to monrtor aremote proeess by havrng areollrght rn a eontrol room an hrs erreurtmay appearto work T If aproblem Momtor 3 rt rarses avoltage andrllumrnates the reollrght But suppose the real I 7 lrghtrs off What oloes rt mean It may be the ease that there rs no problem and the monrtor rs rarsrng alogre 0 Another possrbrlrty rs a Mom lrne E7 5 o In thls dlagam the red lrght ls at the op and the green hght at the bottom quot a lt mm d rllumrnated or a 0 ln whlch case the geen llght ls rllumrnated But suppose a break ln the transmrssron llne The value othe voltage onthehne goes LohlghrZ orundeflned There ls n p Hutu un T T h lnmlH the Due NOT l l l to tht th rem lnerummL ls lndlcated by the truth table below Red nght Green nght Lnterpretatlon OFF OFF Break tn the transmlsslon llne or trouble tn the monltor OFF ON Process ls OK ON OFF A problem has been detected ON ON Trouble In the control room process ls OK anSLaLe Buffers t estate buffa slmple nonrlnvertlng butrer and comment on lLs functlon Logrcally a buffa doesnothng Fleotnoally thebuffer serves as an amphher converttng a degaded srgnal rnto a A F re seable form an lnpuLOfZS volts loglc lgtmrght hes mo u r e be output as 4 5 volts also loglc 1 for use by other ga Ater tat uh l hen lt ls mnumu W t m h t lsnotena l ult gate may be enabled hlgh c 1 or enabled low c o Consrder an enabledrhlgh trrstate A F A L F A gt F C C0 C1 to output The gure above llluslrates ths dlsabled smge as a break ln the clreult We shall aiamlne ths dlsabled slate ln more demall late We now conslder an mabledrlow mrstate as seen ln the flgure below A F A Igt F A z F c1 C C70 n n 0 Inthls a W 4 L l n lnl an WW lsw nlnel Wlm two Boolean lnputs A and B one oulpuLF and an enable slgnal c A F co A F C c1 B F B when c o the lop Lrlrstale IS enabled and thebottom Lrlrstate IS dlsabled The effect othe enc ym a n w A M A Inthscase n a m s m a n posslblllty of con ch or shortrclrcult 1647167 ln naylel lnl l l a Loglc Army 15 an ample ofsuch a package a number ofNOT gales anurnber ofAND gates and one OR gate The FLA provldes a onvenlmtonerchlp soluuon to lmplemenlauon ofsuch funeuons All 11 me below A variables 39 39 4 Jun addm Note that each of the four AND gates has six possible inputs of which only three are conn ct d In general each AND gate in a PLA will have apossible input for each variable and its negation In this example each OR gate has four possible inputs one for each ofthe 39 AND ay In um r 39 of gates hquot t n 39 A t 39 n l n in The output of gate ANDi is EEC The output of gate AND2 is ABC The output of gate AND is AoB C The output of gate ANDA is AoBC And that 171 KE C B EA EEA BC F2 213 A39BoC Input Output and IO Strategies The Four Major Input Output Strategies Preliminary De nitions A Silly Example to Illustrate Basic De nitions A Context for Advanced IO Strategies The Four Strategies Here are the simple de nitions of the four IO strategies Program Controlled 10 This is the simplest to implement The executing program manages every aspect of IO processing IO occurs only when the program calls for it If the IO device is not ready to perform its function the CPU waits for it to be ready this is busy waiting The next two strategies are built upon program controlled IO Interrupt Driven 10 In this variant the IO device can raise a signal called an interrupt when it is ready to perform input or output The CPU performs the IO only when the device is ready for it In some cases this interrupt can be viewed as an alarm indicating an undesirable event Direct Memory Access This variant elaborates on the two above The 10 device interrupts and is sent a word count and starting address by the CPU The transfer takes place as a block IO Channel This assigns 10 to a separate processor which uses one of the above three strategies IO Strategies A Silly Example I am giving a party to which a number of people are invited I know exactly how many people will attend I know that the guests will not arrive before 600 PM All guests will enter through my front door In addition to the regular door which can be locked it has a screen door and a doorbell I have ordered pizzas and beer each to be delivered All deliveries at the back door I must divide my time between baking cookies for the party and going to the door to let the visitors into the house I am a careless cook and often burn the cookies We now give the example by 10 categories The Silly Example Program Controlled Here I go to the door at 600 PM and wait As each one arrives I open the door and admit the guest I do not leave the door until the last guest has arrived nothing gets done in the kitchen Interrupt Driven Here I make use of the fact that the door has a doorbell I continue working in the kitchen until I hear the doorbell When the doorbell rings I put down my work go to the door and admit the guest Note 1 I do not drop the work but bring it to a quick and orderly conclusion If I am removing cookies from the oven I place them in a safe place to cool before answering the door Note 2 If I am ghting a grease re I ignore the doorbell and rst put out the re Only when it is safe do I attend to the door Note 3 With a guest at the front door and the beer truck at the back door I have a dif cult choice but I must attend to each quickly The Silly Example Part 2 Direct Memory Access I continue work in the kitchen until the rst guest arrives and rings the doorbell At that point I take a basket and place a some small gifts into it one for each guest I go to the door unlock it admit the guest and give the rst present I leave the main door open I place the basket of gifts outside with instructions that each guest take one gift and come into the house without ringing the doorbell There is a sign above the basket asking the guest taking the last gift to notify me so that I can return to the front door and close it again In the Interrupt Driven analog I had to go to the door once for each guest In the DMA analog I had to go to the door only twice IO Channel Here I hire a butler and tell him to manage the door any way he wants He just has to get the guests into the party and keep them happy Another Simple Example We rst examine program controlled 10 We give an example that appears to be correct bmummhmmmamd mvIhm mwmmwammmsmahghdw dmgwgemogmn We are using the primitive command Input to read from a dedicated input deVice It is the ASCII codes for characters that are read with a 0 used to indicate no more input Get RI DeviceData Skip if RI gt O ASCII code 0 means no more Jump Done Loop Store XJ Not really a MARIE instruction J J l Again pseudocode Get RI DeviceData Skip if R1 0 Jump Loop Go back and get another Done Continue What s wrong Simply put what is the guarantee that the dedicated input deVice has a new character ready when the next Input is executed Program Controlled Input and the Busy Wait Each input or output device must have at least three registers Status Register This allows the CPU to determine a number of status issues Is the device ready Is its power on Are there any device errors Does the device have new data to input Is the device ready for output Control Register Enable the device to raise an interrupt when it is ready to move data Instruct a printer to follow every ltLFgt with a ltCRgt Move the readwrite heads on a disk Data Register Whatever data is to be transferred Suppose our dedicated input device has three registers including DeviceStatus which is greater than zero if and only if there is a character ready to be input Busy Get R2 Device Status Skip if R2 gt o Jump Bu 3 y Get R1 DeviceData When Is Program Controlled IO Appropriate Basically put it is appropriate only when the IO action can proceed immediately There are two standard cases in which this might be used successfully 1 The deVice can respond immediately when polled for input For example consider an electronic sensor monitoring temperature or pressure However we shall want these sensors to be able to raise interrupts 2 When a deVice has already raised an interrupt indicating that it is ready to process data In a modern computer the basic IO instructions GET and PUT on the Boz S are considered privileged They may be issued only by the Operating System User programs issue traps to the operating system to access these instructions These are system calls in a standardized fashion that is easily interpreted by system programs Diversion Processes Management We now must place the three advanced IO strategies within the proper context in order to see why we even bother with them We use a early strategy called Time Sharing to illustrate the process management associated with handling interrupts and direct memory access In the Time Sharing model we have 1 A single computer with its CPU memory and sharable IO resources 2 A number of computer terminals attached to the CPU and 3 A number of users each of whom wants to use the computer In order to share this expensive computer more fairly we establish two rules 1 Each user process is allocated a time slice during which it can be run At the end of this time it must give up the CPU go to the back of the line and await its turn for another time slice 2 When a process is blocked and waiting on completion of either input or output it must give up the CPU and cannot run until the IO has been completed With this convention each user typically thinks he or she is the only one using the computer Thus the computer is time shared The Classic Process Diagram Here is the standard process state diagram associated With modern operating systems IO Complete Ready When a process think user program executes an IO trap instruction remember that it cannot execute the IO directly the 05 suspends is operation amp starts 10 on its behalf When the 10 is complete the 05 marks the process as ready to run It Will be assigned to the CPU When it next becomes available User Program Input Is blocked Copy from buffer Resume processing The Three Actors for Input Operating System Input Device Block the process Status O Interrupt is enabled Reset the input status register Enable the device interrupt Command the input Input begins Dispatch another process Input is complete Raise interrupt Acknowledge interrupt Input place data into a register Place data into buffer for process Mark the process as ready to run Obviously there is more to it than this Chapter 3 7 Dynamic Programming uat 1 tr Putahother was baseol oh the solutaoh ofanumber of overlapplng subproblems Rather than push for a reerse but confuslng ole mtaoh of dynamlc programmmg we shall rllustrate solutroh to a problem even when amore emereht solutaoh to that problem emsts We must have some good examples m orolerto clarlfy our dlscusslon m thls eourse L reeurrehee relatloh FlbN Frbaqe 1 FrbNe 2 forNZ 2 wrth Flb0 0 andFlb1 1 thatrt repeatedly eomputes the v ues of eertam numbers The gure below shows the eall tree for eompuahg the 5quot Frbohaeer humber uslng the reeurswe algohthm Note that Flb0 ls ealleol 3 trmes Flb1 ls ealleol 5 trmes Flb2 ls ealleol 3 trmes ahol Flb3 ls ealleoltwree 1739 5 F39 4 1739 3 F39 3 1739 2 Fihz 1739 2 Fih 1 Fih1 1 Fih1 1 1 Fib1 1 17mm 0 17mm 0 17mm o even when they ear be reusedln later eomputatrohs Pagel The dynamic programming solution to the computation of the Fibonacci number makes use of an array Following the example of the textbook on page 82 we call the array F and assume that it is big enough 7 speci cally if we compute FibK we assume that the array F has at least K 1 elements indexed from 0 to K Here is the algorithm Algorithm FibK Computes the Kth Fibonacci Number Uses an integer array F to hold the intermediate results Lon FN Assume N is a constant N gt K g If K lt l Return 0 Skip the dull stuff F0 0 Fl l For I 2 to K Do Flll NI 7 ll Fl a 2 End Do Return FK As we noted earlier there is another dynamic programming solution to this problem that is based on the fact that the algorithm must remember only the preVious two results This algorithm is shown below Algorithm FibK Computes the Kth Fibonacci number by retaining the results of the previous two computations If K lt l Return 0 Fl 0 F2 l For J 2 to K Do F0 Fl F0 is FibJ 7 2 Fl F2 Fl is FibJ 7 l F2 F0 Fl F2 is FibJ End Loop Return F2 Each of the two algorithms on this page are better than the recursive algorithm The rst algorithm is preferable if one wants to compute each of the rst K l Fibonacci numbers The second algorithm is preferable if one wants to compute only FibK Page2 Warshall s Algorithm Warshall s algorithm is the first of two graph algorithms based on dynamic programming that we shall study in this class Warshall s algorithm computes the transitive closure of a directed graph and by extension an undirected graph since every undirected graph can be represented as a directed graph Background We first need to recall the two basic Boolean functions AND and OR Each of these is a Boolean function of two Boolean variables which can take the value 0 False or 1 True Each of these two functions can be defined by a truth table or by description For Boolean variables X and Y X AND Y l ifand only ifX l and Y 1 Otherwise it is 0 X OR Y 0 ifand only ifX 0 and Y 0 Otherwise it is l The truth table definitions are given to be complete The adjacency matrix A Ag of a directed graph is the Boolean matrix that has A1 1 if and only if there is a directed edge from vertex Ito vertex J If A1 at 1 then AU 0 For a directed graph having N vertices the adjacency matrix is an NbyN matrix The transitive closure T TU of a directed graph with N vertices is an NbyN Boolean matrix in which T1 1 if and only if there exists a nontrivial directed path a directed path ofpositive length from vertex I to vertex J If TU at 1 then T1 0 There are a number of ways to compute the transitive closure of a directed graph We focus on one of the more efficient methods called Warshall s algorithm Warshall s algorithm computes the transitive closure matrix T through a sequence of Boolean matrices R R R K39I R K R0 The definition of these matrices is that R001 1 if and only if there is a directed path from vertex Ito vertex J with each intermediate vertex if any not numbered higher than K By this definition the matrix Rm has elements Rm 1 if and only the intermediate path contains no vertices numbered above 0 thus no vertices 7 Rm A the adjacency matrix The matrix R00 shows all directed paths that can use any vertex from the vertex set VG Thus we conclude that RN T the transitive closure Page 3 To be speci c we give a few obvious de nitions R0 1 if and only if there is either a direct path from vertex Ito J or if there is a path of length 2 that goes through only vertex 1 Ra 1 if and only if there is either 1 a direct path from vertex Ito J or 2 a path of length 2 that goes through only vertex 1 or 3 a path of length 2 that goes through only vertex 2 or 4 a path of length 3 that goes through only vertices l and 2 in either order So far all we have is a bunch of interesting de nitions and observations There is one fact that turns the above into a useable algorithm 7 the fact that matrix R00 can be computed from only its predecessor matrix R 391 Consider the computation of element R001 which is 1 if and only if there is a directed path from vertex Ito vertex J containin no vertex numbered higher than K Now consider the matrix Ra with its element RK391 U de ned similarly Suppose that ROG 1 Then there is a path from vertex Ito vertex J containing no intermediate vertex numbered higher than K 7 l certainly ful lling the criteria that there be no intermediate vertex numbered higher than K Thus we arrive at the rst conclusion If ROW 77 1 then R091 7 1 IfRK3911J 0 there is another way to create a directed path from vertex Ito vertex J containing no vertex numbered higher than K 1 there is a path from vertex Ito vertex K involving no vertex numbered higher than K 7 l and 2 there is a path from vertex K to vertex J involving no vertex numbered higher than K 7 1 Thus we have the second conclusion If RK3911K 77 1 and Ra KJ 7 1 then R091 7 1 Based on these two conclusions we have the fundamental Boolean equation of the method R001 7 ROG OR ROG1K AND RWDKJ Note that by de nition of a path in a graph neither the start vertex nor the end vertex is a part of the path Thus consider the paths from vertex I to vertex K containing no intermediate vertex numbered greater than K Since vertex K is the end point of the path it cannot appear as an intermediate vertex and the path if present contains no intermediate vertex numbered greater than K 7 l Page 4 The sxtuauon deseussed above may be xllustxated by the gure below Fade frnm K tn J Fade frnm I tn K Fade frnm I tn J Each pade invnlves nnly verdees in dee set 1 K e 1 Warshall39 repeated here Note that xt constructs a sequence of N 1 rnamees 712m through 12 A1gur1ehrh marshan A1N 1N mplements Marshall s a1gur1ehne for Camputlng the translmve r ph en u nuues men adjacency naerrx A uses a sequence of u 1 naermes each wher and R Ru 7 1 Th1 1ahe1eu R 0 um u A Set naerex u equal to naerex A For x 1 De 0 N For I 1 0 N Du F r J 1 0 N Du Rm 1J R x39 IJ 0r Rquot quot 11lt And R x39 mm End Du J Return Rm L In faee generaung more than one mamx m thxs algondern as we shall see shordy In order to h an eqmvalent but apparendy 1ess ef uent forrn R K um R x39 1 umum e If lRmIJ Then u IJ u 11lt1 Anu u 1ltu 1 J u then do the cumputatlun Page 5 The key to simplifying the space complexity of this algorithm is based on the following l R00 1 K R K391I K as no path from vertex Ito vertex K involves vertex K as as intermediate node hence all paths involving only vertices 1 though K must be limited to vertices 1 though K e l 2 RKK J RK391K J for the same reason Based on this observation we can create a simpler version of Warshall s algorithm Algorithm Warshall AlN lN RlN lN Implements Warshall s algorithm for computing the transitive closure of a graph on N nodes with adjacency matrix A This implements the version of Warshall s algorithm that uses only two matrices 7 one for the input and one for the output Input A 7 the NbyN adjacency matrix of the graph Output R 7 the NbyN transitive closure of the graph For I I to N Do Set R A For J l to N Do RI J AI J End Do J End Do I Then RIJ RIK And RKJ End Do J End Do I End Do K Return R As an example we compute the transitive closure of the graph with adjacency matrix gt H OOOOO Col OD l cool o D IOl OO l l 0 l 0 In order to make the example perfectly clear we follow the rst version of Warshall s algorithm with the numbered matrices beginning with Rm A Page 6 The start matrix is Rm OOOOO OOHOH OOOt O D iol OO OHOt H K 1 The loop for K l generates R0 The operative code is the following line R 1J Rm 1J If 0 lt1 1LJ1 Then R 1J Rm 1K And R O KJ But note that R0l J 0 for all values of J The end result is that none of the zero values in the matrix Rm are changed to 1 hence we have R R0 We now have R0 OOOOO Col OD l oooHo J not OO Ol Ol D l K 2 The loop for K 2 generates Ra The operative code is the following line RmHIJ RmHIJ If 0 2IJl Then RZIJ R1I2 And R12J I 1 J 1 R11 1 0 so R21 1 R11 2 And R12 1 1 o 0 0 I 1 J 2 R112 1 so R212 1 I 1 J 3 R11 3 0 so R21 3 R11 2 And R12 3 1 o 1 1 I 1 J 4 R11 4 0 so R21 3 R11 2 And R12 4 1 o 0 0 I 1 J 5 R115 1 so R21 5 1 I 2 J 1 R12 1 0 so R22 1 R12 2 And R12 1 I 2 J 2 R12 2 0 so R22 2 R12 2 And R12 2 I 2 J 3 R12 3 1 so R22 3 1 I 2 J 4 R12 4 0 so R22 4 R12 2 And R12 4 0 o 0 0 I 2 J 5 R12 5 1 so R22 5 1 Page 7 I 3 J 1 R13 1 0 so R23 1 R13 2 And R12 1 1 o 0 0 I 3 J 2 R13 2 1 so R23 2 1 I 3 J 3 R13 3 0 so R23 3 R13 2 And R12 3 1 11 I 3 J 4 R13 4 1 so R23 4 1 I 3 J 5 R13 5 0 so R23 5 R13 2 And R12 5 1 o 1 1 I 4 J 1 R14 1 0 so R24 1 R14 2 And R12 1 0 o 0 0 I 4 J 2 R14 2 0 so R24 2 R14 2 And R12 2 0 o 0 0 I 4 J 3 R14 3 0 so R24 3 R14 2 And R12 3 0 o 1 0 I 4 J 4 R14 4 0 so R24 4 R14 2 And R12 4 0 o 0 0 I 4 J 5 R14 5 1 so R24 5 1 I 5 J 1 R15 1 0 so R25 1 R15 2 And R12 1 0 o 0 0 I 5 J 2 R15 2 0 so R25 2 R15 2 And R12 2 0 o 0 0 I 5 J 3 R15 3 0 so R25 3 R15 2 And R12 3 0 o 1 0 I 5 J 4 R15 4 1 so R25 4 1 I 5 J 5 R15 5 0 so R25 5 R15 2 And R12 5 0 o 1 0 We now have Ra OOOOO OOt IOJ n J Ot IOO OD ib ib D l COD D lD i K 3 The loop for K 3 generates R6 The operative code is the following line RmHIJ RmHIJ If 0 WLJH Then R3IJ RZI3 And RZ3J 1 1J 1 R2110soR311R213AndRz311000 1 1J 2 R2121soR3121 I 1 J 3 R21 3 1 so R31 3 1 1 1J 4 R2140soR314R213AndRz341o11 1 1J 5 R2151soR3151 Page 8 I 2 J 1 R22 1 0 so R32 1 R22 3 And R23 1 1 o 0 0 I 2 J 2 R22 2 0 so R32 2 R22 3 And R23 2 1 o 1 1 I 2 J 3 R22 3 1 so R32 3 1 I 2 J 4 R22 4 0 so R32 4 R22 3 And R23 4 1 o 1 1 I 2 J 5 R22 5 1 so R32 5 1 I 3 J 1 R23 1 0 so R33 1 R23 3 And R23 1 1 o 0 0 I 3 J 2 R23 2 1 so R33 2 1 I 3 J 3 R23 3 1 so R33 3 1 I 3 J 4 R23 4 1 so R33 4 1 I 3 J 5 R23 5 1 so R33 5 1 I 4 J 1 R24 1 0 so R34 1 R24 3 And R23 1 0 o 0 0 I 4 J 2 R24 2 0 so R34 2 R24 3 And R23 2 0 o 1 0 I 4 J 3 R24 3 0 so R34 3 R24 3 And R23 3 0 o 1 0 I 4 J 4 R24 4 0 so R34 4 R24 3 And R23 4 0 o 1 0 I 4 J 5 R24 5 1 so R34 5 1 I 5 J 1 R25 1 0 so R35 1 R25 3 And R23 1 0 o 0 0 I 5 J 2 R25 2 0 so R35 2 R25 3 And R23 2 0 o 1 0 I 5 J 3 R25 3 0 so R35 3 R25 3 And R23 3 0 o 1 0 I 5 J 4 R25 4 1 so R35 4 1 I 5 J 5 R25 5 0 so R35 5 R25 3 And R23 5 0 o 1 0 We now have R6 OOOOO OOt It IJ n OOt t H b Ob J IJ I Ot J J J K 4 The loop for K 4 generates R The operative code is the following line R4IJ R3IJ If 0 11J1J Then R4IJ R3I4 And R 3gt4J 1 1J 1 R3110soR411R314AndR3411000 1 1J 2 R3121soR4121 I 1 J 3 R31 3 1 so R41 3 1 1 1J 3 R3141soR4141 1 1J 5 R3151soR4151 Page 9 R32 1 0 so R42 1 R32 4 And R34 1 1 0 0 R32 2 1 so R42 2 1 R32 3 1 so R42 3 1 R32 4 1 so R42 4 1 R32 5 1 so R42 5 1 D 1b 1b 1b 1b 1 H H NNVNNVN HHHHH H H UILALANH R33 1 0 so R43 1 R33 4 And R34 1 1 0 0 R33 2 1 so R43 2 1 R33 3 1 so R43 3 1 R33 4 1 so R43 4 1 R33 5 1 so R43 5 1 D 1b 1b 1b 1b 1 H wwummw HHHHH H UILALANH R34 1 0 so R44 1 R34 4 And R34 1 R34 2 0 so R44 2 R34 4 And R34 2 R34 3 0 so R44 3 R34 4 And R34 3 R34 4 0 so R44 4 R34 4 And R34 4 R34 5 1 so R44 5 1 b D D D H H AbVAbb HHHHH H UIbUJNt I 000 000 000 000 R35 1 0 so R45 1 R35 4 And R34 1 1 0 0 R35 2 0 so R45 2 R35 4 And R34 2 1 0 0 R35 3 0 so R45 3 R35 4 And R34 3 1 0 0 R35 4 1 so R45 4 1 R35 3 0 so R45 3 R35 4 And R34 5 1 1 1 D 1b 1b 1b 1b 1 H UIVUIVUIUIUI HHHHH H UIbUJNt I We now have R ooooo 001 1 1 001 1 1 o K 5 The loop for K 5 generates R6 The operative code is the following line R 1J R 1J If 0 5IJ Then R5IJ R4I5 And R 5J Page 10 1 1J 1 R4110soR511R415AndR4511000 1 1J 2 R4121soR5121 I 1 J 3 R41 3 1 so R51 3 1 1 1J 3 R4141soR5141 1 1J 5 R4151soR5151 I 2 J 1 R42 1 0 so R52 1 R42 5 And R45 1 1 o 0 0 I 2 J 2 R42 2 1 so R52 2 1 I 2 J 3 R42 3 1 so R52 3 1 I 2 J 3 R42 4 1 so R52 4 1 I 2 J 5 R42 5 1 so R52 5 1 I 3 J 1 R43 1 0 so R53 1 R43 5 And R45 1 1 o 0 0 I 3 J 2 R43 2 1 so R53 2 1 I 3 J 3 R43 3 1 so R53 3 1 I 3 J 3 R43 4 1 so R53 4 1 I 3 J 5 R43 5 1 so R53 5 1 I 4 J R44 1 0 so R54 1 R44 5 And R45 1 1 o 0 0 I 4 J 2 R44 2 0 so R54 2 R44 5 And R45 2 1 o 0 0 I 4 J 3 R44 3 0 so R54 3 R44 5 And R45 3 1 o 0 0 I 4 J 4 R44 4 0 so R54 4 R44 5 And R45 4 1 o 1 1 I 4 J 5 R44 5 1 so R54 5 1 I 5 J 1 R45 1 0 so R55 1 R45 5 And R45 1 1 o 0 0 I 5 J 2 R45 2 0 so R55 2 R45 5 And R45 2 1 o 0 0 I 5 J 3 R45 3 0 so R55 3 R45 5 And R45 3 1 o 0 0 I 5 J 4 R45 4 1 so R55 4 1 I 5 J 5 R45 5 1 so R55 5 1 We now have R6 OOOOO D lb ib ib D l 1 1 1 1 1 ooJ HH 001 1 1 The above matrix R6 represents the transitive closure of the directed graph What can we say about the graph Vertex 1 there is no path to vertex 1 from any vertex Vertices 2 and 3 there is a path to these vertices from vertices 1 2 and 3 Vertices 4 and 5 there is a path to these vertices from every vertex Page 1 1 Flnyd s Algnrithm fur the AILPairs Shnrteslrl zth Emblem vertex ln the connected cornponent ofthe graph Weighted graphs dlscussedln sectron 14 PM V H k V or graphs and use thern all the hrne for dally achwtres such as travel does not Tquot l m d r oh edge rfrt enrsts Handhng nonrexlstent edges ls sornewhat ofaproblem the graph from a slmple graph to a slmple werghted graph Chznannngz conslderlssues that the book overlooks Conslderflgure 8 5 on page 284 The graph ls shown atlelt rts adjacency rnatnn just below 0 3 2 0 w Note the mlsslng entnes 7 0 l Page 12 generallzed adjacency matnx as 0 w 3 w 2 0 w w w 7 0 l 6 w w 0 The problem here ls how to represent ln nlty m the eomputer program Ifwe go wth the IEEE standard oahng pornt we ean representlnflmty as follows slngle Preelslon 0x7180 0000 Double Preelslon 0x7FFO 0000 0000 0000 the CPU hardware wlll support thls part ofthe standard Also can one say x 7pm belng usedto representthe werghts For the two more eommon data types we mrght say For l rbltlnte ers Inflnlty 2767 For slnglerpreclslon oatrng pornt Inflnlty 510 reasonable way to handle the lnflnlty problem ore we rnyeshgate the algonthm we shouldtake note o a werght Conslder the eyele A gt c gtB ah wrth total werght of2 3 e o 71 Ofcourse one shouldbe susplclous of any graph havlng edges wrth negatrye werght The approaeh usedln Floyd39s algonthm ls so slmllar to thatusedrn Warshall39s al gonthm that some textbooks teaeh the two algonthms under the name FloydWarshall The algonthm n N 1 mamces Dm D D quot wlth D 51 J represenhng the shortestpath from yertex Ito vertex 1 wth no yertex numbered hlgher than K belng rneluded m the path Page 13 As m Warshan39s a1gonthrn vve eonsrder tvvo possrb1e paths between verhees 1 andJ 1 a path from Ito J mcludmg no vertex numbered hrgher than K 71 and 2 a path from 1to K mcludmg no vertex numbered hrgher than K e 1 followed by vertex K followed by a srrnr1ar path from vertex K to vertex 1 1 D Kuch Daii h Each 113th invnlves nnly vem39ees in the set 1 Kr 1 v tn rnphtrn tn K pamal detanee Dm J mm D K39UH 1 Damn K D KquotK J for 1 s K s N vvrth DUDE J W1J Note also that vertex K eannotbe apart ofthe path endmg on rtself so that D 1 K D 1 K and D KK J DKquotK I As 1n Algorlthm Floyd w1N lN Implements Floyd s algorlthm for shortest paths ph D For K 1 to N E r 1 1 to N For J 1 c K DI K DK J If K lt DI J Then DI J K End Do End Do End Do Page 14 As an example we shall rnvestagate agraph that ls lsomorphl to the example above Two graphs are consldered to be isnmnrphic 1f they have the same basl strueture wth only superflclal dlfferences Thls graph ls the same a the onglnal e ample wrth the sole exeeptron that the vemces are labeled wrth numbers and not letters Thls facllltates applreataon ofthe algonthm We now eonsrolerthe rn nrty problem as applreolto ths speclflc example We make the eolges Here the sum ofthe eolge werghts ls 1 2 3 6 7 19 Iplck any numberlarger than twlce the total eolge werght to representrn nrty here 1 ehoose 40 In thls example Iplan to show the sequence DWJ D Dm D and D expllcldy 0 40 3 40 Dmw 2 0 40 40 40 7 0 1 o 40 40 0 1 1 All thatfo DI J 2 0 andDI K DK J 2 0 so lfDI J 0 then ltls notposslble to have DI K DK J lt DI J 0 K 1 Thls loop generates the matnx D The operatwe eoole ls the followmg gtlt own 1 D m1 J If X lt D I J Then DU lL J gtltt 5e DU lL J D I J 11 1 D J11eosonoehange 112 D12 XDm 1 1D 1 204040 so no ehange Dml3033sonochange I1J4 D J1440 XD J 1 1D 1404040 so no ehange Ofcourse we know that for I 1 that D 1 J D mn I so no surpnse here Page 15 12J1 D0212 X D02 1 D01 1 2 0 2 so no change 2 J 2 D02 2 0 so no change 2 J 3 D02 3 40 X D02 1 D01 3 2 3 5 a new value I 2 J 4 D02 4 40 X D02 1 D01 4 2 40 42 no change Here we note one advantage of using very large nite numbers in this algorithm The sum 2 40 has meaning and is equal to 42 while the sum 2 00 has no meaning I 3 J 1 D03 1 40 X D03 1 D01 1 40 0 40 so no change I 3 J 2 D03 2 7 X D03 1 D01 2 40 7 47 so no change 3 D03 3 0 so no change 4 D03 4 1 X D03 1 D01 4 40 1 41 so no change We could have examined the third row of the matriX and shown that there would be no change in any element by noting that every computation of D13 J involved adding a distance to D0 3 l which we have set to our representation of in nity 14J1 D0416 X D04 1 D01 1 6 0 6 so no change I 4 J 2 D04 2 40 X D04 1 D01 2 o 40 46 so no change I 4 J 3 D04 2 40 X D04 1 D01 3 o 3 9 a new value I 4 J 4 D04 4 0 so no change At this point we have the following intermediate result 040340 DI2 0540 40701 64090 Page 16 Before continuing the example we present a revision of the algorithm that we shall adopt during the solution of this example Algorithm Floyd WlN lN Implements Floyd s algorithm for shortest paths Input The weight matrix of the graph Output The distance matrix of the graph D W Copy the W matrix so we don t change it For K l to N Do For I l to N Do If I i K And DI K w Then For J l to N Do x DI K DK J If x lt DI J Then DI J x End Do J End If End Do I End Do K K2 This loop generates the matrix De The operative code is the following x own 2 D12 J If x lt 1311 J Then 1321 J x Else 1321 J D WI J Row I l D1I K D1l 2 40 so no change on this row Row I 2 I K 2 so no change on this row 13 J 1 D13 1 40 x D13 2 D12 1 7 2 9 a new value I 3 J 2 D13 2 7 x D13 2 D12 2 7 0 7 so no change I 3 J 3 D13 3 0 so no change 13J4 D13 4 1 x D13 2 D12 4 7 1 8 so no change Row I 4 D14 K D14 2 40 so no change on this row Page 17 At this point we have the following intermediate result 040340 D220540 9701 64090 K 3 This loop generates the matrix D6 The operative code is the following X own 3 1323 J If X lt 1321 J Then 1331 J X Else 1331 J D2I J l D2l l 0 so no change 2 D21 2 40 X D21 3 D23 2 3 7 10 a new value 11J3 D2133 X D21 3 D23 3 3 0 10 so no change I 1 J4 D21 4 40 X D21 3 D23 4 3 1 4 a new value 12J1 D2212 X D22 3 D23 1 5 9 14 so no change 2 J 2 D22 2 0 so no change 2 J 3 D22 3 5 X D22 3 D23 3 5 0 5 so no change I 2 J 4 D22 4 40 X D22 3 D23 4 5 1 6 a new value Row 3 I K 3 so no change on this row 14J1 D2416 X D24 3 D23 1 9 2 11 so no change I 4 J 2 D24 2 40 X D24 3 D23 2 9 7 16 a new value I 4 J 3 D24 3 9 X D24 3 D23 3 9 0 9 so no change I 4 J 4 D24 4 0 so no change Page 18 At this point we have the following intermediate result 01034 D32056 9701 61690 K 4 This loop generates the matrix D6 The operative code is the following X 1331 4 1334 J If X lt 1331 J Then 1341 J X Else 1341 J 1331 J l D3l l 0 so no change 2 D31 2 10 X D31 4 D24 2 4 16 20 so no change 11J3 D3133 X D31 4 D24 3 4 9 13 so no change 11J4 D3144 X 1 4 D24 4 4 0 4 so no change 12J1 D3212 X D32 4 D24 1 6 6 12 so no change 2 J 2 D32 2 0 so no change 2 J 3 D32 3 5 X D32 4 D24 3 6 9 15 so no change I 2 J 4 D32 4 6 X D32 4 D24 4 6 0 6 so no change 13J1 D3319 X D33 4 D24 1 1 6 7 a new value I 3 J 2 D33 2 7 X D33 4 D24 2 1 16 17 so no change 3 D33 3 0 so no change 4 D33 4 1 X D33 4 D24 4 1 0 1 so no change Row 4 I K 4 so no change Page 19 Some Standard Problems Here we present the brute force solution to a number of small instances of standard problems and a few oddball ones 1 2 3 4 5 TSP Knapsack Job Shop Scheduling Dogs Cats and Mice an oddball Evaluating Standard Trigonometric Functions As we undertake the brute force solutions of the standard problems we shall take notice of a number of short cuts that we shall later develop A Didactic Issue There is an issue with the presentation of any of the standard problems in the class called NP Hard This has to do with the size of the problem instance Rather than give a precise de nition of instance size we go with intuitive de nitions speci c to each problem the number of cities the item count etc The standard didactic goal is to teach methods that can be used to speed up signi cantly the solution of a large instance of a standard problem As an example consider a tour of 50 cities with each city visited exactly once The dif culty with solving such monster problems is that a complete solution might take quite a few class periods to develop and would not contribute much to the student s understanding of either the problem or the solution method As a result we elect to illustrate these industrial strength solution strategies by use of toy sized problems which could more easily be solved by more primitive methods The advantage is that the strategies are obvious Optimization Problems Feasibility vs Optimality Almost all optimization problems involve either minimizing or maximizing some function subject to constraints The function is universally called the objective function If the function is to be minimized it is often also called a cost function If the function is to be maximized it is often also called a pro t function Any solution that satis es the constraints of the problem is called a feasible solution An optimal solution is a feasible solution that optimizes the objective function A bounding solution is one that shows the best values obtainable for the objective function if one or more of the constraints are ignored A bounding solution is seldom a feasible solution it just assists in the evaluation of trial feasible solutions A Small Example TSP Problem Air Travel for Five Cities Atlanta Dallas Mexico City Miami amp Seattle These are actual air fares for direct connections SEA 960 Note that the triangle inequality does not apply to air fares Consider MIA gt DFW 865 vs MIA gt ATL gt DFW 232 TSP Is a Hamiltonian Cycle Possible Yes One such cycle SEA gt DFW gt MEX gt MIA gt ATL gt SEA 53960 The total cost of this Hamiltonian cycle is given by 1020 319 439 125 715 2618 One suspects that this is not the lowest cost tour However we do know that any tour with higher cost is not optimal TSP The Small Problem Modi ed The existence problem is almost never considered We formulate standard problems with direct connections between each pair of cities Those city pairs for which there is no real direct connection get a direct connection with an absurdly high cost such as the sum of all of the costs for existing connections Here is the modi cation of the problem above Given ve cities and the fact that the start city can be chosen arbitrarily we have 4 24 different solutions to check in order to nd the optimal Our solution will assume that Seattle is the start city TSP Some Standard Observations on the Solution While we shall solve this problem instance by a direct approach it is important to note that there are a number of tricks that allow one to obtain a solution more quickly None of these are of theoretical importance but they are convenient We begin with the observation that any Hamiltonian cycle in a graph on N vertices requires N distinct edges This gives a lower bound on the total cost The ve lowest cost edges have cost 107 125 319 344 439 The total cost for this collection of edges would be 1334 No cycle can have less cost The ve edges associated with this smallest possible cost are shown highlighted in the graph at left Note that they do not form a cycle much less a Hamiltonian cycle On the rare occasions in which the N edges of smallest cost do form a Hamiltonian cycle the problem is quickly solved TSP Start at Seattle The basic solution method is to enumerate and evaluate all 24 tours Here are the rst twelve tours presented in some sort of order This solution is an example of the strategy that Levitin calls brute force obvious not at all subtle and guaranteed to produce a result SEA SEA SEA SEA SEA SEA SEA SEA SEA SEA DFW DFW MEX MEX MEX MIA DFW MIA DFW MEX MEX MIA ATL MIA ATL MEX MIA MEX MIA DFW MEX DFW MIA MEX MIA ATL MEX ATL 715 715 715 715 715 715 1020 1020 1020 1020 1020 1020 107 107 344 344 125 125 107 107 319 319 865 865 319 865 319 439 865 439 344 125 344 439 125 439 439 439 865 865 319 319 439 439 125 344 344 960 1245 1020 1245 1020 960 1245 960 1245 715 It is 2540 3371 3203 3383 3269 2618 2870 2936 2768 2618 3599 3383 TSP Start at Seattle Page 2 SEA MEX ATL DEW MIA SEA 1245 344 107 865 960 3521 SEA MEX ATL MIA DEW SEA 1245 344 125 865 1020 3599 SEA MEX DEW ATL MIA SEA 1245 319 107 125 960 2756 SEA MEX DEW MIA ATL SEA 1245 319 865 125 715 3269 SEA MEX MIA ATL DEW SEA 1245 439 125 107 1020 2936 SEA MEX MIA DEW ATL SEA 1245 439 865 107 715 3371 SEA MIA ATL DEW MEX SEA 960 125 107 439 1245 2876 SEA MIA ATL MEX DEW SEA 960 125 344 865 1020 3314 SEA MIA DEW ATL MEX SEA 960 865 107 344 1245 3521 SEA MIA DEW MEX ATL SEA 960 865 319 344 715 3203 SEA MIA MEX ATL DEW SEA 960 439 344 107 1020 2870 SEA MIA MEX DEW ATL SEA 960 439 319 107 715 2540 The two minimum cost solutions are SEA ATL DEW MEx MIA SEA and its reverse SEA MIA MEX DEW ATL SEA By coincidence the shortest cycle is the rst one chosen for evaluation TSP Start At Seattle Page 3 Again the answers found were SEA ATL DFW MEX MIA SEA and its reverse SEA MIA MEX DEW ATL SEA Here is the solution For this problem instance it is unique SEA 960 Knapsack Problem Given a knapsack of capacity W and a list of N gt 0 items with weights W1 W2 WN and values V1 V2 There are a few implied assumptions to the problem 1 W1 W2 WN gt W Otherwise just put everything in 2 For each item we have both and VJ gt 0 and 0 lt WJ S W Any single item too big for the knapsack can be disregarded Select the variables X1 X2 XN 00 S XJ S 10 such that V Xle XZOVZ XNOVN is maximized subject to X10W1 X20W2 XNOWN S W The variable X J represents the amount of item J placed into the knapsack We shall develop a strategy for the knapsack problem that is occasionally useful and never gives rise to complications in the solution We de ne an auxiliary variable p J VJ WJ which is the value density Knapsack Ordering by Value Density Given a knapsack of capacity W and a list of N gt 0 items With weights W1 W2 WN and values V1 V2 De ne the value density as p J V J WJ Here the requirement that WJ gt 0 immediately leads to the conclusion that each value density is well de ned Reorder the sets in non increasing order of value density p12p22 2 pjz2pN12pN The textbook s example of Knapsack Problem Item 1 2 3 4 Value 42 12 40 25 Weight 7 3 4 5 Value Density 60 40 100 50 Knapsack Starting the Solution Here is the problem with the new ordering of the items Item 1 2 3 4 Value 40 42 25 12 Weight 4 7 5 3 Value Density 100 60 50 40 There are two common approaches to the solution of this problem 1 The power set approach 2 The decision tree approach For a given set S the power set of S denoted 23 is the set of all subsets of S For S 1 2 3 4 ZS Q 1 2 3 4 1 2 1 3 1 4 2 3 2 4 3 4 12 3 124 1 34 2 34 12 34 Knapsack The Power Set Solution Row Subset Total Weight Total Value 0 Q 0 O 1 1 4 4O 2 2 7 42 3 3 5 25 4 4 3 12 5 12 4711Toomuch NA 6 13 459 402565 7 14 437 401252 8 23 7512 Toomuch NA 9 24 731O 421254 10 34 538 251237 11 123 47516Toomuch NA 12 124 47314Toomuch NA 13 134 45312Toomuch NA 14 234 75315 Toomuch NA 15 1234 475319Toomuch NA The optimal solution is 1 3 With a weight of 9 and a pro t of 65 Knapsack One Observation 0n Brute Force The brute force approach calls for generation of all sets in the powerset But note that the subset 1 2 yields a weight that is too large This implies that any set of which 1 2 is a subset will also be too large Based on this simple observation we could rule out the following trial solutions 12 123 124 1 2 3 4 We also note that the set 2 3 yields a weight that is too large In addition to the above this rules out 2 3 4 More on this later Knapsack First Two Decisions Decide on 1 Decide on 2 Value 0 Value 42 Value 40 Value 82 Weight 0 Weight 7 Weight 4 Weight 11 T00 NIUCH This slide illustrates a decision tree approach that is often applied to such problems Here we see that the paItial solution 1 2 should be abandoned with it go one quaIter of the twentyifour possible solutions Job Shop Scheduling This is also called the assignment problem The simplest form of the problem calls for N workers to be assigned to an equal number of jobs with N 2 2 We are given an N by N matrix of costs with entries CI J being the cost to assign worker I to job J We want to minimize the total costs of the assignments The feasibility constraints are obvious 1 Each worker is to be given exactly one job 2 Each job is to be assigned to exactly one worker In terms of the cost matrix the translation of the requirements is also obvious 1 Each row is to have exactly one element selected 2 Each column is to have exactly one element selected The answer is a total cost and either an array assigning jobs to workers or an array assigning workers to jobs each will be a permutation ofthe set ofintegers 1 2 N An Instance of the Assignment Problem Here we use the textbooks example Four workers A B C amp D Fourjobs 1 2 3 4 u 1 2 3 4 A 9 2 7 8 B 6 4 3 7 C 5 8 1 8 D 7 6 9 4 The solution vectors are as follows Assigning jobs to workers 2 1 3 4 a total cost of 13 Worker A gets job 2 worker B gets job 1 worker C gets job 3 and D gets 4 Assigning workers to jobs B A C D again a total cost of 13 Job 1 goes to worker B job 2 goes to worker A job 3 to C andjob 4 to D It should be obvious that either representation of the solution conveys the same information Neither representation is better than the other Assignment Problem A Brute Force Solution Here we consider a brute force solution assigning jobs to workers 1234 9 4 1 418 1243 948930 1324 938424 1342 938626 1423 978933 1432 971623 2134 261413 2143 268925 2314 235414 2341 238720 2413 275923 2431 271717 The Brute Force Solution Continues 3124 768425 3142 768627 3214 74542o 3241 748726 3412 775625 3421 778729 4123 868931 4132 861621 4213 845926 4231 841720 4312 835622 4321 838726 The winner is 2 1 3 4 with a total cost of 13 Assignment Problem The Winner The winner is 2 l 3 4 with a total cost of 13 Worker A gets job 2 Worker B gets job 1 Worker C gets job 3 Worker D gets job 4 004gtNN 00wgt umoxo owwqw hooqu Once again we are done with a brute force solution that due to the small size of the problem instance is the easiest way to solve the problem We now consider a few tricks that will facilitate solution of larger and more realistic instances of this and similar problems Generating the Bounding Solutions The bounding solution Will determine a lower bound on the costs It Will be generated Without any regard to row or column constraints Row Minimum Give each Worker the cheapest job Forget constraints 1 2 3 A 9 2 7 8 2 B 5 4 7 3 C 5 8 D 8 1 D 7 6 9 D i 10 Column Minimum Assign to eachjob the cheapest Worker Forget constraints 1 2 3 A 9 2 7 8 B 6 4 3 7 C 9 8 D 8 D 7 6 9 5 2 1 4 12 Examination of the Bounding Solutions First note that neither bounding solution is feasible This is expected Now we examine what each of the bounding solutions is telling us The row minimum solution says No optimal solution can cost less than 10 The column minimum solution says No optimal solution can cost less than 12 Both of these statements must be true From this we conclude that no optimal solution can cost less than 12 NOTE Frequently these lower bounds on costs and upper bounds on pro ts are not actually used in the solutions Your instructor just likes to calculate them In our example above had we known beforehand that the absolute minimum cost was 12 we might have stopped the solution at 2 1 3 4 with a cost of 13 close enough Generating Trial Feasible Solutions We want to generate a few feasible solutions very quickly to aid in providing an upper bound to the optimal cost Recall that the assignment problem for N workers has N feasible solutions We are trying to avoid generation of all or a signi cant part of these Two obvious quick tries are 1 2 N and N N 1 2 1 For our four workers the two quick tries are 1 2 3 4 and 4 3 2 1 1234 1234 A9273 ACE278 B 4 7 B637 C5 18 CSSCDS D 694 DT QCijj Cost26 Cost18 Here we have discovered a feasible solution with total cost of 18 This solution is more costly that the bounded solution as expected so it might not be the optimal solution It is the best solution so far Dogs Cats and Mice And now for something completely different You39re given a hundred dollars You are to spend it all purchasing exactly a hundred animals at the pet store Dogs cost 1500 each Cats cost 100 each and Mice are 25 cents each You must buy at least one of each animal You must spend exactly 10000 How many of each type of animal do you buy Formulation of the Problem Let D the number of dogs C the number of cats M the number of mice There are two equations 150D C 0250M 100 the cost equation D C M 100 the number of animals equation Standard algebra requires three equations for a solution involving three unknowns It is also the fact that standard algebra cannot easily restrict solutions to positive integers We can solve this problem by exhaustive search supplemented by a few observations about the problem Dogs Cats and Mice Observations 1 amp 2 Observation 1 The counts of dogs cats and mice must be positive integers Remember that you are required to buy at least one of each Observation 2 The number of mice bought must be a multiple of 4 The sum 150D C 0250M must be an integer more speci cally it is the integer 100 This implies that 0250M must be an integer hence M is an integer multiple of 4 As a corollary used in later work we conclude that 07 5 0M must be an integer multiple of 3 Dogs Cats and Mice Observation 3 Observation 3 The number of dogs cannot be less than one or more than siX Let D be the number of dogs We have already noted the constraint that D 2 1 We now show that D S 6 by supposing that D 2 7 Suppose that D 2 7 Then 150D 2 105 more than the money we can spend Actually since we must buy at least one cat and at least four mice total cost of 500 the most we can spend on the dogs is 9500 We solve this problem by examining the equations for the siX possible values of the count ofdogs D e 1 2 3 4 5 6 It is this limited number of options for the dog count that allows a solution Dogs Cats and Mice Solution by Cases Let D 1 Then the equations become C 0250M 85 C M 99 or 0750M 14 But 14 is not a multiple of 3 so this solution is not allowed Let D 2 Then the equations become C 0250M 70 C M 98 or 0750M 28 not a multiple of3 Let D 3 Then the equations become C 025oM 55 C M 97 or 0750M42 andM 56 Thus we have one solution D 3 C 41 and M 56 Dogs Cats and Mice More Cases Let D 4 Then the equations become C 0250M 40 C M 96 or 0750M 56 not a multiple of3 Let D 5 Then the equations become C 0250M 25 C M 95 or 0750M 70 not a multiple of3 Let D 6 Then the equations become C 0250M 10 C M 94 or 0750M 84 or M 112 too many mice The unique solution is thus D 3 C 41 and M 56 three dogs forty one cats and fty siX mice Algorithms for Evaluation of Trig Functions We now consider the problem of evaluating trigonometric functions for speci ed angles At this point we are not interested in l The exact structure of the algorithm 2 The fact that angles have to be expressed in radians not degrees 1 radian z 572958 degrees One of the basic requirements for an algorithm is that it terminate Given an arbitrary input we must have a guarantee that after a nite amount of time the algorithm will stop and produce the correct answer The mere fact that a complete algorithmic solution of a 50 city instance of TSP would require 19301043 years on a l tera op machine such as the Cray MT 5 does not negate the claim that we have an algorithm Here the algorithm is guaranteed to terminate Trig Functions Terminating the Algorithm The problem with algorithms for evaluating the trigonometric functions is any algorithm that can be implemented on a digital computer is based on an in nite series The details can be examined in any intermediate calculus course Each term in the series can be computed by a nite application of basic operations such as multiplication and division There is no problem here However the series is in nite There is no amount of time however large that will allow the complete evaluation of such a series We must look for another termination criterion There are only a very few angles for which the sine and cosine can be exactly evaluated We all know these values Our termination criterion thus becomes the accuracy required for the computation Do we need 7 digit accuracy l4 digit accuracy The accuracy criterion is easily applied There are formal mathematical models that indicate the number of terms needed for a given accuracy Chapter 7 Finite State Machines and Sequential Circuits De nition Sequential Circuits Combinational Circuits No Memory No ipflops only combinational gates No feedback Output for a given set of Inputs is independent of order in which these inputs were changed after the output stabilizes Slide 1 ofll slides CPSC 5155 Sequential Circuits Memory Flip ops may be used Combinational gates may be used Feedback is allowed The order of input change is quite important and may produce significant differences in the output July 29 2010 Chapter 7 Finite State Machines and Sequential Circuits De nition Sequential Circuits In at p Ir P Ccntbinatienal Legic UmPut AND UR NUT gates and anyr MEI circuits MEIIIIJI F Usua r introduces a time delay The memch state dues not change instantly Sequential Logic Includes Combinational Logic and Memory Slide 2 of 11 slides CPSC 5155 July 29 2010 Chapter 7 Finite State Machines and Sequential Circuits Circuit Analysis and Design Sequential Circuits Circuit Analysis Attempts to identify a circuit for which we have a diagram 1 Identify the inputs and outputs of the circuit 2 Express each output as a Boolean function of the inputs and the present state QT 3 Create a Finite State Model of the circuit 4 Identify the circuit if possible Circuit Design Design a circuit to meet a given specification 1 Develop a Finite State Machine model of the circuit 2 Translate to a state table 3 Pick the ip ops and use excitation tables 4 Define the inputs for each ip op 5 Design the circuit Slide 3 of 11 slides CPSC 5155 July 29 2010 Chapter 7 Finite State Machines and Sequential Circuits Sample Finite State Machine A 11011 Sequence Detector Five states each with two transitions The FSM begins in state A Slide 4 of 11 slides CPSC 5155 July 29 2010 Chapter 7 Finite State Machines and Sequential Circuits State Tables for the 11011 Sequence Detector Present State Next State Output X0 X1 A A0 B0 B A0 C0 C D0 C0 D A0 E0 E A0 C1 Present State Next State Output X0 X1 A000 0000 001 0 B001 0000 0100 C010 0110 0100 D011 0000 1000 E100 0000 0101 Slide 5 of 11 slides CPSC 5155 July 29 2010 Chapter 7 Finite State Machines and Sequential Circuits Sample Circuit for Analysis Step 1 Identify the Inputs and Outputs One input labeled X One output labeled Z My labeling convention Inputs are X Internal states are Y or Q Outputs are Z Slide 6 of 11 slides CPSC 5155 July 29 2010 Chapter 7 Finite State Machines and Sequential Circuits Circuit Analysis Page 2 Step 2a Determine the equations for the ip op inputs Reading the diagram we see that D X Y This can be written as D X QT Note QT is the ip op state now The ip op will not react to the D input until time T 1 Step 2b Determine the eguation for the output Reading the diagram we see that ZXG Y Slide 7 of 11 slides CPSC 5155 July 29 2010 Chapter 7 Finite State Machines and Sequential Circuits Circuit Analysis Page 3 Step 3a Create the Next State Table X QTY DXY QT1 0 0 0 gt tgt tQQ l l O l l l l xy ty t We have a D ip op here When we compute D from X and QT we automatically get QT 1 Step 3b Create the Output Table X QT 0 gt tgt tQQ Z O l l O l O l Slide 8 of 11 slides CPSC 5155 July 29 2010 Chapter 7 Finite State Machines and Sequential Circuits Circuit Analysis Page 4 Step 4a Combine the two tables for form one table X QTY DXY QT1Z 0 0 0 00 0 1 1 11 1 0 1 11 1 1 1 10 Step 4b Reformat the table into standard format Slide 9 of 11 slides Present State Next State Output X 0 X 1 0 0 0 1 1 1 1 1 1 0 CPSC 5155 July 29 2010 Chapter 7 Finite State Machines and Sequential Circuits Circuit Analysis Page 5 Step 5 Draw the FSM Model State Diagram 1H 030 lfl Oil Step 6 If Possible Identify the Circuit Note that for QT 0 Z 3 QT 1 Z X This is a serial two s complement circuit LSB first Slide 10 of 11 slides CPSC 5155 July 29 2010 Subroutine Linkages and Call Mechanisms We trace the evolution of the idea of a subroutine from its origins through the elaboration of run time mechanisms to manage the ow of control and passing of arguments Time line of topics 1947 The Wheeler Jump named for David Wheeler who worked on the British computer called the EDSAC 1960 s A variety of mechanisms used by the CDC 6600 for managing subroutines and functions Though none would support recursion we see a growing adaptation to the requirements of commercial software 1970 s Stack structures and direct support for recursion We begin with a code sample that ought to be completely unintelligible As is my favorite style this is written in pseudocode so that I don t follow any syntax The Code Sample XP x1 5 10 N 1 Y1 XP xs x1 x1 XP x1 x1 Xl60 Do While ABSXP gt 10o1o Y1 Y1 s XP N N 2 s 10 5 XP XP x5 N N 1 End Do XP x2 5 10 N2 1 Y2 XP xs x x XP x x X60 Do While ABSXP gt 10o1o Y2 Y2 s XP N N 2 s 10 5 XP XP x5 N N 1 End Do Y Y1 Y2 The Code Sample The Way We Want It Rewrite the above as YSINX1 SINX2 What we want to do is factor out the common code and put it in one place But how can we do this Writing the above line of code in a strange form will show the issue X X1 Y1 SIN X Ll X X2 Y2 SIN X L2 Y Yl Y2 Placing the code elsewhere and accessing it by a IUTVIP instruction is easy The hard part is handling the different return addresses The f1rsttime the code returns to L1 The second time the return is to L2 This return problem is what David Wheeler solved David Wheeler s Solution This was the basis of all subroutine calls until recursion was considered Suppose a subroutine at address Z Here is the traditional memory map Address Contents Z Place return address here Z 1 First executable instruction More executable code goes here Return Indirect Jump on Z GOTO Z Data Subroutine local storage is usually here End End of the subroutine This arrangement is very ef cient and makes good use of limited memory It is quite satisfactory for functions and subroutines such as Traditional trigonometric and algebraic functions square root etc Standard system functions such as 10 sorting etc Early FORTRAN on a CDC 64006600 The CDC 6000 series architecture provided eight 60 bit data registers called X0 X1 X7 The early FORTRAN IV compilers emitted code that allowed for a very ef cient subroutine linkage All arguments to and from a subroutine were passed in registers Register Use X0 Not used Xl X5 Used to pass up to ve arguments X6 Function return value Most Signi cant 60 bits Not used for single precision results X7 Function return value Least Signi cant 60 bits Suppose one had more than ve arguments to a subroutine Too bad An interesting side effect of this arrangement was that even subroutines would return values albeit not very interesting The results of the last assignment statement in the subroutine would remain in the register X7 Later FORTRAN on a CDC 64006600 The restriction to five argumenm quickly proved to be unacceptable to commercial programmers The answer was to place a parameter block in memory W F2X Y Z Here is the associated memory map A0 X6 x7 More on the CDC 6000 Series The eight 60 bit X registers were paired with eight 18 bit A registers Registers X0 and A0 were exceptions to this rule Registers 1 through 5 were paired for loading the registers Putting an address into an A register would load the value into the paired X register SA4 W causes A4 to be loaded with address W A4 lt W X4 to be loaded with the value at W X4 lt MW Registers 6 and 7 were paired for storing the registers Putting an address into an A register would store the X register value at address W SA6 W causes A6 to be loaded with address W A6 lt W X6 to be stored into address W MW lt A6 This is a diversion but I nd it interesting Recursion Towers of Hanoi This problem of moving disks from one peg to another subject to constraints is an excellent motivation for recursion It is not an ancient problem The puzzle was invented by the French mathematician Edouard Lucas in 1883 There is a legend about an Indian temple which contains a large room with three timeworn posts in it surrounded by 64 golden discs The priests of Brahma acting out the command of an ancient prophecy have been moving these discs in accordance with the rules of the puzzle According to the legend when the last move of the puzzle is completed the world will end The puzzle is therefore also known as the Tower of Brahma puzzle It is not clear whether Lucas invented this legend or was inspired by it If the legend were true and if the priests were able to move discs at a rate of l per second using the smallest number of moves it would take them 264 1 seconds or roughly 585 billion years The universe is currently about 137 billion years old This taken from the Wikipedia article on Towers of Hanoi There are two variants of the problem each involving recursion 1 Generate a sequence of moves for a problem with N disks 2 Compute the number of moves required for a problem with N disks The rst is an enumeration problem the second is a counting problem Recursion The Counting Version of Towers of Hanoi Let s consider the counting version of the problem Call the three pegs X Y and Z To move N disks from peg X to peg Y IfN 1 move the disk from peg X to peg Y Else Move N 1 disks from peg X to peg Z Move 1 disk from peg X to peg Y Move N 1 disks from peg Z to peg Y This gives rise to the following function to compute the number of moves Function Moves N Integer Long Integer If N g 1 Then Return 1 Else Y1 Moves N 1 Here is the recursive call Return Y102 1 Recursion We Need Another Subroutine Linkage Mechanism The static allocation of arguments local variables and return addresses used in early computers cannot support recursion All recursive programming demands the use of a stack Non recursive languages can support recursion by explicit stack manipulation Recursive languages provide native Run Time Library RTL support for recursion Due to the exibility provided by native support for recursion the compiled version of any function becomes more complex For each procedure in a recursive language the compiler must provide 1 a procedure prolog to set up the local variable storage on the stack 2 a procedure epilog to clean up the stack on exit of the procedure For each procedure invoked in a recursive language the compiler must provide the code to load the arguments and argument count onto the stack prior to invoking the procedure Early architectures PDP l 1 etc used a single Stack Pointer SP to manage this Experience quickly showed it to be desirable to have a separate Frame Pointer FP that points to the arguments on the stack Multiplexers and Demultiplexers Multiplexer MUX Associates One of Many Inputs to a Single Output Demultiplexer DEMUX Associates One Input With One of Many Outputs Circuit Inputs Control Outputs Signals Multiplexer 2N N 1 Demultiplexer 1 N 2N Sample 4 t0 1 MUX and 1 t0 4 DEMUX XI X1 Y X K X3 3931 ICI1 c1 Cquot My Notation X for Input C for Control Signals Y for Output The Multiplexer Equation Illustrated for a 4 t0 1 MUX Truth table Denote the multiplexer output by M Equation Form M C1EOX0 C1C0X1 C1C0X2 C1C0X3 Here is another form of the equation that is better when X is used as an input M 606110 C0C1Il C0Erlz coCr13 Build a 4 to 1 NIUX But what about an enable input for a multiplexer What does it mean for the output ofthe MUX to be 0 Multiplexer Attached to a Bus Line To control a multiplexer s connection to a common bus We use a trirstate buffer and not an enable input to the MUX Here I use as the trirstate control When E 1 the selected MUX input is placed on the bus When E 0 the MUX is detached from the bus another source feeds the bus A 1 t0 4 DEMUX C1 C0 Selected Output Y0 X Other outputs 0 Y1 X Other outputs 0 Y2 X Other outputs 0 Y3 X Other outputs 0 Build a 1 to 4 DEMUX With an Enable Ianable 0 all outpum are 0 Using A 2N to 1 MUX for a Boolean Function of N Boolean Variables Theorem 1 Any Boolean function of N Boolean variables N gt O can be constructed by a multiplexer With 2N inputs usually labeled 1K IK1 11 10 and N control lines labeled CN1 C0 Method Express the Boolean function of N Boolean variables in Canonical Sum of Products and then match the desired function to the Multiplexer Equation for a 2N t0 l MUX Example F2X Y Z XOY X02 YOZ Step 1 This is a function of three Boolean variables We must use a 23 t0 l MUX also called a 8 t0 l MUX Using A 2N t0 1 MUX page 2 Step 2 Convert F2X Y Z XOY X02 YOZ to Canonical SOP Every product term must have a literal for each variable A literal is either the variable or its complement F2 19 333 593 ma a 15T3539E Emirz EYE KYE XTE mag Eng EYE iquot E59 1mg EYE 14 Note that all four terms have a literal for each of the three variables X Y and Z Using A 2N t0 1 MUX page 3 Step 3 Convert the function to a form with all 2N product terms Here we convert F2 to have all eight possible product terms FX Y Z YYz XYZXYZXYZ YZ0 Yz0 YZ0 Yz1 XYi0 XYz1 XYZ1 XYZ1 Using A 2N to 1 MUX page 4 Step 4 Write the Multiplexer Equation for an Srtorl MUX M 62616040 2 1COI1 2Cl 012 2clco13 02616044 CZEICOI5 Czcl 016 C2C1C0I7 Step 5 Rewrite the equation With C2 X C1 Y and Cu Z M XoYoZIO iovaIl ioYZ12 EY Z39I 3 XoYZI4 X5521 Xvi16 XYZpl7 NOTE Here I use In 11 17 as the MUX inputs becauseI am using X to denote one ofthe Boolean variables Using A 2N t0 1 MUX page 5 Step 6 Match the two expressions FX Y Z YZ0 Yz0 YZ0 Yz1 XYi0 XYz1 XYZ1 XYZ1 M 23291 YoYoszl Edgar ioYZI 3 XoYoZI4 x3445 XYoZo16 XYz17 100 110 120 131 HXEYEZ 140 151 161 171 withCZXC1YandC0Z Multiplexers and Demultiplexers Multiplexer 7 MUX Associates One of Many Inputs to a Single Output Demultiplexer 7 DEMUX Associates One Input with One ofMany Outputs Circuit Inputs Control Outputs quot ial Multiplex er 2 N 1 Deniultiplex er 1 N 2 Sample 4 t0 l 1 TX and l t0 4 DEBUV Xn 44ml hard X1 X MU39X Y X DEMUX 2 X3 Cl CEI C1 Cu My Notation X for Input C 0139 Control Signals Y for Output The Multiplexer Equation Illustrated for a 4 t0 1 MUX T1ut11 table Denote the multiplexer output by M Equation Fouu 7 7 7 M C1o oxo Cl Co X C1C0 X2 C1 C0X3 Build 1 4 t0 l MUX A l t0 4 DEB11W Cl Cu Selected Output 0 0 Y X Other outputs 0 0 l X 7 f Other outputs 0 l 0 f 7 Other outputx 0 1 1 Y3 Other outputs 0 Build 21 1 t0 4 DEBTUX With an Enable IfEuable 0 all outputs are 0 Using A 2N to l 1WUX for a Boolean Function of N Boolean Variables Theorem 1 AnyBoolean function ofN Boolean Variables N gt 0 can be constructed by a multiplexer with 2 inputs usually labeled K4 11g anchontrollines labeled Cm Method Express the Boolean function ofNBoolean Variables in Canonical Sum of Products and then match the desired function to the M39UX Multiplexer Equation for a lNrtorl Example F2X Y Z XnY XZ YoZ Step 1 Thi is a function ofthree Boolean Variables We must use a 137L071 MUX also called a 84071 MUX Using A 2N t0 l lWUX page 2 Step 2 Convert F2X Y Z ZY XZ YZ to Canonical SOP Every product term nnlsthave a literal for each Variable A literal is either the Variable or itx complement in X Y 92 Yoz X Y ZZ MYsz X Y Z X602 XMZ X0Y Z XOY Z 602 X0Y Z 3402 X z xvi X YVZ Note that all four terms have a literal for each of the three Variables Y and Z Using A 2N to l 1 page 3 Step 3 Convert the function to a form with all 2 product terms are we convert F2 to have all eight possible product terms FX Y Z XYcz XoYozxoyZXoYz Z39O 1 quotZ390 X39YOZHJlXYOZ01 XuVZ0 XoYz1 xYZ1 xYz1 Using A 2N t0 l Mm page 4 Step 4 Write the Multiplexer Equation for an EHorl MUX M 62616040 62616041 62 C139Eo12 2C1vC0I3 9616044 CnyCOIs C2oclfo16 C2C1C0I7 Step 5 Rewrite the equation with C2 X C1 Y and Cu Z M XoYoZIO XoYoZoIl XoYoiolz ile 3 xii14 XYZ Is XoYoZI6 XYz1 7 Using A 2N t0 l 1 page 5 XVo2o0 xYozol XoYZ1 XY 11 Miiziibhiwzh bg XVZHXibgszkX Zy 0 110 120 1 0 151 151 171 with Cl X C1 Y and Cu Z Basic Graph Theory Graph theory provides a mathematical structure that is convenient for understanding a number of problems and writing algorithms to solve them At one level graphs are very simple They are just dots connected by lines The dots generally drawn as labeled circles are called vertices singular is verteX or nodes The lines are usually called edges but may be called links or arcs The vertices are generally labeled by positive integers but can be labeled otherwise In a graph that is read as a map many of the vertices are labeled with city names One special class of graphs is called trees to be de ned soon For some reason we tend to use the term node when discussing trees and use the term vertex when discussing general graphs De nition of 3 Graph We present two de nitions of the mathematical object called a graph De nition 1 A graph is a nite subset of the positive integers denoted by V l 2 N for N 21 and apossibly empty subset E g V x V WOW That is a lot of help De nition 2 A graph G is a nite nonempty set of objects called vertices the word vertices is the plural of vertex together With a possibly empty nite set of pairs of distinct vertices of G called edges The vertex set is commonly denoted by VG 1 2 3 N Where N is the number of vertices in the set It is convenient to use integers to label vertices The edge set commonly denoted by EG is a subset of VG gtlt VG the set of all pairs of elements om the vertex set VG The cardinality of the vertex set of a graph G is called the order of G The cardinality of the edge set is called the size of G Commonly EG M An N Mgraph G is a graph With N vertices and M edges Directed and Undirected Graphs Consider an N M graph G with vertex set V 1 2 N N 2 l The edge set E is de ned as EG J K J e VG and K e VG We immediately invoke the de nition of a set that duplicate elements are not found in a graph Thus we have no duplicate edges that is two edges Jl K1 and J2 K2 with both Jl J2 and K1 K2 A loop is an edge of the form J J with J e VG that is the end points are equal A simple graph is a graph without loops A simple graph is thus de ned as the pair VG EG with VG 1 2 N N 2 1 again we use integers as vertex labels EGJKJK e VGandJ K The basic question is whether the pairs J K are ordered or unordered A directed graph often called a digraph is a simple graph in which the edge set is considered as a set of ordered pairs J K with J 73 K An undirected graph is a simple graph in which the edge set is considered as a set of unordered pairs J K with J 73 K Example VG 1 2 39 4 VG gtlt VG 1 1 1 2 1 3 1 4 2 1 2 2 2 3 2 4 3 1 3 2 3 3 3 4 4 1 4 2 4 3 4 4 This set has 16 elements A directed graph with vertex set VG would have its edge set as a subset of the following subset of VG gtlt VG 1 2 1 3 1 4 2 1 2 3 2 4 3 1 3 2 3 4 4 1 4 2 4 3 EG S 12 An undirected graph with vertex set VG would have its edge set as a subset of the following subset of VG gtlt VG 1 2 1 3 1 4 2 3 2 4 3 4 EG S 6 Representations of Graphs So far we have followed the strict mathematical de nition of a graph The problem is that most people your instructor included prefer a Visual depiction of these mathematical objects With the understanding that these depictions can occasionally be misleading as we shall show later we now begin with the simpler depictions An N M graph G is depicted with N circles denoting the N vertices and M lines connecting the circles and representing the edges Labeled Graphs While we often use subsets of the set of integers to denote the vertices in a graph we note that this is just a convenience A four vertex graph might have the vertex set Atlanta Memphis Miami Tampa As examples let s look at representation of some graphs on four vertices VG 1 2 3 4 The Empty Graph Consider the four vertex setVG 12 3 4 One possible graph on this vertex set has no edges Here is its representation Note that EG D D G D G This is a disconnected graph We de ne the term later but for now we shall just mention that most people see a single disconnected graph as anumber of smaller graphs One might see this as four singleivertex graphs that view is valid As afourivertex empty graph this is a 4 07graph It is only of theoretical interest Some Undirected Graphs Let s examine three undirected graphs with four vertices and three edges Two of the graphs are connected the one at the right is disconnected R 0 1 1 0 0 0 0 0 0 G EG 12113 EG 12114 EG 1 2 1 4 4 7 7 7 Here are a number of undirected graphs on three vertices TR Some Directed 4 3 Graphs Here are two 4 3 directed graphs Note that they are different graphs EGl 1 2 1 4 4 3 EG21gt421gt43 Note speci cally that 1 2 E EGl but 1 2 e EG2 also 2 1 e EGl but 2 1 E EG2 1n the directed graph the edges are ordered pairs taken from the set VG x VG For this reason 1 2 2 1 More on Directed and Undirected Graphs Directed graphs correspond to edge sets that contain ordered pairs Undirected graphs correspond to edge sets that contain unordered pairs Consider the gure below which shows an undirected graph and a directed graph Undirected Graph Directed Graph The two graphs appear similar but are quite different The graph at the left is said to be the undirected graph that underlies the directed graph The undirected graph has EG l 4 2 3 2 4 34 implicitly 3 2 e EG 4 l e EG 4 2 e EG and 4 3 e EG The directed graph has EG l 4 2 4 3 2 4 l 4 3 with no edges implicitly added Still More on Directed and Undirected Graphs Consider the following two graphs one directed and one undirected In what sense are they similar Each graph has VG 1 2 3 4 The undirected graph has EG 1 2 1 4 2 3 3 4 more conventionally written as EG 1 2 2 3 3 4 4 1 The directed graph has EG 1 2 2 3 3 4 4 1 17 4 4a 3 3a 2 27 1 The undirected graph can be seen as a simpler representation containing the same information as the directed graph Nevertheless they are not the same graph Equivalence of Undirected and Directed Graphs For every directed graph there is an undirected graph that has a similar structure These graphs are not eguivalent In some cases the structure of the underlying undirected graph may yield insights into the structure of the directed graph he main use ofthis concept arises When We discuss the idea of connectivity in directed graphs While the idea of a connected graph is intuitive We shall Wait awhile before giving a formal de nition Summary Directed and Undirected Graphs Both are de ned over a non empty set of vertices conventionally denoted by VG 1 2 N forN 21 When useful graph vertices may be given other labels as convenient Neither directed nor undirected graphs have loops that is edges that begin or end on the same vertex For directed N M graphs l The edge set is a subset of ordered pairs om VG gtlt VG 2 The maximum edge count is N0N 1 For undirected N M graphs l The edge set is a subset of unordered pairs om VG gtlt VG N 2 The maximum edge count is 2 N0N l 2 For the next few lectures we shall focus on simple undirected graphs Complement of a Graph Let G be an N M graph with vertex set VG 1 2 N and edge set EG The complement of graph G denoted GC is de ned to be the graph with vertex set VGC VG and edge set EGC de ned by e e EGC if and only if 6 EG Example De ne Gby VG 1 2 3 4 and EG 1 2 1 3 1 4 Then GC is de ned by VGC 1 2 3 4 and EGC 2 3 2 4 3 4 In pictures we have 0 9 G 0 a G G Undirected Graphs Basic Structure We begin this section ofthe lecture by considering two questions related to the basic structure ofundirected graphs Similar considerations can be applied to directed graphs it is just that the presentation is simpler in the undirected case Look at the following two graphs How do they differ How are they similar 0 0 D 0 0 G 0 0 Each of the two graphs contains a triangle and an isolated vertex Obviously the two graphs are not the same graph in that the one on the le has vertex 3 isolated while the one on the right has vertex 1 isolated We shall develop a concept that describes the basic structural level at which these two graphs are essentially the same graph Another Example of Structure in Undireeted Graphs Consider the next two graphs They look very different But note that the two diagrams represent exactly the same graph In each graph we have two facts Each vertex with an odd number is connected to every eveninumbered vertex Each vertex with an even number is connected to every oddinumbered vertex We need a way to focus on the structure of a graph independent of the way it is depicted or otherwise represented One answer is graph isomorphism Graph Isomorphism Two graphs G1 and G2 are said to be isomorphic denoted by G1 2 G2 if there exists a onetoone mapping F from VGl onto VG2 such that u V e EGl if and only if Fu FV e EG2 Consider the following two graphs with different vertex labels 0 a 0399 0 0 9 0 These two graphs are isomorphic under the following transformation Fl A F2 B F3 C and F4 D The edge lists of the two graphs show this Graph on left 1 2 l 4 2 3 and 3 4 Graph on right A B A D B C and C D In a sense two isomorphic graphs represent the same graph More on Isomorphism The idea of graph isomorphism allows us to focus on the basic structure If two graphs are non isomorphic they have basically different structures Here are three noniisomorphic graphs on three vertices 9 6 9 For any value of N there are only a nite number of noniisomorphic graphs on N vertices We can view these as a nite set De ne FN as the set of noniisomorphic graphs on N vertices De ne FN M as the set of noniisomorphic graphs on N vertices and M edges For smaller values of N N g 4 we may easily display the entire sets FN The Smaller Graphs F1 F2 and I 3 Here are the sets HI 6 No edges 1 edge m G o ED G 0 0 No edges 1 Edge 2 edges 3 edges 1 3 0 1 3 1 1 3 2 1 3 3 The Set r4 1 Edge 6 Nn Edge 6 2 Edge Z 3 Edge 4 Edge Standard Undirected Graphs PN Certain graph structures appear with sufficient regularity to be given names The Path The ath on Nvertices is a connected graph with N r 1 edges in the form ofa path It has N r 2 vertices of degree 2 and two vertices of degree 1 The path on N vertices is denoted by PN Here are depictions ofthe graphs P1 P3 and P4 P2 9 6 Pa P4 Standard Undirected Graphs CN The Cycle The cycle on N vertices is a connected graph With N edges and N vertices Each vertex has degree 2 The cycle on N vertices is denoted CN Note that we cannot de ne a cycle on two vertices Here are C3 and C4 C3 C4 0 0 0 Comment It is the structure not the labels that make the graph The labels are for convenience only and could be removed Standard Undirected Graphs KN The Complete Graph The complete graph onN Vertices denoted by K is a graph in which every pair ofvertices is adjacent It has N Vertices each with degree N 7 1 N It has 2 NoCNi 1 2 edges Here are four complete graphs K K2 K3 and K4 with 0 1 3 and 6 edges K1 9 K2 9 6 K K4 0 9 3 V out Don t two of these graphs have other names More later Standard Undirected Graphs Bipartite Graphs A set X VG of vertices is said to be an independent set stable set if no two vertices in the set are adjacent A set of vertices X VG is said to be a maximal independent set if it is an independent set and not a subset of a larger independent set The set 1 2 is independent but not maximal The set 1 2 3 is a maximal independent set although it is not the largest independent set The set 4 5 6 7 is also amaximal independent set Bipartite and Complete Bipartite Graphs A graph G VG EG is called bipartite if the vertex set VG can be divided into two independent sets called the partite sets of G Let the two independent vertex sets be called X and Y 39Ihen VG X u Y and X m Y CD A graph is called complete bipartite if each vertex in either partite set is adjacent to every vertex in the other partite set By Kth We mean the complete bipartite graph With partite sizes a and b Here is the complete bipartite graph K14 The Star Graph The complete bipartite graph KL is often called either a star graph or a claw is co iguration is seen in network meoiy where it is called a star topology with the central Vertex node being called a hub Here are two depictions ofthe complete bipartite graph Km 39 39 39 ofL usage 39 39 39 seeninquot 39 Some Isomorphic Graphs Here are two isomorphic graphs on two Vertices P2 9 6 K2 9 6 Here are some isomorphic graphs on three Vertices These are triangles 03 mg Here are some isomorphic graphs on four Vertices o o The Vertices and Edges Let G V E be a graph With vertex set VG and edge set EG Note If u v e EG then necessarily u e VG and v e VG Undirected Graphs If u e VG v e VG and u v e EG then 1 Vertices u and v are said to be adjacent to each other 2 Edge u v is said to be incident on each of vertices u and v 3 The degree of vertex v denoted dv is the number of edges incident on v Equivalently it is the number of vertices adjacent to the vertex v The Vertices and Edges Part 2 Let G V E be a graph With vertex set VG and edge set EG Note If u v e EG then necessarily u e VG and v e VG Directed Graphs If u e VG v e VG and u v e EG then 1 Vertex u is adjacent to vertex v and vertex v is adjacent from vertex u 2 Edge u v is said to be incident on each of vertices u and v 3 The in degree of a vertex v d v is the number of vertices adjacent to v It is the number of vertices x such that x v e EG The out degree of a vertex v dv is the number ofvertices adjacent om v It is the number ofvertices y such that v y e EG Neighborhoods in Undirected Graphs Let G V E be an undirected graph with vertex set VG and edge set EG For v e VG we de ne Nv called the open neighborhood of the vertex v as the set of vertices adjacent to v The size of this open neighborhood is the out degree of the vertex Nv dv Again for undirected graphs this is just the degree of the vertex v A vertex v is called isolated if dv O equivalently Nv CD For v e VG we de ne Nv called the closed neighborhood of the vertex v as Nv Nv U v the open neighborhood with the vertex itself added Note the equation Nv Nv U v This depicts the standard way of showing an element added to a set As set union is de ned only for a number of sets we take the element v and rst create the singleton set v This keeps the notation simple and consistent Neighborhoods in Directed Graphs Let G V E be a directed graph with vertex set VG and edge set EG For v e VG we de ne Nv called the out neighborhood of the vertex v as the set of vertices adjacent om v the set of vertices y with v y e EG Note that the size of the out neighborhood is Nv dv A vertex v with dv 0 is sometimes called a sink vertex For v e VG we de ne N702 called the in neighborhood of the vertex v as the set of vertices adjacent to v the set of vertices x with x v e EG Note that the size of the in neighborhood is N v d v A vertex v with d v 0 is sometimes called a source vertex For a vertex v e VG we de ne the following sets the successor set of v is the out neighborhood of v the predecessor set of v is the in neighborhood of v The Adj acency Matrix Let G V E be a graph with vertex set VG 1 2 N and edge set EG C VG gtlt VG We have used a pictorial representation of graphs to facilitate discussing them We need a more formal method of representation if we want to develop algorithms that operate on graphs We have two data structures that can be adapted to depiction of graphs These are the matrix and the linked list The adjacent matrix of a graph G with vertex set VG 1 2 N is de ned as the square N by N matrix A where AJ K 1 if edge J K e EG 0 otherwise In the case of an undirected graph we note that J K e EG if and only if K J e EG For this reason the adjacency matrix of an undirected graph is symmetric AJ K AK J Example 0n Four Vertices Consider Gwith VG 1 2 3 4 and EG 1 2 1 3 1 4 2 4 The 47by4 adjacency matrix of G is as follows NOTE An undirected graph will always have a symmetric adjacency matrix A directed graph might have a symmetric adjacency matrix it is just not required to If the adjacency matrix is not symmetric the graph must be directed Another Example A Directed Graph on Four Vertices Here is a directed graph on four vertices VG 1 2 3 4 EG 1 4 2 4 3 2 4 3 0 9 0 0 The 4 by 4 adjacency matrix of G is as follows 0 OOOO Ol OO OOHH O O 1 Adjacency Lists Let G be a graph with VG 1 2 N and edge set EG The adjacency list representation ofthe graph has a set ofN linked lists Each list has a header node identifying the Vertex Each list contains the set ofvertices adjacent from that Vertex For undirected graphs this is the same as those adjacent to that Vertex Undirecterl Graph Directed Graph 1 2 I III 3 4 l l I 1 2 3 4 Walks Paths and Connectivity Let G V E be a graph with vertex set VG and edge set EG Let u e VG v e VG A sequence ofedges v1 v2 v2 v3 vK vKH with u v1 v vKH and VJ vH e EG for l S J S K is called a walk from u to v The walk can be described by listing its vertices u v1 v2 vK VKH v If G is an undirected graph the sequence is also a walk om v to u The number of edges in the sequence is the length of the walk A walk with no edge repeated is called a path A simple path u v1 v2 vK vKH v is a path with no repeated vertices A cycle is similar to a simple path except that the rst vertex is also the last Example ofa cycle u v1 v2 v3 V4 V5 u NOTE This terminology has yet to settle down various authors will give similar but not identical de nitions Connected Graphs and Components A graph G is said to be connected iffor every distinct pair ofvenices it a we and v a we there is apatii from 1410 v It is called a it Vrpa1h A 39 connected subgraph of G A graph is connected ifand onlyifithas one component A disconnected graph has two oi moie components Here is a graph with tinee components 0090 000 Itis tempting to View this as three connected graphs 51 witnwm 1234 GZ wimsz 4553 and G3 withG3 7 Example Paths in a Graph Here is a graph on nine vertices with two 1 9 paths shown Vertex disjoint paths are two paths with common end points but no other vertices in common We have two vertex disjoint 1 6 paths followed by two vertex disjoint 6 9 paths But every 1 9 path must go through vertex 6 As a result of this fact removal of vertex 6 will cut the graph in two Cycles and Tours in Graphs A uv walk of G is a nite alternating sequence of vertices and edges starting with u and ending with v It can be denoted as sequence of edges v1 v2 v2 v3 vK VKH with u v1 v VKH and VJ vJ1e EG for l S J S K A walk with no edge repeated is called a path A simple path u v1 v2 vK VKH v is a path with no repeated vertices A closed walk is a walk in which the rst vertex is the last vertex u v A cycle is similar to a simple path except that the rst vertex is also the last A Euler Tour Euler Circuit of a graph is a path is a closed walk through the graph that covers each edge exactly one time Vertices may be covered multiple times A Hamiltonian Cycle of a graph is a cycle that contains each vertex exactly one time It is likely that not all edges will be covered no edge can be covered more than once A graph is said to be Hamiltonian if it contains at least one Hamiltonian Cycle Examples of Walks Paths and Cycles Consider the following sample undirected graph Here VG 1 2 3 4 5 EG 1 1 2 1 3 2 3 2 4 2 5 3 5 4 5 l A 1 47Walk 1 2 2 3 3 5 5 2 2 4 o en Written as 1 2 3 4 2 4 ijust the Vertices Note the repeated Vertex One 1 47path 1 2 2 4 7 the shortest 1 47path Another 1 47path 1 3 3 5 5 2 and 2 4 One cycle 1 2 2 4 4 5 5 3 3 1 Another cycle 1 2 2 3 3 1 7 this is called a triangle Examples Euler and Hamiltonian Circuit Euler circuits and Hamiltonian circuits appear similar but they are quite distinct Consider two sets of graphs We begin With a graph having an Euler circuit 1quot 3quot 4quot 5393quot There is no Hamiltonian circuit We have displayed the Euler circuit Theoretically it is easy to determine Whether or not a graph has a Euler circuit and thus is to be called Eulerian There is no easy way to determine the existence of a Hamiltonian circuit in a graph it can be done only by a complete enumeration of all candidates there are 24 in this graph Examples Euler and Hamiltonian Circuit Part2 We now consider a Hamiltonian gmph having no Euler circuit There is no Euler circuit Again there is no easy way to determine if a general gmph contains a Hamiltonian cycle Here I have just displayed one The claim that the gmph lacks a Euler circuit can be established by a theorem proved in 1736 by the Swiss mathematician Leonhard Euler 1707 7 1783 Theorem Let G V E be an undirected gmph with no isolated vertices Then G has an Euler circuit if and only if G is connected and every vertex of G has even degree This example gmph has two vertices of odd degree hence no Euler circuit Distances in Graphs Let Uand Vbe two Vertices in a graph G Ifthere is a Uinath in G then the distance from Uto V is the number of edges in the sho est UiV path If the Vertices Uand VaIe adjacent then the from Uto Vis 1 Ifthere is no Uinath in G the distance from Uto Vis not de ned The distance from 1 to 2 is dist1 2 1 The distance from 1 to 4 is dist1 4 2 Connectivity in Directed Graphs A directed graph is said to be weakly connected if its underlying undirected graph is connected It is saidto be strongly connected ifthere is a u vrpath for each ordered pair 14 v of vertices This directed 4 3rgraph is Weakly connected but not strongly connected There is no path from vertex 3 to any other vertex However its underlying undirected graph shown to the right is connected Connectivity in Directed Graphs The next example shows a directed 4 4 graph that is strongly connected Terminology This graph meets the technical de nition for being weakly connected its underlying graph is also connected However the term weakly connected is always used to imply not strongly connected Subgraphs Let G VG EG be a graph A subgraph H ofthe graph G is agraph H VH EH with VH VG and EH EG Obviously any graph G is a subgraph of itself Here is a rather straightforward example ofa subgraph The subgraph on the left is a subgraph of the one on the right a e o e 0 Y o 9 0 9 As We shall see later the graph on the left is a spanning tree ofthe one at right The graph at right is o en called a wheel graph With a central vertex attached to a number of peripheral vertices otherwise arrayed as a cycle More on Subgraphs When we say either H is a subgraph of G or G contains H as a subgraph we often mean that the graph G contains a graph isomorphic to H Consider the following 4 4 graph We say that it contains a triangle C3 or K3 a star graph K13 several paths on three vertices P3 and a number of other subgraphs C3 1 2 2 4 4 1 K13 l 2 l 3 l 4 P3 2 l l 3 P3 2 l l 4 P3 3 l l 4 Acyclic Graphs A graph may contain one or more cycles CN as subgraphs Here we see K4 the complete graph on four vertices This graph contains a 4 cycle and four 3 cycles as subgraphs A graph that does not contain a cycle as a subgraph is called acyclic Tree De nition A tree is a connected acyclic graph Note This differs om the book s de nition on page 9 which is incorrect This de nition is simple and very powerful It is also a bit puzzling Here is a set of statements that can be proven to be logically equivalent G is a tree with n vertices and m edges Every two distinct vertices of G are connected by a unique path G is connected and m n l 4 G is acyclic and m n 1 P9P Theoretically a single node K1 satis es the above de nitions In order to avoid this nuisance fact in our theorems we make a further de nition De nition A nontrivial tree is a tree with at least two vertices Unless speci cally stated when we say tree we mean nontrivial tree Examples of Trees All PN paths onN vertices are trees although P2 is not much ofa tree P2 9 6 Pa P4 All star graphs KMH are trees We shall give some examples that are more interesting But first a few more definitions Spanning Subgraphs and Spanning Trees Let G VG EG be a graph With vertex set VG and edge set EG A spanning subgraph is a subgraph H With VH VG and EH EG The spanning subgraph is not required to be a connected graph The only spanning subgraph of much practical use is the spanning tree De nition A spanning tree is a spanning subgraph that is a tree Let T be a spanning tree of an N M graph G VG EG What can we conclude l T is a connected graph With N vertices and N l edges 2 T is a spanning subgraph of G so VT VG and ET EG 3 G is a connected graph With at least N l edges Remark A connected N M graph With M N l is a tree thus it is its own spanning tree A connected N M graph With M 2 N has many spanning trees Example Spanning Trees for K4 The complete graph on four vertices has six edges A spanning tree on four vertices will have three edges We have 2 20 different ways to remove three edges from a set of six Not all of these will yield a connected subgraph Here is K4 and two noniisomorphic spanning trees Rooted Trees De nition A rooted tree is a tree in which one vertex has been chosen and designated as the root of the tree Question Is a rooted tree a directed graph Answer Nothing in the de nition requires it to be a directed graph though all common usage treats it as a directed graph Here is the books de nition from page 9 as corrected A rooted tree is a directed graph that has no cycles and that has one distinct vertex called the root such that there is exactly one path om the root to every other vertex A vertex in a in tree is more commonly called a node The distance of a vertex in this rooted tree om the root vertex is the number of edges in the unique path om the root to that vertex Vertices in rooted trees are often ranked by level The level of a vertex is the distance of the vertex om the root The root is at level 0 by de nition Rooted Trees Part 2 Let v be a vertex in a rooted directed tree We can prove a few facts about the vertex that lead to some common de nitions Incoming Edges Either the vertex v is the root of the tree or there is exactly one vertex w adjacent to v that is w v is an edge in the tree This one node is the parent node of v If x is a node on the unique path from the root node to a vertex v then x is said to be an ancestor node of v The root node is an ancestor of all other nodes Outgoing Edges The vertex v has zero or more vertices that are adjacent m it In directed graph terminology this is Nv the open neighborhood of v In tree terminology this is the set of child nodes of v A node v with no child nodes is called a leaf node If node v is the ancestor node of a vertex y then vertex y is said to be a descendant node of vertex v All child nodes are descendant nodes Binary Trees Definition A binary tree is a rooted tree in which each node has at most two child nodes Each child node is called a left child or right child depending on how it is drawn in a typical depiction By convention of the drawing style a node has a right child only if it has a left child There is no theory behind this convention Each child node is the root of a subtree The left subtree is the subtree rooted at the left child node The right subtree is the subtree rooted at the right child node Note that a single node by definition is a binaIy tree It is the root of that tree Search Trees The process of searching a graph is usually based on the generation of a spanning tree of the graph rooted at a speci c start node There may be a number of ways to select the start node The search results will vary The gure at left shows a search tree that started at vertex 1 and moved to vertex 3 The gure at right shows a search tree that started at verteX 2 The gure at right might have been generated by Breadth First Search more on this later but the gure at left was generated to be a pretty picture Weighted Graphs A weighted graph is a simple graph directed or undirected in which each edge has a weight cost or capacity associated with it More formally A weighted graph G is a triple V E W in which V is a non empty set of vertices E g V X V is a set of edges the graph can be directed or undirected and W is a function from the edge set E into R the set of real numbers For any edge e e E we is the weight of e For any two vertices u e VG and v e VG the distance between these two vertices is normally the smallest sum of weights of edges in any path from u to v Much of graph theory and almost all of our examples uses edge weights from the set of non negative integers An unweighted graph can easily be represented as a weighted graph in which every edge has weight of l we 1 Associated with each weighted graph G V E W is an underlying unweighted graph that has the same structure but which has no edge weights assigned Weighted Subtrees Given a weighted graph G V E W we often generate a number of speci c spanning trees There are two trees of interest 1 A minimal spanning tree MST which is the spanning tree of G with smallest total edge weight A MST need not be unique 2 A shortest distance tree nonstandard name generated as a part of computing the shortest distance Here is a graph and two spanning subtrees The tree at left is a MST The one at right shows minimum distances from vertex 2 The Adj acency Matrix of a Weighted Graph Let G V E W be a weighted graph with vertex set VG 1 2 N edge set EG c VG gtlt VG and W as the set of weights We have two data structures that can be adapted to depiction of graphs These are the matrix and the linked list The adjacent matrix of a weighted graph G with vertex set VG 1 2 N is de ned as the square N by N matrix A where AJ K WJ K the weight of the edge if edge J K e EG If edge J K is not in the graph the assignment of a weight will depend CC 77 on the algorithm used to examine the graph Often we use 00 for no edge Here is the adjacency matrix for the above graph which was used to illustrate spanning trees H8 U10 lbom twogt8 ON1gt The Adjacency List Representation ofa Weighted Graph In the adjacency list representation the node in the linked list includes not only the VeItex to which the edge is incident but also the Weight of that edge Here is the graph and is adjacency list representation 1 BE Ill 2 BE Eli Ill 3 III III 4 I I I II Greedy This lecture focuses on the algorithm design strategy called greedy The greedy design strategy is usually applied to optimization techniques Topics for the lecture include 1 9759 De nition of optimization problems De nition of the greedy approach Example of greedy as applied to the problem of making change Spanning subgraphs spanning trees and minimum spanning trees and Prim s Algorithm for generating a Minimum Spanning Tree MST Optimization Problems Optimization problems are those in which some function called the value function is either to be maximized or minimized In general we maximize pro ts or capacities or we minimize costs or distances Very often the problem solution is subject to constraints Typical constraints are 1 Weight limitations in a maximization problem 2 Count limitations in either a maximization or minimization problem 3 Structural constraints such as Visiting each node exactly once or assigning exactly one job to each worker A solution is called feasible if and only if it satis es all constraints on the problem A solution is called optimal if and only if both 1 It is a feasible solution and 2 It maximizes or minimizes the value function There is no requirement for an optimal solution to be unique Three Design Strategies We contrast algorithms following each of three design strategies as follows Dynamic programming solutions build and extend partial solutions until a nal solution is discovered Backtracking solutions build a number of feasible solutions retaining the best one discovered so far At each step a tentative choice is made which may be reconsidered later Greedy solutions build exactly one solution one step at a time An algorithm is called greedy if it can be described as follows 1 The algorithm solves the problem by a sequence of steps 2 Each choice is greedy in that it is based on what appears best at the time This might be called a local optimum 3 Once a choice is made it is never reconsidered Greedy solutions are characterized by the property that a sequence of locally optimum choices what seems best at the time will lead to a globally optimal solution one that is at least as good as any other feasible solution Example Making Change Consider the problem of giving change of seventeen cents using only US coins currently in circulation In this problem we consider only dimes nickels and pennies The value function to be minimized is the number of coins given as change Here is the backtracking solution One can give either 0 dimes or 1 dime 1 HO dimes are given one can give 0 l 2 or 3 nickels 2 If 1 dime is given one can give either 0 nickels or 1 nickel 17 Cents in Change Dimes Nickels 1 Pennies 17 12 39l7 2 39l7 2 Coins 17 13 9 5 8 4 All solutions must be considered39 the optimal solution is known only at the end Making Change Using the Greedy Algorithm At each step Greedy makes the right choice 17 Cents in Change Since there is only one choice at each level of Greedy Dimes i Nickels 1 Pennies 2 4 Provably Optimum the decision tree the tree is quite simple Given the U S coin set halfdollar quarter dime nickel and penny the algorithm for giving change amounting to less than 100 is simple 1 Give as many half dollars as possible 2 Give as many quarters as possible 3 Give as many dimes as possible 4 Give anickel ifpossible 5 Give the rest in pennies Given the olderUS coin set 50 25 10 5 3 2 and 1 it is possible to give examples for which Greedy fails 9 as a 1 nickel and four pennies or 2 three 3 Procedure MakeChange The Greedy Change Algorithm Amt C50 C25 C10 C05 C01 Input Amt the amount in cents 0 S Amt S 99 Output C50 the number of half dollars C25 the number of quarters etc C50 0 C25 0 C10 0 C05 0 C01 0 If Amt 2 50 Then Amt Amt 50 C50 End If H If Amt 2 25 Then Amt Amt 25 C25 1 End If While Amt 2 10 Do Amt Amt 10 C10 C10 1 End While If Amt 2 5 Then Amt Amt 5 C05 1 End If C01 Amt End Procedure Subgraphs Spanning Subgraphs and Spanning Trees Let G be an N M graph with vertex set VG and edge set EG N 2 l and M 2 0 Let H be a graph with vertex set VH and edge set EH H is said to be a subgraph of G if and only if VH g VG and EH g EG H is said to be a spanning subgraph of G if and only if VH VG and EH g EG Note the difference here The two vertex sets VG and VH are identical H is said to be a spanning tree of G if and only if H is a spanning subgraph of G and H is a tree Speci cally H is a connected acyclic graph If H is a spanning tree of the N M graph G then H is an N N l graph it has N vertices and N l edges The weight of a tree is the sum of the weights of its edges A MST Minimum Spanning Tree of a graph with edge weights is the spanning tree of minimum total edge weight The MST need not be unique For an unweighted graph any spanning tree is a MST All have weight N l Sample Graph and Three Spanning Trees This example is adapted from the textbook Here is the weighted tree 1 03 2 6 O 6 Here are its three spanning trees 1 1 1 9 03 0 03 9 9 2 5 2 5 3 3 O 0 C 0 O 0 Weight 6 Weight 8 Weight 9 The subtree on the le is the minimum spanning tree of G Two Algorithms for Minimum Spanning Trees There are two algorithms commonly used to construct minimum spanning trees Prim Select a starting vertex and grow a tree one edge at a time The structure being generated is always connected and acyclic Kruskal Build a number of acyclic sets of edges until you have a tree shake and bake Note that Kruskal s Algorithm unlike Prim s requires a method for managing sets of vertices This includes an implementation of 1 A method for creating singleton sets 2 A method for creating unions of sets 3 A Boolean test for set membership Prim s Algorithm Algorithm Prim G Prim s algorithm for constructing a minimum spanning tree Input A weighted connected graph G V E with vertex set V and edge set E Output ET the set of edges for the spanning tree which by definition must have the same vertex set as G WT the total weight of that tree Vi v0 Initialize the vertex set of T to any vertex ET Q Initialize its edge set to the empty set WT 0 Weight of this tree is 0 For I l to lvl I DO V is the size of the vertex set Find a minimum weight edge e u w with w 6 Vi and u E V but u e VT VT 2 VT U U ET 2 ET U 9 w weighte WT WT w End Do Return ET and WT Start Prim s Algorithm We now illustrate the execution of Prim s algorithm on this example 09 The graph G has four edges each With Weight as follows Weigh A B 1 Weight C D 3 Start the algorithm With VT A V 7 T B C D Consider A B With 1 A C with 5 A D with 2 Place the First Tree Edge The shortest edge connecting A to a vertex not in VT is the edge A B with edge weight 1 Add this vertex to the tree At this point we have the following VT A B The tree has two vertices ET A B The tree has one edge WT 1 V VT 2 C Edges connecting a vertex in VT to a vertex in V VT are A C with 5 A D with 2 Place the Second and Third Tree Edges No L A 39 39 setA 39 V e A B t c D is the edge A D We add this to the subtree with the following results The tree has three venices T A B D A B A D he tree has two edges 3 NowV ABD andV sVT C Finally the shortest edge connecting venex c to avenex in the set A B D is the edge cDwith weight3 44 L A L L i i 0 1 9 VTAB cD ET A B A D C D gt T 6 o o This is our spanning tree 3 Comparison to Single Source Minimum Distance Trees In a later lecture we shall consider Dijkstra s Algorithm for determining the shortest path from one special vertex the source vertex to all other vertices in the grap It is easily proven that the paths generated form a spanning tree ofthe graph rooted at the source vertex Occasionally this tree might be aminimum spanning ee Here are two shortest path trees for our sample graph with vertex A as the source vertex 1 1 9 03 9 03 2 5 2 3 O 0 O 0 Each tree shows the following distances A to B distance 1 A to C distance 5 A to D distance 2 The tree at left has total weight 6 and the tree on the right has total weight 8 Only the tree on the left is a MST Correctness Proof Outline The key step in building the MST with edge set ET by Prim s Algorithm is as follows Find a minimum weight edge e u w with WEVTandUEVbutuVT VT 2 VT U UET 2 ET U 9 The algorithm terminates when VT V and V VT Q the empty set The proof of correctness is by contradiction We assume that at some stage of the construction we have an edge 6 as described but do not place it in the edge set ET Speci cally we assume that the edge 6 is not at any time placed in the spanning tree The outline of the proof is quite simple We are constructing the spanning tree and at the moment have a tree T1 with I V6rtices Consider two edges 6 u v u e VT v e V VT 6 x y x e VT y e V VT w6 gt w6 We construct the tree with edge 6 x y omitting edge 6 u v and generate a spanning tree with a give total weight W We then remove the edge 6 x y and add the edge 6 u v This creates a spanning tree of less total weight a contradiction Correctness Proof Part 1 We are constructing the spanning tree and at the moment have atree T with I vertices We are considering addition ofan edge to generate atIee T with I 1 vertices We are looking for edges ofthe form e xy with E V andy E V EVT 0 TI Not in TI The rst thing that we note is that the graph T with I 1 vertices is atree provided only that the graph T with I vertices is also atree 1 T is a connected acyclic graph with 1 vertices and I E 1 edges We add one edge and one vertex to get a connected graph T with 1 1 vertices andI edges 2 Because E V andy E V EVT addition ofedge e xy does not introduce a cycle A cycle would imply anotherx gty path thus thaty E V already Chapter 7 Space and Time Tradeoffs This chapter covers a few topics relating to the possibility of creating an algorithm that is more timee icient by using some extra computer memory The rst topic to be covered illustrates inth enhancement 7 in which preprocessing the input gives a faster algorithm String Matching The rst problem to be examined is string matching 7 nding the next occurrence of a string of M characters informally called a word in a string of N characters called the text There are two obVious cases that can be dismissed immediately M gt N the pattern is longer than the text and so cannot exist in the text M N the pattern and text are the same length We test for equality In this set of notes we assume that the pattern is shorter than the text thus M lt N We assume the usual case of M ltlt N read M is much less than N The bruteforce algorithm for string matching was discussed in section 32 of the textbook although we did not discuss it at the time Part of the algorithm is shown below Input An array PO M 7 l of M characters called the pattern This is What we are searching for An array TO N 7 l of N characters which is the text being searched for the pattern Output The position of the first character in a section of the text that matched the pattern or l if no match TN 7 M is the last character For I O to N 7 M Do that can start the word J 0 While J lt M And PJ TI J Do J J l End While If J M Return 1 End For Return l Failure The basic approach is quite easily stated The rst character in the pattern to be matched is P0 We try to match it with T0 the rst character in the text If this matches we try to match Pl otherwise we try to match PO to another character In other words Try to match PO to TI If no match set I I l and try again While this algorithm is demonstrably correct it does a lot more work than is necessary We can see this by trying to match the word BAT against a string of words taken from an earlier example Index 012345678901234 Text BAG BAR BAT CAT DOG Pattern BAT In the bruteforce algorithm we do the following match sequences PO TO Pl Tl but P2 T2 try again PO 7E Tl try again PO 7E T2 try again PO 7E T3 try again PO T4 Pl TS but P2 T6 try again we nally get the answer with PO T8 Pl T9 and P2 TlO The clue to an improved algorithm is to note that we do not need to match PO rst but could just as easily start with the last character in the pattern here P2 Look at the above instance of the string match problem We start by noting that the pattern to be matched has three characters represented in the array PO 2 We begin the match with comparing P2 to T2 We discover that P2 7E T2 Now we come to the basic insight We see that T2 G and note that the pattern to be matched does not contain a G That means that we can slide the pattern by its entire length over the text to get the following Index 012345678901234 Text BAG BAR BAT CAT DOG Pattern BAT We now compare P2 to T5 We note that P2 7E T5 but that TS A a letter that is in the word We slide the pattern by 1 character to align the A and attempt a match Index 012345678901234 Text BAG BAR BAT CAT DOG Pattern BAT We now compare P2 to T6 and note that P2 7E T6 T6 R a character not in the pattern so we can slide the pattern by its length again Index 012345678901234 Text BAG BAR BAT CAT DOG Pattern BAT Again we compare P2 to T9 We note that P2 7E T9 but that T9 A a letter that is in the word We slide the pattern by 1 character to align the A and attempt a match Index 012345678901234 Text BAG BAR BAT CAT DOG Pattern BAT We now have P2 TlO Pl T9 and PO T8 We have a match The key to applying methods such as this is to preprocess the input by creating a table of permissible shifts indexed by characters that are possible in the text All characters not in the pattern are assigned a shift equal to the length of the pattern Characters in the pattern are assigned a shift depending on the distance from the rightmost occurrence of the character to the end of the pattern For P BAT we have TableB 2 TableA l TableT 0 and Table c 3 for all other characters Here is my version of the algorithm ShiftTable found on page 253 of the text My version differs from that of the text in that it assigns a shift of 0 to the last character in the pattern thus Table T 0 This is not an issue because the string matching algorithms will not check this entry in the array Algorithm ShiftTable PO M 7 l Fills the shift table used by the pattern match algorithms Input PO M 7 1 the pattern of length M gt O CharSetO Size 7 1 the alphabet of given size from which the pattern and text are drawn For 7 bit ASCII Size 128 Output TableO Size 7 1 a table of shift values with one value for each character in the alphabet Here we use the CC trick of making a character equivalent to a small integer say in the range 0 127 For J O to Size 7 1 Do TableJ M For J O to M 7 1 Do TablePJ M 7 J 1 Apply this variant of the algorithm to the pattern REORDER to get TableR 0 TableE l TableD 2 TableO 4 and the others 7 To show how this algorithm works lets look at the processing of REORDER Note that the value of some of the table entries is overwritten several times Other approaches can be designed to avoid overwriting but these approaches involve extra complexity Apply the algorithm assuming a 7bit ASCII character set of size 128 The length of the pattern is 7 so we initialize the table to this length For J O to 127 Do TableJ 7 Now we execute the next loop found in the algorithm For J O to 6 Do TablePJ M 7 6 PO R Table R 6 Pl E Table E 5 P2 O Table O 4 P3 R Table R 3 changing a value 2 1 0 P4 D Table D P5 E Table E P6 R Table R H H H H H H H H ONUIlkUJNt O another change Here is Horspool s algorithm based on the above constructor of the shift tables Algorithm Horspool PO M 7 1 TO N 7 1 Input PO M 7 1 the pattern to be matched and TO N 7 1 the text to be searched Output The index of the left end of the first matching substring or 1 if there are no matches ShiftTable PO M 7 1 create the shift table l M 7 l position of pattern s last character While I lt N Do K 0 While K lt M And PM 7 K 7 l l 7 K DO K K l End Do If K Then Return 1 7 M l Else 1 l TableTl End While Return 1 Hashing Hashing is an algorithm used for dictionary type applications which maintain a list of keys The ADT Abstract Data Type dictionary provides for searching by key insertion and deletion of keys We shall see later that deletion from a hashed list must be done carefully Hashing is based on the idea of distributing a large number of keys into a onedimensional array H0 M 7 1 called a hash table The index of the key in the hash table is computed by a hash function which we denote by hK for key K In the more common examples the keys are words formed from an alphabet If we use the function ordc to associate a position in the alphabet with every character then we commonly hash a key K oflength S by the following Assume that K K0K1 KS1 so that K1 is a character in the key K Algorithm Hash K h O For I O to S 7 I Do h h 0 C ordKi mod M Book uses c1 End Do Return h with C being a constant larger than the number of characters in the alphabet We have two requirements on the hash function 1 It should be easy to compute as we are using this to save time 2 It needs to distribute keys as evenly as possible among the cells of the table What we mean by even distribution is that each contiguous subarray of the array be filled to approximately the same fraction Suppose that a hash table of size M is used to hash a list of N keys We define the load factor as at N M Some hashing methods require that at lt 10 and others allow at to be greater than 1 The simplest and most intuitive approach requires at lt 09 for reasonable performance One way to achieve even distribution of the hashed keys is to make the hash table to have a size that is a prime number It has been found to be useful to make the multiplier C also a prime number For example suppose that we want to have a hash table of size close to 150 We could select M 151 a prime number and C 301 another prime number As an example we hash the word REORDER using the ASCII codes for the characters as the ord function ASCII D 0x44 68 ASCII E 0x45 69 ASCII O 0x4F 79 ASCII R 0x52 82 Start the hash algorithm with h 0 The word has length 7 so S 7 l 6 10 c R h0030182mod151 82modl5l 82 11 c E h82030169mod 151 24571 mod 151 138 12 c O h138030179mod 151 4l6l7mod 151 92 13 c R h92030182mod151 27774mod15l 141 14 c D h14lo30168mod l5l 42509mod15l 78 15 c E h78030169mod151 23547mod15l 142 16 c R hl42030182mod151 42824mod15l 91 By this calculation we have h REORDER 91 The hash table has size 151 and is indexed from 0 through 150 inclusive The requirement that the hash function distribute the keys evenly over the table implies that for any other key K that there is a l in 151 chance that hK 91 Collisions We are considering the problem of hashing a key for insertion into an array of size M We have seen that the hash function hK produces a number in the range 0 M 7 1 to be used as an index into the array If two different keys K1 and K2 hash to the same index value we say that we have a collision It should be obvious that if we hash N keys with N gt M that we must have a collision In mathematics this is called the pigeonhole principle Handling collisions is one necessity of doing hashing We shall consider two of the more common approaches to handling collisions open hashing and closed hashing As an example we use the phrase used in the book s example A FOOL AND HIS MONEY ARE SOON PARTED According to the hash function applied in the book we have the following values IK IA FOOL IANDI HIS MONEY ARE SOONI PARTEDI IhKI1I9I6I10I7I11I11I12I We note that two different words hash to the same index 7 ARE and SOON both hash to the index 11 Obviously the two cannot be stored in the same location Open Hashing The first solution is to have the array hashed into be a list of header nodes for a set of linked lists and to store the words in the linked lists The average number of entries in each linked list would be the load factor of the hash table at N M If one searches a linked list for an element one has to search about half of the list for an element that is there and all of the linked list for elements that are not there Thus the number of probes for searches is Sm 10c2andU0c o 1 27 37 4t 5 s r v u s so 9 FOOL 10 11 I SOON u Prncessing Hash Tahleswith Open Hashing are are four baste openataons to be consideredm any hash table initialization key insertion key Search and key deletaon Here we consider eaeh operation Open hashtng uses r h wdT V Our assumptaons about the lmked hsts 1 Dupheate keys are not stored 2 The hnkedhsts are kept as orderedhsts although this is notnecessary The hash table is inidalixed by setttng eaeh ofxts entnes to the null pointer T111515 A 39 T t tm Tu rt A key K is searched for by eonnputang h hK and then searehtng Lxsth A r Note that ahash table Wth open hashtng ean have aload factorgreater than one Closed Hashing In closed hashing all entries are placed in the array itself Two keys K1 and K2 are said to collide if K1 7E K2 and hK1 hK2 In open hashing both keys K1 and K2 can be stored on the linked list associated with the hash address so collisions do not present a problem In closed hashing each of K1 and K2 must be stored in an array element and both hash to the same address thus one must be placed elsewhere There are two approaches to handling the collisions that can be expected to occur 1 linear probing not very ef cient 2 double hashing We shall present the two methods within the context of operations on the hash table Processing Hash Tables with Closed Hashing There are four basic operations to be considered in any hash table initialization key insertion key search and key deletion As we shall see later simple deletion of keys from a hash table with closed hashing presents some unintended consequences The better approach is to allow only lazy deletion in which a key is marked as deleted rather than actually deleted This requires associating a Boolean ag with every key perhaps storing a struct rather than a simple character string The hash table is initialized by setting each of its entries to all blanks or some other entry that cannot be a word Here we assume initialization to blanks Consider insertion into a hash table of size M denoted as an array HO M 7 1 1 We produce the index h hK 2 If Hh is blank we set Hh K 3 If Hh K we check the deleted ag and reset it to Not Deleted ifnecessary 4 If Hh at K and Hh is not blank compute a new hash value h according to linear probing or double hashing de ned below and go to step 2 The reader will note that the above does not automatically insert into the rst slot with a key marked as deleted since this may introduce duplicate keys As we shall see below it is necessary to follow the hash chain to an empty slot in order to be sure that the key is not already in the table When nding the key not present in the table one might want to return to the rst slot marked as deleted and insert the key there Note that this insertion process will never terminate if the hash table is full Any reasonable implementation of a hash table object would include a way of keeping track of the number of entries in the hash table so that an insertion would not be attempted for a full table The following steps are followed in searching the table 1 We produce the index h hK 2 If Hh is blank we return key not found 3 If Hh K we return key found 4 If Hh at K and Hh is not blank compute a new hash value h according to linear probing or double hashing de ned below and go to step 2 The following steps are followed in deleting a key from the table 1 Produce the index h hK 2 If Hh is blank do nothing 3 If Hh K mark the entry as deleted 4 If Hh at K and Hh is not blank compute a new hash value h according to linear probing or double hashing de ned below and go to step 2 In linear probing if Hh is not empty be continually increase h setting h h 1 mod M until we nd an empty slot In double hashing if Hh is not empty we hash again using another hash function sK next check hK sK mod M and if that is not empty continually add sK modulo M until an empty slot is found Double hashing is more complex but it avoids clustering that tends to occur when linear probing is applied In order to illustrate concepts associated with closed hashing we consider the above phrase slightly modified A FOOL AND HIS MONEY ARE SOON PARTED REALLY under the very arbitrary assumption that REALLY also hashes to 11 We also postulate a second hash function which we arbitrarily declare has the following values s SOON 6 and s REALLY 9 Here is the status of the array after the word ARE is processed 12345 6 7 ISI 910 11 12 IAI IANDIMONEYI IFOOLIHISI ARE At this point we need to decide whether to use linear probing or double hashing We first consider linear probing Insert SOON with h SOON ll Slot 11 is taken so we try slot 12 It is empty 1I2I3I4I5I6I7I8I9I10 1112 IAII IAND IMONEYI IFOOLI HIS Insert PARTED with h PARTED l2 Slot 12 is taken so we try 12 1 mod 13 0 Slot 0 is empty so we insert PARTED at that location 12345 6 7 ISI 9 10 11 12 All IANDIMONEYI IFOOLIHISIAREISOONI Insert REALLY with h REALLY 11 There is a collision so we try slots 12 0 and 1 before finding that slot 2 is open The state of the table after inserting REALLY is shown next 0 1 2 I3I4I5I6I 7 I8I9I10I11I12 IPARTEDIAIREALLYI IANDIMONEYI IFOOLIHISIAREISOONI Note the clustering that results from linear probing We have a run of 7 words beginning at location 9 with the run of 5 words beginning at location 11 and continuing to location 2 being the direct result of linear probing Suppose we now delete the word ARE If it is marked as deleted the processing is OK Suppose that we really delete the word to obtain the following table 0 1 2 I3I4I5I6I 7 ISI 9 101112 IPARTEDI A IREALLYI IAND IMONEYI IFOOLI HISI ISOONI An attempt to locate the words SOON with h SOON 11 and REALLY with h REALLY 11 will indicate that neither word is present as the search logic immediately hits a blank entry It is for this reason that deletion is done by marking the entry Now we consider what would happen for double hashing Here again is the status of the array after the word ARE is processed 12345 6 7 ISI 910 11 12 IAI IANDIMONEYI IFOOLIHISI AREI We insert the word SOON with h SOON 11 and s SOON 6 Note that slot 11 is already taken so we compute 11 6 mod 13 17 mod 13 4 Slot 4 is open so we have 1 I4I5I6I7I8I9I10I11I12I IAII ISOONI IANDIMONEYI IFOOLIHISIAREI We now insert the word PARTED with h PARTED 12 Slot 12 is empty so we get 012345678910 11 12 IAI ISOONI IANDIMONEYI IFOOLIHIS We now insert the word REALLY with h REALLY 11 and s REALLY 9 Slot 11 is taken so compute h 11 9 mod 13 20 mod 13 7 Slot 7 is taken so compute h 7 9 mod 13 16 mod 13 3 Note that one can get the same value from 11 209 mod 13 11 18 mod 13 29 mod 13 3 The end result of this process is shown below 0123 I4I5I6I7I8I9I10 11 12 IAI IREALLYISOONI IANDIMONEYI IFOOLIHIS It is often advantageous to adapt closed hashing to allow more than one key to be stored in a given location in the hash table The use of buckets allows such an arrangement A bucket is just an entry in the hash table that can hold more than one key Consider a hash table of size M in which each entry is a bucket that can hold B entries This can be considered as atwo dimensional array H0 M 7 l 0 B 7 1 As in simple hashing when a bucket is lled one must use either linear probing or double hashing Hashing with buckets is often applied to access of disk entries identi ed by keys Cryptographic Hashing So far we have discussed the use of hashing to create an index into a hash table With closed hashing we had to attend to the possibility of collisions but that was a minor addition We then covered some issues in the maintenance of a hash table Hashing is a general technique useable in many arenas One such arena is cryptographic hashing which involves the use of a cryptographic quality hash function Theoretically a cryptographic hash function could be used to create an index into a hash table practically this is the equivalent of using a sledgehammer to kill a roach A hash function is considered cryptographic quality if 1 The function is one way thus it is not possible to retrieve K from its hash hK The hash function we considered above satisfied this requirement In the example just considered we had two keys hash to hK 11 Given hK 11 it is obviously impossible to find what single key gave rise to the value 2 Given a specific hash value h it is computationally infeasible very difficult to construct a key K with the property that hK h Computational infeasibility should be viewed in the light of similar problems such as opening a combination lock by trying all possible combinations Within the context of network communications a cryptographic hash function is used to insure that a message M has not been altered in transit One computes hM and transmits that value also The recipient on receipt of the message M can easily compute hM and verify that it is the value that has been transmitted The security of this method depends on the computational infeasibility of producing another message M2 with the same hash Two commonly used hash functions are MD5 which produces a 128bit hash result and SHA which produces a 160bit hash result Both of these have been examined for some time and found to be of cryptographic quality One interesting use of cryptographic hash functions is as a combination lock I give you a 128bit 32 hex digits hash result h and say that I will grant you access to a resource when you compute a key K such that hK h For each possible key K it is easy to compute the value hK but the only practical way to find the correct K is to try all possibilities until one creates the correct function This takes quite a bit of time BrTrees one way to mcrease the speed of aeeess to a eoueetron ofrecords espeerany a eoueeuon stored on a ddsk dnve rs to ereate an rndependentrndex strueture Thrs strueture rs an aeeess ABrTree rs one structure devrsedto rnatntatn the mdex strueture ean beeorne senously unbalaneedrfthey are not a brnary r 39r a unbalanced trees Pure bmary seareh trees p p Consrder r Suppose thatwe enterthe fouowrng keys A39 B39 c D39 E39 and F39 enteredtwo ways If entered as D39 B39 F39 39 c and E39 7 one gets the balmcedtxee on the lett If entered as B39 A39 c D39 39 and F39 7 one gets the unbalancedtree on the nght Unbalanced Tree There are anurnber oftree struetures that ean be called salfrhalancingm that they wru In thrs m t drfferenee of 1 tn the drstanee from rootto leaf The tree at1ett rs ba1aneed beeause all leaf nodes are at drstanee 2 from the root The tree at nght has one leaf at drstanee 1 and one leaf 39T L D As an ADT v t mthm r r w m u r hm yd deleuon or n rn balance We may say abrt more ofthrs tater These notes wlll eoyer a yanant ofthe BrTree ealleol aB eTree whlch ean be vlewed as a speclal type ofllnkedllst wth an lndex strueture burlt on life ease of aeeess 1n aBrTree all keys are storeolm leafnodes Aleafnode usually eontarns anumber ofkeys up to aprer set lrmrt 1n the gure below we show the strueture of aleaf noole m aBtTree Re Retard 1 t t t t m reeorols themselves tenol to be rather large The pornter markeol Next noole m llnked lrstquot ls nlque B Trees Th orolenng ofkeysln aBrTreels obvlous wrthrn the leafnodes here K lt7lt lt7lt TL l ees BtTrees but ls easler to explam m BtTrees eNo key m anent noole ls less an or equal to akey m the prese tnoole Thus lfthe leaf nooles are lald out le to nghtquot th the key values wlll be seen to lncrease m an orolerly fashlon r t n A u The strueture ofatyplcal noole ls shown m the gure below EEEE B In general a node m aBrTree eontarns Ne 1 keys andeornters Thls example has 3 keys and four pornters Agam the keys mustbe orolereol preferably K lt K lt K3 as havlng equal k V V 39T n rh n Fw d slmply stateol eaeh key m the subtree pornteolto by P ls smaller than KM By rmplreauon eaeh key m the subtree pornteolto by P ls greater than or equal to K ne pornt remams amblguous m the dlscusslon baseol only on a slngle noole We have not stateol alower lrmrt on keys m the Pu subtree or an upper lrmrt on keys m the P3 subtree Conslder V w m n r m noole thus the lrmrt K3 g keys m P3 subtree lt K m parentnoole A slmllar argument laees alower lrmrt on keys m the Pu subtree ofthe nght chlld noole We now come to che balance iequiiemems foi a BTree The order ofa BTree is che i i i i point to hochihg A BTree oforder M 2 2 must satisfy che following requiremmt 1 The Al 5 root node is eidTer a leafnode or has between 2 and M child nodes lthis precludes is aroot node widT one chi node 2 Fahnd H 1 F V AM h39m nodes ehee between TM2L 1 and Nb 1 keys 3 i The gure below is an example ofaBtTiee of oideia adapted from he gure on page 259 onhe textbook The major difference is he linking of Te leafnodes to make it a BtTiee 4 L 39 39 39 or greacei quot We operate First we eohsidenhe easiest scep Suppose chmkey valueK13 is inserted We note chat 39 39 39 quot u In a No L L Before Insertion oI39K 13 After Insertion oI39K 13 W 17 one holdm 15 and 16 andthe other holdmg 17 and 19 We needle select akey for the parent node that W111 adm navxganng Note that me parentnode only has two keys thh athxrd key avalable so we set K3 17 andlmkmthe new leaf node Suhtree after insulingK 13 and K 17 We eou1dmen make an msemon ofakey such as 18 faxrly easxly Suppose now that Iwantto msert akey value ofK 42 What are the xssues7 The subtree mto whxch the key value K 42 W111 be msened 15 shown atle Note that me eafnode mto whxch xthll be msenedxsful andmustbe spm ms causes problems n r m 40 40 43 EliI I Desired Befnrelnsertinn Afterlnsertinn u m m r u 1 u obtamedxs as follows K1 21K 34 K 40 and K 42 But there 15 ha key Kem thsnode h wquot h 1 Fixed Point Arithmetic and the Packed Decimal Format This set of lectures discusses the idea of xed point arithmetic and its implementation by Packed Decimal Format 011 the IBM Mainframes The topics include 1 A review of IBM 8370 oating point formats focusing on the precision with which real numbers can be stored 2 The difference between the precision requirements of business applications and those of scienti c applications 3 An overview of the Packed Decimal format as implemented on the IBM 8370 and predecessors We begin with a review of IBM notation for denoting bit numbers in a byte or word From our viewpoint it is a bit non standard IBM 8370 Terminology and Notation The IBM 370 is a byte addressable machine each byte has a unique address The standard storage sizes on the IBM 370 are byte halfword and fullword Byte 8 binary bits Halfword 16 binary bits 2 bytes Fullword 32 binary bits 4 bytes In IBM terminology the leftmost bit is bit zero so we have the following Byte 01234567 Halfword O1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Fullword 0 7 8 15 16 23 24 31 Comment The IBM 370 seems to be a big endian machine S370 Available Floating Point Formats There are three available formats for representing oating point numbers Single precision 4 bytes 32 bits 0 31 Double precision 8 bytes 64 bits 0 63 Extended precision 16 bytes 128 bits 0 127 The standard representation of the elds is as follows Format Sign bit Exponent bits Fraction bits Single 0 1 7 8 31 Double 0 l 7 8 63 Extended O 1 7 8 127 NOTE Unlike the IEEE 754 format greater precision is not accompanied by a greater range of exponents The precision of the format depends on the number of bits used for the fraction Single precision 24 bit fraction 1 part in 224 7 digits precision Double precision 56 bit fraction 1 part in 256 16 digits precision 224 256 z 1003010356 w 101686 z 71016 Precision Example Slightly Exaggerated Consider a banking problem Banks lend each other money overnight At 3 annual interest the overnight interest on 1000000 is 40492 Suppose my bank lends your bank 10000000 ten million You owe me 40492 in interest 1000040492 in total With seven signi cant digits the amount might be calculated as 10000400 My bank loses 492 I want my books to balance to the penny I do not like oating point arithmetic TRUE STORY When DEC the Digital Equipment Corporation was marketing their PDP ll to a large New York bank it supported integer and oating point arithmetic At this time the PDP ll did not support decimal arithmetic The bank told DEC something like this Add decimal arithmetic and we shall buy a few thousand Without it no sale What do you think that DEC did 77 Precision Example Weather Modeling Suppose a weather model that relies on three data to describe each point 1 The temperature in Kelvins Possible values are 200 K to 350 K 2 The pressure in millibars Typical values are around 1000 3 The percent humidity Possible ranges are 000 through 100 100 Consider the errors associated with single precision oating point arithmetic The values are precise to 1 part in 107 The maximum temperature errors resulting at any step in the calculation would be 35105 Kelvins 10010 4 millibars 10105 percent in humidity As cumulative errors tend to cancel each other out it is hard to imagine the cumulated error in the computation of any of these values as becoming signi cant to the result Example A weather prediction accurate to i1 Kelvin is considered excellent Encoding Decimal Digits There are ten decimal digits As 23 lt 10 S 24 we require four binary bits in order to represent each of the digits Here are the standard encodings Note the resemblance to hexadecimal except that the non decimal hexadecimal digits are not shown 0 0000 5 0101 1 0001 6 0110 2 0010 7 0111 3 0011 8 1000 4 0100 9 1001 Notethatthebinarycodes 1010 1011 1100 1101 1110 1111 for hexadecimal digits A B C D E F are not used to represent decimal digits Packed decimal representation Will have other uses for these binary codes Zoned Decimal Data Remember that all textual data in computing are input and output as character codes IBM S370 uses 8 bit EBCDIC to represent characters On input these character codes are immediately converted to Zoned Decimal format Which uses 8 bit one byte codes to represent each digit With a slight exception it is identical to EBCDIC Here are the EBCDIC representations of each digit shown as 2 hex digits Digit Code Digit Code Hex Binary Hex Binary 0 F0 11110000 5 F5 11110101 1 F1 11110001 6 F6 11110110 2 F2 11110010 7 F7 11110111 3 F3 11110011 8 F8 11111000 4 F4 11110100 9 F9 11111001 Packed Decimal Data Packed decimal representation makes use of the fact that only four binary bits are required to represent any decimal digit Numbers can be packed two digits to a byte How do we represent signed numbers in this representation The answer is that we must include a sign half byte The IBM format for packed decimal allows for an arbitrary number of digits in each number stored The range is om 1 to 31 inclusive After adding the sign half byte the number of hexadecimal digits used to represent any number ranges from 2 through 32 Numbers with an odd digit count are converted to proper format by the addition of a leading zero 123 becomes 0123 The system must allocate an integral number of bytes to store each datum so each number stored in Packed Decimal format must have an odd number of digits Packed Decimal The Sign Half Byte In the S370 Packed Decimal format the sign is stored to the right of the string of digits as the least signi cant hexadecimal digit The standard calls for use of all siX non decimal hexadecimal digits as sign digits The standard is as follows Binary Hex Sign Comments 1010 A 101 1 B 1100 C The standard plus sign 1101 D The standard minus sign 1 1 10 E 1111 F A common plus sign resulting from a shortcut in translation from Zoned format Zoned Decimal Data The zoned decimal format is a modi cation of the EBCDIC format The zoned decimal format seems to be a modi cation to facilitate processing decimal strings of variable length The length of zoned data may be from 1 to 16 digits stored in l to 16 bytes We have the address of the rst byte for the decimal data but need some tag to denote the last rightmost byte The assembler places a sign zone for the rightmost byte of the zoned data The common standard is X C for non negative numbers and X D for negative numbers The format is used for constants possibly containing a decimal point but it does not store the decimal point As an example we consider the string 12345 Note that the format requires one byte per digit stored Creating the Zoned Representation Here is how the assembler generates the zoned decimal format Consider the string l2345 The EBCDIC character representation is as follows lCharacterl l l l 2 l 3 l l 4 l 5 l lCode6DlF1lF2lF3l4BlF4lF5l The decimal point code 4B is not stored A bit later we shall see the reason for this The sign character is implicitly stored in the rightmost digit The zoned data representation is as follows 12345 IFlIF2IF3IF4ID5I The string F1 F2 F3 F4 C5 would indicate a positive number with digits 12345 Packed Decimal Data The packed decimal format is the one preferred by business for nancial use Zoned decimal format is not used for arithmetic but just for conversions As is suggested by the name the packed format is more compact Zoned format one digit per byte Packed format two digits per byte mostly In the packed format the rightmost byte stores the sign in its rightmost part so the rightmost byte of packed format data contains only one digit All other bytes in the packed format contain two digits each with value in O 9 This implies that each packed constants always has an odd number of digits A leading 0 may be inserted as needed The standard sign elds are negative X D non negative X C The length may be from 1 to 16 bytes or 1 to 31 decimal digits Examples 7 I 7C 13 I 01 I 3D Example Addition of Two Packed Decimal Values Consider two integer values stored in packed decimal format Note that 32 bit two s complement representation would limit the integer to a bit more than nine digits 2 147 483 648 through 2 147 483 647 Integers stored as packed decimals can have 31 digits and vary between 9 999 999 999 999 999 999 999 999 999 999 and 9 999 999 999 999 999 999 999 999 999 999 99901030 Consider the addition of the numbers 97 and 12541 97 would be expanded to 097 and stored as 0 97D 12541 would be stored as 12541C The CPU does what we would do It rst pads the smaller number with 0 s 00097D 12541C It then performs the operation denoted as 12541 97 and stores the result as 12444 or 12 444C in packed decimal format Sequential Circuits Sequential circuits are those With memory also called feedback In this they differ fr m combinational circuits Which have no memo The stable output of a combinational circuit does not depend on the order in Which is inpuw are changed The stable output of a sequential circuit usually does depend on the order in Which the inputs are changed We usually focus on clocked sequential circuits in Which the circuit changes state at xed times in a clock cycle Clocked circuits are easier to design and understand All sequential circuiw depend on a phenomenon called gate delay This re ects the fact that the output of any logic gate implementing a Boolean function does not change immediately When the input changes but only some the la er The gate delay for modern circuits is typically a few nanoseconds IJII Y 0 X gto Y l X 1 0 t Another Circuit Dependent on Gate Delay The following is a statement of the Inverse Law of Boolean Algebra X 0 But consider the following circuit and its timing diagram Note that for a short time one gate delay We have Z 1 Yet Another Circuit Dependent on Gate Delay Consider the following circuit When X 0 Y 0 and Z 1 It is stable 2 X Y The behavior becomes interesting When X 1 Just a er X becomes 1 We still h 0 and Z 1 due to gate delays Z ow H Views of the System Clock There are a number ofways to View die system clock In general me View depends on L H H gure which illustrates some ofthe semis commonly used for a clock Rising Edge Ed I l f 1 Cycle Time T 11 The clock is typical ofapen39odic function meie is aperiod T Oi which at ft T L L Where me Views 0fthe System Clock I l f Cycle Time f T T 1 The top view is the real physical vieW It is seldom used The middle view re ects the fact that voltage levels do not change instantaneously Flip Flops First De nition We consider a ip op as a device that stores a single binary value We consider only clocked ip ops These are devices that change value only at times dictated by a system clock Technically an unclocked device is called a latch only clocked devices are called ip ops Denote the present time by the symbol T Denote the clock period by T Rather than directly discussing the clock period we merely say that the current time is T after the next clock tick the time is T 1 The present state of the ip op is often called QT The next state of the ip op is often called QT 1 The sequence the present state is QT the clock ticks the state is now QT 1 Details of a Latch Clock Q Q R R The device on the le is an SR latch It can be shown that When S 0 and R 0 the device does not change state QT l QT The device on the right is a leVelitIiggered SR ipi op better all c ed a clocked SR latch It can be shown that two sew ofconditions cause QT l QT 1 Clock 0 2 Clock 1 andS0andR0 The clock inpu to a ipi op is so called because it is usually connected to a signal generated in part by the system clock You instructor likes to call the signal wake up or some such When the signal is asserted the ipi op responds to its input Details of a Flip Flop A ipi op is a clocked latch that is edgeitriggered it responds to the input either on the rising edge of the clock input or on the trailing edge of the same Here is an SR ipi op S QT Clock QT R The edge triggered nature of this device is due to the pulse generator shown below It converts the clock pulse to a Very short pulse that activates the latch 1 x 1 l 0 0 Clock Clock Comment on Notation Used All ip ops have a number of inputs that your instructor does not indicate unless they are required for discussion of the circuit Power every ip op must be powered Ground every ip op must be grounded Clock all ip ops are clocked devices Asynchronous Clear this allows the ip op to be cleared independently of the clock In other words make QT O Asynchronous Set this allows the ip op to be set independently of the clock In other words make QT 1 Absent the explicit clock input your instructor s circuits might resemble unclocked latches Your instructor does not use such latches but designs only with ip ops SR Flip Flop We noW abandon the detailed View of ipi ops and adopt a functional View How does the next state depend on the present state and input A ipi op is a bit holder Here is the diagram for the SR ipi op S QT 12 Here is the state table for the Note that setting both S l inconsistent state followed by an indeterrninistic We label the output for S l and R l as an error enter a logicall state For this reason J K Flip Flop A J K ipi op generalizes the SR to allow for both inpuw to be 1 J QT KW Here is the characteIistic table for a JK ipi op Note that the ipi op can generate all four possible functions of a single Va1iable the two constanw 0 a3 the Va1iab1es Q and The D Flip Flop The D ipi op specializes either the SR or JK to store a single bit It is Very useful for interfacing the CPU to external devices Where the CPU sends a brief pulse to set the Value in the device and it remains set until the next CPU signal QT D QT The characteristic table for the D ipi op is so simple that it is expressed better as the equation QT 1 D Here is the table I 0 1 quot The T Flip Flop The toggle ipi op allows one to change the Value stored It is o en used in circuiw in which the Value of the bit changes between 0 and l as in a moduloi4 counter in Which the lowiorder bit goes 0 l 0 l 0 1 etc QT T QT The characteIistic table for the D ipi op is so simple that it is expressed better as the equation Qt l Qt T Here is the table Here the symbol 1quot denotes the input t and t 1 denote time Memory Technologies SRAM and DRAM One major classi cation of computer memory is into two technologies SRAM Static Random Access Memory DRAM Dynamic Random Access Memory and its variants SRAM is called static because it will keep its contents as long as it is powered DRAM is called dynamic because it tends to lose its contents even when powered Special refresh circuitry must be provided Compared to DRAM SRAM is faster more expensive physically larger fewer memory bits per square millimeter SDRAM is a Synchronous DRAM It is DRAM that is designed to work with a Synchronous Bus one with a clock signal Consider a 2 GHz CPU with 100 MHz SDRAM The CPU clock speed is 2 GHz 2000 MHz The memory bus speed is 100 MHz Chap1amp7 Finns SmiehbdnnesandSequemJal 0mm De nition Sequential Circuits Combinational Circuits Sequential Circuits No M emmy Memory No ip ops Flip ops may be used only combinational gates Combinational gates may be used No feedback Feedback is allowed Output for a given set of The order of input change Inputs is independent of is quite important and may order in which these inputs produce signi cant differences were changed a er the in the output output stabilizes Slide 1 M 11 slides chc 5155 Fetmary j Chap1amp7 Finns Stale anhmes and Sequennal 0mm De nition Sequential Circuits D Output In ut gt p Combinational Logic AND OR NOT gates and any MSI circuits Memory Usually introduces a time delay The memory state does not change instantly Sequential Logic Includes Conlbinational Logic and Memory Slide 2 M 11 slides chc 5155 Felmary j Chap1amp7 Finns StatehbdnnesandSequennal 0mm Circuit Analysis and Design Sequential Circuits Circuit AnalV 39 Attempt to lentify a circrut for which we have a diagram 1 Identify the inputs and outputs of the circuit 2 Express each output as a Boolean function of Lquot the inputs and the present state Q0quot M3 Create a Finite State Model of the circuit lt S Y 4 Identify the circuit if possible Circuit Design Design a circuit to meet a given speci cation 1 Develop a Finite State Machine model of the circuit quot Translate to a state table 3 Pick the ir ops and use excitation tables 4 De ne the inputs for each ir op 3 Design the circuit Slide 3 M 11 slides chc 5155 Fetmary j Chap1amp7 Finns StatewbdmesandSequemial 0mm Sample Finite State Machine A 1101 1 Sequence Detector Five states each with two transitions The FSM begins in state A sud M 11 slides chc 5155 Femmry j Chap187 Fm e StalerdnnesandSequemJal 0mm State Tables for the 11011 Sequence Detector Shae 5 M 11 511625 chc 5155 Femy j Chap187 Flmle StalerdnnesandSequemJal 0mm Sample Circuit for Analysis AoK Q Ste 1 ldentifv the Inputs and Othuts One inan labeled X One output labeled Z My labeling convention Inputs are X nte1nal states are Y 01 Q Outputs are Z Shae 6 M 11 511625 chc 5155 Felmary j Chap1amp7 Finns StatehbdnnesandSequemJal 0mm Circuit Analysis Page 2 Step 2a Determine the equations for the iir op inputs Reading the ii g l39 llL we see that D X Y This can be mitten as D X QT Note QT is the ir op state now The ipe op will not react to the D input until time T 1 Step 21 Detennine the equation for the output Reading the diagi am we see that Z X EB Y 51627 M 11 slides chc 5155 Fetmary j Chap1amp7 Finns Smiehbdnnesamisaquennal 0mm Circuit Analysis Page 3 Ste 3a Create the Next State Table We have a D here When we compute D from X and QCT we automatically get QT 1 2X6Y Slide 8 M 11 slides chc 5155 Felmary j Chap1amp7 Finite StatehbmmsandSequermal 0mm Circuit Analysis Page 4 Step 43 Combine the two tables f01 fonn one table Step 4b Refonnat the table into quot quot forum Present Shte Next StateOutpm 1 L o 111 Slide 9 M 11 slides chc 5155 FeimEry JEIEI Chap187 Finns StalerdnnesandSequemial 0mm Circuit Anal IfP Ste 6 sible Identifv the Circuit Note that for QT 0 Z X T 1 z This is a seiial two sicomplcment circuit LSB rst Slide in an 1 slides chc 5155 FeVmEry JEIEI Standard Boolean Forms In this section we develop the idea of standard forms of Boolean expressions In part these forms are based on some standard Boolean simpli cation rules Standard forms are either canonical forms or normal forms The standard expressions are in either SOP Sum of Products form or POS Product of Sums form This lecture will focus on the following Canonical Sum of Products Normal Sum of Products Canonical Product of Sums Normal Product of Sums We shall also discuss a few more variants that have no standard names IMPORTANT These forms use only the 3 basic Boolean functions AND OR NOT Specifically XOR is not used Variables and Literals We start with the idea of a Boolean variable It is a simple variable that can take one of only two values 0 False or 1 True Following standard digital design practice we use the values 0 and 1 Following standard teaching practice we denote all Boolean variables by 46A 66B 66C CD777 46W 46X ELY 277 A literal is either a Boolean variable or its complement Literals based on the variable X X and X Literals based on the variable Y Y and F NOTE X and X represent the same variable but they are not the same literal X and F represent different variables Product and Sum Terms A product term is the logical AND of one or more literals with no variable represented more than once A sum term is the logical OR of one or more literals with no variable represented more than once The following are all valid product terms over the two variables X and Y X0 XOY XOY XOY Forms such as XOXX andiXoXoY are not considered as XoX X and XOX 0 so XoXoY XOY and XOXOY OOY O The following are all valid sum terms over the two variables X and Y X Y X Y Y Y Y Y Single literals According to the strict de nition a single literal is either a sum term or a product term depending on the context This is necessary to avoid having to give a number of special cases in the following definitions Sum of Products and Product of Sums A SOP Sum of Products expression is the logical OR of product terms A POS Product of Sums expression is the logical AND of sum terms Sample SOP expressions F1X Y XoY Xi G1X Y YoY Xi H1X Y Z X i2 Note If we did not allow single literals to be product terms we would have trouble classifying HX Y Z which is clearly SOP Sample POS expressions F2X Y XY Xi G2X Y YY X H2X Y Z X Z Note POS expressions almost always have parentheses to indicate the correct evaluation More on Ambiguous Forms What is the form of the expression FX Y X Y 1 SOP It is the logical OR of two product terms Each product term is a single literal 2 P08 It is a single sum term X Y Both statements are true In general questions such as this do not concern us If you are asked a question like this on a test either answer will be accepted This ambiguity comes from the de nitional necessity of mentioning the logical AND of one or more terms and the logical OR of one or more terms With two equally good answers to an ambiguous form pick the one you like Inclusion A product term T1 is included in a product term T2 if every literal in T1 is also in T2 A sum term T1 is included in a sum term T2 if every literal in T1 is also in T2 Consider the following examples FA B C AoB AoC AoBoC Each of AoB and AoC is included in AoBoC GA B c A BgtA CgtA B C Each of A B and A C is included in A B C There is no inclusion in the next expression FA B C AoB AoC ZBC The literal A does no appear in the third term The inclusion rule is based on literals not just variables More on Inclusion Consider F1A B C AOB AOC AOBOC and F2A B C AOB AOC We claim that the two are equal for every value of A B and C Let A 0 Clearly F1A B C O F2A B C 0 Let A 1 Then F1A B CBCBOC and F2A B C B C Notice that we still have inclusion in F1 as each of B and C is included in BoC We prove these versions of F1A B C F2A B C using a truth table B C BoC BC BlCl BoC 0 0 0 0 0 0 1 0 1 1 1 0 0 1 1 1 1 1 1 1 Last Word on Inclusion If a SOP or POS expression has included terms it can be simpli ed F1A B C AoB AoC AoBoC is identically equal to F2A B C AoB AoC G1A B C A B0A CoA B C is identically equal to G2A B C A 1339A C The conclusion is that Boolean expressions With included terms are needlessly complicated We can simplify them by the application of trivial rules Note that duplication is a form of inclusion The expression F3A B AoB AoB has 2 terms each included in the other Non Standard Expressions Not every useful Boolean expression is in a standard form FX Y X 69 Y is not a standard form due to the exclusive OR GX Y XoY X Y0YY is not in a standard form This has both a product term and a sum term The fact that GX Y can easily be converted to a standard form does not make it already in a standard form Let s convert this to SOP I usually have dif culty in conversion to P08 unless I am using a method I have yet to describe The term XOY is already a product term so we convert X Y0YY to SOP X YYY xY Yin XOXXOYYOXYOY 7 0XoYXoYY XXoYY loY Y Y So Gx Y XoY Y Y GX Y Y Normal and Canonical Forms A Normal SOP expression is a Sum of Products expression with no included product terms A Normal POS expression is a Product of Sums expression with no included sum terms A Canonical SOP expression over a set of Boolean variables is a Normal SOP expression in Which each product term contains a literal for each of the Boolean variables A Canonical POS expression over a set of Boolean variables is a Normal POS expression in Which each sum term contains a literal for each of the Boolean variables Note A canonical expression on N Boolean variables is made up of terms each of Which has exactly N literals Note One can do digital design based on either normal or canonical forms The choice usually depends on the technology used Equivalence of Canonical Forms and Truth Tables We can directly translate between either of the canonical forms and a truth table using a standard set of rules To produce the Sum of Products representation from a truth table a Generate a product term for each row where the value of the function is 1 b The variable is complemented if its value in the row is 0 otherwise it is not To produce the Product of Sums representation from a truth table a Generate a sum term for each row where the value of the function is O b The variable is complemented if its value in the row is 1 otherwise it is not Row X Y XEDY 0 0 0 0 1 0 1 1 2 1 0 1 3 1 1 0 SOP Terms for rows 1 and 2 Row 1 YOY Row 2 X0 F YOY X0 POS Terms for roin and 3 Row 0 X Y Row 3 Y Y FXY0X Y Examples Conversions between the Three Forms We have three equivalent ways to de ne a Boolean expression 1 The Truth Table 2 The Z and H lists 3 The Canonical Form In each of these depictions of the expression we need to know the number of Boolean variables and labels to be assigned these variables It is easier to consider the SOP and POS cases separately because the rules for conversion from the truth tables are so different for the two cases The gure below depicts the translations we shall consider for the SOP case Truth Canonical Table Form AL AL v v H List ZList 1 Normal 1 Form Example Truth Table Here is a truth table de nition of a Boolean function of three Boolean variables Row A B C F2 0 0 0 0 0 1 0 0 1 0 2 0 1 0 0 3 0 1 1 1 4 1 0 0 0 5 1 0 1 1 6 1 1 0 1 7 1 1 1 1 We shall discuss this function in both of its Sum of Product and Product of Sum representations We begin With SOP the form that most students nd easier SOP Example Truth Table to Z List Here is the truth table ROW A B C F2 0 0 0 0 0 1 0 0 1 0 2 0 1 0 0 3 0 1 1 1 4 1 0 0 0 5 1 0 1 1 6 1 1 0 1 7 1 1 1 1 Note that the function has value 1 in rows 3 5 6 and 7 The function is F2A B C 23 5 6 7 SOP Example Z List to Truth Table The function is F2A B C 23 5 6 7 Create a truth table and place 10 ic 1 s in rows 3 5 6 and 7 Row A B C F2 0 0 0 0 l 0 0 l 2 0 l 0 3 0 l l l 4 l 0 0 5 l 0 l l 6 l l 0 l 7 l l l 1 Place 0 s in the other rows Row A B C F2 0 0 0 0 0 l 0 0 l 0 2 0 l 0 0 3 0 l l l 4 l 0 0 0 5 l 0 l l 6 l l 0 l 7 l l l l SOP Example Between Z List and H List Any function of N Boolean variables is represented by a truth table having 2N rows numbered from 0 through 2N l The function is F2A B C 23 5 6 7 There will be 23 8 rows numbered 0 through 7 in the truth table for F2 To translate om one list form to the other list form just pick the numbers in the range 0 2N 1 that are not in the source list F2A B C 23 5 6 7 This is missing 0 l 2 and 4 So F2A B C H0 1 2 4 The translation from the H list to the Z list works in the same way SOP Example Truth Table to Canonical Form To produce the Sum of Products representation from a truth table a Generate a product term for each row Where the value of the function is 1 b The variable is complemented if its value in the row is 0 otherwise it is not Here again is the truth table Row A B C F2 0 0 0 0 0 1 0 0 1 0 2 0 1 0 0 3 0 1 1 1 The term is ZOBOC 4 1 0 0 0 5 1 0 1 1 The term is A E C 6 1 1 0 1 The term is AoBoE 7 1 1 1 1 The term is AOBOC F2A B C ZBC AEC AB AoBoC SOP Example Between Canonical Form and Z List Given F2A B C ZOBOC AOEOC AOBOC AOBOC Write a 0 beneath each complemented variable and a 1 below each that is not Convert the N bit numbers assuming unsigned binary These are the rows of the truth table that have a 1 F2 A B c ZoBoc AoEoc AOBOC AoBoC o 1 1 1 o 1 1 1 o 1 1 1 3 5 6 7 F2A B C 23 5 6 7 To convert the other way just reverse the steps SOP Example Between Canonical and Normal Forms Given F2A B C ZOEC AOEOC AOBOE AOBOC To convert this to a simpler normal form we must rst make it more complex F2A B C ZoBoc AoBoC AOBOC AOBOC AOBOE AOBOC But ZOEC AOBOC Z A0BOC loBoC BOC AEC AoBoC AE BoC A1C AoC 1436 AoBoC AoBoE C AoBol AoB So F2A B C BoC AoC AoB AoB AoC BoC The more standard notation POS Example Truth Table to H List Here is the truth table ROW A B C F2 0 0 0 0 0 1 0 0 1 0 2 0 1 0 0 3 0 1 1 1 4 1 0 0 0 5 1 0 1 1 6 1 1 0 1 7 1 1 1 1 Note that the function has value 0 in rows 0 1 2 and 4 The function is F2A B C HO 1 2 4 POS Example H List to Truth Table The function is F2A B C HO l 2 4 Create a truth table and place 10 ic 0 s in rows 0 1 2 and 4 Row A B C F2 0 0 0 0 0 l 0 0 l 0 2 0 l 0 0 3 0 l l 4 l 0 0 0 5 l 0 l 6 l l 0 7 l l 1 Place 1 s in the other rows Row A B C F2 0 0 0 0 0 l 0 0 l 0 2 0 l 0 0 3 0 l l l 4 l 0 0 0 5 l 0 l l 6 l l 0 l 7 l l l l POS Example Truth Table to Canonical Form To produce the Product of Sums representation om a truth table a Generate a product term for each row Where the value of the function is O b The variable is complemented if its value in the row is 1 otherwise it is not Here again is the truth table Row A B C F2 0 0 0 0 O The term is A B C 1 0 0 1 0 The term is A B C 2 O l O O The term is A E C 3 0 1 1 1 4 1 o o o The term is Z B C 5 1 0 1 1 6 1 1 0 1 7 1 1 1 1 F2A B CAB CoAB CoA B CoZ B C POS Example Between Canonical Form and H List F2A B C A B CoA B EMA B CoZ B C Write a 1 beneath each complemented variable and a 0 below each that is not Convert the N bit numbers assuming unsigned binary F2A B C A B CA B OA E CZ B C 0 0 0 0 0 l 0 l 0 l 0 0 0 l 2 4 F2A B C H0 1 2 4 To convert the other way just reverse the steps CPSC 5155 Chapter 7 Slide 1 Modulo 4 Up Down Counter This is a counter with input If X 0 the device counts up If X 1 the device counts down u 02 02 Ho 1 939 Note two transitions between the state pairs one is up and one is down Slide 1 of 14 slides Design ofa Mod4 Up Down Counter July 13 2010 CPSC 5155 Chapter 7 Slide 2 Step 1b Derive the State Table Present State Next State X O X 1 0 1 3 1 2 0 2 3 1 3 0 2 This is just a restatement of the state diagram Note Two columns for the Next State Slide 2 of 14 slides Design ofa Mod4 Up Down Counter July 13 2010 CPSC 5155 Chapter 7 Slide 3 Step 2 Count the States and Determine the Flip Flop Count Count the States There are four states for any modulo 4 counter N 4 The states are simple 0 1 2 and 3 Calculate the Number of Flip Flops Required Let P be the number of ip ops Solve 2P1 lt N s 2P So 2P1 lt 4 s 2P and P 2 We need two ip ops Slide 3 of 14 slides Design ofa Mod4 Up Down Counter July 13 2010 CPSC 5155 Chapter 7 Slide 4 Step 3 Assign a unique Pbit binary number state vector to each state Here P 2 so we are assigning two bit binary numbers Vector is denoted by the binary number YlYO State 2bit Vector Y1 Y0 0 0 0 1 O 1 2 1 0 3 1 1 Each state has a unique 2 bit number assigned Any other assignment would be absurd Slide 4 of 14 slides Design ofa Mod4 Up Down Counter July 13 2010 CPSC 5155 Chapter 7 Slide 5 Step 4 Derive the state transition table and the output table There is no computed output hence no output table The state transition table uses the 2 bit state vectors Present State Next State X O X 1 0 00 01 11 1 01 10 00 2 10 11 01 3 11 00 10 Slide 5 of 14 slides Design ofa Mod4 Up Down Counter July 13 2010 CPSC 5155 Chapter 7 Slide 6 Step 5Separate the state transition table into P tables one for each ip op Flip Flop 1 FlipFlop 1 PS Next State Y1 Y1Y0 X Z O X Z 1 0 0 0 1 0 1 1 0 1 0 1 0 1 1 0 1 Flip Flop 0 FlipFlop 0 Y0 PS Next State Y1Y0 X Z O X Z 1 0 0 1 1 0 1 0 0 1 0 1 1 1 1 0 0 Slide 6 of 14 slides Design ofa Mod4 Up Down Counter July 13 2010 CPSC 5155 Chapter 7 Slide 7 Step 6 Decide on the types of ip ops to use When in doubt use all JK s Here is the excitation table for 21 JK ip op QT QT1 J K 0 0 0 d 0 1 1 d 1 o d 1 1 1 d 0 Slide 7 of 14 slides Design ofa Mod4 Up Down Counter July 13 2010 CPSC 5155 Chapter 7 Slide 8 Step 7 Derive the input table for each ip op Flip Flop 1 X 0 X l Y1Yo Y1 J1 K1 Y1 J1 K1 0 0 0 0 d 1 l d 0 l 1 l d 0 0 d 1 0 1 d 0 0 d l 1 l 0 d l 1 d 0 Flip Flop 0 X 0 X l Y1Yo Y0 Jo K0 Y1 Jo K0 0 0 1 l d 1 l d 0 1 0 d l 0 d l l 0 1 l d 1 l d l 1 0 d l 0 d l Question How do we produce equations for the J s and K s Slide 8 of 14 slides Design ofa Mod4 Up Down Counter July 13 2010 CPSC 5155 Chapter 7 Slide 9 Step 8 Derive the input equations for each ip op The equations are based on the present state and the input The input X produces a complication The simplest match procedure will lead to two equations for each ip op input one for X O and one for X 1 Use the combine rule The rule for combining expressions derived separately forXOandX l is X0expression for X 0 XoeXpression for X l Rationale Let FX AOX BOX When X O FX A and when X 1 FX B Slide 9 of 14 slides Design ofa Mod4 Up Down Counter July 13 2010 CPSC 5155 Chapter 7 Slide 10 Input Equations for Flip Flop 1 XO Xl YlYO Y1 J1 K1 Y1 J1 K1 0 O 0 0 d 1 l d 01 1 l d 0 0 d 1 O 1 d O 0 d l 1 l 0 d l 1 d 0 leYO J1Y0j KleO Ilt1ZYOj Apply the combine rule J1 X OYO l 0Y0j X 639 Y0 K1 X OYO Y3 Z X G39 Y0 Slide 10 of 14 slides Design ofa Mod4 Up Down Counter July 13 2010 CPSC 5155 Chapter 7 Slide 11 Input Equations for Flip Flop 0 X 0 Xl Y1Y0 Y0 J0 K0 Y1 J0 K0 0 0 1 l d 1 l d O 1 0 d l 0 d l l 0 1 l d 1 l d l 1 0 d l 0 d 1 JO 1 JO 1 K0 1 K0 1 Apply the Combine Rule J0 X ol Xol 1 K0 X ol Xol 1 Neither J 0 nor K0 depend on X But Y0 does not depend on X Slide 11 of 14 slides Design ofa Mod4 Up Down Counter July 13 2010 CPSC 5155 Chapter 7 Slide 12 Step 9 Summarize the equations by writing them in one place Here they are J1X YO K1X Y0 J01 K01 Slide 12 of 14 slides Design ofa Mod4 Up Down Counter July 13 2010 CPSC 5155 Chapter 7 Slide 13 Step 10 Draw the Circuit As designed it is ELK ELK 1 0 Cluck Slide 13 of 14 slides Design ofa Mod4 Up Down Counter July 13 2010 Combinational vs Sequential Circuits Basically sequential circuits have memory and combinational circuits do not Here is a basic depiction of a sequential circuit In ut p P P Cumbinatianal Logic Dunn AND UR NUT gates and any MEI circuits Memory Usually intruduces a time delay The memury state dues nut change instantly All sequential circuits contain combinational logic in addition to the memory elements We now consider the analysis and design of sequential circuits Finite State Machines Notation In this course we represent sequential circuits as nite state machines A Finite State Machine FSM is a circuit that can exist in a nite number of states usually a rather small number Finite State Machines with more than 32 states are rare The FSM has a memory that stores its state If the FSM has N states then its memory can be implemented with P ip ops where 21quot1 lt N s 2P Typical values 3 states 2 ip ops 4 states 2 ip ops 5 states 3 ip ops 8 states 3 ip ops Tools to describe nite states machines include 1 The state diagram 2 The state table 3 The transition table State Diagram for 3 Sequence Detector a if C a f 913 ur by 0 DID DID Fax I REV NOTE We have ve states labeled A B C D and E We have labeled edges connecting the states Each is labeled Input Output This is a directed graph with labeled edges and loops More Notes on the State Diagram The main function of the state diagram for the FSM is to indicate what the next state will be given the present state and input Here the input is labeled X Were the input two bits at a time the input would be labeled as X1 X0 with X1 the more signi cant bit The labeling of the arcs between the states indicates that there is output associated with each transition Not all Finite State Machines have output associated with the transition This one does This and all typical FSM represents a synchronous machine Transitions between states and production of output if any takes place at a xed phase of the clock depending on the ip ops used to implement the circuit Were we pressed to be more speci c we would associate the transitions with the rising edge of the clock This is usually an unnecessary detail State Diagram for a Modulo 4 Counter Here is the state diagram for amodulo4 counter There is no input but the clock It just counts clock pulses Note the direction ofthe axrows this is an uprcounter State Tables The state table is a tabular form of the state diagram It is easier to work with Here is the state table for the sequence detector Present State Next State Output XO X1 A AO BO B AO CO C DO CO D AO EO E AO Cl Here is the state table for the modulo 4 counter Present State Next State 0 owmi a 1 2 3 Transition Tables Transition tables are just state tables in which the labels have been replaced by binary numbers Often the labels are retained to facilitate translation to binary Here is the transition table for the sequence detector Present State Next State Output X0 X1 A000 0000 0010 B001 0000 0100 C010 0110 0100 D011 0000 1000 E100 0000 0101 Here is the transition table for the modulo 4 counter There is no output table Present State Next State 000 01 101 10 210 11 311 00 Sample Circuit for Analysis Q The analysis of such a circuit follows a xed set of steps 1 Determine the inputs and outputs of the circuit Assign variables to represent these 2 Characterize the inputs and outputs of the ip ops Show as Boolean expressions 3 Construct the Next State and Output Tables 4 Construct the State Diagram 5 If possible identify the circuit There are no good rules for this step Step 1 Determine the inputs and outputs of the circuit L 39 with one huculaj X The input is labeled as x The output is labeled as 2 The internal hue that is fed baek into the ipr op is labeled as v t a NOTE W 2 based on the input x Step 2 Show the inputs and outputs as Boolean expressions X Q Input x Output 2 x 9 Y Input to Flierlop D x Y Output ofFlierlop Y Step 3 Construct the Next State and Output Tables Here is the next state table X tY DXY O 1 1 1 We know the present state of the ip op call it Y Given Y and X the input we can compute D This determines the next state Here is the output table It depends on the input and present state X Yt Z O O 1 1 O 1 1 O Step 3A Construct the Next State Output Table Just combine the two tables into one table X QtY DXY Qt1Z O O O 00 0 l l ll l O l ll l l 1 10 We then put the table into a standard form that will lead to the state diagram Present State Next StateOutput X O X l O O O l l l l l l O We use this to build a state diagram The two states are Q O and Q 1 The outputs are associated with the transitions Step 4 Construct the State Diagram Here again is the state table with output Present State Next StateOuth 0 X 1 Here is the state diagram This is the required answer Step 5 Identify the Circuit if Possible This is often hard to do The key here is that the circuit stays in state 0 until the rst 1 is input When the rst 1 is input it goes to state 1 and stays there for all input We now characterize the output as a function of the input for each of the two states Input QT Output 0 0 0 For Qt 0 the output is X For Qt 1 the output is Ql Al I 1 0 0 1 1 1 It can be shown that this is a serial generator for a two s complement The binary integer is read from Least Signi cant Bit to Most Signi cant Bit Up to and including the rst least signi cant 1 the input is copied After that it is complemented 00011100 becomes 1110 0100 0010 1101 becomes 1101 0011 Try this it works Design of a Sequential Circuit We begin with the rules for a simple procedure to do the design 1 2 3 4 5 6 7 8 9 10 Derive the state diagram and state table for the circuit Count the number of states in the state diagram call it N and calculate the number of ip ops needed call it P by solving the equation 2P 1 lt N S 2P This is best solved by guessing the value of P Assign a unique Pbit binary number state vector to each state Often the rst state O the next state 1 etc Derive the state transition table and the output table Separate the state transition table into P tables one for each ip op WARNING Things can get messy here neatness counts Decide on the types of ip ops to use When in doubt use all JK s Derive the input table for each ip op using the excitation tables for the type Derive the input equations for each ip op based as functions of the input and current state of all ip ops Summarize the equations by writing them in one place Draw the circuit diagram Most homework assignments will not go this far as the circuit diagrams are hard to draw neatly Design a Modulo 4 Counter Step 1 Derive the state diagram and state table for the circuit Here is the state diagram Note that it is quite simple and involves no input Here is the state table for the moduloi4 counter Step 2 Count the Number of States Obviously there are only four states numbered 0 through 3 Determine the number of ip ops needed Solve 2P71ltN S 2P IfN 4 we have P 2 and 21lt 4 g 22 We need two ip ops for this design Number them 1 and 0 Their states will be Q1 and Q0 or Y1 and Y0 depending on the context Remember 21 2 22 4 23 8 24 16 25 32 26 64 27 128 etc Step 3 Assign a unique Pbit binary number state vector to each state Here P 2 so we assign a unique 2 bit number to each state For a number of reasons the rst state state 0 must be assigned Y1 O and Y0 O For a counter there is only one assignment that is not complete nonsense State 2bit Vector O O O l O l 2 l O 3 l l The 2 bit vectors are just the unsigned binary equivalent of the decimal state numbers Step 4 Derive the state transition table Present State Next State 0 00 01 l 01 10 2 10 l l 3 l l 00 Strictly speaking we should have dropped the decimal labels in this step However this representation is often useful for giving the binary numbers The state transition table tells us what the required next state will be for each present state Step 5 Separate the state transition table into P tables one for each ip op Here P 2 so we need two tables FlipF103 1 FlipFlop 0 Present State Next State Present State Next State Y1 Y0 Y1 t1 Y1 Y0 Y0 t1 O O O O O 1 O 1 1 O 1 O 1 O 1 1 O 1 1 1 O 1 1 O Each ip op is represented with the complete present state and its own next state Step 6 Decide on the types of ip ops to use When in doubt use all JK s Our design will use JK ip ops For design work it is important that we remember the excitation table Here it is Q t Q t1 J K O O 0 d O 1 1 d 1 0 d 1 1 1 d O Step 7 Derive the input table for each ip op using the excitation tables for the type Here is the table for ip op 1 PS NS Input Y1 Y0 Y1 J1 K1 0 O O 0 d 0 1 1 1 d 1 O 1 d O 1 1 0 d 1 Here is the table for ip op O PS NS Input Y1 Y0 Y0 J0 K0 0 0 1 1 d O 1 0 d 1 1 0 1 1 d 1 1 0 d 1 Step 8 Derive the input equations for each ip op I use a set of intuitive rules based on observation and not on formal methods 1 If a column does not have a O in it match it to the constant value 1 If a column does not have a l in it match it to the constant value 0 2 If the column has both 0 s and 1 s in it try to match it to a single variable which must be part of the present state Only the 0 s and 1 s in a column must match the suggested function 3 If every 0 and l in the column is a mismatch match to the complement of a function or a variable in the present state 4 If all the above fails try for simple combinations of the present state NOTE The use of the complement of a state in step 3 is due to the fact that each ip op outputs both its state and the complement of its state Step 8 Derive the input equations for each ip op Here is the input table for Flip Flop 1 PS NS Input Y1 Y0 Y1 J1 K1 0 O O 0 d 0 1 1 1 d 1 O 1 d O 1 1 0 d 1 J1Y0 K1Y0 Here is the input table for Flip Flop O PS NS Input Y1 Y0 Y0 J0 K0 0 0 1 1 d O 1 0 d 1 1 0 1 1 d 1 1 0 d 1 J01 K01 Step 9 Summarize the equations by writing them in one place Here they are J1 Y0 K1 2 Y0 J0 1 K0 1 For homework and tests this is required so that I can easily nd the answers Step 10 Draw the circuit diagram J Y1 J Q I J Q Y0 K 6 39 Cluck i I But note that each ip op has input J K This suggests a simpli cation The System Bus The System Bus is one of the four major components of a computer Addvess Bus Control BUS This logical representation is taken from the textbook 39 uLiici umqu r data addresses instructions and control signals The CPU A High Level Description Just how does the CPU interact With the rest of the system 1 2 Chmboo It sends out and takes in data in units of 8 16 32 or 64 bits It sends out addresses indicating the source or destination of such data An address can indicate either memory or an inputoutput device It sends out signals to control the system bus and arbitrate its use It accepts interrupts from IO devices and acknowledges those interrupts It sends out a clock signal and various other signals It must have power and ground as input Addressing 4 Bus arbitration Data 44 lt Coprocessor gt Typical Micro Processor lt Bus control 4 Status gt 4 4 Miscellaneous its 1 5v Symbol for clock signal Interrupts L Symbol for electrical ground Power is 5volts The CPU Interacts Via the System Bus The System Bus allows the CPU to interact with the rest ofthe system Each ofthe r r 1 r u b Ground lines on the bus have two purposes 1 To complete the eleetn39eal elrou39 2 To lnlrnlzecrosset between the signal lines Her 39 mallbuswiththree datalines D2DDntwo address lines AAn asystem clock 4 and avoltage line V n 1 D1 n U 1 Au m A V In our considerations we generally ignore the multiple grounds and the power lines Notations Used for a Bus Here is the my that we would commonly represent the small bus shown above The big double arrow notation indicates abus ofanumber of different signals Our author calls this a fat axrow Lines with similar function are grouped together Their count is denoted with the diagonal slash notation From top to bottom we have 1 Three datalines D2D anan 2 Two address lines A andAn 3 The clock signal for the bus lt1 Power and ground lines usually are not shown in this diagram Computer Systems Have Multiple Busses Early computers had only a single bus but this could not handle the data rates Modern computers have at least four types of busses l A Video bus to the display unit 2 A memory bus to connect the CPU to memory which is often SDRAM 3 An lO bus to connect the CPU to InputOutput devices 4 Busses internal to the CPU which generally has at least three busses Often the proliferation of busses is for backward compatibility with older devices CPU chip Buses Registers Memow bus Bus A controller u I IO bus A y Memory Onchip bus Backward Compatibility in PC Busses Here is a gure that shows how the PC bus grew from a 20 bit address through a 24 bit address to a 32 bit address while retaining backward compatibility 8088 20Bit address Control 80286 20Bit address Control 4Bit address Control 20Bit address 80386 Ill Control 4Bit address It Control 8 Bit address Ill Control Backward Compatibility in PC Busses Part 2 Here is a picture of the PCAT bus showing how the original con guration was kept and augmented rather than totally revised Motheirboard connector PC bus Plugin Contact board New connector for PCAT Edge connector Note that the top slots can be used by the older 8088 cards which do not have the extra long edge connectors Notation for Bus Signal Levels The system 39 39 39 not change instantaneously Here is atypical depiction Others may be seen but this is what our authoruses 4 Clock Cycle v T Clock Clock Goes High Goes Low Single control signals are depicted in a similar fashion except ofcourse that they may not Vary in lock step with the bus clock Notation for Multiple Signals A single control signal is either low or high 0 volts or 5 volts agram For each ofaddress an data we have two impo dress or data is valid address or data is not valid l l Not Valid Valid Not Valid Clock mg rz T1 1 A collection such as 32 address lines or 16 data lines cannot be represented with such a simple di d rtant states For example consider the address lines on the bus Imagine a 327bit address At some time after T the CPU asserts an address on the address lines This means that each ofthe 32 address lines is given avalue When the CPU has asserted the address it is valid until the CPU ceases assertion Reading Bus Timing Diagrams Sometimes we need to depict signals on a typical bus Here we are looking at a synchronous bus ofthe type used for connecting memory This figure taken from the textbook shows the timings on a typical bus 3 x x Ready Request Address WiltsHead 5 gt53 so s VTe Data Note the form used for the Address Signals between to and t1 they change value According to the gure the address signals remain valid from t1 through the end of t7 Read Timing on a Synchronous Bus The bus protocol calls for certain timings to be met Read cycle with 1 wait state T1 T2 1 T ADDRESS address to be read DATA MREQ WAIT Time a a T AD the maximum allowed delay for asserting the address after the clock pulse TML the minimum time that the address is stable before the MREQ is asserted Read Sequences on an Asynchronous Bus Here the focus is on the protocol by which the two devices interact This is also called the handshake The bus master asserts MSYN and the bus slave responds with SSYN when done ADDRESS Memory address to be read MREQ B MSYN DATA SSYN Attaching an IO Device to a Bus This figure shows a DMA Controller for a disk attached to a bus It is only slightly more complex than a standard controller Data Address n cache 0 Cnnlroller Request ea y WriteRead Clock Bus Reset Error Each lO Controller has a range of addresses to which it will respond Specifically the device has a number of registers each at a unique address When the device recognizes is address it will respond to 10 commands sent on the comman US Bus Arbitration A number ofIO devices are usually connected to abus Each 10 device can generate an Interrupt called 1m when it needs service The CPU will reply with an acknowledgement called ACK The handling by the CPU is simple There are two signals only INT some device has raised an interru ACK the CPU is ready to handle that interrupt We need an arbitrator to take the ACK and pass it to the correct device The common architecture is to use a daisy chain in which the ACK is passed mmii Ltr Grammars and Languages Languages are associated With two mechanisms is formal language theory 1 The nite automata that recognize the language 2 The grammars that generate the language This lecture presents an introduction to the use of grammars to generate a formal language Note that we don t think of grammars generating natural languages In a natural language such as English or Spanish grammar is often considered a codi cation of correct usage Ungrammatical constructs are still recognized In a formal language the grammar generates the language Any string that has not been generated by the grammar Will not be recognized Alphabets We begin by de ning these basic terms which also appear in the nite automata approach to the subject Fortunately the de nitions are synonymous The most basic construct is a symbol Which can be considered the equivalent of an ASCII or UNICODE character There is no better de nition of the term An alphabet is a nite non empty set of symbols We denote this by 2 Some common alphabets seen in introductory courses are Binary digits 2 0 1 Small letter set 2 a b Small number set 2 0 l 2 3 Another alphabet that I like to use for illustration is the hexadecimal digit set 2 0 l 2 3 4 5 6 7 8 9 A B C D E F Strings From the individual symbols we construct strings In other contexts these are called either words or tokens Suppose s is an arbitrary string with length s k 2 0 Then either 1 k 0 and s 7 the empty string 2 k l s a1 a single character and s lor 3 k gt 1 s a1a2ak is a sequence ofk elements ofA and s k We now consider the operation of string concatenation de ned as follows 1 If s is an arbitrary string and the empty string then s M s 2 Otherwise let s1 a1a2aIn s1 m 2 l and s2 b1b2bn s2 n 21 Then slsz a1a2amb1b2bn and s1s2 s1 s2 m n More on Strings and Concatenation Example Let 2 0 1 This is an alphabet with only two symbols Given this we can de ne the following sets 81 O l the set of all strings of length 1 over 2 2 00 Ol 10 ll the set ofall strings oflength 2 over 2 Consider two strings s1 011 and s2 Ol 10 with symbols om 2 0 l l s13ands24 2 s1s2OllOllO ands2s1OllOOll 3 s1s2 7 and s2s1 7 NOTE The string concatenation operator is not commutative Speci cally s1s2 7t s2s1 Concatenation Part 2 The string concatenation is associative More precisely we say Let s1 s2 and s3 each be a string over a given alphabet 2 Then s1s2s3 s1s2s3 so that the expression s1s2s3 is not ambiguous For example let s1 11 s2 2222 and s3 333333 Then s1s2 112222 and s1s2s3 112222333333 112222333333 while s2s3 2222333333 and s1s2s3 112222333333 112222333333 Note again that s1s2 112222 while s2s1 222211 Yet again we have another reminder that the operator does not commute A Lemma 0n String Concatenation Lemma For any two strings s1 and s2 s1s2 2 s1 with equality if and only if s2 9 Proof For any two strings s1 and s2 s1s2 s1 s2 Now s2 is a counting number so it is not negative From that remark we immediately infer that s1s2 s1 s2 2 s1 Now suppose that s1s2 s1 s2 s1 Then s2 0 and s2 7 the unique string of zero length Ifsz 7 then s2 0 and s1s2 s1 s2 s1 0 81 Strings and Languages Again we start With 2 an alphabet of symbols We de ne 2 as the set of all strings based on the alphabet 2 More properly it is the set of strings obtained by concatenating zero or more symbols from the alphabet 2 We de ne 2 as the set of all non empty strings based on the alphabet 2 We can also say 2 2 K A language L is de ned as a subset of 2 L g 2 Example 2 a b 2 7L a b aa ab ba bb aaa aab aba abb 2 a b aa ab ba bb aaa aab aba abb L a ab aba abb is a nite language over 2 Note that L g 2 Strings and Languages Part 2 Again consider 2 a b with 2 7 a b aa ab ba bb aaa aab aba abb Consider the set de ned over 2 by L anbn n 2 0 Notation The exponent refers here to concatenation Let w be a string Then w2 ww is the concatenation of w with itself wn1 wnw ww for n 2 1 an inductive de nition So we now have L 7 ab aabb aaabbb aaaabbbb Obviously L g 2 so that L is a language over 2 a b Note that L anbn n 2 0 is de ned as a string of the symbol a followed by a string of an equal number of the symbol b L2 ambn m 2 O n 2 0 is de ned as a string of the symbol a followed by a string of symbol b of the same or different size How Do Grammars Generate Languages Given an alphabet Z we have de ned L g 2 as a language over 2 It is now time to attend to the grammar of a language Grammars are based on two concepts terminals and non terminals Grammars use production rules to generate a speci c string of terminals called a sentence om a speci c terminal called the start symbol The best way to handle this is to work through an example We consider a language based on a small subset of English A Small Language Here is a grammar for a subset of English 1 2 00lOUllgt A sentence comprises a n0unphrase followed by a verbphrase A n0unphrase is either a an article followed by an adjective followed by a noun or b an article followed by a noun A verbphrase is either a a verb followed by an adverb or b a verb article a the this is a list of terminal symbols adjective silly clever noun frog cow lawyer verb writes argues eats adverb well convincingly Producing a Sentence in the Language In this language the token sentence is the start symbol sentence nounphrase verbphrase article adjective noun verb adverb At this point we start to see terminals introduced the adjective noun verb adverb the silly noun verb adverb the silly cow verb adverb the silly cow writes adverb the silly cow writes convincingly At the end we have a sequence of terminal symbols Another sentence in the language The clever lawyer argues convincingly Why Don t We Build a Tree This grammar seems to suggest building a tree We don t do that Here are the top four levels of the complete tree sentence nounphrase verbphrase article adjective noun verbphrase article noun Verbphrase article adjective noun verb article noun verb article adjective noun verb adverb the silly cow wn39tes convincingly We do not generate the whole tree just a path from the root to one leaf node article noun verb adverb Recognizing the Language For those who miss high school English class here I show the standard way taught then for recognizing a valid sentence in the language It was taught to me as diagramming a sentence lawyer argues cun ncingljr Notice that this does not indicate how to write a sentence but only how to show that it was constructed according to the rules of grammar De nitions Part 1 A vocabulary V is a nite nonempty set of elements called symbols A sentence over V is a sequence of a nite number of elements of V The empty sentence 7 also called the empty string is the sentence containing no symbols The set of all sentences over V is denoted by V A language over V is a subset of V De nition of 3 Grammar A grammar G is de ned as a collection of four sets G V T S P V is a nite non empty set of objects called variables Equivalently V is called the vocabulary T is a nite non empty set of objects called terminal symbols also called terminals S is a special element of the set V It is called the start symbol P is a nite set of rules called productions The standard assumption is that an object is either a terminal symbol or a non terminal symbol It cannot be both V n T Q V can also be called the set of non terminal symbols or non terminals 1 U Analysis of Our Sample Grammar A sentence comprises a n0unphrase followed by a verbphrase Here the symbol sentence is clearly a nonterminal as it is replaced by the symbol n0unphrase followed by the symbol verbphrase This is the rst rule in the grammar so the symbol sentence is the start symbol A n0unphrase is either a an article followed by an adjective followed by a noun or b an article followed by a noun Here the symbol n0unphrase is clearly a nonterminal as it can be replaced by either one or two other symbols A verbphrase is either a a verb followed by an adverb or b a verb The symbol verbphrase is also clearly a non terminal Analysis of the Grammar Part 2 We now move to a listing of the terminal symbols 4 article a the 5 adjective silly clever 6 noun frog cow lawyer 7 verb writes argues eats 8 adverb well convincingly The symbols article adjective noun verb adverb are clearly nonterminals What we have left over are the symbols for which no substitution rules are speci ed This is the set of terminal symbols denoted by T V sentence noun phrase verb phrase article adjective noun verb adverb T a the silly clever frog cow lawyer writes argues eats well convincingly Conventionally the start symbol is listed rst in the set V Notation for Productions Productions are a rules for replacing non terrninal symbols by another sequence of symbols Here are two standard forms X gt Z The symbol X can be replaced by the symbol Z X gt Y Z The symbol X can be replaced by either Y or Z The last rule is equivalent to the following set of two rules X gt Y X gt Z One can extend the alternative right side notation as needed X gtSlSZ SN Later we shall de ne a number of grammars each distinguished by a unique set of restrictive rules for what can appear on the right hand side of a production Our Silly Grammar as a Set of Productions Here is the grammar written in one style of production rules 1 8 But what is V U TY sentence nounphrase verbphrase 2 3 4 5 6 7 article adjective noun verb adverb gt nounphrase verbphrase gt article adjective noun article noun gt verb adverb verb gt a the gt silly clever gt frog cow lawyer gt writes argues eats gt well convincingly V U T is the set of all objects both terminals and non terminals used in these rules V U T is the set of all sequences of one or more of these symbols verb adverb e V U T Just What is V U T We rst de ne V U T We present the sets V and T but in another ordering V adjective adverb article noun noun phrase sentence verb verb phrase T a argues clever convincingly cow eats frog lawyer silly the well writes V U T a adjective adverb argues article clever convincingly cow eats frog lawyer noun noun phrase sentence silly the verb verb phrase well writes This set has 20 symbols We could then consider the set that might be called V U T2 the set of 400 strings formed with two of the symbols above It might contain strings such as a a a adjective article noun verb adverb V U T is the set of all combinations of one or more of the symbols in V U T Memory Problems You are asked to implement a 128M by 32 memory 1M 220 using only 16M by 8 memory chips a What is the minimum size of the MAR b What is the size of the MBR c How many 16M by 8 chips are required for this design Answer a 128M 270220 227 so the minimum MAR size is 27 bits b The MBR size is 32 bits C 128M03216M08 804 32 chips A given memory is byte addressable The MAR size is 22 bits What is the maximum size of addressable memory in bytes Answer The maximum size is 222 bytes 40220 bytes 4 NB 4 194 304 bytes Complete the following table 64K 216 32K 215 and 213 lt 10K lt 214 Memory System Number of bits in Number of bits in Number of Chips Needed if Capacity MAR MBR the capacity of each chip is 1Kby4 2Kby1 1Kby8 64K by 4 16 4 64 128 32 64K by 8 16 8 128 256 64 32K by 4 15 4 32 64 16 32K by 16 15 16 128 256 64 32K by 32 15 32 256 512 128 10K by 8 14 8 2O 4O 10 Memory Control Signals 1 A CPU outputs two control signals Select and W interpreted as follows IfSelect o the memory does nothing IfSelect l the memory is read ifRW 1 otherwise it is written Draw a circuit to convert these two signals into the following control signals READ If D l the CPU reads from memory WRITE WRITE l e CPU writes to memory Insure that READ 1 and WRITE 1 cannot occur at the same time ANSWER Note if Select 0 then both READ o and WRITE o ifSelect dRW then READ 1 anolWRITEo then READ o anolWRITE an if Select 1 andRW The circuit follows from those observations S l t e READ R W WRITE Cache Memory A computer memory system uses a primary memory with 100 nanosecond access time fronted by a cache memory With 8 nanosecond access time What is the effective access time if a The hit ratio is 07 b The hit ratio is 09 c The hit ratio is 095 d The hit ratio is 099 ANSWER There is a problem with names here For the cache scenario we have Tp as the access time of the cache Ts as the access time of the primary memory The equation is TE 1108 1 h0100 a h 07 TE 0708 1 070100 0708 030100 56 300 356 bh 09 TB 0908 1 090100 0908 010100 72 100 172 c h 095TE 09508 1 0950100 09508 005100 76 50 126 dh 099TE 09908 1 0990100 09908 001100 792 10 892 More Cache Memory A computer has a cache memory set up with the cache memory having an access time of 6 nanoseconds and the main memory having an access time of 80 nanoseconds This question focuses on the effective access time of the cache a What is the minimum effective access time for this cache memory b If the effective access time of this memory is 134 nanoseconds what is the hit ratio ANSWER a The cache memory cannot have an access time less than that of the cache so 6 nsec b The equation of interest here is TE hoTp 1 h0Ts Using the numbers we have 134 h06 1 h080 or 134 80 740h or 74h 666 or h 09 Big Endian Little Endian Examine the following memory map of a byte addressable memory Let W refer to address 109 107l108l109l10Al 10B 00l11l23l17l CA a Assuming big endian organization what is the value of the 16 bit integer at W b Assuming little endian organization what is the value of the 16 bit integer at W Answer The 16 bit entry at address 109 occupies bytes 109 and 10A as follows 107l108l109l10Al10Bl OOI112317ICAI a In big endian the big end is stored rst 2317 b In little endian the little end is stored rst 1723 Memory Interleaving This memory is byte addressable and low order interleaved with its 22 bit address formatted as lBits2l 4 lBits3 O l Address to chips l Bank select I a What is the size of each chip in bytes b How many banks does the memory have a 18 bits are sent to each chip so each chip has 218 bytes 256 KB 262 144 bytes b Four bits are used to select the bank so there must be 24 16 banks Cache Tags and Cache Block Sizes Suppose a computer using direct mapped cache has 232 words of main memory and a cache of 1024 blocks where each cache block contains 32 words a How many blocks of main memory are there b What is the format of a memory address as seen by the cache that is what are the sizes of the tag block and word elds c To which cache block will the memory reference 0000 63FA map ANSWER Recall that 1024 210 and 32 25 from which we may conclude 1024 32 0 32 a The number ofblocks is 232 25 2625 227 27 0 220 128 M 134 217 728 b 32 words per cache line 2 Offset address is 5 hits 1024 blocks in the cache 2 Block number is 10 bits 232 words 2 32 bit address 2 32 5 10 10 15 17 bit tag Bits 311 1514 ls 41 lo Contents Tag Block Number Offset within block A Simple Computer Hardware Design We now focus on the detailed design of the CPU Central Processing Unit of the ASC The CPU has two major components the Control Unit and the ALU Arithmetic Logic Unit We rst consider the control unit Program Execution The program execution cycle is the basic Fetch Execute cycle in which the 16bit instruction is fetched from the memory and executed This cycle is based on two registers PC the Program Counter 7 a 16bit address register IR the Instruction Register 7 a 16bit data register At the beginning of the instruction fetch cycle the PC contains the address of the instruction to be executed next The fetch cycle begins by reading the memory at the address indicated by the PC and copying the memory into the IR At this point the PC is incremented by 1 to point to the next instruction This is done due to the high probability that the instruction to be executed next is the instruction in the address that follows immediately program jumps BRU BIP etc are somewhat unusual and are handled in program execution All instructions share a common beginning to the fetch sequence The common fetch sequence is adapted to the relative speed of the CPU and memory We assume that the access time of the memory unit is such that the memory contents are not available on the step following the memory read but on the step after that Here is the common fetch sequence MAR PC send the address of the instruction to the memory Read Memory this causes MBR MARPC PC PC 1 cannot access memory so might as well increment the PC IR MBR now the instruction is in the Instruction Register When the instruction is in the IR it is decoded and the common fetch sequence terminates After this point the execution sequence is speci c to the instruction This subsequent execution sequence includes calculation of the EA Effective Address for those instructions that take an operand The next step in the design of the CPU is to specify the microoperations corresponding to the steps that must be executed in order for each of the assembly language instructions to be executed Before considering these microoperations we study several topics the structure of the bus or busses internal to the CPU the functional requirements on the ALU We rst consider the structure of the CPU internal busses Version of 7302010 Page 1 CPU Internal Bus Stiueture bu t rll l For mnl ttnmmr We now eonsrder the bus strueture m lrght othe eommon fetch sequence 12c PC 1 the probabrlrty that the next rnstiuetron wlll be the neirtto be eireeuted Note that th5 one operation We shall depart from the book39s notation and use the notation add to denote the ii t n upperease ADD to denote the assembly language operatron Atthrs pornt we know that there mustbe atleast one bus rntemal to the CPU so that the the PC We eonsrder a one bus solution andlmmeddately notiee aproblem The ALU must r l one f0 usedto lnerement the PC Ifwe use a slngle bus solution we must allow for the faet that slngle bus assumptron one deslgn would add the eomplemty of an rnerement rnstiuetion for the ALU butwe ayordthat and base our s tio the operatro y ne souree of the eonstant 1 so we ereate a 1 regrsterquot to hold e mbe We p late atwo t wl egrsterz to holdthe output Sm e the bus ean have only one ue at a time we musthaye a temporary regrster Y to hol one of the two rnputs to the ALU Here are the mrerooperations CPI c1gt2 PC aBUs a lid CP3 z ABUS BUS gtPc note that the slngle bus solutron ls rather slow We wouldllke anotherway to do thrs preferably afaster one The solution proposed by the bookls to have three busses m the CPU named BUSl BUsz and BU53 Wrth three busses we ean put one value on eaeh of two busses that serye as rnput to the ALU and eopy the results on the thrd bus servlng as rnput to the Pc as follows PC gtBUSL l gtBU52 add BU53 gtPC Verslon of7302010 Page 2 The ThreeBus Structure As mentioned above the design of a CPU with three internal data busses allows a more efficient design We name the busses BUSl BUS2 and BUS3 The use ofthese busses is as follows BUSl and BUS2 are input to the ALU BUS3 is an output from the ALU Put another way BUS3 is the source for all data going to each register Each register outputs data to either BUSl or BUS2 We allocate registers to busses based partially on chance and partially on the requirement to avoid con icts if two data need to be sent to the ALU at the same time they need to be assigned to different busses What does the ALU require The only way to determine what must be placed on each input bus is to examine each assembly language instruction break it into microoperations and allocate the bus assignments based on the requirements of the microoperations Common Fetch Sequence We repeat the common fetch sequence MAR PC send the address of the instruction to the memory Read Memory this causes MBR MARPC PC PC 1 cannot access memory so might as well increment the PC IR MBR now the instruction is in the Instruction Register This sequence of four microoperations gives rise to a remarkable number of requirements for both the ALU and the bus assignments We rst examine the simple microoperation PC PC 1 We have already noted the requirement that the ALU have an add control signal associated with the eponymous ALU primitive operation use your dictionary It may be a surprise to see that there are two more requirements associated with this microoperation If the ALU is to execute the add primitive it must have two inputs 7 one associated with BUSl and one associated with BUS2 We make the bus allocations as follows The PC is allocated to BUSl in that it outputs an address to BUS 1 At this moment the allocation is arbitrary We allocate the constant 1 to BUS2 because of the requirement that the microoperation PC PC l have inputs on each of the two busses We create a 16bit register containing the constant 1 As an aside at this point we note that BUS3 is used to transfer an address into the PC At this point we give the complete set of control signals associated with this microoperation PC gt BUSl l gt BUS2 add BUS3 gt PC Note the use of the gt operator rather than the operation Version of 7302010 Page 3 M R e PC PC outputs to BUSl andthatBUS ls usedto transfer datato all regrsters The only way for an t t t n t n We de ne the two ALU pnrnltlves for data transfer ter transferthe eontents ofBUSl to BUS the eontents ofBUSZ to BUS tral transfer Whlle we are at rt we asslgn the MABto BUSl Agaln at ths pornt ths ls a brt arbrtrary U e MZBR th requmng the traz pnrnrtwe already defrned Atthrs pornt we revlew what we have PC gtBUSL tral BUS gtMAR Read Memory Read Merno c epc 1 PC gtBUSL l aBUS2 add BUS gtPC US2 traz BUS am We now look at the CPU deslgn as rthas BUSl BUSZ evolved at thrs trrne m response to the eornplete desenptron but only a deserrptron of whatwe have dseussedto ths pornt The ALU output bus BUS feeds all ofthe regl s wrth the exeeptron of the eonstant regl 1 We have a sornewhat uneven dlvlslon ofthe ALUrnputbusses butthrs wlll be xedlater ster ster Two Addresslng Modes Drreet and Indexed ul rn rn u Fornow gn brts When rnto the IR Lnstruetron Begrster the address partrs mm 1 a t n dreet addressrng ths ls the address to use In lndexed addressrng the address to use ls mm Ix where IX denotes the eontents of the speerfred address regrster These addresses goto the MAR thus DrreetAddressrng MARE mm Indexed Addresslng MAR e mm IX Verslon of7302010 Page 4 regrsters 1x1 1x2 deXE to BUSZ so ear that the eorrterrts of the rrrdreated Index Regrster addedto the lower order 8 brts ofthe 12R At thrs porrrt we rrrerrtrorr the 1x0 mck e aetuany a standard desrgrr praetree Corrsrder the above deserrptrorr of the two addressrrrg modes shghdy rewrrtterr grRs 1R 0 9 MAR e IRLU 1x1 ngIRsew MAReleherIxz ngIRsn MAReleherIm W Wrth thrs eorrstrarrrt the mrerooperataorr MAR e mm 1x0 rs the same as MAR e mm 0 The BUSl BUSZ Note the top rrrderr regrster 1x0 rs sh Wn lled wrth Us to dreate that rt rs y the eorrstarrt 0 Srrree 1x0 rs eorrstarrtregrster rthas rro eorrrreeta BUSE the only function of whreh r put data rrrto regrsters E0 re a or to s to he mdex regrsters are eorrrreetedto BUSZ through alHorl MUX The eorrtrot srgrra1 IX gt U52 eauses the output oftth MU X to be placed or BUSZ BU53 rs eorrrreetedto the L DEMU39X operatrrrg m the expeeted fashrorr when the eorrtrot srgrra1 BU53 1er aetrve Versrorr of7302010 Page 5 Flgure 5 There are anurnber ofequlvalent desrgns tlne lrnportant polnt ls to follow tlne eonstrarnts The two blts 1392 seleettlne regrster There are only two eontrol slgnals 1x gtBU52 and BU53 gtIK and When mus 00 tlne slgnal IX gtBU52 eauses a 0 to be plaeed on eaen llne ofBUSZ How does one thnk abouttlne pseudo regrsterIxo eontarnlng only 07 BUSZ onslder rny rrnplernentauon othe IX gtBU52 logl For eaeln othe 16 bltlrnes on BUSZ we have a erreurt slml arto that at nglnt The eontrol slgnal otlnertlnree lndex regrsters do 7 eaen havlng a set of 16 lpr ops 1n tlne end ltrs arnatter ofsemantlcs A Note on Bus Funeuonallty Y n as output of tlne ALU We have narned tlne lnput busses BUSl and BUSZ and tlne output bus BU53 p w a speclfl lndex regrsterto tlne MBR Gwen tlne strueture already de ned we eould de ne a new eontrol slgnal BUSZ AMER and tlnen assert botln 1x gtBU52 and BU 2 AMER to affectthe transfer For our purposes tlus deslgn presents an unneeessary eornplleatron Survey of the Assembly Language Instxueuons language rnstruetrons LDA Tu l aAcc m MR RUSZ US3gtA quot Yquot 9 h 7 c pram STA Tu l c ACC to abus we shall soon show tlnatrt rnustbe BUSl ontxol slgnals requlred ACC aBUSl ALU pnrnluyes requlred ter Verslon of7302010 Page 6 ADD The key microoperation to be executed is ACC MBR gt ACC This immediately requires that the ACC feed a bus different from the one used by the MBR Since the MBR has been assigned to BUSZ the ACC must be assigned to BUSl as done above ALU primitives add already speci ed TCA The key microoperation to be executed is 7 ACC gt ACC We need a new ALU primitive we have the choice tea or comp Implementation of the two scomplement as an ALU primitive tca would yield a speed boost but create problems in the ALU design so we opt to have the ALU primitive be comp 7 for one scomplement This will cause the TCA instruction to be executed as a twostep operation store the one scomplement in the ACC and then add 1 to it We have a 1 register attached to BUSZ and an add primitive de ned ALU primitives comp SHL This requires the ALU primitive shl SHR This requires the ALU primitive shr TIX As the index registers feed bus 2 this requires a 1 constant register to feed bus 1 TDX As the index registers feed bus 2 this requires a 7l constant register to feed bus 1 Indexed Addressing Revisited We now address the impact of the 1X0 design choice Recall the claim that this design has reduced the number of addressing modes to two Indexed and IndexedIndirect so that Direct Addressing becomes Indexed by 1X0 identically 0 Indirect Addressing becomes Indexed by 1X0 then indirect Indexed Addressing becomes Indexed by a real index register IndexedIndirect Addressing becomes Indexed by a real index register then indirect We must now face the problem that execution of opcodes C D E and F does not allow for indexed addressing So we need a new criterion for choosing when to index possibly with 1X0 and when not to index The answer is seen in the structure of the opcodes If IR1514 10 addressing is not used so there is no indexed addressing If IR1514 ll indexed addressing is not used The design choice is to inhibit indexing if IR15 1 If IR15 0 the design uses indexed addressing possibly with 1X0 7 leading to direct or indirect addressing If IR15 1 only direct or indirect addressing are possible This approach simpli es the design We now specify a design requirement for the common fetch cycle an address is to be deposited in the MAR This address will be the Effective Address unless indirect addressing is also used Thus we specify a common fetch sequence for all oneoperand instructions Version of 7302010 Page 7 Common Fetch Seguence for OneOperand Instructions CPl PC gt BUSl tral BUS3 gt MAR READ CP2 PC gt BUSl l gt BUSZ add BUS3 gt PC CP3 MBR gt BUSZ traZ BUS3 gt IR CP4 If IR15 0 Then IR70 gt BUSl IX gt BUSZ add BUS3 gt MAR Else IR70 gt BUSl tral BUS3 gt MAR The zerooperand instructions will replace the CP4 step with an appropriate action The Defer Cycle The defer cycle handles indirect addressing If indirect addressing is used then on entry to the defer cycle the value in the MAR is the address of a pointer to the argument On exit the argument address must be in the MAR Here is the complete defer sequence CPl READ The address is in the MAR CP2 Wait Cannot do anything here CP3 MBR gt BUSZ tra2 BUS3 gt MAR Now the argument address in the MAR CP4 Wait Major Cycles and Minor Cycles We now introduce the concept of major and minor cycles In the rst CPU design we focus on three major cycles Fetch Defer and Execute In this design each major cycle has four minor cycles In another design the Fetch Defer and Execute cycles are seen as phases in the execution of an instruction The minor cycle is tied to the CPU clock rate A minor cycle can contain only micro operations that can be executed at the same time and complete in one clock time Suppose that the ASC is a 10 MHz machine The clock time is then 01 microsecond this is the time duration for a minor cycle Our design calls for the ASC to wait for one clock cycle between asserting an address in the MAR and retrieving data from the MBR this corresponds to a memory access time of 100 nanoseconds 7 a reasonable value We now examine two implementations of the control unit one using only standard combinational circuits hardwired control unit and then a microprogrammed CPU The basis for each ofthese designs is the complete sequence of control signals for each assembly language instruction For convenience we give the control signals within the context of a hardwired control unit Note that the defer cycle is always listed for oneoperand instructions even when indirect addressing is not used If indirect addressing is not used the defer cycle is not entered No indirect addressing Fetch 7 Execute Indirect addressing used Fetch 7 Defer 7 Execute Version of 7302010 Page 8 0 HLT F CP1 F CP2 F CP3 F CP4 Halt PC gt BUSltra1 BUS3 gt MAR READ PC gt BUSl 1 gt BUSZ add BUS3 gt PC MBR gt BUSZ tra2 BUS3 gt IR 0 gt RUN Reset the RUN Flip op For this instruction there are no defer and execute phases The computer stops 1 LDA FCP1 FCP2 FCP3 FCP4 DCP1 DCP2 DCP3 DCP4 ECP1 ECP2 ECP3 ECP4 2 STA Load Accumulator PC gt BUSl tral BUS3 gt MAR READ PC gt BUSl 1 gt BUSZ add BUS3 gt PC MBR gt BUSZ tra2 BUS3 gt IR IR70 gt BUSI IX gt BUSZ add BUS3 gt MAR READ Wait MBR gt BUSZ tra2 BUS3 gt MAR Wait READ Wait MBR gt BUSZ tra2 BUS3 gt ACC Wait Store Accumulator Fetch 7 as in LDA Defer 7 if entered the same as LDA ECP1 ECP2 ECP3 ECP4 3 ADD ACC gt BUSI tral BUS3 gt MBR WRITE Wait Wait Wait Add to Accumulator Fetch 7 as in LDA Defer 7 if entered the same as LDA ECP1 ECP2 ECP3 ECP4 Version of 7302010 READ Wait ACC gt BUSIMBR gt BUSZ add BUS3 gt ACC Wait Page 9 4 TCA Negate the Accumulator using the Two s Complement FCP1 PC gt BUSl tra1 BUS3 gt MAR READ FCP2 PC gt BUSl 1 gt BUSZ add BUS3 gt PC FCP3 MBR gt BUSZ tra2 BUS3 gt IR FCP4 Wait Defer Cycle is not entered ECPl ACC gt BUSl comp BUS3 gt ACC ECP2 ACC gt BUSl 1 gt BUSZ add BUS3 gt ACC ECP3 Wait ECP4 Wait 5 BRU Unconditional Branch Fetch 7 as in LDA Defer 7 if entered the same as LDA ECPl MAR gt BUSl tra1 BUS3 gt PC ECP2 Wait ECP3 Wait ECP4 Wait 6 BIP Branch if the Accumulator is Positive Fetch 7 as in LDA Defer 7 if entered the same as LDA ECPl If ACC gt 0 Then MAR gt BUSl tra1 BUS3 gt PC ECP2 Wait ECP3 Wait ECP4 Wait 7 BIN Branch if the Accumulator is Negative Fetch 7 as in LDA Defer 7 if entered the same as LDA ECPl If ACC lt 0 Then MAR gt BUSl tra1 BUS3 gt PC ECP2 Wait ECP3 Wait ECP4 Wait Version of 7302010 Page 10 Before giving the control signals for the RWD and WWD assembly language instructions we first must de ne the handshake protocols used in the IO system The 10 system uses three ip ops DATA INPUT and OUTPUT The uses of these ip ops are as follows DATA indicates the presence of data to be transferred INPUT commands the input data to place data on the DIL data input lines OUTPUT commands the output deVice to take data from the DOL data output lines The RWD protocol is as follows 1 The CPU resets the DATA ip op 2 The CPU sets the INPUT ip op 3 The CPU waits for the DATA ip op to be set by the Input Unit 4 The Input Unit places data on the DIL and sets the DATA ip op 5 The CPU transfers data from the DIL to the ACC and resets the INPUT ip op The WWD protocol is as follows The CPU resets the DATA ip op 2 The CPU transfers data from the ACC to the DOL and sets the OUTPUT ip op 3 The CPU waits for the DATA ip op to be set by the Output Unit 4 The output unit copies data from the DOL and sets the DATA ip op 5 The CPU resets the OUTPUT ip op V 8 RWD Read Word from Input Unit into the Accumulator FCPl PC gt BUSl tral BUS3 gt MAR READ FCP2 PC gt BUSl l gt BUS2 add BUS3 gt PC FCP3 MBR gt BUS2 tra2 BUS3 gt IR FCP4 0 gt DATA 1 gt INPUT ECPl Wait Stay in Execute until the Input is complete ECP2 Wait ECP3 Wait ECP4 If DATA 1 Then DIL gt ACC 0 gt INPUT Else E gt Next State 9 WWD W rite Word to Output Unit from the Accumulator FCPl PC gt BUSl tral BUS3 gt MAR READ FCP2 PC gt BUSl l gt BUS2 add BUS3 gt PC FCP3 MBR gt BUS2 tra2 BUS3 gt IR FCP4 0 gt DATA 1 gt OUTPUT ACC gt DOL ECPl Wait Stay in Execute until the Output is complete ECP2 Wait ECP3 Wait ECP4 If DATA 1 Then 0 gt OUPUT Else E gt Next State Version of 7302010 Page 11 A SHL Shift Accumulator Left FCP1 PC gt BUSl tral BUS3 gt MAR READ FCP2 PC gt BUSl 1 gt BUSZ add BUS3 gt PC FCP3 MBR gt BUSZ tra2 BUS3 gt IR FCP4 ACC gt BUSl shl BUS3 gt ACC There is no defer or execute cycle B SHR Shift Accumulator Right FCP1 PC gt BUSl tral BUS3 gt MAR READ FCP2 PC gt BUSl 1 gt BUSZ add BUS3 gt PC FCP3 MBR gt BUSZ tra2 BUS3 gt IR FCP4 ACC gt BUSl shr BUS3 gt ACC There is no defer or execute cycle C LDX Load the Index Register FCP1 PC gt BUSl tral BUS3 gt MAR READ FCP2 PC gt BUSl 1 gt BUSZ add BUS3 gt PC FCP3 MBR gt BUSZ tra2 BUS3 gt IR FCP4 IR70 gt BUSl tral BUS3 gt MAR DCP1 READ DCP2 Wait DCP3 MBR gt BUSZ tra2 BUS3 gt MAR DCP4 Wait ECP1 READ ECP2 Wait ECP3 MBR gt BUSZ tra2 BUS3 gt IX ECP4 Wait D STX Store the Index Register Fetch 7 same as LDX Defer 7 if entered same as LDX ECP1 IX gt BUSZ tra2 BUS3 gt MBR WRITE ECP2 Wait ECP3 Wait ECP4 Wait Version of 7302010 Page 12 eonneetedto BUS2 We needto tnerement and deerement the regtsters usmg the ALU so we need two addmonal regtsters 1 and 71 both eonneetedto BUS1 E TD Increment and Test the Accnmnlztnr Branch Cnndi nnally Feteh 7 same as LDX Defer same as LDX ECP1 1gtBUS1IX ABUSZ add BU53 gtIX ECP2 IfIX 0 Then MAR ABUSI tral BU53 gtPc F TDX Decrement and Testthe Accnmnlztnr Branch Cnnditinnally Feteh 7 same as LDX Defer same as LDX Ec1gt1 e1 gtBUSL IX gtBU52 add BU53 alx ECP2 IfIX 0 Then MAR gtBUSL tra1 BU53 gtPC E op Watt ECP4 Watt Desrgn of a Hardwrred Contro1Unrt We sha11 study the followmg rssues 1 Decodmg the m 1nstruetaon Regrster 2 The ma or state regrster 3 The eontro1srgna1 generatron tree The hardwtred eontrol umthas avery regular strueture It 15 based on two e1oeks 7 one generating mm r t t F t h 1 F wdF m tr quott r d c1gt2 0P3 and CPA The mmor e1oek pu1ses are generated by amodulorlt eounter see text page 102 Cl The system reset n n n n 12 the eo s started 5 CP3 When the eomputerts Cl runnrngthrsgenerates 5 mm 4 aeontrnuous stream of eset kpu1ses e e1 e aeh de mng a mmor state Verston of7302010 Page 13 The lnstruetlon decodertakes the four hlgh HT oroler brts he a struetaon Regrster LDA The output ofthe Artorl decodens one of IRIS STA loeo rolsl als hld n edbythe D name othe assembly language rnstruetron wth whlch the eontrol slgnal ls assoerateol 44046 33 R or slmpllelty we do not conslder the 1quot moodquot Bl extenoleolrnstruetron set as oloes the gure on BIN page 238 RWD m WWD For example conslder the sTXlnstruetaon 13 5111 The Opeoole forthls rnstruetaon ls SHR mm 1101 Thls ls rnput to the deeooler LDX causlng the output 13 to beeome aetaye anol STX assert the slgnal STX whlch eauses the logre Ru TIX for the STX assembly language rnstruetron to be e t d TDX The Induect eontrol slgnal ls the output of mm blt 10 ofthe Instruetlon Reglster We have already dlscussed the use ofmes fonndex reglster seleetaon logl Deslgn of the Ma or State Reglster conespondlng to the exeeutaon ofthe assembly language lnstruetaons We have seenthat there are three posslble sequences Feteh only Feteh eExeeute Feteh e Defer e Exeeute states For example as most eannot the questaon ls whether ornot to enterthe Defer state The next eonslderatzon anses from the two IOlnstxueuons RWD andWWD The strategy Wlll be to holdthese lnstruetlons m Exeeute untll the IO txansaetzon ls eomplete To de ne three slgnals I lf ln Feteh stay ln Feteh and fetch the next lnstruetaon Ia lf ln Feteh go to Defer and handle mdu39ect addressmg 13 lfln Exeeute stay ln Exeeute untal the 10 ls eomplete Verslon of7302010 Page 14 We revlew the rnstruetaon set to deterrnrne when to assert the Ir eontrol slgnal What assembly language rnstruetaons are eornpleteolrn the Feteh state7 The answerls HLT SHL and 5le The eonsroleratron of BLT may be aeaolernre as the startup sequence forces the eontrol unrtto Fetehc1gtl orFetehc1gtl Thus I HLT SHL 51m Remember that these are eontrol slgnals Ir0 The book slams than mlu the lndArectbltln the Instruction Regrster We note that the TCAorRWDor D m A Ln r tr d handled so 1 mn TcA39 RWD only two rnstruetaons freeze m the Exeeute state untal eornpletron RWD andW39WI Eaeh of these stay m exeeute lf DATA 0 r e the DATA lpr op has not been set Thus Is RWI WWD DATA39 Here ls the state dlagram for aken unclear121 the rst Iquot belng aololeol The ma or state regrster has state transrtaons oee rj atter CPA ofthe present state We use two D lpr ops D and Dn to eonstruet the major state e encodmg m the table to the lelt we glye the ua forDl anan andthus deterrnrne the nextmajor state as a funetaon ofthe present major state F D or E and the three denved eontrol slgnalsll12 and 13 D Full 12 D1 F11239D EI13 Verslon of7302010 Page 15 Signal Generation Tree We now have the three major parts of circuits required to generate the control signals 1 the major state register F D and E 2 the minor state register CPI CP2 CP3 CP4 3 the instruction decoder In hardwired control units these and some other condition signals are used as input to combinational circuits for generation of control signals As an example we consider the generation of the control signals for the first three steps of the fetch phase Fetch When the major cycle is Fetch then PC To BUS1 the discrete signal Fetch 1 True mm The discrete signals CPI CP2 and cm 7 BUS3 To MAR CP3 each are I at the appropriate READ clock pulse Thus in CPI ofthe Fetch cycle we have Fetch 1 PC T BUSI CPI ICP20 and CP3 0 1 TO BUS2 Only the top AND gate has both CF07 add inputs 1 so its output is I and as a BUS3 To PC result the signals PC to BUS I traI BUS3 to MAR and READ are all MBR To BUSz asserted set to I In Fetch CP2 CPSDEE traz only the second AND gate has output BUS3 To IR of I and the second set of signals is asserted F CP3 is similar There is one obVious remark about the above drawing Notice that the each of the top two AND gates generates a signal labeled PC To BUSI At some point in the design these and any other identical signals are all input into an OR gate used to effect the actual transfer The book contains the complete set of signal generation trees for the ASC Although they may seem complicated they are really quite simple Figure 515 on page 253 shows the tree for the Defer cycle which has the advantage of also appearing simple The simplest way to handle the other trees in to consider five trees one each for FetchCP4 ExecuteCPI ExecuteCP2 ExecuteCP3 and ExecuteCP4 The figures in the book on pages 252 to 256 illustrate these We now look at the signal tree for Fetch CP4 This we do in two stages First we consider what the signal generation tree would look like were the generic microinstruction for F CP4 followed Then we shall look at the actual signal generation tree Version of 7302010 Page 16 The mlm omsh uchon for the genene Feteh CPA is as follows F CPA K1315 0 Then man gtBUSL IX gtBUS2 add BU53 gtMAR El man gtBUSLuaLBU53 gtMAR u THEN ELSE condmons tn a stgna1 control a e After thts has been done we dsplay the stgna1 generation tree for Eeteh CPA as aetuany found on the Asc Here is the stgna1 generation tree for the nntenotnstmetton as eoded above CPA 1f Eeteh 1 and CPA 1 as they are based on output from an AND gate wtth mputFetch and CPA m ter mm Ann the Eeteh CPA stgnals as they do not depend on the state ofIRu 3 K13 we tssue the stgnals IX ABUSZ andadd othemxsejust the stgna1 p11 o fANI the stgnals conespondmg to the u THENELSE condmon We now begtn our examtnaaon ofthe real stgna1 generation tree forFetch CPA Verston of7302010 Page 17 Fnstwe nnust examtne a eonnponent ofthat cxrcmt The cxrcmt at1ett is a NOR gate Recall that the m output ofaNOR gate is 1 1fand on1y 1fboth oftts 5 tnputs are 0 othenynse the outputts 0 TCA The output is 1m and on1y if both m1 0 and TCA 0 ms 0 forthe rst etght opeodes the rst mw T The nextfxgure shows the tme stgns1 generation tree forFetch c1gt4 of the Asc 1ae1nng on1y To h h n t Fetch c124 ovRuN IELT RH mum m tBusz IRIS add TCA BESSOMAR RH mum mus er BESSOMAR RN m s 1 Acet BUSI Buss Acc s1111 Sh TT 39T quotW wd39T T h TR 1 and ran 1 As these are the on1y tnstmehons wtththat property 1 substttuted forthe OR gate In my edmon of the book the tree for SHR and SHL tneoneet1y enntts p11 not eonststent wtth 1Lhethe control stgna1s or the ALU destgn Version of7302010 Page 18 cyclc causc control slgnals to be cmrttcol Hcrc ls thc deslgn ofthe defer cyclc Dcrcr READ c121 MBRtBusz CP3 BussvMAR cvcrytlurrg m these uotcs so the studcrrt ls lnstxuctedto examlne the tcxtbook Tcstm the State othe Accumulator We have two mstrucuorls B are con nal Sign Bit and BJN that dluo 12 z lfthe accumulator ls notrrcgatrvc or zcro rtmust be posruvc c ag lsJust a copy of the slgn brt ofthe accumulator 1t ls casrcrto gcucratc z by rst gcrlcraurlg Z wluch ls 1 only lfthe accumulatorrs not zcro Then wc gcucratc z Vcrslon of7302010 Page 19 the spenfxedmdex regrsterrs zero orhot zero Desrgh of the ALU We now eover the desrgh ofthe Arithmetic Lngie Unit ALU ofthe ASC The book the ALU rhto three fuheuohsu uruts e the TRA urut for ter and ma the Adder and the thftUmt for shl shr and camp The gure below shows the desrgh of the TRA Transfer Urut othe ALU The desrgh rs eeds Bus 3 erferthertral 1 or tra2 1theh the urut outputs on Bus 3 otherwrse rt does not and possrbly anotherurut places output on the bus Bus1K Buss K BUSZK ter tral The Transfer Unit Versroh of7302010 Page 20 The adder unit is best presented as a 16bit unit using a 16bit full adder In addition to these connections the circuit outputs C Carry Out and O Over ow to the PSR The shi unit handles shl shr and comp because it is easier to put it here The only input to the shi unit comes from bus 1 There is special handling ofbits 15 and 0 for the shi instructions Shi le is a logical shi and shi right is an arithmetic shi Shi Right BUSl 15 is copied to both BUS315 and BUS314 to preserve the sign bit BUSlu is lost Shi Le BUS3U gets a 0 BUSl 15 is lost The design for the shift unit is shown below Note that we must show separate designs for the units that feed bits 15 and 0 of bus 3 Version of 7302010 Page 21 Mteroprogrammed Control Unlt l and th ther eontrols slgnals to 0 For example 1 FetchrCPI the eontrol slgnals PC To BUSl txal BUS To MAR and READ are setto l and all other eontrol slgnals are set to 0 one way to generate slgnals ls to use comblnauonal loglc mostly AND and NOT gates The Memory as eaeh word ofthe ROMls read rt sets brts m amlcrorMIBR Eaeh brtm the we t t As avery slmple example conslder the eontrol slgnals emrtteol forFetch cm to c1gt3 FCP1 PC aBUSl tral BU53 gtMAR READ PC gtBUSL1gt BUSZ add BU53 gt PC FCP3 MZBR gtBUS2 traz BU53 am ey l l rblt wldth MZBR othls small umt wouldbe eonneeteolto the eontrol slgnals E E E II I PC tn BUSI READ 1 tn BUSZ add MZBR tn BUSZ quot32 ter Buss tn MAR Buss m PC BUS tn IR Wlth the brt assrgnments lmplled by ths slmple mlcrormemory buffer regrster the eontrol eoole for the three fetch operations beeomes the followlng E CPl 1 an 1mm 1mm 1 E CP2 1 1n nn1 nn1 n 5 CF n 11 n1n mm m but the The eontrol program for the eomputerrs Lhanjust abuneh of 139s and 039s lndAEatlng what lnstxucuon The program so representedrs ealleol binary micrncm le As before we ean use amicrninstmctinn as a shorthandto represent the blnary mlcrocode Verslon of7302010 Page 22 1 The CROM Control ReadrOnly Memory the mlcrorMBR to eontarn the output othe CROM the mreroMAR to rndreate the eontrol wordto be read 2 A sequencer to speclfy the next eontrol wordto be re d The executlon eyele for the eontroller 15 really only aFeteh eyele efeteh the eontrol word ess ofthe next eontrol wordto be aeeessed The task othe ucu MmrorContxol Unrt 15 to seleet the next word from the CROMto be readrnto the 1111312 Status Signals ete Cnntrnl Signals 64K by 16 The number of words m the CROMls determrnedby the strueture ofthe maehrne language In our example we need 118 words of CROM and would probably provld3128 words Justto leave the 5123 as apower of2 The num r that mustbe generated and on the approaeh to generaung the eontrol slgnals There are two mleroeode whleh ls somewhat slower text 39 mlernende and vertical mlernende We de ne these terms by rllustrauon ConslderBUSl M1 for mrerornstrueuon formats presents alrst For eaeh ofthese regrsters we mustg eontrol slgnal Thus we need the followlng eontrol slgnals for B u bu d enerate a U51 ACC To BUsl PC To BUsl MARTo BUsl lTo BUsl man To BUsl e 1 To BUsl Thus SlX eontxol slgnals forBUSl Verslon of7302010 Page 23 By the tenn Fleld 1quot we olenote tlne brts m tlne eontrol worol useolto speclfy whlch regrsterrs plaeeol on BUSl There are seven possrbrlrtres remember tnatnotlnng on BUSl ls OK In hnrilnmal micrncnde Held 1 would have seven blts one for eaeh of the posslbllltles N t h r BUsl Ktwo of the blts m Held 1 are set we have areal problem 6rhit eld l39nr BUSI in 391 anl Cnntrnl rnr Hnrilnntal N crncnde rt ld BUSl e souree so we have seven possrbrlmes forwhlch to eoole Now 22 lt 7 23 so we neeol tlnree brts to eoole tlne possrbrlraes 1n veraeal rnreroeoole1relol 1 has tlnree brts The eooles shown m Flgure 5 20a rndreate tlne slgnal souree tlnus 100 ls tlne eoole for PC To BUSl ete Srhit eld l39nr BUSI Rm an1 Cnntrnl rnr Vertical N crncnde Verslon of7302010 Page 24 We now must consider the structure of the control words keeping in mind that there are two types 7 those that issue control signals and those that sequence the microcontrol unit The way to do this in microcode is to have two qucodes one for each type Type 0 These emit control signals Type 1 These sequence the microcontrol unit Type 0 Design of the type 0 microinstruction format requires that we have a list of every control signal that will be issued by the control unit Figure 520 on page 263 shows a list of all the control signals We repeat it here not present 01 Not Used 10 11 Not The first thing that we note is that vertical microcoding is used for each of the major functions There are seven possibilities for BUSl six signals and do nothing so we need three bits to encode these possibilities 7 recall 22 lt 7 S 23 The same argument applies to selecting the number of bits for the other elds One might note a possible design aw in the Field 1 and Field 3 instructions It probably would be better to swap 1 gt OUTPUT and 0 gt INPUT to avoid signal con icts Here is a typical type 0 instruction lqulBUSl BUs2 BUS3 ALU F1 F2 F3 lMem 0 001 000 I101 001 00000010 This corresponds to ACC gt BUSl tral BUS3 gt MBR WRITE For the record the preferred way to write this on an answer is 0 l 0 5 l 0 0 0 2 7 each field being represented by a single hexadecimal digit or octal digit if you want to be a purist This is easier to read Version of 7302010 Page 25 The format of the type 1 microinstruction depends on two criteria 1 How many different jump conditions exist and 2 How many bits must be allocated for the jump address The current design calls for 119 words of micromemory numbered from 0 to 118 Recalling that 26 lt 119 S 27 we conclude that we need seven bits in the microaddress As we shall soon see there is a need for three bits to specify the jump condition With this convention the format of the type 1 microinstruction is I20I19I18I17I16I15I14I13I12I11I10I9 I8 7 I6 5 l4 3 2 1 I 1 I Condition I MicroAddress I Not Used I The best way to enumerate the condition codes for the jumps is to write the microcode and create a new condition whenever one is needed The following table shows the results of doing that Note that we always have an unconditional jump common BIP to ACC not BIN to ACC not RWD execute TIX to TDX to 0 was necessary to addition to the status of IR An example below will show why the test on IR10 is all that is necessary for the microcode The motivation behind the last five conditions is the same 7 determine the condition to be met if the jump is to be taken and branch back to the common fetch sequence if the condition is not met Thus consider the following execute code for BIP written as microinstructions 72 If ACC S 0 Then Go To Fetch 73 Go To BRU Branch ifNot ACC S 0 or ACC gt 0 A true software engineer such as the author of these notes will find this style of code archaic and truly appalling However remember that microcode is quite primitive The only way to sequence the code is to use the GOTO statement which can lead to problems in writing any code be it microcode or 4GL It is just that higher level languages support better constructs Version of 7302010 Page 26 Bina Microcode Example We close this chapter with an example to show how to write binary microcode This is taken from page 261 of the textbook We begin with the microinstructions convert these to control signals and then generate the binary microcode 65 BRU MAR IR70 INDEX Compute the indexed address 66 If IR10 0 Then Go To M3 67 READ The next three lines are the Defer Cycle 68 WAIT 69 MAR MBR 70 M3 PC MAR Do the jump by giving the PC a new value 71 Go To Fetch 72 BIP If ACC S 0 Then Go To Fetch Fetch next instruction if ACC is not positive 73 Go To BRU 74 BIN If ACC Z 0 Then Go To Fetch Fetch next instruction if ACC is not negative 75 Go To BRU Notes on the code 65 This is the standard computation of an indexed address Recall that the bits IR98 specify the index register to be used so the control signals do not 66 In the hardwired control unit we speci ed that indirect addressing was to be used only if IR10 l and the instruction was not one of a list including RWD and WWD Here we know that the instruction is one of BRU BIP or BIN all of which allow indirect addressing 67 The next three lines represent the Defer Cycle Note that this code is duplicated for the execute part of each assembly language instruction that allows a defer cycle One could also have a microsubroutine to handle Defer leading to a slightly more complicated design Here we have decided not to use microsubroutines 70 The way to do a jump or GOTO in any computer is to cram a new address into the program counter That says where to nd the next instruction 72 Here we have decided to test for the condition that we do not want and fetch the next instruction unless we have what we want The code could have been written BIP If ACC gt 0 Go To BRU 73 Go To FETCH with the same effect but it was not There is no real advantage one way or the other Remember that microinstructions should be viewed as shorthand for control signals which translate directly into binary microcode Version of 7302010 Page 27 We now generate the control signals rst facing that the typel microinstructions cannot be translated into control signals What we have is then a mix 65 BRU IR70 gt BUSl INDEX gt BUSZ add BUS3 gt MAR 66 If IR10 0 Then Go To M3 67 READ 68 WAIT 69 MBR gt BUSZ tra2 BUS3 gt MAR 70 M3 MAR gt BUSl tral BUS3 gt PC 71 Go To Fetch 72 BIP If ACC S 0 Then Go To Fetch Fetch next instruction if ACC is not positive 73 Go To BRU 74 BIN If ACC Z 0 Then Go To Fetch Fetch next instruction if ACC is not negative 75 Go To BRU Now for a little binary arithmetic to show the representation of the two addresses FETCH 32 0 100 000 BRU 6564l 1000 001 M3 706442 1000110 I nd it helpful to generate the type 0 microcode separately BUSl BUSZ BUS3 ALU F1 F2 F3 Mem Version of 7302010 Page 28 Models of Computation What is a computer What can be computed Theoreticians want to develop formal mathematical models of computation so that they can answer questions such as these With some precision The theoretical models of computation devised in this study are often called either nite state machines or nite automata The word automata is the plural of the word automaton Those studying computer architecture and digital design tend to use the term finite state machine abbreviated as FSM Those studying formal language theory tend to use the term finite automaton Your instructor Will probably use the two terms indifferently Sample FSM The 1101 Sequence Detector This is an example your instructor often uses in courses on computer organization and architecture This is a FSM to detect a sequence of binary bits First a 1 then another 1 followed by a 0 and then a 1 We specify reading this left to right We shall present the complete design of this FSM using a number of terms that will be de ned only after we rst see them and present them intuitively The idea of string pre xes will be somewhat important in this discussion The set of pre xes for the string 1101 are the following P71111101101 The FSM will have ve states labeled q0 q1 q2 q3 and q4 The state qo is a special state the start state or initial state The state q4 is a special state the nal sate or accept state Initial FSM Design Follow the Target Sequence Here is the rst step in the FSM design Assume the desired sequence is received in the proper order A four bit sequence requires ve states the start state and four more Q 1 0 1 Q 0 Q 1 Begin in state qo Input of a 1 takes the FSM to state ql In state ql input of a 1 takes the FSM to state q2 In state q2 input of a 0 takes the FSM to state q3 In state q3 input of a 1 takes the FSM to state q4 and the sequence is accepted FSM Design Consider Other Inputs Suppose that the FSM is in state qo and receives a 0 as input The FSM cannot leave state qo until it gets a l as input so it just stays in state qo and waits Here is the partial design of the FSM that allows for this case NOTE This looks much like a directed graph in Which the edges are labeled One big difference is the loop connecting vertex qo to itself We might use some graph theory in the analysis of FSM but not much FSM Design Other Unexpected Inputs State q1 Input of a 1 takes the FSM to state qz What about input of a 0 If the FSM is in state ql then the string pre x 1 has been received and the last two symbols in the string are 10 But 10 is not one of the pre xes of the string 1101 So we go back to qo and start over Before continuing construction of this FSM it is important to state the above observation in more precise language Pre xes and Suf xes The string that we want to match is 1101 Its set ofpre xes is P 1111101101 If state ql has the symbol 0 as input the last two symbols in the input string at that point are 10 Call these last two symbols the string x The set of suf xes to this string are S 7 O 10 What is the longest suf x of x that is a pre x of the target string 10 is not a pre x 0 is not a pre x 7 is a pre x We go back to the start state qo IMPORTANT NOTE When we get to state qk for k 2 1 all we can say about the input string is what the last k symbols were In other words we know a k symbol suf x of the input string FSM Design Unexpected Inputs t0 qz State q Input ofa 0 takes the FSM to state q3 What about input ofa 1 Ifthe FSM is in state q2 the last two symbols input were 11 Ifanother 1 is input the last three symbols input were x 111 The set ofsuf xes to this stIing are S L111111 The set ofpre xes ofthe target stIing is P 751111101101 The longest suf x ofthe stIing x that is a pre x ofthe target stIing is 11 But having a twoisymbol suf x of 1 1 is What puts the FSM in state q2 So the FSM in state q2 on receiving an input of 1 stays in that state FSM Design Unexpected Inputs t0 q3 State q3 Input of a 1 takes the FSM to state q4 What about input of a 0 If the FSM is in state q3 and has input 0 the last four symbols input are 1100 The set of suf xes ofthis input string is S 7 0 00 100 1100 The set ofpre xes ofthe target string is P K 1 11 110 1101 1100 is not a pre x of the target string 100 is not a pre x of the target string 00 is not a pre x of the target string 0 is not a pre x of the target string A is a pre x of the target string If the FSM is in state q3 an input of 0 takes it to state qo The Almost Complete FSM Design Here is the FSM design up to this point Note that it has been redrawn slightly to avoid a confusion of lines What about the accept state q4 Handling Input to the Accept State In state q4 the last four symbols input were 1101 This is the target string The set ofpre xes ofthe target string is P K 1 11 110 1101 If a 0 is input next then the last ve symbols were x 11010 The set ofsuf xes to this string are S A 0 10 010 1010 11010 Only 7 is a pre x of the target string so a 0 takes the FSM back to q0 If a 1 is input next then the last ve symbols were x 11011 The set ofsuf xes to this string are S 7L111 011101111011 The longest suf x to x that is a pre x to the target string is 11 Go to qz The Complete FSM Design for the 1101 Acceptor Here is the complete nite automaton to accept a 1101 binary string Why should one be interested in this FSM 1 It is simple yet suf cient to illustrate the concepts of importance to nite automata such as compilers that are useful and used 2 Sequence detectors are occasionally used For example an input line might be monitored for 11010011 which is the odd parity 8 bit ASCII code for the character S input to start a session 1 2 3 4 8 Comments on the FSM Design The states are labeled as q0 q1 q2 q3 and q4 Theoreticians who work with nite automata prefer to label the states with the lowercase q The state q4 has a double circle around it This marks it as an accept state also called a nal state Finite automata may have any number of accept states including zero for no accept states The FA has a start state qo that is indicated by having an arrow pointing at it om nowhere The FA starts here There is no concept of a real clock such as seen in courses on computer organization These FA are mathematical models This nite automata is said to accept any string terminating in 1101 We now call this FSM a recognizer for any such string Terminology for Finite Automata This example of a nite automaton is classi ed as a deterministic nite accepter or DFA Formally such a nite automaton is de ned as a quintuple 5 tup1e M 2 Q7 29 59 C10 We use the 1101 sequence detector to illustrate this de nition Q denotes a nite set of states In the 1101 acceptor as de ned above we see that Q q0 q1 q2 q3 q4 2 denotes a nite set called the alphabet of symbols For this FA we have 2 0 1 as the strings presented to the FA Will consist only of those symbols Terminology for Finite Automata Page 2 5 denotes the transition function Here is a state table representation of the function 5 0 1 This can be written as 5010 0 2 C10 5010 1 2 C11 5011 0 2 C10 5011 1 2 C12 5012 0 2 C13 5012 1 2 C12 5013 0 2 C10 5013 1 2 C14 5014 0 2 C10 5014 1 2 C12 Mathematicians would say that 5 Q X Z gt Q 8 Q x Z Q Mathematicians often describe a function as F D gt R D the set of all possible inputs to F is called the domain R the set of all possible outputs om F is called the range We say that 5 Q X Z gt Q to imply that l The input to the function is a 2 tuple of the form q x withq e Qandx e Z 2 The output of the function is a state q e Q The function takes the 2 tuple q x as input and maps it to a state q e Q The function is called a total function because it produces an output q e Q for every possible 2 tuple q x e Q X Z Terminology for Finite Automata Page 4 De nition of Q X 2 For this example we have Q qo q1 q2 13 C14 Z 0 1 Q X Z Z C10 0 C11 0 C12 0 C13 0 C14 0 C10 1 C11 1 C12 1 C13 1 C14 1 an e 0 denotes the start state Not much more to say here In our example we use ql as the start state F C O is the set of accept states or final states For our example we have F q4 the set with one element Example for Analysis Here is one example called M1 for Machine 1 1 How many states docs Ml have and what are they The nite automaton Ml has three states labeled q1 q2 and q3 since these are the labels found in the three circles Thus Q q1 q2 q3 2 What is the alphabet set of input symbols for M1 The input symbols for M1 are 0 and l as these are the only labels found on the arrows that represent the transitions Thus 2 O l Really all we know is that q1 q2 q3 g Q and that O l g 2 Example for Analysis Part 2 3 What is the start state for M1 The start state is ql the state shown with an unlabeled input arrow coming om nowhere The reader will note that state ql is drawn leftmost on the diagram as is often the case but this is just conventional and does not imply anything about ql 4 What is the set of accepting nal states for M1 The only state marked with the double circle is q3 so F q3 NOTE This is an old example from a class in which I used ql as the start state I did not want to redraw all my gures so I stayed with the original Privileged Instructions Computer instructions are usually divided into two classes user instructions and privileged instructions User instructions are those that are not privileged Instructions can be labeled as privileged for a number of reasons Confusion Instructions such as input output instructions can cause dif culties if executed directly by the user Consider output to a shared print device Security Instructions such as memory management can cause severe security problems if executed by the user I can directly read and corrupt your program memory Rings of Protection The simple security models in a computer call for rings of protection The protection rings offered by the Pentium 4 architecture IA 32 are fairly typical Possible uses of the levels Level Attempts to read data at higher less protected rings are permitted Attempts to read data at lower more protected rings are not permitted and cause traps to the operating system Implementing Rings of Protection There are two options for these rings of protection 1 Implementation in software 2 Direct implementation in hardware with suf cient hardware support Early experience with the MULTICS operating system showed that direct hardware implementation is necessary The reason for this is the ef ciency of cross ring procedure calls With protection rings emulated in software these calls take too much time Experience with MULTICS with protection rings implemented in software showed that system programmers placed non security critical software into the kernel just to avoid the time delays associated with calling kernel code from the outside Experience with MULTICS running on a computer with hardware support for protection rings showed that cross ring calls were not noticeably less ef cient than calls within the same protection ring The PSW Program Status Word Often called the PSL Program Status Longword as with the VAX l 1780 Not really a word or longword but a collection of bits associated with the program being executed Some bits re ect the status of the program under execution N the last arithmetic result was negative Z the last arithmetic result was zero V the last arithmetic result caused an over ow C the last arithmetic result had a carry out The security relevant parts of the PSW relate to the protection ring that is appropriate for the program execution The VAX l 17 80 and the Pentium 4 each offered four protection rings The ring number was encoded in atwo bitf1eld in the PSW The VAX stored both the ring for the current program and the previous program in the PSW This allowed a program running at kernel level level 00 to determine the privilege level of the program that issued the trap for its services Commercialization of the Protection Rings Early computers had operating systems tailoredto the speci c architecture Examples oftliis are the IBM 360 and 03360 VAX711780 andVMS More modern operating systems such as UNIX are designed to run on many hardware platforms and so use the lowest common denominator of protection rings This gure shows the IA732 protection rings as intended and as normally implemented Ring 3 Application Application RingZ OS Services Ring 1 Device Drivers Ring 0 ms Kernel Og g As Intended As Implemented The problem here is that any progmm that executes with more than user privileges must have access to all system resources This is a security Vulnembility SPOOLING A Context for Discussing OS Services The term SPOOL stands for System Peripheml Operation OniLine Direct access to a shared output device can cause chaos Shared Pr ter The Spooljng approach calls for all output to be to tempomry les The manager has sole control of the shared printer and prints in order of closing the les Process 1 gt P m Shared Privileges User vs SPOOL A User Program must be able to create a le and write data to that le It can read les in that user s directory but usually not in other user s directory It cannot access the shared printer directly The Print Manager must be able to Read a temporary le in any user s directory and delete that le when done Access a printer directly and output directly to that device It should not be able to create new les in any directory The Print Manager cannot be run with Level 3 Application privilege as that would disallow direct access to the printer and read access to the user s temporary les Under current designs the Print Manager must be run with Superuser Privileges which include the ability to create and delete user accounts manage memory etc This violates the principle of least privilege which states that an executing program should be given no more privileges than necessary to do its job We need at least four fully implemented rings of privilege as well as speci c role restrictions within a privilege level Memory Segmentation Memory paging divides the address space into a number of equal sized blocks called pages The page sizes are xed for convenience of addressing Memory segmentation divides the program s address space into logical segments into which logically related units are placed As examples we conventionally have code segments data segments stack segments constant pool segments etc Each segment has a unique logical name All accesses to data in a segment must be through a ltname 0 selgt pair that explicitly references the segment name For addressing convenience segments are usually constrained to contain an integral number of memory pages so that the more efficient paging can be used Memory segmentation facilitates the use of security techniques for protection All data requiring a given level of protection can be grouped into a single segment with protection ags specific to giving that exact level of protection All code requiring protection can be placed into a code segment and also protected It is not likely that a given segment will contain both code and data For this reason we may have a number of distinct segments with identical protection Segmentation and Its Support for Security The segmentation scheme used for the MULTICS operating system is typical Descriptor Page frame iSegment number Page Descriptor number Word segment Page table Offset Page 18Bit Segment 6 Bit page 10Bit offset number number within the page Twopart MULTICS address Each segment has a number of pages as indicated by the page table associated with the segment The segment can have a number of associated security descriptors Modern operating systems treat a segment as one ofa number of general objects each with its Access Control List ACL that speci es which processes can access it More on Memory Protection There are two general protections that must be provided for memory Protection against unauthorized access by software can be provided by a segmentation scheme similar to that described above Each memory access must go through each of the segment table and its associated page table in order to generate the physical memory address Direct Memory Access DMA provides another threat to memory In DMA an input device such as a disk controller can access memory directly without the intervention of the CPU It can issue physical addresses to the MAR and write directly to the MBR This allows for ef cient Input Output operations Unfortunately a corrupted device controller can write directly to memory not associated with its application We must protect memory against unauthorized DMA Some recent proposals for secure systems provide for a NoDMA Table that can be used to limit DMA access to specific areas of physical memory Securing Input and Output Suppose that we have secured computing How can we insure that our input and output are secure against attacks such as key logging and screen scraping Input Sequence Suppose that we want to input an A We press the shift key and then the A key The keyboard sends four scan codes to the keyboard handler operating in user mode This might be 0X36 OXlE 0X9E OXB6 for pressing the Shift Key then pressing the A key then releasing the A key then releasing the Shift Key This sequence is then translated to the ASCII code 0X41 which is sent to the O S Either the scan codes or the ASCII code can be intercepted by a key logger Output Sequence When the program outputs data it is sent to the display buffer The display buffer represents the bit maps to be displayed on the screen While it does not directly contain ASCII data just its pictures it does contain an image that can be copied and interpreted This is called screen scraping Protecting the Code under Execution We can wrap the CPU in many layers of security so that it correctly executes the code How do we assure ourselves that the code being executed is the code that we want More speci cally how do we insure that the code being executed is what we think it is and has not been maliciously altered One method to validate the code prior to execution is called a cryptographic hash One common hash algorithm is called SHA l for Secure Hash Algorithm 1 This takes the code to be executed represented as a string of 8 bit bytes and produces a 20 byte 160 bit output associated with the input The hardware can have another mechanism that stores what the 20 byte hash should be The hardware loads the object code computes its SHA l hash and then compares it to the stored value If the two values match the code is accepted as valid What Is a Cryptographic Hash First we begin with the de nition of a hash function It is a many to one function that produces a short binary number that characterizes a longer string of bytes Consider the two characters AC with ASCII codes 0100 0001 and 0100 0011 One hash function would be the parity of each 8 bit number 0 and 1 even and odd Another would be the exclusive OR of the sequence of 8 bit bytes A 01000001 C 01000011 6 0000 0010 The hash function must be easy to compute for any given input A cryptographic hash function has a number of additional properties 1 A change of any single bit in the input being processed changes the output in a very noticeable way For the 160 bit SHA l it changes about 80 of the bits 2 While it is easy to compute the SHA l hash for a given input it is computationally infeasible to produce another input with the identical 20 byte hash Thus if a code image has the correct hash output it is extremely probable that it is the correct code image and not some counterfeit

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

#### "I bought an awesome study guide, which helped me get an A in my Math 34B class this quarter!"

#### "Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

#### "It's a great way for students to improve their educational experience and it seemed like a product that everybody wants, so all the people participating are winning."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.