New User Special Price Expires in

Let's log you in.

Sign in with Facebook


Don't have a StudySoup account? Create one here!


Create a StudySoup account

Be part of our community, it's free to join!

Sign up with Facebook


Create your account
By creating an account you agree to StudySoup's terms and conditions and privacy policy

Already have a StudySoup account? Login here

IntroComputer Architecture

by: Jacey Olson

IntroComputer Architecture CSE 141

Jacey Olson

GPA 3.69


Almost Ready


These notes were just uploaded, and will be ready to view shortly.

Purchase these notes here, or revisit this page.

Either way, we'll remind you when they're ready :)

Preview These Notes for FREE

Get a free preview of these Notes, just enter your email below.

Unlock Preview
Unlock Preview

Preview these materials now for free

Why put in your email? Get access to more of this material and other relevant free materials for your school

View Preview

About this Document

Class Notes
25 ?




Popular in Course

Popular in Computer Science and Engineering

This 55 page Class Notes was uploaded by Jacey Olson on Thursday October 22, 2015. The Class Notes belongs to CSE 141 at University of California - San Diego taught by Staff in Fall. Since its upload, it has received 14 views. For similar materials see /class/226792/cse-141-university-of-california-san-diego in Computer Science and Engineering at University of California - San Diego.

Popular in Computer Science and Engineering


Reviews for IntroComputer Architecture


Report this Material


What is Karma?


Karma is the currency of StudySoup.

You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 10/22/15
Pipelining 0 Quiz 0 Introduction to pipelining 10n5 20n5 30n5 Pipelining L 1L a E t LogIc 10ns if c E L 1L a a1 t LogIc 10ns it c are 7 Logic 10ns nnml 5w What39s the latency for one unit of work VVhanthethroughput Pi peli ni n9 1Br39eok up The logic wiTh loTches inTo pipeline sTogesquot 2Eoch sToge can ocT on differenT doTo 3LoTches hold The inpuTs To Their39 sToge 4Ever39y clock cycle doTo Transfers from one pipe sToge To The nexT Logic lOns Logic2ns Logic2ns Critical path review Critical path is the longest possible delay between two registers in a design The critical path sets the cycle time since the cycle time must be long enough for a signal to traverse the critical path Lengthening or shortening noncritical paths does not change performance Ideally all paths are about the same length Logic Pipelining and Logic Logic Logic Logic Hi Logic Logic Logic H1 Logic Logic Hi Logic Logic EH Logic Mil Logic H1 0 Hopefully critical path reduced by 3 Limits onPipelining Lc Lc Lc A A A 0 You cannot pipeline forever 0 Some logic cannot be pipelined arbitrarily Memories 0 Some logic is inconvenient to pipeline 0 How do you insert a register in the middle of an adder 0 Registers have a cost 0 They cost area choose narrow points in the logic 0 They cost time 0 Extra logic delay 0 Setup and hold times Pipelining Overhead Logic Delay LD How long does the logic take ie the useful part Set up time ST How long before the clock edge do the inputs to a register need be ready Register delay RD Delay through the internals of the register BaseCT cycle time before pipelining O BaseCT LD ST RD 0 Total delay BaseCT PipeCT cycle time after pipelining N times 0 PipeCT 0 Total delay Pipelining Overhead Logic Delay LD How long does the logic take ie the useful part Set up time ST How long before the clock edge do the inputs to a register need be ready Register delay RD Delay through the internals of the register BaseCT cycle time before pipelining O BaseCT LD ST RD 0 PipeCT cycle time after pipelining N times 0 PipeCT ST RD LDN 0 Total time NST NRD LD Pipelining Dif culties 0 You cannot put registers just anywhere 0 You may not have access to the internal of some block 0 Ex memories 0 Balancing the path lengths is challenging O The there are many more potential critical paths in a pipelined design Pipelining Dif culties Fast Slow Logic Logic Slow Logic Fast 7 Logic Fast Slow Logic Logic I I Slow Logic Fast Logic 0 The critical path only went down a bit How to pipeline a processor 0 Break each instruction into pieces remember the basic algorithm for execution Fetch Decode Collect arguments Execute Write back results Compute next PC The classic 5stage MIPS pipeline Fetch read the instruction Decode decode and read from the register file Execute Perform arithmetic ops and address calculations Memory access data memory Write bacllt Store results in the register file Pipelining a processor Fe C l 3003 Impact of Pipelining 0 Break the processor into P pipe stages 0 What happens to latency O L Inst CPI CycIeTime 0 The cycle time 0CP Impact of Pipelining 0 Break the processor into P pipe stages 0 What happens to latency O L Inst CPI CycIeTime 0 The cycle time CTP 0 CPI I O CPI is an average Cyclesinstructions 0 When of instructions is large CPI I O Ifjust one instruction CPI P Pi pel i ned Datapa rh Read Instruction Memory Address ReadAddrl Register ReadAder File Write Addr VVrne Data Read Data 1 Read DMaZ EEdd U gn 39 Data Memory Read Add ress Data VVrne Data Pi pel i ned Datapa rh Read Instruction Memory Address Read Addr 1 Register Read Read Addr 2 Data 1 File Write Addr Read Data 2 Write Data Sign 16 Extend Data Memory Read Add ress Data Write Data gt Read Instruction Memory Address Pi pel i ned Datapa rh 16 ReadAddrl Register Read Read Addr 2 Data 1 File erte Addr Read DMaZ VVrne Data gn Extend g2 Shift Add left 2 Data Memory Read Add ALU FESS Data Write Data Pi pel i ned Datapa rh gt 4 Shift Add left 2 Instruction 7 mad Addr 1 39 Register Read Data memory Data 1 Memory R d V cad 2 ea File gt Address AI 39 i Address mite Addr Read a a Data 2 Write Data JVrite Data Sign 6 Eftjfj T Pi pel i ned Datapath gt 4 dd Instruction Read Addr 1 Data Memory eglster Read Memor gt Read Addr 2 Data 1 y Read File Address ALU Read gt Write Addr Read Address Data Data 2 Write Data 39 Write Data Sign 6 Extend a Pi pel i ned Datapa rh gt 4 Instruction Read Addr 1 Data Memory eglster Read Memor gt Read Addr 2 Data 1 y File Read Read Address Write Addr Read Address Data Data 2 Write Data Write Data Sign 39 6 Extend a Pi pel i ned Datapa rh gt 4 Instruction gt Read Addr 1 Register Read Data memory Data 1 39 Memory gt Read Addr 2 File gt Read ALU Address Read Address gt Write Addr Data Read Data 2 Write Data I Write Data Sign 16 Extend AT Simple Pipelining Control Fetch a Fetch Decode Write back l l l Ll ll 0 Compute all the control bits in decode then pass them from stage to stage It won t stay this simple Pipelining is Tricky 0 If all the data flows in one direction pipelining is relatively easy 0 Not so for processors 0 Decode and write back both access the register file 0 Branch instructions affect the next PC 0 Instructions need values computed by previous instructions Not just tricky Hazardous 0 Hazards are situations where pipelining does not work as elegantly as we would like 0 Caused by backward flowing signals 0 Or by lack of available hardware 0 Three kinds 0 Data hazards an input is not available on the cycle it is needed 0 Control hazards the next instruction is not known 0 Structural hazards we have run out of a hardware resource 0 Detecting avoiding and recovering from these hazards is what makes processor design hard 0 That and the Xilinx tools l I11j Procedures Assembly Alternatives and LinkingLoading I Last Time J JR JAL long distance transfers and procedure linkage Stacks to lower addresses and optimizing stack arithmetic symbolic names for registers I This Time Quiz 3 Detailed Procedure call example recursion stack discipline calling conventions Alternative assemblymachine languages RISCCISC I Announcements Reminder Homework 2 due Tuesday October 22 2002 in lecture Chien CSE141 1 October 152002 I Basics of Procedure Call review I jal target Jump and Link 31 lt PC 4 I jr 31 Jump back to the caller ofthe procedure I Recursive procedures must push return addresses on the stack pop them before returning Chien CSE141 2 October 152002 Page 1 l I11j Recursive Procedure simple movi a010 jal rec Jroc addl a3a31 rmjroc sw 310sp push rtn address I a0 passed through I rth address pushed on stack I popped before return I 510 at call 510 at returh Chien CSE141 addi spsp74 beq a00return is argO 0 subi a0a01 jal rec JTDC recursive call return 1w 314sp pop rtn address ad 1 spsp4 jr 1 Rm Addr Rm Addr 3 October 15 2002 I Full MIPS Calling Conventions 0 SAT v0vl a0a3 storstg s0s7 SkOeSkl Sgp Ssp Sfp s31 Constant O Sl Reserved for Asm and OS 273 477 Return Values Arguments 8l62425 Caller saved Reglsters l623 Callee saved Registers 2627 Reserved for Asm and OS 28 Global Polnter 29 Stack Pointer 30 Frame Pointer base of stack frame Return address from jal We re down to 8 regs 10 temps 4 arg registers 2 return values Chien CSE141 4 October 15 2002 Page 2 Calling Conventions I Caller saved t0t9 caller must save and restore if want to preserve the values Callee may use indiscriminately I Callee saved s0s7 caller can assume these are preserved across calls callee must save values and restore if want to use the registers Caller Callee Caller Callee Chien CSE141 5 October 15 2002 bl Procedure Call symbolic regs addi a0023 sub a1a0t0 proc39 SW ra0sp and a2alyv addi spsp720 How many local temps jal pmc add t0a0a1 trashes t0 caller saves add 5050v10 or t1a2a1 trashes t1 sw s016sp save the callee save register sw s112sp sw 528sp sw s34sp add use s07s3 in here move v10t1 setup return value Detailed Exrglanation 1W s016sp resture the callee saves lw s112sp I call arguments 1W 527865 I callercallee saves 1w 33465 I return address lw ra20sp restore return address I stack management f ddl 5p 5p 20 1r ra Chien CSE141 6 October 15 2002 Page 3 l Id Procedure Call summary jal jr for linkage push ra if addl calls sp29 and optimized stack arithmetic Caller saves don t persist over calls Callee saves must be saved before trashing return values in vl0vl1 Other Registers Chien CSE141 7 October 15 2002 I Miscellaneous Registers atk0k1 used by assembler OS macros etc gp28 Global pointer points to global static space static C variables constants etc fp 30 Frame pointer points to base of the current procedure frame can simplify arg addressing Approx 18 real usable registers st All others have some special usage Chien CSE141 8 October 152002 Page 4 A few more details stack frames Arg 5 Arg 6 Arg 7 Saved Register 5 Local Variables Chien CSE141 gt 4 args pass on stack registers saved atop the stack How to know how many args convention fp and SID Registers and locals under compiler control Stack grows to LOWER addresses 9 October 15 2002 J 1 g Detailed Example Factorial nt fact int n if nlt1 return 1 else return n factn 1 Base Stack Chien CSE141 no recurse return fact subu spsp15 5w ra15sp beq a00norecurse 5w a012sp if save argument n subi a0a01 n jal fact 1w sto12sp mul v10t0v10 if n fact n l j return addi v1001 lw ra 15sp addu spsp15 jr ra if set return val 10 October 152002 Page 5 Detailed Example Factorial fact subu 5P5P716 l l 1t 53511 int n 5w ra16sp if nlt1 return 1 7 else beq a00no recurse return n fact n 1 5w a012sp if save argument n subi a0a01 if n jal fact lw t012sp mul v10t0v10 if n factn 1 j return quot39locals39quot norecurse addi v1001 if set return val return 1w ra 15sp gl o addu sspsp15 RTNaddr jr sra Base Stack l Chien CSE141 11 October 15 2002 Detailed Example Factorial t fact int n if nlt1 return 1 else fact subu 5psp15 return n fact n 1 5W r3r155P beq a00norecurse 5w a012sp if save argument n subi a0a01 locals jal fact 910 lw sto12sp mul v10t0v10 if n factn 1 j return quot39locals39quot norecurse addi v1001 if set return val re urn lw ra 15sp addu spsp15 jr ra Base Stack Chien CSE141 12 October 15 2002 Page 6 Stack After Recursive calls stack grows to lower addresses calls allocatedeallocate according to stack discipline clean up your mess Where are the caller saves regs for factn in stack Where are the callee saves Return links Chien CSE141 13 October 15 2002 Summary Procedure Call Procedure Linkage Calling conventions and register usage Stack structure and organization Now you should be able to write assembly programs that interoperate with C and make recursive calls Chien CSE141 14 October 15 2002 Page 7 Perspective on the MIPS Instruction Set definitely a RISC Reduced Instruction Set Computer Instructions are simple and regular There are only a modest number of operation codes There are VERY few addressing modes displacement some PC relative Instructions are all the same size A few fixed formats for the instructions One of the original academic RISC projects that went commercial the other is the SPARC Sun architecture Chien CSE141 15 October 15 2002 Assembly Overview for MIPS architecture Approximately 40 Add class opcodes 20 control transfer Jump and Branch 25 memory instructions 25 Floating point instructions 5 miscellaneous gt still nearly 150 instructions 3 formats all 32 bits only 2 addressing modes 32 general purpose registers bl Chien CSE141 16 October 15 2002 Page 8 MIPS Overview cont I gt Most ofthese features aid the implementation fast simple I Simple instruction sets are good for exposition of the minimal functionality required I gt Not all instruction sets are this way I Why Evolution and learning howto do things better Previously lots of assembly code was written by programmers so focus on providing higher level interfaces I What about others VAX PowerPC 80x86 family most common computer processor Chien c3E141 17 October 15 2002 A highend design point DEC VAX I VAX Virtual Address Extension 11730780 19781987 8xxx series 1988 present 16 data registers a few special Orthogonal Architectures gt opcodes and all addressing modes Many operations 250 many addressing modes 8 all possible combinations hardware must deal with these Many formats alignment of fields dependent on the addressing modes Complex operations like polynomial evaluate block move Special callreturn instructions Execution time from 5 5000 cycles extremely complex implementations Wqu subsetting argument compilers Chien cse141 10 October 15 2002 Page 9 A Modern Alternative the IBMMotorola sPowerPC Architecture Designed 1989 1993 Leverages RISC ideas 32 general purpose registers segment registers for larger address space simple instruction set few compound operations added for critical inner loop operations autoincrementjdecrement modes multiplyaccumulate branch architecture condition codes multiple sets separate test and use of information distinct namespace Designed for multiple issue a modern RISC All ofthese machines are getting quite complex l Chien CSE141 19 October 15 2002 I Why don t computer instruction sets always re ect the state of the art in understanding Differences of workload amp opinion Binary Compatibility requirements Periodic opportunities for technology inclusion ISA revision Binary compatibility vs assembly language programming BC old programs and executables should run and run well ALP assembly language should be understandable and easy to use gt distinct constraints BC admits compatible extensions to add new features complex interfaces and translation for execution Few of the constraints of lower level programming 1 80X86 Architecture Computer Evolution I l Chien CSE141 20 October 152002 Page 10 Intel Mlcroprocessor Hlstory I 4004 First singlechip microprocessor 1974 4bit machine era of racksized machines utility unclear target was pocket calculator market I 8008 80808085 First 8bit micros bug book Birth of hobbyist microcomputers Z80 s I 80868088 816 bit micro 1979 Basis of IBM PC s lots of microcontroller usage I 80286 24bit addresses PCAT little native code real mode I Chien CSE141 21 October 15 2002 I 80386 32bit addressing modern PC still lots of 16bit code Win16 Win32 I 486 Pentium Plll PlV minor ISA changes I Merced Epic IA64 gt ltanium Major changes 64bit addressing new instruction formats 256 registers Epic Significant hardware for compatibility I Intel Micro History cont i Chien CSE141 22 October 152002 in Page 11 Implications of Historical ISA s I Popular microprocessors gt significant software base I Competitive advantage preserve software base prevent defections I CostBene ts favor compatibility at each step though exponential growth means eventually phase out I Evolution ofan architecturequot looks like layers of revisions Resource limits of early microprocessors similar to rst computers in 50 s so Intel s 080 was a single accumulator architecture Chien CSE141 23 October 15 2002 Initial Architecture 8086 15 0 I 816 bit architecture 8008 Ax ioompa bmty CX CH CL I Accumulator and special registers VS 32 GP 8 quotfewer interconnects and wired DX operations into registers Ex I 8bit and 16bit register names I IP program counter not shown SP l I Why special regs Instructions w fewer operands BP II I 2 address architecture but registers not GPR so addressing SI modes linked to registers e stack offset base offset etc DI MUST get values into the right registers Chien cse141 24 October 152002 Page 12 Addressing modes and single accumulator architecture 8080 was a true single accumulator machine 8086 extended with many registers most were special use instructions used special registers DH DL etc addressing modes use special registers SP BX etc I procedure call linkage instructions use special registers asymmetric tangle of operationsaddressing modesregisters gt lots of fun hacking l Id Chien CSE141 25 October 15 2002 Asymmetry and Orthogonality Operations I Registers Asymmetric architectures couple the use of registers and selection ofoperations Harder to generate good code 17 ways to clear accumulator A l Chien CSE141 26 October 152002 Page 13 l Id How do you clear accumulator A I Load constant 0 I subtract AX from AX I xor AX with itself I shift left 16 positions I I Which is fastest and why Compilers needed to choose few Programmers knew this well Chien CSE141 27 October 15 2002 I 24bit and 32bit extensions I 80286 gt 24 bit external addresses I Segment register usage changed Info in segment register used to produce 24 bit addresses from 16bit registers 24 bit segment base little change to basic programming model I 80386 gt 32 bit external address 32bit registers updated registerfiles addition of operations which make more orthogonal sets of operations modifiers of opcodes to extend the functionality Chien CSE141 28 October 152002 Page 14 Ltd him Updated Register Files 16gt32 bit data registers Segment registers same size IP Flags extension Modes addressing and operations datatype sizes and segment registersmodes Exploiting mode bits and modifier bytes to condition the operations EFlags Flags I Chien CSE141 29 October 15 2002 Extending an Instruction Set Encoding Basic Instruction old encoding Extended operation modi er 32bit opns gt trick works because modifier bytes previously supported Complex decoding and instruction formats 1 17 Pre x opcode addr speci ers displacements immediates I How does this compare to exibility in DLX Chien CSE141 30 October 152002 Page 15 l I11d Extendin usability of Registers Operations Registers I How to make specialpurpose registers more general purpose I H in the gaps with new instruction encodings pre xes modes mod bits Chien CSE141 31 October 15 2002 I Encoding Results I Complex and varied instruction set Breaks all the rules but not as complex as the VAX Intel able to make it fast and competitive with other architectures with 2x how P6PIIPIII Aggressive prefetching and instruction translation into RISClike microops Simple fast pipeline to support these micro operations similar to a RISC processor on the y translation pros cons Chien CSE141 32 October 152002 Page 16 Fast 80X86 Implementation microop or uops I RISClike Basic Structure when one uop Pros code density compatibility Cons design effort complexity compilation latency Singlgkstrategy pursued by other CISC ish architectures such l Chien CSE141 33 October 15 2002 I 80X86 Statistics and Usage Com ress E ntott es resso Data Opns X86 615 230 751 DRatioto DLX 103 125 158 i Total Insts x86 2226 1203 2216 Inst Ratio to DLX 061 174 085 Instruction Set design affects usage Total instruction counts proportion counts in millions Instruction counts are comparable but varied 103 overall Data Accesses are greater for 80x86 123 overall lack of registers stresses the memory system Chien CSE141 34 October 152002 Page 17 l Id 80X86 Statistics and usage cont doduc hydro2d su200r Data Opns x86 778 8212 4782 Dratio to DLX 240 444 344 Insts X86 1223 13342 6197 Inst Ratio to DLX 119 253 162 Floating Point Codes millions Significantly more instructions 8087 stack architecture 173 overall Significantly more data accesses 335 overall gt lack of registers stresses the memory system Chien CSE141 35 October 15 2002 I X86 Summary I Evolution ofarchitectures which are important in the software base I Historical architectures are much more complex I Creativity hackery and aggressive implementations can keep these alive but ISA does have implications for performance and system characteristics Chien CSE141 36 October 152002 Page 18 Memory Hierarchies 9 Last Time 7 Advanced Pipelining gtgt Superpipelining Super scalar Dynamic Pipelining 7 Memory Hierarchies and Locality gtgt why do they work How do they work 7 Cache basics O RemindersAnnouncements 7 Reading is 75710 7 Midterm Solutions and HW3 solutions are on the Web 7 Homework 4 Due March 7 r Graded Midterms returned at the end of class today CS 141 Chien 1 March 2 2000 Basic Components the Memory Computer Control Input Memory Datapam Output CS 141 Chien 2 March 2 2000 Page 1 Component Speeds 0 Processor Speeds 7 Intel Pentium III 800 Mhz 7 Compaq Alpha 21264 1000 Mhz 7 Others 0 Memory Speeds 7 MacPCWorkstation DRAM 50 ns 7 Disks are even slower 0 How can we span this access time gap 7 1 instruction fetch per instruction 7 15100 instructions also do a data read or Write load or store CS 141 Chien 3 March 2 2000 Locality For some period most of the references are in the shaded re 39ons 9 Property of memory references in typical programs 9 Tendency to favor a portion of their address space at any given time 9 A key elements in computer organization for high performance CS 141 Chien 4 March 2 2000 Page 2 Aspects of Locality o Temporal 7 Tendency to reference locations recently referencted xx t spatial likelyturefaencex r Tendency to reference locations near those recently referenced hkely reference zune CS 141 Chien 5 March 2 2000 Memory References execute a program look at what crosses here Q Memory reference streams traces o Memory requests addresses data o Focus on the ADDRESSES CS 141 Chien 6 March 2 2000 Page 3 Temporal Locality in time 0 Program Reference Stream instruction and memory 110 264 Sources of temporal locality 111 112 Instruction fetch loops 272 muse t 113 Locals loops repeated mvocations 110 Data Structures 264 111 1 xxx Instruction fetches IF 292 113 xxxDataMemory Access CS 141 Chien 7 March 2 2000 Spatial Locality in space 110 264 111 0 Sources of spatial 112 locality 272 113 7 Instruction sequence 110 7 Indexing Arrays 7 Locals in a stackframe 112 7 Contiguous allocation 292 113 XXX Instruction fetch memory access CS 141 Chien 8 March 2 2000 Page 4 Importance of Locality 0 Why does locality matter 7 An opportunity for the computer designer make the computer cheaper faster etc process needs 256MB M 1 may processor can run at 500MHz 2ns memory ex ense feasibility What does locality buy us CS 141 Chien March 2 2000 Memory Locality 0 Memory hierarchies take advantage of memory locality 0 Memory locality is the principle that future memory accesses are near past accesses 0 Memories take advantage of two types of locality 7 Temporal locality near in time gt we Will often access the same data again very soon 7 Spatial locality near in spacedistance gt our next access is ft in very close to our last access or recent accesses CS 141 Chien March 2 2000 Page 5 Locality and caching 0 Memory hierarchies exploit locality by cacheing keeping close to the processor data likely to be used again 0 This is done because we can build large slow memories and small fast memories but we can t build large fast memories 0 If it works we get the illusion of SRAM access time with disk capacity SRAM access times are 2 25ns at cost 0f500 to 125 per Mbyte DRAM access times are 4080ns at cost 0fl to 4 per Mbyte Disk access times are 10 to 20 million ns at cost of lt01 per Mbyte CS 141 Chien 11 March 2 2000 A typical memory hierarchy small r CPU 4 onchip cache O Chip came memory lt main memory big memory lt disk cheap Sbit CS 141 Chien 39so then where is my rggmm and dam MarCh 2 2000 Page 6 How to exploit locality Q How do we use a memory hierarchy to exploit locality Er put things likely to use here remainder of things here How do we decide explicit control Registers implicit COHtl Ol Caches what we ll talk about CS 141 Chien 13 March 2 2000 Caches The Idea 0 Processor accesses Memory hierarchy O Accesses rst check small fast cache 0 If miss in cache retrieve data from main memory 0 Cache exploits locality to provide high performance cs 141 Chien 14 March 2 2000 Page 7 Implicit Control 2 0 interface View Hardware controlled or OS data movement Generic MemoryAddress operation interface Examples 00 r Caches W implicitly managed memory hierarchies 7 design HW to exploit locality 7 last concurrent movement 7 no need for addressing changes 7 performance visible only CS 141 Chien 15 March 2 2000 Amortized Access Cost Avg 0 Measure set of access costs 0 Weight by frequency Slow Me Assumptions ii 90 accesses to 10 of addresses maglc 10 1n fast memo remainder in slow memory Latency 1 cycle 10 cycles Avg Access 09 1 01 10 09 10 19 COSt frac offast mem refs fast mem access cost frac of slow mem refs slow mem access cost CS 141 Chien March 2 2000 Page 8 Terminology implicitly managed Hit nd desired data at desired level Miss don39t nd data at desired level typically search from processor outward speed Hit time time to access ifdata found Wrt a particular level Miss Penalty time to bring desired data to this level following a miss and deliver the data to the CPU Access time transfer time em latency multiple transfers for Hit Miss hrgxt level larger data sizes 0 Hit Time Miss penalty time 17 CS 141 Chien March 2 2000 Exam ple Me A Latency 1 cycle 10 cycles 1 wordcycle 100 references gt 85 cache hits 15 misses Total time hits hit time misses miss penalty MM access time transfer time 85l1510385l95 280 gt Average access time 28 cycles CS 141 Chien 18 March 2 2000 Page 9 Cache Design Issues lt Cache Main Memory 0 Data is copied into the cache 0 Where to put the data which location 0 Which data to replace later CS 141 Chien 19 March 2 2000 Cache Organization Basic lt Cache Main Memory 0 Map each region of address space to a single cache location 7 Direct Mapped Cache 7 Mapping Discard some address bits 7 Discard lowest shown above 7 Discard highest is more typical interleaved CS 141 Chien 20 March 2 2000 Page 10 Finding Cache Data Memory Address Access Lookup Data Tags Cache Cache Tag Index Cache O For fast access must be able to nd data quickly 0 Lookup discard a few address bits access the fast memory 0 Tags indicate the data that s really there 0 Why does this work 7 Memory location tag gt complete address CS 141 Chien 21 March 2 2000 Cache Access Data Tags Memory gt E g address lines E a Hit 0 Part of Memory Address applied to cache 0 Tags checked against remaining address bits 0 If match gt Hitl use data 0 No match gt Miss retrieve data from memory 0 This works pretty well but there are some complications CS 141 Chien 22 March 2 2000 Page 11


Buy Material

Are you sure you want to buy this material for

25 Karma

Buy Material

BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.


You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

Why people love StudySoup

Bentley McCaw University of Florida

"I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

Allison Fischer University of Alabama

"I signed up to be an Elite Notetaker with 2 of my sorority sisters this semester. We just posted our notes weekly and were each making over $600 per month. I LOVE StudySoup!"

Jim McGreen Ohio University

"Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

Parker Thompson 500 Startups

"It's a great way for students to improve their educational experience and it seemed like a product that everybody wants, so all the people participating are winning."

Become an Elite Notetaker and start selling your notes online!

Refund Policy


All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email


StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here:

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.