Computer System Architecture
Computer System Architecture CPSC 440
Cal State Fullerton
Popular in Course
Popular in ComputerScienence
This 10 page Class Notes was uploaded by Mrs. Lue Goyette on Wednesday September 30, 2015. The Class Notes belongs to CPSC 440 at California State University - Fullerton taught by Staff in Fall. Since its upload, it has received 40 views. For similar materials see /class/217067/cpsc-440-california-state-university-fullerton in ComputerScienence at California State University - Fullerton.
Reviews for Computer System Architecture
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 09/30/15
CPSC 440 Lecture 4 The Processor Datapath and Control What will be implemented 1 Data transfer 1W sw 2 R type add sub and or s1t 3 Branch beq andjumpj An abstract view excluding branch See Fig 41 on page 302 Adding multiplexes and control lines See Fig 42 on page 304 In more details See Fig 45 7 49 Rtype See Fig 410 on page 314 Data transfer lw sw andbranch instruction See Fig 411 on page 315 Adding multiplexes and control lines See Fig 415 on page 320 A simple schema See Fig 417 on page 322 Note you need to know What type of instructions will go through which part of the circuit Designing the Main Control Unit See Fig 418 on page 323 Note you need to know how to come up with the table by looking at the datapath except for the ALUOp s Implementing jumps See Fig 424 on page 329 6 rra a an Breaking the Instruction Execution inta ve steps 1 IF fetch instruction from memory 2 1D read registers While decoding the instruction 3 EX execute the operation or calculate an address 4 MEM access an operand in data memory 5 WE Write the result into a register See Fig 433 on page 345 An Overview of Pipelining Performance No pipelining 5xN With pipelining N 4 More hardware is needed See Fig 435 on page 347 Data hazard sub 2 1 3 and 12 2 5 or 13 6 2 add 14 2 2 sw 15 100052 Without forwarding Fig 452 on page 364 With forwarding Fig 453 on page 367 Also see Fig 454 on page 368 Example addi s0 zero 400 Loop lW t1 0s0 addi t2 tl 100 sw t2 0s0 addi s0 s0 4 bne s0 zero Loop Question can forwarding solve all data hazards Answer except for the loaduse stall loaduse hazard lW 2 20l and 4 2 5 aeg m a ram Reordering code to avoid pipeline stalls abq cb lw tl 0t0 lw t2 4t0 add t3 tl t2 sw t3 12t0 lw t4 8t0 add t5 tl t4 sw t5 l6t0 Control hazard See Fig 461 on page 376 Move decision making to the ID cycle See Fig 462 on page 379 Branch prediction taken vs not taken Dynamic predication See Fig 463 on page 381 Delayed branch See Fig 464 on page 382 Pipeline Hazard summary 1 Structure hazards eg memory access Solution more readwrite buses 2 Data hazards caused by data dependencies Solution forwarding 3 Control hazards e g branch Solution a move decision making earlier b predict taken or not taken c delay branch d dynamic prediction Superscalar and Dynamic Pipelining ldea separate data transfer and ALU branch instructions add more functional units so that a pair of instructions may be issued executed at the same clock cycle 1 Codes rescheduling 2 Loop unrolling Note The class notes here is brief It only gives you the scope of what we have covered What you got from the classes is complete woem a ram CPSC440 Lecture 3 Supporting Procedure in Computer Hardware When calling a procedure the program must 1 Place the parameters for the procedure 2 Transfer control to the procedure 3 Acquire the storage resources needed 4 Perform the desired task 5 Place the results for the calling program 6 Return control to the point of origin Which afthese are done by the caller Which afthem are done by the callee Special registers for procedure calling 1 a0 a3 parameters 2 v0 vl return values 3 ra return address current PC 4 i Using more registers 7 stack Special register sp for stack pointer Example int leaf int g int h int i int j int f fgh1i return f in MIPS leaf addi sp sp 12 sw tl 8sp sw t0 4sp sw s0 0sp add t0 a0 al add tl a2 a3 sub s0 t0 tl add v0 s0 zero lw s0 0sp lw t0 4sp lw tl 8sp addi sp sp 12 jr ra Want to save some time from storing and recovering registers Temporary and saved register convention I t0 t9 are temporary and not preserved 2 s0 s7 are saved and preserved ii Nested Procedures 7 preserve return address and parameters Example n 1 if n0 n n x nl if ngt0 aeg m a ram in C int fact int n if n 0 return 1 else return nfactn 1 in MIPS fact addi sp sp 8 sw ra 4sp sw 210 0sp slti t0 210 I beg t0 zero L1 addi v0 zero 1 addi sp sp 8 jr ra L1 addi 210 210 1 jal fact 1W 210 0sp 1W ra 4sp addi sp sp 8 mul v0 210 v0 jr ra emporary in the Stack iil Allocating SpacefarNew Data 7 Iheframe painter 3 and the global painter 3gp ipV Elspgt Beyond Numbers processing strings 1b t0 0sp mg m a gym 5b t0 0gp Exam le void 5trcpy char x char y int i39 10 while xi yi 0 i 39 1 l in C it really is char 5trcpy char x const char y In MIPS strcpy addi 5p 5p 4 SW 50 05p add 50 zero zero add tl 50 al lbu t2 0tl add t3 50 a0 5b t2 0t3 beq t2 zero L2 addi 50 50 1 j L1 lW 50 05p addi 5p 5p 4 jr ra r r N Other Styles of MIPS Addressing i more I Type instructions addi 5p 5p 4 slti t0 52 100 lui t0 255 ii Addressing in Branches andJumps J 7 Type instructions I op I pseudodirect address 6 bits 26 bits Branching I op I r5 I rt I relative address MIPS Addressing Mode Summary Register RType Base or displacement eg lW 5W Immediate eg addi slti lui PC 7 relative e g branching Pseudodirect eg J 7 Type i V mums VVVV warm a ram Arrays vs Pointers Example Clearl int array int size int i for i 0 iltsize 14 arrayi 0 in MIPS move t0 zero Loopl sll t1 t0 2 add t2 210 t1 sw zero Ot2 addi t0 t0 1 slt t3 t0 211 bne t3 zero Loopl Another example C1ear2 int array int size int p for p amparray0 pltamparraysize pp1 p in MIPS move t0 210 Loop2 SW zero Ot0 addi t0 t0 4 11 t1 211 2 add t2 210 t1 slt t3 t0 t2 bne t3 zero Loop2 An improved version move t0 210 11 t1 211 2 add t2 210 t1 Loop2 SW zero Ot0 addi t0 t0 4 slt t3 t0 t2 bne t3 zero Loop2 Notice the di erenlbelween the two versions aeg m a gym CPSC 440 Lecture 5 Th Memury Hizmchy 1 Memaxytechnnlagy rm mare expenave 2 cm 12 gtmemaxyhenrchy 3 17mm 15 K Ll cm 07cm 15 K m m K scam m M mm Dm 135 Fax ushxmx xm Wm the requested am is present sts Wm the requested am 15 absent thume m km W mk m the xtem m hm m xend39hatnzm pmcessm stsumequot the mm m mk fax m xtem misspen ty Hum human numbexafhnstatal accesses stsn39z numbexafmxssesHmalaccesses 7h 1 rate mm 12 m Tbe Ha z uf Czchz 1 Cum 2 g S egseengem m m A smnxe nueememnurey xeclmappmg mm mosmmmmguuto anbev ofElock m rm th2 E v urubu rnurenss wbeuner Lb eurrem esebe black rs mm Tags ursungursb urrreremmem my bxseksunm can gelxmamz ssme cache black Byte uffsel hm What 1 Index v Tag Dawzbns 2n 32 Data rbe mm number obrr meded o mplemmtme mm mu number ufbus Number af enmes v mun Tag Data Tagbns Audresgbrqs exnuerms 72Eyu uffset 32 en72znen Assume we have InenmeS am wsru black mu number bf bus Assume A KE Csehe nm may 12 m zwnznenuzyr T 63 in am number ufbns 53 Kbus Handling Cache Misses an read 1 Send PC 4 to the memory Activate memory read Data from memory gt data portion upper bits of address gt tag field valid bit gt on Restart memory fetch bWN VVV Write Write through write to the memory at the same time when writing to the cache 2 Write back postpone writing to the memory until the block is replaced i V Taking Advantage of Spatial Locality Larger block I I I IIOI Byte offset 18 10 Index 2 Block offset Tag V Tag Data 4 X 32bits What is the total number of bits needed to implement the cache Assume 16 KB Cache with 4 word block gt 4 K words gt 210blocks gt n10 Total number of bits 21 132 710 7 2 2432 147 K bits Write hit write the word into the block Write miss bring the block from the memory and write the word into the block Notice the difference between one word per block and multiple words per block Advantages Write through is good for small block and is easier to implement But the delay is too long One solution is to use write buffers Write back is good for larger block and burst write But it is more difficult to implement Block size Ins miss rate Data miss rate Combined miss rate 1 61 21 54 gcc 4 20 17 19 1 12 13 12 spice 4 03 06 04 5695mm a gym 3
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'