Adv Comput Architecture
Adv Comput Architecture ECE 6100
Popular in Course
verified elite notetaker
Popular in ELECTRICAL AND COMPUTER ENGINEERING
This 0 page Class Notes was uploaded by Cassidy Effertz on Monday November 2, 2015. The Class Notes belongs to ECE 6100 at Georgia Institute of Technology - Main Campus taught by Staff in Fall. Since its upload, it has received 6 views. For similar materials see /class/233902/ece-6100-georgia-institute-of-technology-main-campus in ELECTRICAL AND COMPUTER ENGINEERING at Georgia Institute of Technology - Main Campus.
Reviews for Adv Comput Architecture
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 11/02/15
Branch Prediction and Speculation Sample Problems Branch Prediction and Speculation Branch Prediction and Speculation 1 Consider the following two designs for a alleviating the effect of branches a The first design defines a branch with two delay slots and does not use branch prediction Rather the solution is to use compiletime scheduling to fill the delay slots with useful instructions where possible Suppose that for 30 of the branch instructions the compiler can fill both branch delay and for 60 of the instructions the compiler can fill only one delay slot The second design employs branch prediction and does not use delay slots The misprediction penalty is 3 cycles The branch always costs one cycle and if mispredicted it will cost an additional three cycles Fquot What prediction accuracy is required in the second design to achieve the same performance as the first design From a we know that 10 of the branch instructions result in two pipeline bubbles while 60 result in a one cycle bubble We can compute the increase in CPI for each case we can ignore the probability an instruction is a branch instruction since it is the same in both cases we have The increase in CPI due to a is probiinstriisiaibranch k06 1 01 2 08 The increase in CPI due to b is probiinstriisiaibranchgtxlt 1 p 3 3p Where p is the probability of misprediction Equating both provides the critical value of p The prediction accuracy 1p Branch Prediction and Speculation 2 Consider the following code sequence Assume that each instruction is encoded in one 32bit word We have a k entry branch prediction buffer Address Instruction 0x40000000 DSUBUI R3 R1 2 0x40000004 BNEZ R3 L1 0x40000008 DADD R1 R0 R0 0x4000000c L1 DSUBUI R3 R3 2 0x40000010 BNEZ R3 L2 0x40000014 DADD R2 R0 R0 0x40000018 L2 DSUBU R3 R1 R2 0x4000001c BEQZ R3 L3 0x80008004 L3 a What is the minimum value of kto maximize prediction accuracy and why In general it is 32 In this example we need 5 bits to address the prediction buffer without aliasing the branch addresses since the addresses of the branch instructions differ in the least significant 5 bits In this example even 4 bits will suffice Can you tell why Alternatively consider the use of a branch target buffer Figure 319 Show the possible contents of the following 4element branch target buffer after one execution of the prior code assuming it is initially empty 0x40000004 0x4000000c 0x40000010 0x40000018 0x4000001c 0x80008004 Branch Prediction and Speculation 3 Consider the use of a branch prediction buffer using nbit saturating counters for the code sequence shown below The memory addresses for the instructions are shown in hexadecimal notation Assume the following loop code has been executed 12 times The branch at location 0x0044 has been taken 50 of the time and the branch at location 0x0050 has been taken 50 of the time Consider the point in time of the start of the execution of the l3Lh iteration Address 0X0038 L3 0x0030 0X0040 DSUBUI R3 R1 2 0X0044 BNEZ R3 L1 0X0048 DADD R1 R0 R0 0x004C L1 DSUBUI R3 R2 2 0X0050 BNEZ R3 L2 0X0054 DADD R2 R0 R0 0X0058 L2 DSUBUI R3 R1 R2 0x0050 BEQZ R3 L3 a Considering only the preceding code how many entries should the branch prediction buffer have to avoid the possibility of aliasing of branch addresses The minimum number of least signi cant bits to ensure no aliasing for these addresses is 4 hence the branch prediction buffer would need 24 16 entries b If all prediction buffer entries were initialized to 0 what can be the value of the counters in the prediction buffer corresponding to these two branch instructions OXOO44 BNEX R3 L1 0 S value S 6 OX0050 BNEZ R3 L2 0 S value S 6 Branch Prediction and Speculation c Now consider the case where we use a global branch predictor with 3 bit global history Execution of the 13Lh iteration is about to start Provide an example of i the value of feasible 3bit global branch history and ii the value of an infeasible global branch history Ensure you clearly identify the entries in the branch history with the branch instructions in the code sequence The first two branches test for equality between two numbers N1 amp N2 with the number 2 The last branch tests if N1 N2 If the first branch is taken N1 not equal to 2 and the second branch is taken N2 2 then the last branch cannot be taken Hence a feasible global history is 111 the first branch on the program corresponds to the most significant bit An infeasible history is 000 Branch Prediction and Speculation 4 Consider the case where the execution pipeline has a single cycle branch delay slot Static scheduling can ll 30 of the delay slots We can ll 60 of the remaining slots if we use cancelling branches instructions are cancelled if the branch is mispredicted These slots are lled with instructions assuming the branch is not taken If 18 of all instructions are branches and they are taken 62 of the time what is the net increase in CPI Number of stall cyclesinstruction are 018 07 04 06062 70 of the branch instruction slots cannot be successfully lled with computer instructions Of these 40 are left empty and contribute a cycle Of the remaining a cycle is contributed only when the branch is cancelled which is 62 of the time Branch Prediction and Speculation 5 Consider the 5 stage integer pipeline with forwarding Assume the branch penalty is 1 cycles branch condition computed in ID Now assume we have pipelined the memory system to three stages rather than 1 stage for both instruction fetch and data fetch Branches are resolved at the end of the EX stage We use a static branch not taken prediction strategy ie if branches are taken we incur the branch penalty Assume conditional branches occur with a frequency of 14 a If branches are taken 62 of the time what is the increase in CPI due to this prediction strategy The branch penalty is 3 cycles and incurred only when the branch is taken Increase in CPI 062 014 3 02604 b Alternatively if we modify the pipeline and implement a delayed branch with a single delay slot and we are able to successfully fill 65 of the slots what is the increase in CPI 35 of the time we are unable to fill these slots with a penalty of 1 cycle Hence increase in CPI I s 035 14 c Now consider the occurrence of load delay slots where loads occur with a probability of 24 and 40 of these fetch data used by the immediately following instruction If we perform no instruction scheduling to fill delay slots what is the increase in CPI compared to the original pipeline ie without pipelining the memory system The load stalls are now three cycles rather than 1 With no instruction scheduling we have 024 4 3 0288 Branch Prediction and Speculation 6 Consider the dynamically scheduled execution of the following code sequence where a ROB buffer is used Assume register F10 is initialized with the value 10 and memory locations 0Rl and 0R2 are initialized with 6 and 7 respectively All other registers are initialized to 0 Consider the first iteration through the loop 1 LOOP LD F2 0R1 2 LD F4 0R2 3 MULD F6 F2 F4 4 SUBD F4 F6 F10 5 DIVD F6 F12 F2 6 SD F6 0R1 7 ADDD F8 F8 F4 8 DADDIU R1 R1 8 9 DADDIU R2 R2 8 1 BNE R1 R4 LOOP O a Show a valid state of a 4 entry ROB when instruction 7 issued Identify the head and tail of the ROB NO VALUE PENDING TAIL HEAD b Register remapping is employed where architecture registers are remapped to physical registers PR F6 in instruction 3 is remapped on issue to PR 9 When the DIV instruction reaches the head of the ROB can PR 9 be freed Justify your answer Yes This means that all instructions prior to the DIVD have committed and all instructions that used the mapped register for F6 have completed Therefore it can be freed Branch Prediction and Speculation 7 Consider the following code sequence for a dynamically scheduled machine Assume register F10 is initialized with the value 10 and memory locations 0Rl and 0R2 are initialized with 6 and 7 respectively All other FP registers are initialized to 0 Consider the first iteration through the loop 1 LD F2 0Rl 2 LD F4 0R2 3 DIVD F6 F12 F2 4 MULD F6 F2 F4 5 SUBD F4 F6 F10 6 SD F6 0Rl a Assuming an exception occurs on instruction 3 and instructions 4 and 5 have completed execution What are the contents of F4 and F6 in the following cases F4 F6 Using a History Buffer only 7417 7427 Using a ROB only 777 707 Using a Future Register File with an ROB 777 707 b What is a precise exception An exception is precise if its occurrence is as if it occurred after instruction i and before instruction i 1 All instructions upto and including i complete execution and all instructions owing and including i 1 have not executed Branch Prediction and Speculation 8 Consider the dynamically scheduled execution of the following code sequence The first time through the loop an exception occurs on the DIV instruction Distinguish between how a precise exception will be handled using a ROB and a history buffer How does register renaming affect or not affect the handling of exceptions 1 LOOP LD F2 0R1 2 LD F4 0R2 3 MULD F6 F2 F4 4 SUBD F4 F6 F10 5 DIVD F6 F12 F2 6 SD F6 0R1 7 ADDD F8 F8 F4 8 DADDIU R1 R1 8 9 DADDIU R2 R2 8 1 ENE R1 R4 LOOP IO V th an ROB Exceptions for an executing instruction are flagged in its ROB entry but not raised The processor raises an exception associated with an instruction when that instruction reaches the head of the ROB Since instructions in the ROB are allotted entries in program order they are committed in program order Instructions fetched speculatively on a mispredicted branch are never committed Therefore all exceptions are precise V th an ROB register renaming does not affect handling of precise exceptions This is because register renaming does not affect how the instructions commit always in program order and exceptions can be raised only at commit time In the case of a history buffer instructions are allocated history buffer entries that contain the old value history of the register being written If an exception occurs the corresponding history buffer entry is labeled When exception instruction reaches the head ofthe history buffer the history buffer is scanned from head to tail and all old values replaced using those in the history buffer This is needed because instructions write directly to the register le
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'