Adv Computer Architecure
Adv Computer Architecure ECE 4100
Popular in Course
Popular in ELECTRICAL AND COMPUTER ENGINEERING
This 0 page Class Notes was uploaded by Cassidy Effertz on Monday November 2, 2015. The Class Notes belongs to ECE 4100 at Georgia Institute of Technology - Main Campus taught by Hsien-Hsin Lee in Fall. Since its upload, it has received 16 views. For similar materials see /class/233865/ece-4100-georgia-institute-of-technology-main-campus in ELECTRICAL AND COMPUTER ENGINEERING at Georgia Institute of Technology - Main Campus.
Reviews for Adv Computer Architecure
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 11/02/15
ECE 41006100 Advanced Computer Architecture Lecture 5 Branch Prediction Prof HsienHsin Sean Lee School of Electrical and Computer Engineering Georgia Institute of Technology i Georgia nnssitf immica l i rTechm 193HL39IDQQS 7 IQ II I Reading for this Module 0 Branch Prediction Appendix A2 pg A 21 A 26 Section 23 0 Branch Target Buffers and Return Address Predictors Section 29 0 Reading assignments Papers on class website IE I Control Dependencies Dependencies Structural Data Name Arm Output Control dependencies determine execution order of instructions Instructions may be control dependent on a branch DADD R5 R6 R7 BNER4 R2 CONTINUE DMULR4 R2 R5 DSUB R4 R9 R5 Geo ia39 i 39 73w IE I Predict What 0 Direction 1bit Single direction for unconditionaljumps and callsreturns Binaryfor conditional branches 0 Target 32bit or 64bit addresses Some are easy 0 One Unidirectionaljumps 0 Two Fall througi Not Takenvs Taken Many Function Pointer or IndirectJump egjr r31 Categorizing Branches Condiu39onal Branch 0 20 40 60 80 100 Frequency of branch instructions Source HampP using Alpha Geo ia39 r 39 ran I i II Branch Misprediction 1234567891011121314151617181920 PC ext IPC FFtch IDriveIAllocI Ranlama I nznl slhadqla Disdatchl RegIFila Exa4FIagq Br 4mm Single Issue Geo ia39 r r39Tgchquot 3 II Branch Misprediction i234567891011121314151617181920 PC 11m IPC FLtch IDriveIAllocI Ranlama I uwl slhadqla Disdatchl RegIFila Exa4FIa94 Br R solv Single Issue DEEDDDDDDEDDDDDDDDDD Mispredict Geo ia39i r Tah I III Branch Misprediction 12 3 4 5 6 7 8 91011121314151617181920 PC ext IPC FFtch IDriveIAllocI Ranlama I nznl slhadqla Disdatchl RegIFila Exa4FIagq Br 4mm Single Issue flush entailed instructions and refetch Mispredict IE I Branch Mispredlctlon 1234567891011121314151617181920 PC 11m IPC FLtch IDriveIAllocI Ranlama I uwl slhadqla Disdatchl RegIFila Exa4FIa94 Br R solvl Singlelssue Fetch the correct path Branch Misprediction 1234567891011121314151617181920 PC hm IPC ich IDriveIAllocI Ranlama I nznl slhadqla Disdatchl RegIFila Exaquagq Br 4mm Single Issue Mispredict 8issue Superscalar Processor Worst case E II Why Branch is Predictable if aa2 aa for i 0 ilt100 i if bb2 b if aabb addi r2 r02 bne r10 r2 Libb xor r10 r10 r10 j Liexit Libb bne r11 r2 Lixx xor r11 r11 r11 j Liexit addi r10 r0 100 addi r1 r0 r0 Ada r1r1 1 bne r1r10L1 Lixx beq r10 r11 Liexit Lexit Geo I II I Control Speculation 0 Execute instruction beyond a branch before the branch is resolved 9 Performance 0 Speculative execution Difference between speculation and prediction 0 What if misspeculated need Recovery mechanism Squash instructions on the incorrect path 0 Branch prediction Dynamic vs Static 0 What to predict Geo ia39 r39Tgchquot I II Static prediction 0 Static prediction is used to guide code scheduling strategies Simple strategy for all branches in the code Based on opcode or direction of branches Profile based 0 individual branches tend to be strongly bimodal set a bit in the opcode 0 Provide mechanisms for compilers or programmers to provide hints Bit in the instruction encoding lfetch is steered accordingly Geo ia39H 39 i x r 2 l IE I I Profile Guided Static Prediction Target address is known here Geo iaquot nggcw IE I Profile Guided Static Prediction Branch condition is known here A II I Dynamic Branch Prediction Strategies smn register i e taken ornotlaken How do we capture this history How do we predict From Re quotModern Processor Design Fundamenias 0 Superscaar Processors 1 Shen and M Lipasti Use past behavior to predict the future Local vs global behaviors Branches show surprisingly good correlation with one another and their history 0 They are not totally random events Simplest Dynamic Branch Predictor 0 Prediction based on latest outcome 0 Index by some bits in the branch PC Aliasing 1bit Branch History for i0 K100 i Table 0x40010100 addi r10 r0 100 0x40010104 addi r1 r1 r0 0x40010108 Ll 6540010A04 r1 r1 1 0x40010 bne r1 r10 L1 How accurgtggrgia EC Typical Table Organization PC 32 39 2N entries table update Actual outcome Simplest Dynamic Branch Predictor for i0 ilt100 i if ai 0 0x40010100 addi r10 r0 100 0x40010104 addi r1 r1 r0 Ll 0x40010108 add r21r20 0x4001010c Iw 0x40010W 0x40010 jquot 39 L2 L5 0x40010 di r1 r1 1 0x40010 bne r1 r10 L1 Georgiaquot 1 39 Tech 1 1bit Branch History Table FSM of the Simplest Predictor A 2 state machine Change mind fast 1 9 A If branch taken If branch not taken Predict not taken Predict taken Geo ia39H r39ngchquot Example using 1bit branch history t abie addi r10 r0 4 for i0 ilt4 i Lalid39 rl rl ro addi r1r1 1 bne r1r10Ll girrggirrge MOQQQQOQGQQOQ ActualTTTTNT T 60 accuracy Geo iaquot 7 Ta Ira ll 2bit Saturating Up Down Counter Predictor iquoti ii g up Taken gt Not Taken ST Strongly Taken WT Weakly Taken Predict Not taken wu Weakly Not Taken SN Strongly Not Taken Q Predict taken Georgi i39Techquot la I 2blt Counter Predictor Another Scheme up Taken gt Not Taken ST Strongly Taken WT Weakly Taken 9 Predict Not taken wu Weakly Not Taken SN Strongly Not Taken Q Predict taken Georgia 7 L 7 Tech 7 E II I Example using 2bit updown counter addi r10 r04 addi r1 rlr0 for i0 ilt4 i Eda r1 r1 1 bne r1 r10 L1 gvrv vvvv 4 Pred 39T39T ActualTTTTN Capturing Global Behavior Ashift register captures the local path through the program 0 For each unique path a predictor is maintained 0 Prediction is based on the behavior history of each local path Shift register length determines program region size Geo iaHr 97 yaw 7 If i Ill Branch Correlation Code Snippet if aa2 b1 aa 0 if bb2 b2 if aabb b3 BI10 0101 D0 0 aa0aa 2 aaxZ bb12 bb0 bb 2 0 Branch direction Notindependent Correlated to the path taken 0 Example Path 11 of b3 can be surely known beforehand Track path using a 2bit register Georgia W1ch t I Correlated Branch Predictorlmnsoanmm1 2bit shift register Subse uent branc branch history direction Branch PO BranCh PC Prediction 27m counter counter counter noumsr 22 Correlation Scheme COUH 2bit Sat Couhter Scheme MN correlation scheme M shift register size bits N Nbit counter Geo ia39H Virrngch TwoLevel Branch Predictor Vehpansgnmc Pattern History Table PHT 2N entries Branch History Register BH R Shift left when update Rck Rc1 N Prediction Current State PHT update Rc Actual Branch Outcom o Generalized correlated branch predictor o 151 level keeps branch history in Branch History Register BHR o 2 d level segregates pattern history in Pattern History Table PHT Gaga 39 IE I Branch History Register 0 An Nbit Shift Register 2N patterns in PHT 0 Shiftin branch outcomes 1 2 taken O 2 not taken 0 Firstin FirstOut 0 BHR can be Global Perset Local Peraddress Geo ia i 97 Tan IE I l Pattern History Table 0 2N entries addressed by Nbit BHR Each entry keeps a counter 2bit or more for prediction Counter update the same as 2bit counter Can be initialized in alternate patterns 01 10 01 10 Alias or interference problem Geo ia i 97 nah i v 3 II Key Idea 0 Separate all of the histories of a branch 9 subhistories 0 For each subhistory employ a separate predictor Each history maps to a FSM Geo iain Wu 2 v I i III Global History Schemes GAg GAs GAp Perset Peraddr SetPB PHT AddrB PHT Global PHT Global Global Global BHR BHR BHR Set can be determined by branch opcode compiler classi cation 39 or branch PC address E95 bi bi PanSoRahmeh QZ similar to GAnzorgia it 7 iquotTech39 139 GAs TwoLevel Branch Prediction The 2 LSBs are insigni cant for 32bit instruction 00000000 PC 0X4001000C 00110110 gt 00110110 00110111 BHR 11111101 11111110 11111111 MS Georgia 1 Eredict Taken Tech 7 3 lr l Predictor Update Actually Not Taken PHT 00000000 PC 0X4001000C 9 xgg w 00111100 00110110 00110111 m 00111100 BHR 11111101 11111110 11111111 decremented Update Predictor after branch is resolved 9013115 PerAddress History Schemes PAg PAs PAp SetPB Eat AddrB Em BHT PBHT Global BHT PBHT PHT BHT PBHT EX Alpha 21264 5 l H local predictor EX P6 Itanlum a g i PAs TwoLevel Branch Predictor PC 1110 0000100110010010110011101011 00000000 00000001 00000010 11010110 000 001 010 m 11010101 011 11010110 m 100 101 110 111 BHT 11111111 Geo ia1 1 Tgch39 1 gill PerSet History Schemes 8A9 8A5 SAP SetPB Eat AddrB Em Global BHT SBHT PHT BHT SBHT BHT SBHT Geo ia i 97 2 v IE I l PHT Indexing Global Gselect history 44 00000000 00000001 00000001 00000000 00000000 00000000 11111111 00000000 11110000 11111111 10000000 11110000 Insufficient History Tradeoff between more history bits and address bits Branch addr Too many bits needed in Gselect sparse table entries Geo ia w39Tgch39 3 III Gshare Branch Predictor McFarling93 Global Gselect Gshare history 44 33 00000000 00000001 00000001 00000001 00000000 00000000 00000000 00000000 11111111 00000000 11110000 1 11111 11111111 10000000 11110000 1011111111 Gselect 44 Index PHT by concatenate low order 4 bits Gshare 88 Index PHT by Branch address Global history Branch addr Tradeoff between more history bits and address bits Too many bits needed in Gselect 5 sparse table entries Gshare 5 Not to lose global history bits Ex AMD Athlon MIPS R12000 Sun MAJC Broadcom SiByte sSBl Geo iain C gr Tah I i Ill Gshare Branch Predictor PC Address Global BHR Aliasing Example GAp PHT inc BHR 1101 PC 0110 11 1001 BHR 1001 PC 1010 11 1001 exed by 10 0000 3 Gshare 1101 0110 Geo ia l Tah I i II Hybrid Branch Predictor Meaninng Choice or Meta Predictor Final Prediction Some branches correlated to global history some correlated to local history Only update the metapredictor when 2 predictors disagree Geo ial 1129m Alpha 21264 EV6 Hybrid Predictor A tournament branch predictorquot Multirpredictor scheme W Local predictorPAg Selircorrelation Global predictor lnterrcorrelation Choice predictor as the decision maker a 27bit sat counterto credit either local or global predictors Die size impact History info tables 2 Next BTB 2 7 associated LINEset For SingleCW9 r 7 Prediction With i 3 on a per line basis predlcnon 2 cycle latency We Wlii discuss more later Alpha EV8 Branch Predictor predictor Real silicon never sees the daylight prediction Use a 2Bcgskew predictor one form of enhanced gskew Bimodal predictor used as 1 static biased predictorahd 2 part of eygskew predictor Global predictors GO and Gi are part of ergskew predictor Table sizes 352Kbits in total 208Kbits for prediction table 144Kbits for hysteresis table Georgia39H iquotTechquot IE II Branch Target Prediction 0 Try the easy ones first Directjumps CallReturn Conditional branch bidirectional 0 Branch Target Buffer BTB 0 Return Address Stack RAS Geo iain C gr saw v Ira II I Branch Target Buffer A cache that contains three pieces of information 0 The address of branch instructions The BTB is managed like a cache and the addresses of branch instructions are kept for lookup purpose 0 Branch target address To avoid recomputation of branch target address where possible 0 Prediction statistics Different strategies are possible to maintain this portion of the BTB Geo ia aw I i II Branch Target Buffers PC WW Dramas PC nfmrrespmdmg large Predldlu WU llll Hm use mrrespunmng large address MlSS rm adlm Access in parallel with instruction cache Hit produces the branch target address Geo ia l Tah Branch Target Buffer Operation lnstructlon Fetch lnstructlon Decode Execute Nurmal Updale Updale 575 Cunllnue mm execullnn 575 new Msmdmn execullnn branch reame Geo ial J39ngch39 Branch Target Buffers Operation Couple speculative generation of the branch target address with branch prediction Continue to fetch and resolve branch condition 0 Take appropriate action if wrong Any of the preceding history based techniques can be used for branch condition speculation Store prediction information eg nbit predictors along with BTB entry Branch folding optimization store target instruction rather than target address Geo ia i J 2 v Branch Target Buffer BTB Branch PO Predicted Branch Branch Target TechW IE I Return Address Stack RAS 0 Different ca sites make return address hard to predict Printf being called by many callers The target of return instruction in printf is a moving target 0 A hardware stack LIFO Call will push return address on the stack Return uses the prediction off of TOS 339 quot If II I Return Address Stack V P h R e fum BTB BTB Address 0 Does it always work 0 May not know it is a return instruction Candepth priorto decoding Rely on BTB for speculation Fix once recogggrglegurn xTech39 SetjmpLongjmp Speculative call 3 II Indirect Jump 0 Need Target Prediction Many potentially 230 for 32bit machine In reality not so many Similar to predicting values Think about case statements 0 Is the target influenced by history 0 Tagless Target Prediction 0 Tagged Target Prediction Geo ia39X 39 i L r IE I Tagless Target Prediction ChangHaoPatt 97 N PC 8 BHR Pattern Target Cache 2 entries Branch PO Predicted Target Addr Branch History Register 111 BHR Modify the PHT to be a Target Cachequot indirectjump from target cache from BTB Alias Multipletargetsforthesamejump How does this improve accuracy over the BTB 7 Georgia wTechquot 3 II Tagged Target Prediction ChangHaoPatt 97 Target Cache 2quot entries per way To reduce aliasing with setassociative target cache Use branch PC andor history for tags Geo ia39H 39 i x 7 Tet I II I Multiple Branch Prediction 0 For a really wide machine Across several basic blocks Need to predict multiple branches per cycle 0 How to fetch noncontiguous instructions in one cycle Geo iaquot iquotTgchquot
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'