Compilers & Interpreters
Compilers & Interpreters CS 4240
Popular in Course
verified elite notetaker
verified elite notetaker
verified elite notetaker
Ms. Bryana Hagenes
verified elite notetaker
verified elite notetaker
verified elite notetaker
Popular in ComputerScienence
This 0 page Class Notes was uploaded by Alayna Veum on Monday November 2, 2015. The Class Notes belongs to CS 4240 at Georgia Institute of Technology - Main Campus taught by Staff in Fall. Since its upload, it has received 6 views. For similar materials see /class/234111/cs-4240-georgia-institute-of-technology-main-campus in ComputerScienence at Georgia Institute of Technology - Main Campus.
Reviews for Compilers & Interpreters
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 11/02/15
Optimizations intermediate code 7 pros machine independent optimization pportunities which will not need to be retargeted for another platform 7 cons yet another language 7 possible representations 39 trees threeaddress code y x op z or quadruples triple polish notation utilization of some resour e 7 Execution time most o en 7 Code size 7 Network messages sent 7 Battery power used etc Optimization should not alter what the program computes 7 The answer must still be the same What to optimize 7 costperformance argument opts hard to implement increase compilation time 7 code to consider based on number of occurrences in code repetitions of execution uses of compiled code Optimization seeks to improve a program s c Optimization options Apply local optimizations in basic blocks 7 single entryexit point 7 preserve outcome of computation 7 select different representation for sequence in block Global optimization 7 across CFG trace of BBS 7 harder to prove 7 use ow analysis Intraprocedural optimizations Local optimizations Algebraic simplification 7 Some statements can be deleted xx0 xxl 7 Some statements can be simpli ed xx0 2 x0 yy2 2 yiyy xx8 2 xxltlt3 xx15 2 txltlt4xtX Constant Folding Operations on constants can be computed at compi e time 7 algebraic simpli cations In general if there is a statement x y op z 7 And y and z are constants 7 Then y op 2 can be computed at compile time Examplex22 2X4 Example if2 lt 0 jump L can be deleted In general 7 7 values of expressions involving compiler time constants 7 replace unarybinary tree with constant node with corr value a 7 operations involving constants as operand 7 replace binary operation subtree with its non constant subtree Constants within blocks traverse values from constant assignments within a basic block when an assignment involves a constant rhs put the target in a known vars list symbol table whenever a var is accessed check to see if its in known vars list and if constant then replace node in AST remove variables from the list when a non constant assignment is encountered or at the end ofa basic block can use known list to track accesses and delete useless variables Constants across blocks why does a block end 7 entrance to selectionlooping statement 7 exit from same 7 calls to procsfuncs when can we retain value of variable Flow of Control Optimizations Eliminating unreachable code 7 Code that is unreachable in the control ow graph 7 Basic blocks that are not the target of anyjump or fall through from a conditional 7 Such basic blocks can be eliminated Removing unreachable code makes the program smaller 7 And sometimes also faster Due to memory cache effects increased spatial locality Single Assignment Form Some optimizations are simpli ed if each register occurs only once on the lefthand side of an assignment Intermediate code can be rewritten to be in single assignment form X b z y a 2 a x 2 x x 7 2 b b is a fresh register More complicated in general due to loops Common Subexpression Elimination Assume 7 Basic block is in single assignment form 7 A de nition x is the rst use ofx in a block All assignments with same rhs compute the same value Example XiyZ XiyZ gt Wx the values ofx y and 2 do not change in the code Copy Propagation If w X appears in a block all subsequent uses of w can be replaced with uses ofx Example bzy bzy ab 2 ab x2a x2b This does not make the program smaller or faster but might enable other optimizations 7 Constant folding 7 Dead code elimination Copy Propagation and Constant Folding Example a5 a5 x2a 5 x10 yx6 y16 txy txltlt4 Copy Propagation and Dead Code Elimination If W rhs appears in a basic block W does not appear anywhere else in the program Then the statement W rhs is dead and can be eliminated r Dead does not contribute to the program s result Example a is not used anywhere else xzy bzy bzy 2 a b 2 x 2 b x 2 a x 2 b Applying Local Optimizations Each local optimization does very little by itself Typically optimizations interact 7 Performing one optimizations enables other opt Typical optimizing compilers repeatedly perform optimizations until no improvement is possible r The optimizer can also be stopped at any time to limit the compilation time An Example Initial code An Example Algebraic optimization An Example Algebraic optimization An Example Copy propagation a X X An Example Copy propagation a X X b 3 c X d c c e b ltlt 1 f a d g e f An Example Constant folding a X X b 3 c X d X X e 3 ltlt 1 f a d g e f An Example Common subexpression elimination a X X b c X d X X e f 39 a d b 3 c X 1 X X e 3 ltlt 1 f a d g e f An Example Constant folding a X X b c X d X X e 39 f 39 a d g e f An Example Common subexpression elimination a X X b c X d a e f 39 a d g e f An Example Copy propagation a X X An Example Copy propagation a 39 X X b c X d a e 39 f a d g e f An Example Dead code elimination a X b c X d a e 39 f 39 a a g 6 f An Example Dead code elimination faa g6f This is the nal form If statements CJUlVlP ifcond then el else el gt el ifO then el else e2 gt e2 reorder ele2 to fall through on branch follow by false lable do cond if necessary While 2lt3 7 dead code elimination last class slide Arrays and loops subscript expressions 7 occur frequently 7 good code important 7 precompute Whatever possible 7 occur o en Within loops gt unroll loops modulo unrolling r amortize overhead of loop instructions over several loop executions branches to branches use of registers 7 e g loop indeX in reg I A grammar speci es the legal syntax ofa language The kind ofgrammar most commonly used in computer language processing 39s a contextfree grammar I A grammar speci es a set of 7 terminal symbols tokens from alphabet of strings 7 nonterminal rymbo 39 phrase names or para of language 7 productiom each production speci es how a nonterminal symbol may be replaced by a string ofterminal or non terminal symbols39 7 start rymbol one of the nonterminals Grammar is a fourtuple G T N S P Where I T is the set ofterminalrymbolx or wordx ofthe language I N is a set of nonterminal symbols or phraxe names that are used in specifying rammar We say V TNis the vocabulary of the grammar I S is a distinguished element ofN called the start symbol I P is a set of productions P VNV V We Write productions in the form a gt b Where a is a string of symbols from V containing at least one nonterminal and b is any string ofsymbols from V I Regular 7 A gt x AgtxB 7ABareNxisT I ContextFree 7 A gt a 7 A is N a is Vquot I ContextSensitive 7 a gt b 7 a is VNV b is Vquot Sample CFGrammar productions terminals T Varnumyl E 7gt E E nonterminals E 7gt E 7 E T 7gt var E Ti T gt hum start symbol T 7gt TT E T 7gt E productions are recursive and mutually recursive I derivation 7 start with starting sentence 7 rep eat until no nonterminals remain I chose a nonterminal symbol I chose a production to use I replace the nonterminal I le rnost always chose le rnost nonterminal to expand next I rightmost always chose rightmost nonterminal to expand next I example 3xy7 I 3 x y 7 7 semantic values I lefthand side is a single N gt derivation forms a parse tree I structure of parse tree important I parse tree gt abstract syntax tree AST I ambiguity 7 ame sentence with two different parse trees will have different meaning 7xyz I transform grammar 7 operand precedence new nonterminals morelater 7eg EigtETlEiTlT T gt T F i F F7gtnumlvarl E I associates to left I parser must read EOF marker so augment grammar with S7gtE E gt E T l E m T l T T gt T E l E F7gtnumlvarl E Parsing 7 takes token as input and performs syntactical also semantic analysis 7 topdown rootgtleaves in tree le gtright in rules abstract to concrete performs guessing handwritten or automatically gen 7 bottomup leaves to root in tree rightgtle concrete to abstract performs matching automatically gen LL Parsing Lefttoright Leftmost derivation recursive funtlon expand start symbol by choosing continue for a Ns fun so Case ltok of IE gt eatIE E0 eatTHEN 50 eatELSE 50 BEGIN gt eatBEGIN s L0 PRINT gt eatPRINT E SEMI gt eatSEMI 50 L0 and E0 eatNUM eatEQ eatNUM s7gtIfEthenSelses L7gtend sigtAc B7gtbB s7gtbeg1nsL L7gtSL A7gt5Bcd eps s7gt rlntE E7gt numnum p l B Q Q 7gt 01 c 7gt c eps datatype tokenIE THEN ELSE val tok ref getToken0 l eps fun advanceo tokgetToken fun eatt 1r ltokt then advanceo else erroro Parser 1r peek a then eata parseBo ParseCO eatd else parseBO parser 7ifanotherrule A 7gt a Q a con ict cannot predict with only 1 token lookahead topdown predicative LLk 7 k tokens lookahead recursive descent 7 turn each rammar production into a clause of a 0 start at top 7 from start rule and try to appropriate rule based on input token ll to choose production rule FIRST 7 give a String a from Vquot FIRSTa is set ofall terminals that can begin any stn g derived home E7gtETiE7TiT n 7gt n s i s n n 7gtnumlvarl 13 FIRSTTF num var FIRSTAB a by q a1 a X 7gt a2 and x 15 in FIRSTal and FIRSTa2 7gt conflict nullableX 7true ifX can derive the empty string FOLLOWX 7 set of terminals that can immediately follow X FIRST and FOLLOW 7 empty nullable 7 false for each terminal t FIRSTt lt t ep eat for each production XgtY1Yk i are all nullable or kIO then nullableXtrue for each i from 1 to k eachj from i1 to k ile Yil are all nullable or i1 then FIRSTX is FIRSTX U FIRSTYi iin1 k are all nullable or i k then FOLLOWYi is FOLLOWYi U FOLLOWX iin1 jl are all nullable or i1j then FOLLOWYi is FOLLOWYi U FIRSTYj until FIRST FOLLOW and nullable stop changing preorder inorder postorder tree traversals idea oftopdown vs bottomup parsers LLl topdown ltoken lookahead 0 we start in starting state and based on input token need to determine which production rule to choose for the next node in the tree 7 FIRST sew idid id inputgtE EgtTR TgtidlE RgtEleps First Set T id Input id id id FIRSTX 7 for X terminals nonterrninals all E 3 id production alternatives and alternative tails E FIRSTeps eps 53 Id init E 1 gt 7 forT token E hi i l 7 for N and noneps alternatives and tails T R id 7 for eps alternatives and tails s R Jr eps for each Ngta FIRSTN contains FIRSTa R taps E m for each alternative or tail of alternative a of form Ab E id FIRSTa contains FIRSTA excluding eps ifFIRSTA contains eps FIRSTa contains FIRSTb eps 5135 including eps Follow sets I if N is nullable and We see a token L hoW do We know Which rule to choose nOn39term FOllOwset I need information on Follow sets Input E 39 I 1n1t all FOLLOWN to T I for each MgtaNb FOLLOWN contains FIRSTb Without eps R I if FIRSTb includes eps FOLLOWN includes FOLLOWM Predictive table I For each N enter production Ngta for all T in FIRSTa I Ifa nullable enter Ngta for all T in FOLLOW N I can be modeled with a pushdown X Predicate parsing ta a from FoLLowX ble c automaton id g I for each input Input E E 7 pop N from stack E TR TR 7 check prediction table T id E 7 ifT gt match R E eps eps 7 else 7 push entry from pred table on stack Z7gtd Y7gt x7gtv x7gtx Y z Y7gtc X7gta Con icts nullable FIRST FOLLOW x yes a c a c d Y yes c a c d 2 no a c d I cannot determine Which rule to take on a character 7 multiple entries in prediction tables 7 grammar not LL1 7 can have additional lookahead I FIRSTFIRST con ict 7 FIRST sew all 01 ternatives for N must be disjunct X X X I FIRSTFOLLOW con ict7 for ch N With a X nullable alternative 7 FOLLOWN and FIRSTN y y7gt y7gt y7gt must be disjunct Y gt I No multiple nullable alternatives Z ZigtXYZ ZigtXYZ Zigtd ZigtXYZ 0 left factoring T gt id id E Tgt id A er id A eriid gt E eps substitution T gt id Indexed id E Indexediid gt id E T gt id i id E i E T gt id A er id i E leftrecurs ion removal E gtET o EgtT 0 FIRST ET contains FIRSTE contains FIRSTT W 0 make it into right recursive production N gtNa b gt b ba baa baaa baaaaaaaaaaa N gtbN N gtaN eps EgtET T E gt TE E gtTE eps Code Generation for Control Structures If Statements Code Template 11f condition is false go to 4 2 Execute statements in thenPart 3 Go to 5 4 Execute statements in elsePart 5 Statements following the if statement AST Node Structure HNode condition thenPart elsePart Stmts Stmts Code Generation fNodeCodeGen EIseLabel e CreateUniqueLabel oonditionResult e AddressNodeJumplfFalse EIseLabel conditionCodeGen thenPartCodeGen OutLabel e CreateUniqueLabel GenGoToOutLabel GenLabelEseLabel elsePartCodeGen GenLabelOutLabel 9 W gt39 W W N 9 Example Congder ifb al else a2 Assume a and b are local variables with indices of l and 2 We know b is a boolean limited to either 1 true or 0 false The code we generate is L1 L2 iload ifeq iconstl istore goto iconst2 istore 2 I L1 1 I I L2 1 I Store stack top into local 1 onto the stack false Push local 2 b Goto Ll if b is 0 Push 1 Store stack top into local 1 a Skip around else statements Push 2 a Special case no else part if b al will result in code with an extra jump iload 2 Push local 2 B onto the stack ifeq Ll Goto Ll if B is 0 false iconstl Push l istore l Store stack top into local 1 a goto L2 Skip around else statements L1 L2 Improved Code Generation fNodeCodeGen ElseLabel e CreateUniqueLabelO conditionResult e AddressNodeJumplfFalse ElseLabeI conditionCodeGen thenPartCodeGen if elsePart null then GenLabelElseLabeI else OutLabel e CreateUniqueLabelO GenGoToOutLabel GenLabelElseLabeI elsePartCodeGen GenLabelOutLabeI P 7 quot39quotP N O While Loops Code Template 11f condition is false go to 4 2 Execute statements in loopBody 3 Go to 1 4 Statements following the while loop Better version 1Go to 3 2 Execute statements in loopBody 31f condition is true go to 2 AST Node Structure whiIeNode condition loopBody Code Generation VW bNodeCodeGen CondMonLabele ge Don nueLabeK GenGoToConditionLabe TopLabeI e CreateUniqueLabelO GenLabeKprLabeD loopBodyCodeGenO GenLabeKCondMonLabeD conditionResult e AddressNodeJumplfTrue TopLabeI conditionCodeGen N999N4 Example In this While loop I is an integer local with an index of 2 and L1 is chosen as the ConditionLabel while I gt 0 I TheJVhdcodegeneHnedis goto Ll Skip around loop body L2 iload 2 Push local 2 I onto the stack iconstl Push l isub Compute I l istore 2 Store I l into local 2 I Ll iload 2 Push local 2 I onto the stack ifge L2 Goto L2 if I is gt 0 C style For Loops Code Template 1Execute initializer code 2Go to 5 3Execute statements in loopBody 4Execute increment code 5If condition is true go to 3 AST Node Structure forNode initializer condition increment loopBody Code Generation Fo P 7 quot39quotP N 13 14 rNodeCodeGen initializerCodeGen SkipLabel e CreateUniqueLabelO GenGoToSkipLabel TopLabel e CreateUniqueLabelO GenLabelTopLabel oopBodyCodeGen ContinueLabel e getContinueLabelO GenLabelContinueLabel incrementCodeGen GenLabelSkipLabel if condition null then GenGoToTopLabel else conditionResult e AddressNodeJumplfTrue TopLabel conditionCodeGen Special Case Example The special case of for generates L1 L2 U goto L1 L3 goto L2 Instruction Selection 0 A list of simpli ed IR trees gt need to map these to assembly instructions 0 tree node 61gt one operation 7 BINOP 0 machine instruction 61gt multiple operations 7 eg load involves fetch add Instruction trees 0 Represent instructions as tree fragments 7 tree patterns tiles 0 Tile the IR trees with instruction tiles so that 7 no IR tree nodes are uncovered 7 no tiles overlap M M 0 The tiling corresponds to selected BINOP instructions PLUS eCOfTST O timal amp O timum Tilin Instruction tiles p p g tile rOOt HOde 0 other instructions a value in a regi er 7 no result in register 0 e and c 7 side effect only 7 unresolved subtrees leaves MOVE MZEM Mll ZM i CONST COINST e C c el el load1w tile storesw tile 0 An instruction may have alternate patterns 0 An IR can be tiled in different ways 0 tiling is optimum ifthe set of tiles have lowest cost best 7cost 7 eg number oftiles used sum of cycles for instructions selected 0 A tiling is optimal ifno two adjacent tiles can be combined into a single tile of lower cost locally best optimum gt optimal is it really optimum what about other optimizations 7 I constant folding code movemen How do we find optimumoptimal tiling 7 maximal munch 7 dynamic programming 7 tree grammars Maximal Munch 0 choose largest tile that matches at the root of the IR tree 7 munch 7 this is instructionl 7 you still have subtrees T1 T2 Recursively apply maxmunch for T1 Generate instruction I with source destination registers produced from subtrees T1 0 good to have large initial exp trees 7 can take large munches 0 but maybe an alternate smaller munch at the beginning can leave subtrees that can be tiled at lower costbetter 7 greedy algorithm uses local info only Dynamlc Pro grammmg find optimum for entire problems based on optimums for each subproblem 7 bottomup assume 7 each instruction has a cost value c at node n each possible tiling leaves zero or more residual subtrees leaves 7 costs of leaves are already computed 7 for a speci c tiling t the cost ofn Will be the sum of costs of t and subtrees choose the tiling t with minimal cost When you get to root node 7 emit instructions for each leaf 7 emit instruction for selected tile for root Tree Grammars What if we have to consider additional conditions when selecting a tile 7 eg different registers for address and data 7 ok 7 can label nodes in tiles with conditions I can return result in a d or none statement 5 7 can represent them as grammars I a d r are nonterminals 7 grammar ambiguous 7 but ok we want to consider different possibilities to find minimum cost I dynamic programming algorithm still works I at each node find minimum cost to produce a and minimum cost to produce d and any other nonterminal 7 eg different sized immediates I programming can be hairy 7 tools exits BURG 7 accepts tree gmrnmars and produces a tree parser 7 ML BURG overkill for Tiger parser For Tiger compiler 7 Maximal Munch in ML 7 ML great for pattern patching 7 in C we ll have to encode fast matching algorithm I consider type of each node existing patters for that type then children Implementing Maximal Munch fun munchstmMOVEMEMBINOPPLUse1co ST ie2 p e1 munchE e2 emit STORE munchstmMOVEMEMBINOPPLUscoNsT ie1 e2 munchExp e1 munchEx e2 emit STORE munchstmMOVEMEM e1FETCHMEM e2 munchExp e1 munchExp e2 emit STORE fun munchExp BINOP PLUS e 1coNsT 1 7mun D chExp e1 emit AD 1 munchExpBINOPPLUSCONST ie1 munchExp e1 emit ADDI munchExpBINOPPLUSe1eZ munchExp e1 munchExp e2 emit ADD 0 Register allocation 7 not yet 0 use virtual registers Representlng IHSUUCUOHS 7 no need to tie reg alloc to machine specific 39 39 39 d t t t instruction selection a a Yge ins r of I aside from numberkind ofregisters allocation is a S S em 5 t ring maclunemdependent so Want to be able to reuse dst temp list I 7 in tree 7 register root oftile src temp list 7 before IS no information whicth nodes will jump label hSt Option LABEL of assem string lab label registers MOVE of I some may be internal to aninst tile assem string 7 we ll do reg alloc after instruction selection d5 temPr src temp Sample instructions tructure A Assem 5w ta 4tb OPERlasse 5 5039 ism fun munchstmTSEQab munchstm a munchstm b t Hr munchstm src 7 tempatempb jump r NONE TMOVETMEMTBINOPTPLUSe1TCONST ie2 em1tAOPERassem sw 50 A lntTOStrlng i A 51 srcmunchEgtltp e2 munchExp e1 dst LABELlasse L0 39 lab label 3 mp7N0NE if i within no of bits munchstmTMovETMEMTBINOPTPLUsTcoNsT ie1e2 move 339 stb em1tAOPERassem sw quot lntTOStrlng l A MOVEassem 7 move dO 50quot dst mpa munchStmTLABEL lab src tempb 7 emitALABELassemSymbolname lab A lablab and munchEXpTBINOPTPLUSTCONST 1 e2 7 assem7 addlu d0 50 A intToStrlng 1 srcmunchEgtltp e2 dst7r jump7NONEl but if 1 gtgt then multiple inst munchExpTMEMTBINoPTPLUse1TcoNsT 1 resultltfn rem1tAOPER assem7 iw dO A intTOStrlng 1 A 50 srcmunchEgtltp e1 dst7r jumpNONE munchExp Treeexp 7gt Temp temp munchstm Treestm 7gt unit in codegen emit gt instruction list 0 results gt newtemp in as s em 0 format m i gt string 7 m 7 function that translates temp registers to appropriate strings 7 l 7 the instruction l munchStmAJUMPTNAME lab labllSt em1tAOPERassem 7 quot3 30quot src 17 dst 17 jump 7 SOME labllstl l munchstmAcJUMPrelope1e2tf 7 case relop of TEQ 7gt em1tAOPERassem 7 quotbeq 50 51 30quot src 7 munchExp e1 munchExp e2 dst 17 jump 7 soMEt mm EXPCALLfarglist or MOVETEMP t CALLf arglist l munchstmTEXPTCALLe1argllst em1tassem jalr 50quot srcmunchEgtltp munchArgs or something dst7caiidefs jumpNONE munchArgs 7 generates code for movement of argument to correct position reg memory results in list of source registers for the CALL instruction list in not really an arg but needed for liveness analysis alldefs 7 all registers trashed by the call the callersave registers Sra we have instruction sequence 7 almost there 7 just need to gure out the registers instruction sequence may use hundreds or temporaries 7 but we only have n registers need register allocation algorithm before 7 need to make some analysis In order to perform liveness analysis we need control ow graph 7 CFG elements 7 statements instructions basic blocks Live range 7 segment of the graph where variable is live 1f a lt N goto L1 return C Liveness analysis key to register allocation determine which temporaries can be bound to same register 7 variables which are not in use at the same time variable a is live ifthe value it holds isWill be used register which holds a live variable cannot be allocated to another var 7 if all live vars don t t in registers the excess can be kept in memory var v is live at sthinstblock I in graph G if 7 there is a path from I to ause ofvwhich does not include assignments o v 1m range ofv starts halfway through assignment 1m range ofv ends halfway into the assignment an assignmmt de nes a Variabletemporary def v 7 set ofgraph nodes Where v is de ned def I 7 set ofvariables de ned at I use I 7 set ofvariables used at I rhs other exp use v 7 set ofnodes at which v is used liveness a variable is live at edge e in G ifthe 39s a directed path from a use of the variable that does not go through any defs Calculating liveness since live ranges startend halfway need to look at edges in and out of stmtsinstsblocks in the ow graph live7in I variables live at the top ofI live7out I variables live on some edges going out of I 7 var may not be used a er block I 7 instruction following I may rely on sideeffects 1nn u5en U outnedefn outn U 1ns where s 15 11 succn foreach I do 1nn lte outn lt I repeat foreach n out nlt outn U outne defn l where s 11 succn 1nnlte u5en outnlte U 11 until in n1nn out noutn foreach I do 1nn lte outn lt I WL lt WL U 1 while WL ltgt 0 remove n from WL ll 1 lt u5en U outn defn out nlt7 U 1n5 where s 11 succn if 1nn changed then WL lt7 WL U predn 39 could have algorithms that start pessimistically and then improve estimate by eliminating registers Which are not going to be necess if in oppos1te direction of contro 7 do loops backwards consider basic blocks at a time not individual s computation of algorithm converges more quickly ow 7 need to know What really needs to be live at end of block many variables shortlived 7 algorithm Will converge quickly for those so perform it one variable at a time representing the sets in out 7 bit arrays N variables in NK Words union OR good for dense sets 7 linked lists union merge 7 duplicates good for sparse sets ltlt NK iterations will stop 7 each iteration may only increase sets or not change sets and stop 7 set sizes are limited by N 7 worstcase cost ONM in practice On 7 OnAZ Conservative approximation for liveness cannot be sure that a variable will be used so it s safe to be conservative and keep it in live sets dynamic liveness for some execution we go from I to usev Without defs static liveness there is some path from I to usev Without defs in general cannot determine reachability of usev 7 or label or end of program halting problem Halting problem there is no program H that takes any program P and input X and HPX returns true if FX halts and HPX returns false if FX in niteloops Suppose H exists de ne FY ifHYY then While true do 0 else true if FF halts then HFF is true but by def FF loops if FF loops then HFF is false but by defFF halts and returns true ame no program H can tell for any program X that a label L Within the program can be reached Interference graph Liveness analysis used for register allocation 7 ifvars a and b are live at the same point 1 they cannot be allocated to same register a and b interfere if a used by instructioni Which cannot use register r gt cannot allocate a to r and a and r interfere Construct undirected graph nodes are variables and edges connect variables that interfere each pair of adjacent nodes must be allocated in different registers register allocation translates to graph coloring problem at any nonmove instruction that defines a With liveout vars bl bj add interference edges a bl a bj at a move instruction a lt c With liveout vars bl bj add interference edges a bi for any bi ltgt c 0 move is reg move x lt y add edge later if assign appears While other var still live Data Flow Analysis Liveness analysis 7 DFA needed to perform register allocation DFA can be performed for other reasons 7 analyzetraverse ow graph and keep track of a set of property information 7 use info gathered to modify the program in some consistent Way 7 Chapter 17 Pair analysis and transformation Liveness amp register allocation Liveness amp dead code elimination Reaching definitions amp constant propagation Available expressions amp common subexpression elimination The objective Perform certain optimizations Current representation of program is correct but we d like to 7 reduce temporaries variables 7 eliminate unnecessary operations Data ow Equations Liveness inn genn U outn 7 killn outn U ins for s in succn gen 7 set of generated live vars use kill 7 set of vars for which liveness property no longer holds newly def 7 starts With empty sets 7 builds sets for Which liveness holds at n gt cannot terminate earl 7 the ow is analyzed backwards Everything covered in class since the midterm in detail The assumption is you know everything coVered before that Questions about how is suchandsuch done in Tiger compiler are legit 7 I won t ask you to write ML code though List on following slides IS NOT complete 7 everything covered in class is fair game new topics Intermediate Representation Instruction S election Basic Blocks and Traces Optimizations Liveness Data ow analysis Register allocation Instruction scheduling old topics Compilers Lexical analysis Parsing Abstract Syntax Tree Symbol Tables Type checking and Semantic Analysis Stack Frames Procedure Calling Conventions Intermediate Representation what and why where is the line between source codeIRmachine mm expressions statements and conditionals simple Variables to function calls 7 what s interesting 7 e g give intermediate tree representation for array creation conditionals special cases amp and operations while for loops case statements cti s external functions procedure prologueepilogue Basic Blocks getting rid pulling of SEQESEQ CALL why what is the objective of steps in canon 39 BINOPopeZ ESEQse2 gt ESEQMOVE TEMP tnew e1 ESEQs BINOPop TEMP tnew e2 true labelfalse label generating optimal traces Optimizations what to optimize scope optimize a piece of code steps to ensure constant propagation or other optimization pe ephole optimization Instruction Selection why tiles optimal vs optimum tiling maximal munch dynamic programming tree grammars given a IR tree find optimuma1 tiling RISC vs CISC inst sets 7 pros and cons of having CISC options Liveness what info are we gathering outline algorithm static vs dynamic info halting problem graph construction and move instruction example 7 construct graph and do analysis f SPIM Memory Layout 0x7 f f f f f Stack Segment l T Data Segment Text Segment Layout of Memory f Layout of a Stack Frame Parameter 5 Parameter 6 Parameter 7 fP gt Saved Registers Local Variables Dynamic Variables SP gt Procedures No Parameters No Recursion Calling a Procedure Execute a jal instruction At the beginning of the called Procedure Space is allocated statically for the local variables so nothing need be done to allocate a stack frame Register ra needs to be saved in a designated storage location for this procedure in order to allow the routine itself to make calls Any of the registers 5057 that are used by the callee need to be saved To return from a call Restore any calleesaved registers that were saved upon entry Return byjumping to the return address saved from register ra using the jr instruction Procedures No Parameters with Recursion Calling a Procedure Execute a jal instruction At the beginning of the called Procedure Register ra needs to be saved to the location referenced by the stack pointer The appropriate display register must be saved into the following word of the stack frame Save any other calleesaved registers in the frame Establish the new display pointer by copying sp into the appropriate display register Establish the stack frame by subtracting the frame size from the stack pointer To return from a call a Pop the stack frame by adding the frame size to sp Restore calleesaved registers including the appropriate display register a Return by jumping to the address in register ra it is also at the location now referenced by sp Procedures Parameters and Recursion with Chains Calling a Procedure Pass the parameters By convention the first four parameters are passed in registers a0a3 The remaining arguments are pushed on the stack a Set up the static chain register necessary if the procedure is nested within another dynamically allocated scope Save the callersaved registers t0t9 if they contain live values at the call site Execute a jal instruction At the beginning of the called Procedure Save the calleesaved registers in the frame Register fp must be saved since it is being used to reference variables in the calling frame Register ra needs to be saved in order to allow the routine itself to make calls Any of the registers 5057 that are used by the callee need to be saved Establish the stack frame by subtracting the frame size from the stack pointer Establish the frame pointer fp by adding the stack frame size to sp Assemblers Linkers and the SPIM Simulator James R Larus Computer Sciences Department University of W isconsiniMadison Fear of serious injury cannot alone justify suppression offree speech and assembly Louis Brandeis Whitney V California 1927 Copyright 1998 Morgan Kaufmann Publishers No portion of this material may be reproduced Without permission of the publsiher Introduction A73 Assemblers A710 Linkers A717 Loading A719 Memory Usage A720 Procedure Call Convention A722 Exceptions and Interrupts A732 Input and Output A736 SPIM A738 MIPS R2000 Assembly Language A749 Concluding Remarks A775 Key Terms A776 Exercises A776 Introduction Encoding instructions as binary numbers is natural and ef cient for compute ers Humans however have a great deal of dif culty understanding and manipulating these numbers People read and write symbols words much better than long sequences of digits Chapter3 showed that we need not choose between numbers and words because computer instructions can be represented in many ways Humans can write and read symbols and come puters can execute the equivalent binary numbers This appendix describes the process by which a humanereadable program is translated into a form that a computer can execute provides a few hints about writing assembly pro grams and explains how to run these programs on SPIM a simulator that executes MIPS programs Unix Windows and DOS versions of the SPIM sime ulator are available through WWWmkpcomcod2ehtm Assembly language is the symbolic representation of a computer s binary en codingimachine language Assembly language is more readable than machine language because it uses symbols instead of bits The symbols in assembly lan guage name commonly occurring bit patterns such as opcodes and register speci ers so people can read and remember them In addition assembly lan guage permits programmers to use labels to identify and name particular meme ory words that hold instructions or data A4 Appendix A Assemblers Linkers and the SPIM Simulator Object file Object file Llnker Object Program file libraw FIGURE A1 The process that produces an executable le An assembler translates a le of assembly language into an object le which is linked with other les and libraries into an executable le Executable le A tool called an assembler translates assembly language into binary instruc7 tions Assemblers provide a friendlier representation than a computer s 0s and ls that simpli es writing and reading programs Symbolic names for opera tions and locations are one facet of this representation Another facet is pro gramming facilities that increase a program s clarity For example macros discussed in section A2 enable a programmer to extend the assembly lan guage by de ning new operations An assembler reads a single assembly language source le and produces an object le containing machine instructions and bookkeeping information that helps combine several object les into a program Figure Al illustrates how a program is built Most programs consist of several lesialso called modulesi that are written compiled and assembled independently A program may also use prewritten routines supplied in a program library A module typically con tains references to subroutines and data de ned in other modules and in librari ies The code in a module cannot be executed when it contains unresolved references to labels in other object les or libraries Another tool called a linker combines a collection of object and library les into an executable le which a computer can run To see the advantage of assembly language consider the following se quence of gures all of which contain a short subroutine that computes and prints the sum of the squares of integers from 0 to 100 Figure A2 shows the machine language that a MIPS computer executes With considerable effort you could use the opcode and instruction format tables in Chapters 3 and 4 to translate the instructions into a symbolic program similar to Figure A3 This form of the routine is much easier to read because operations and operands are written with symbols rather than with bit patterns However this assembly A1 Introduction OOlOOllllOllllOllllllllllllOOOOO lOlOlilllOllllllOOOOOOOOOOOlOlOO lOlOlilllOlOOlOOOOOOOOOOOOlOOOOO lOlOlilllOlOOlOlOOOOOOOOOOlOOlOO lOlOlilllOlOOOOOOOOOOOOOOOOllOOO lOlOlilllOlOOOOOOOOOOOOOOOOlllOO lOOOlilllOlOlllOOOOOOOOOOOOlllOO lOOOlilllOlllOOOOOOOOOOOOOOllOOO OOOOOOOTllOOlllOOOOOOOOOOOOllOOl OOlOOlOlllOOlOOOOOOOOOOOOOOOOOOl OOlOlOOlOOOOOOOlOOOOOOOOOllOOlOl lOlOlilllOlOlOOOOOOOOOOOOOOlllOO OOOOOOOOOOOOOOOOOllllOOOOOOlOOlO OOOOOOllOOOOllllllOOlOOOOOlOOOOl OOOlOlOOOOlOOOOOllllllllllllOlll lOlOlilllOlllOOlOOOOOOOOOOOllOOO OOllllOOOOOOOlOOOOOTOOOOOOOOOOOO lOOOlilllOlOOlOlOOOOOOOOOOOllOOO OOOOllOOOOOlOOOOOOOOOOOOlllOllOO OOlOOlOOlOOOOlOOOOOOOTOOOOWlOOOO lOOOlilllOllllllOOOOOOOOOOOlOlOO OOlOOllllOllllOlOOOOOOOOOOlOOOOO OOOOOOlllllOOOOOOOOOOOOOOOOOlOOO OOOOOOOOOOOOOOOOOOOTOOOOOOTOOOOT FIGURE AZ MIPS machine language code for a routine to compute and print the sum of the squares of integers between 0 and 1 0 language is still dif cult to follow because memory locations are named by their address rather than by a symbolic label Figure 4 shows assembly language that labels memory addresses with mnemonic names Most programmers prefer to read and write this form Names that begin with a period for example data and g l ob l are assembler directives that tell the assembler how to translate a program but do not produce machine instructions Names followed by a colon such as Str or ma l n are la bels that name the next memory location This program is as readable as most assembl language programs except for a glaring lack of comments but it is still dif cult to follow because many simple operations are required to accome plish simple tasks and because assembly language s lack of control flow con structs provides few hints about the program s operation By contrast the C routine in Figure A5 is both shorter and clearer since variables have mnemonic names and the loop is explicit rather than construct ed with branches If you are unfamiliar with C you may wish to look at Web Extension 11 at WWWmkpcomcod2ehtm In fact the C routine is the only one that we wrote The other forms ofthe program were produced by a C compiler and assembler Appendix A Assemblers Linkers and the SPIM Simulator addw 29 29 732 SW 31 2029 SW 4 3229 SW 5 3629 SW O 2429 SW O 2829 1W 14 2829 1W 24 2429 mu1tu 14 14 addw 8 14 1 S1t1 1 8 101 m o 15 addu 25 24 15 one 1 7 SW 25 2429 1w 4 96 1W 5 2429 Jal 1048 81 addw 4 4 1072 1W 31 2029 addw 29 29 32 J1 31 move 2 0 FIGURE A3 The same routine written in assembly language However the code for the routine does not label registers or memory locations nor include comments In general assembly language plays two roles see Figure A6 The rst role is the output language of compilers A compiler translates a program written in a highilevel language such as C or Pascal into an equivalent program in ma chine or assembly language The highilevel language is called the source language and the compiler s output is its target language Assembly languages other role is as a language in which to write programs This role used to be the dominant one Today however because of larger main memories and better compilers most programmers write in a highilevel lan guage and rarely if ever see the instructions that a computer executes Never theless assembly language is still important to write programs in which speed or size are critical or to exploit hardware features that have no analogues in highilevel languages Although this appendix focuses on MIPS assembly language assembly pro gramming on most other machines is very similar The additional instructions and address modes in CISC machines such as the VAX see Web Extension 111 at wwwmkpcomCodZehtm can make assembly programs shorter but do not change the process of assembling a program or provide assembly language with the advantages of highilevel languages such as typeichecking and struci tured control flow A1 Introduction A7 text al gn 2 globl mam mam SUbU Sp Sp 32 SW a 20Sp Sd aO 32Sp SW 0 24Sp SW 0 28Sp loop W t6 28Sp mul t7 t6 t6 W t8 24Sp addu t9 t8 t7 SW t9 24Sp addu tO t6 1 SW tO 28Sp ble tO WOO loop la aO W a1 24Sp Jal prm move vO 0 W ra 20Sp addu Sp Sp 32 J a data al gn 0 Str aSCl lZ The Sum from O 100 5 dn FIGURE AA The same routine written in assembly language with labels but no com ments The commands that start with periods are assembler directives see pages A7517A753 text indicates that succeeding lines contain instructions data indicates that they contain data a g n indicates that the items on the succeeding lines should be aligned on a 2 byte boundary Hence a g 2 means the next item should be on a word bounda g Obi ma l n declares that ma l n is a global symbol that should be visible to code stored in other les Finally aSC l l Z stores a nullrterminated string in memory When to Use Assembly Language The rimary reason to program in assembly language as opposed to an avail able highelevel language is that the speed or size of a program is critically important For example consider a computer that controls a piece of machine ery such as a car s brakes A computer that is incorporated in another device such as a car is called an embedded computer This type of computer needs to respond rapidly and predictably to events in the outside world Because a A3 Appendix A Assemblers Linkers and the SPIM Simulator lnclude ltStdl0 hgt mt mam mt argc char argv mt l mt sum O forlO lltlOO ll1sumsuml l prlntf The sum from O 100 lS dn sum FIGURE A5 The routine written in the c programming language Highrlevel language program 1 Compiler Assembler Linker FIGURE A6 Assembly language either is written by a programmer or is the output of a com piler A Assembly language program compiler introduces uncertainty about the time cost of operations program mers may nd it dif cult to ensure that a highilevel language program responds within a de nite time intervalisay l millisecond after a sensor detects that a tire is skidding An assembly language programmer on the other hand has tight control over which instructions execute In addition in embedded applications reducing a program s size so that it ts in fewer memory chips reduces the cost of the embedded computer A hybrid approach in which most of a program is written in a highilevel language and timeicritical sections are written in assembly language builds on the strengths of both languages Programs typically spend most of their time executing a small fraction of the program s source code This observation is just the principle of locality that underlies caches see section 72 in Chapter 7 Program pro ling measures where a program spends its time and can nd the timeicritical parts of a program In many cases this portion of the program can be made faster with better data structures or algorithms Sometimes how ever signi cant performance improvements only come from recoding a criti7 cal portion ofa program in assembly language A1 Introduction A9 This improvement is not necessarily an indication that the highilevel language s compiler has failed Compilers typically are better than program mers at producing uniformly highiquality machine code across an entire pro gram Programmers however understand a program s algorithms and behavior at a deeper level than a compiler and can expend considerable effort and ingenuity improving small sections of the program In particular prof grammers often consider several procedures simultaneously while writing their code Compilers typically compile each procedure in isolation and must follow strict conventions governing the use of registers at procedure bound aries By retaining commonly used values in registers even across procedure boundaries programmers can make a program run faster Another major advantage of assembly language is the ability to exploit spei cialized instructions for example string copy or patternimatching instruc7 tions Compilers in most cases cannot determine that a program loop can be replaced by a single instruction However the programmer who wrote the loop can replace it easily with a single instruction In the future a programmer s advantage over a compiler is likely to become increasingly dif cult to maintain as compilation techniques improve and machines pipelines increase in complexity Chapter The nal reason to use assembly language is that no highilevel language is available on a particular computer Many older or specialized computers do not have a compiler so a programmer s only alternative is assembly language Drawbacks of Assembly Language Assembly language has many disadvantages that strongly argue against its widespread use Perhaps its major disadvantage is that programs written in assembly language are inherently machineispeci c and must be totally rewritten to run on another computer architecture The rapid evolution of computers discussed in Chapter 1 means that architectures become obsolete An assembly language program remains tightly bound to its original archi7 tecture even after the computer is eclipsed by new faster and more cost effective machines Another disadvantage is that assembly language programs are longer than the equivalent programs written in a highilevel language For example the C program in Figure AS is ll lines long while the assembly program in Figure A4 is 31 lines long In more complex programs the ratio of assembly to highilevel language its expansion factor can be much larger than the factor of three in this example Unfortunately empirical studies have shown that pro grammers write roughly the same number oflines of code per day in assembly as in highilevel languages This means that programmers are roughly Xtimes more productive in a highilevel language where X is the assembly language expansion factor A10 Appendix A Assemblers Linkers and the SPIM Simulator To compound the problem longer programs are more dif cult to read and understand and they contain more bugs Assembly language exacerbates the problem because of its complete lack of structure Common programming id ioms such as i then statements and loops must be built from branches and jumps The resulting programs are hard to read because the reader must recon struct every higherilevel construct from its pieces and each instance of a state ment may be slightly different For example look at Figure A4 and answer these questions What type of loop is used What are its lower and upper bounds Elaboration Compilers can produce machine language directly instead of relying on an assembler These compilers typically execute much faster than those that invoke an assembler as part of compilation However a compiler that generates machine lan guage must perform many tasks that an assembler normally handles such as resolving addresses and encoding instructions as binary numbers The traderoff is between comr pilation speed and compiler simplicity Elaboration P r39 39 39 39 h A d quot 39 are written in a highrlevel language Many of these applications are large and complex programs that must be extremely reliable Assembly language programs are longer and more diffi cult to write and read than highrlevel language programs This greatly increases the cost of writing an assembly language program and makes it extremely difficult to verify the correctness ofthis type of program In fact these considerations led the Department of Defense which pays for many complex embedded systems to develop Ada a new high level language for writing embedded systems Assemblers An assembler translates a le of assembly language statements into a le of binary machine instructions and binary data The translation process has two major parts The rst step is to nd memory locations with labels so the rela7 tionship between symbolic names and addresses is known when instructions are translated The second step is to translate each assembly statement by combining the numeric equivalents of opcodes register speci ers and labels into a legal instruction As shown in Figure Al the assembler produces an output le called an object le which contains the machine instructions data and bookkeeping information AZ Assemblers An object le typically cannot be executed because it references procedures or data in other les A label is external also called global if the labeled object can be referenced from les other than the one in which it is de ned A label is local if the object can be used only within the le in which it is de ned In most assemblers labels are local by default and must be explicitly declared global Subroutines and global variables require external labels since they are refer enced from many les in a program Local labels hide names that should not be visible to other modulesifor example static functions in C which can only be called by other functions in the same le In addition compileregenerated namesifor example a name for the instruction at the beginning of a loopi are local so the compiler need not produce unique names in every le Example Answer Local and Global Labels Consider the program in Figure A4 on page A77 The subroutine has an external global label ma l m It also contains two local labelsi l 00p and Strithat are only visible with this assembly language file Finally the routine also contains an unresolved reference to an external label pr l mt which is the library routine that prints values Which labels in Figure A4 could be referenced from another le Only global labels are visible outside of a file so the only label that could be referenced from another le is mal m Since the assembler processes each le in a program individually and in iso lation it only knows the addresses of local labels The assembler depends on another tool the linker to combine a collection of object les and libraries into an executable le by resolving external labels The assembler assists the linker by providing lists of labels and unresolved references However even local labels present an interesting challenge to an assembler Unlike names in most highelevel languages assembly labels may be used be fore they are de ned In the example in Figure A4 the label Str is used by the l a instruction before it is defined The possibility of a forward reference like this one forces an assembler to translate a program in two steps rst nd all labels and then produce instructions In the example when the assembler sees the l a instruction it does not know where the word labeled Str is located or even whether Str labels an instruction or datum Appendix A Assemblers Linkers and the SPIM Simulator An assembler s rst pass reads each line of an assembly le and breaks it into its component pieces These pieces Which are called lexemes are individ7 ual words numbers and punctuation characters For example the line bie t0 WOO oop contains 6 lexemes the opcode bi e the register speci er tO a comma the number 100 a comma and the symbol i 00p lfa line begins With a label the assembler records in its symbol table the name of the label and the address of the memory word that the instruction occupies The assembler then calculates hoW many words of memory the instruction on the current line Will occupy By keeping track of the instructions sizes the as sembler can determine Where the next instruction goes To compute the size of avariableilength instruction like those on the VAX an assembler has to exam ine it in detail Fixedilength instructions like those on MIPS on the other hand require only a cursory examination The assembler performs a similar calcula7 tion to compute the space required for data statements When the assembler reaches the end of an assembly le the symbol table records the location of each label de ned in the le The assembler uses the information in the symbol table during a second pass over the le Which actually produces machine code The assembler again examines each line in the le If the line contains an instruction the assembler combines the binary representations of its opcode and operands register spec i ers or memory address into a legal instruction The process is similar to the one used in section 34 in Chapter 3 Instructions and data words that reference an external symbol de ned in another le cannot be completely assembled they are unresolved since the symbols address is not in the symbol table An assembler does not complain about unresolved references since the corre7 sponding label is likely to be de ned in another le Assembly language is a programming language lts The Big principal difference from highilevel languages such as BASIC Java and C is that assembly language prof Picture vides only a feW simple types of data and control ow Assembly language programs do not specify the type of value held in a variable Instead a programi mer must apply the appropriate operations eg integer or oating point addition to a value In addition in assembly language pro grams must implement all control ow with go as Both factors make assembly language programming for any machineiMlPS or 80x867 more dif cult and erroriprone than writing in a highilevel language AZ Assemblers A13 Elaboration If an assembler s speed is important this tworstep process can be done in one pass over the assembly file with a technique known as backpatching In its pass over the file the assembler builds a possibly incomplete binary representation of every instruction lfthe instruction references a label that has not yet been defined the assembler records the label and instruction in a table When a label is defined the assembler consults this table to find all instructions that contain a forward reference to the label The assembler goes back and corrects their binary representation to incorpor rate the address of the label Backpatching speeds assembly because the assembler only reads its input once However it requires an assembler to hold the entire binary representation of a program in memory so instructions can be backpatched This requirement can limit the size of programs that can be assembled Object File Format Assemblers produce object les An object le on Unix contains six distinct sections see Figure A7 I The object le header describes the size and position of the other pieces of the le I The teXt segmentcontains the machine language code for routines in the source le These routines may be unexecutable because of unresolved references I The data segment contains a binary representation of the data in the source le The data also may be incomplete because of unresolved ref erences to labels in other les I The relocation information identi es instructions and data words that de pend on absolute addresses These references must change ifportions of the program are moved in memory I The symbol table associates addresses With external labels in the source le and lists unresolved references I The debugging information contains a concise description of the way in Which the program was compiled so a debugger can nd Which in struction addresses correspond to lines in a source le and print the data structures in readable form Text segment Relocation information Symbol table Debugging Object le Da a header segment information FIGURE A39l Object le A Unix assembler produces an object le with six distinct sections A14 Appendix A Assemblers Linkers and the SPIM Simulator The assembler produces an object le that contains a binary representation of the program and data and additional information to help link pieces of a program This relocation information is necessary because the assembler does not know which memory locations a procedure or piece of data will occupy af7 ter it is linked with the rest ofthe program Procedures and data from a le are stored in a contiguous piece of memory but the assembler does not know where this memory will be located The assembler also passes some symbol ta ble entries to the linker In particular the assembler must record which external symbols are de ned in a le and what unresolved references occur in a le Elaboration For convenience assemblers assume each file starts at the same address for example location 0 with the expectation that the linker will relocate the code and data when they are assigned locations in memory The assembler produces relocation information which contains an entry describing each instruction or data word in the file that references an absolute address On MIPS only the subroutine call load and store instructions reference absolute addresses Instructions that use PCrrelative addressing such as branches need not be relocated Additional Facilities Assemblers provide a variety of convenience features that help make assemi bler programs short and easier to write but do not fundamentally change assembly language For example data layout directives allow a programmer to describe data in a more concise and natural manner than its binary represen7 tation In Figure A4 the directive ascl lZ quotThe sum from 0 100 is dn stores characters from the string in memory Contrast this line with the alter native of writing each character as its ASCII value Figure 315 in Chapter 3 describes the ASCII encoding for characters byte 84 104 101 32 115 117 109 32 byte 102 114 111 109 32 48 32 46 byte 46 32 49 48 48 32 105 115 byte 32 37 100 10 0 The asc l l Z directive is easier to read because it represents characters as let ters not binary numbers An assembler can translate characters to their binary representation much faster and more accurately than a human Data layout directives specify data in a humanireadable form that the assembler translates to binary Other layout directives are described in section AlO on pages A7517 A753 AZ Assemblers A15 String Directive Example Define the sequence of bytes produced by this directive asc1 1Z quotThe duxck brown fox Jumps over the lazy dog Answer byte 84 104 101 32 113 117 105 99 byte 107 32 98 114 111 119 110 32 byte 102 111 120 32 106117109112 byte 115 32 111118101114 32 116 byte 104 101 32 108 97 122 121 32 byte 100 111 103 0 Macros are a patternimatching and replacement facility that provide a sime ple mechanism to name a frequently used sequence of instructions Instead of repeatedly typing the same instructions every time they are used a program mer invokes the macro and the assembler replaces the macro call with the cor responding sequence of instructions Macros like subroutines permit a programmer to create and name a new abstraction for a common operation Unlike subroutines however macros do not cause a subroutine call and return when the program runs since a macro call is replaced by the macro s body when the program is assembled After this replacement the resulting assembly is indistinguishable from the equivalent program written without macros Macros As an example suppose that a programmer needs to print many numbers The library routine pr 1 ritf accepts a format string and one or more values to print as its arguments A programmer could print the integer in register 7 with the following instructions data mtistr asm izquotd text 1a a0 lotistr Load string address mto first arg mov a1 7 Load value mto second arg Jal prmtf Call the prmtf routine A16 Appendix A Assemblers Linkers and the SPIM Simulator The data directive tells the assembler to store the string in the program s data segment and the text directive tells the assembler to store the in structions in its text segment However printing many numbers in this fashion is tedious and pro duces a verbose program that is difficult to understand An alternative is to introduce a macro pr tii rit to print an integer data intistr ascxiz d text macro printntarg a aO intistr Load string address tO first arg mov al arg Load macro s parameter arg tO second arg Jai printf Caii the printf routine endimacro prmtg nt7 The macro has a formal parameter arg that names the argument to the macro When the macro is expanded the argument from a call is substitute ed for the formal parameter throughout the macro s body Then the assemi bler replaces the call with the macro s newly expanded body In the rst call on pr tii rit the argument is 7 so the macro expands to the code i a a0 intistr mov al 7 J a pr ntf In a second call on pr mtg mt say pri mtg ntt0 the argument is tO so the macro expands to i a a0 intistr mov al tO J a pr ntf What does the call pr mtg mt a0 expand to mov al aO J a pr ntf This example illustrates a drawback of macros A programmer who uses this macro must be aware that pr tii rit uses register a0 and so cannot correctly print the value in that register A3 Linkers A17 Some assemblers also implement pseudoinstructions which Hardware are instructions provided by an assembler but not implei software mented in hardware Chapter 3 contains many examples of how the MIPS assembler synthesizes pseudoinstructions Interface and addressing modes from the spartan MIPS hardware instruction set For example section 35 in Chapter 3 describes how the assembler synthesizes the b l t instruction from two other instructions 8 l t and brie By extending the instruction set the MIPS assembler makes assembly language programming easier without complicating the hardware Many pseudoinstructions could also be simulated with macros but the MIPS assembler can generate better code for these instructions because it can use a dedicated register at and is able to optii mize the generated code Elaboration Assemblers conditionally assemble pieces of code which permits a programmer to include or exclude groups of instructions when a program is assembled This feature is particularly useful when several versions of a program differ by a small amount Rather than keep these programs in separate filesiwhich greatly complicates fixing bugs in the common codeiprogrammers typically merge the versions into a sin gle file Code particular to one version is conditionally assembled so it can be excluded when other versions of the program are assembled lf macros and conditional assembly are useful why do assemblers for Unix systems rarely if ever provide them One reason is that most programmers on these systems write programs in higherrlevel languages like C Most ofthe assembly code is produced by compilers which find it more convenient to repeat code rather than define macros Another reason is that other tools on Unixisuch as cpp the C preprocessor or m4 a general macro processorican provide macros and conditional assembly for assembly language programs Linkers Separate compilation permits a program to be split into pieces that are stored in different les Each le contains a logically related collection of subroutines and data structures that form a module in a larger program A le can be come piled and assembled independently of other les so changes to one module do not require recompiling the entire program As we discussed above sepa7 rate compilation necessitates the additional step of linking to combine object les from separate modules and x their unresolved references A13 Appendix A Assemblers Linkers and the SPIM Simulator The tool that merges these les is the linker see Figure AB It performs three tasks I Searches the program libraries to nd library routines used by the pro gram l Determines the memory locations that code from each module will oci cupy and relocates its instructions by adjusting absolute references I Resolves references among les A linker s rst task is to ensure that a program contains no unde ned labels The linker matches the external symbols and unresolved references from a pro grams les An external symbol in one le resolves a reference from another le if both refer to a label with the same name Unmatched references mean a symbol was used but not de ned anywhere in the program Unresolved references at this stage in the linking process do not necessarily mean a programmer made a mistake The program could have referenced a lie brary routine whose code was not in the object les passed to the linker After matching symbols in the program the linker searches the system s program lie braries to nd prede ned subroutines and data structures that the program Object lile sub Object lile Executable lile Instructions ma l n ma l n Jal 777 Jal printf Jal 777 Jal sub printf call sub I nker Relocation ca I prmtf records sub Clibrary print FIGURE L8 The linker searches a collection of object les and program libraries to nd non local routines used in a program combines them into a single executable le and resolves references between routines in different les AA Loading A19 references The basic libraries contain routines that read and write data alloi cate and deallocate memory and perform numeric operations Other libraries contain routines to access a database or manipulate terminal windows A pro gram that references an unresolved symbol that is not in any library is erronei ous and cannot be linked When the program uses a library routine the linker extracts the routine s code from the library and incorporates it into the pro gram text segment This new routine in turn may depend on other library routines so the linker continues to fetch other library routines until no external references are unresolved or a routine cannot be found If all external references are resolved the linker next determines the memo ry locations that each module will occupy Since the les were assembled in isolation the assembler could not know where a module s instructions or data will be placed relative to other modules When the linker places a module in memory all absolute references must be relocated to re ect its true location Since the linker has relocation information that identi es all relocatable refer ences it can ef ciently nd and backpatch these references The linker produces an executable le that can run on a computer Typically this le has the same format as an object le except that it contains no unrei solved references or relocation information Loading A program that links without an error can be run Before being run the pro gram resides in a le on secondary storage such as a disk On Unix systems the operating system kernel brings a program into memory and starts it run ning To start a program the operating system performs the following steps 1 Reads the executable le s header to determine the size of the text and data segments 2 Creates a new address space for the program This address space is large enough to hold the text and data segments along with a stack seg7 ment see section A5 3 Copies instructions and data from the executable le into the new address space 4 Copies arguments passed to the program onto the stack 5 lnitializes the machine registers In general most registers are cleared but the stack pointer must be assigned the address of the rst free stack location see section A5 AZO Appendix A Assemblers Linkers and the SPIM Simulator 6 Jumps to a startup routine that copies the program s arguments from the stack to registers and calls the program s mat n routine If the ma I n routine returns the startup routine terminates the program with the exit system ca Memory Usage The next few sections elaborate the description of the MIPS architecture pre sented earlier in the book Earlier chapters focused primarily on hardware and its relationship with lowilevel software These sections focus primarily on how assembly language programmers use MIPS hardware These sections describe a set of conventions followed on many MIPS systems For the most part the hardware does not impose these conventions Instead they represent an agreement among programmers to follow the same set ofrules so that soft ware written by different people can work together and make effective use of MIPS hardware Systems based on MIPS processors typically divide memory into three parts see Figure A 9 The rst part near the bottom of the address space starting at address 400000hex is the teXt segment which holds the program s instruc7 tions The second part above the text segment is the data segment which is further divided into two parts Static data starting at address lOOOOOOOheX contains ob jects whose size is known to the compiler and whose lifetimeithe interval during which a program can access themiis the program s entire execution For example in C global variables are statically allocated since they can be ref erenced anytime during a program s execution The linker both assigns static objects to locations in the data segment and resolves references to these objects 7fffffffm Stack segment Data segment 10000000m Text segment 400000 hex FIGURE A9 Layout of memory A5 Memory Usage A21 Immediately above static data is dynamic data This data as its name implies is allocated by the program as it executes In C programs the ma l lOC library routine nds and returns a new block of memory Since a compiler cannot pref dict how much memory a program will allocate the operating system expands the dynamic data area to meet demand As the upward arrow in the gure in dicates ma l l OC expands the dynamic area with the sbrk system call which causes the operating system to add more pages to the program s virtual ad dress space see section 73 in Chapter 7 immediately above the dynamic data segment The third part the program stacksegment resides at the top of the virtual ad dress space starting at address 7fffffffhex Like dynamic data the maximum size ofa program s stack is not known in advance As the program pushes val ues on the stack the operating system expands the stack segment down to wards the data segment This threeipart division of memory is not the only possible one However it has two important characteristics the two dynamically expandable seg7 ments are as far apart as possible and they can grow to use a program s entire address space Because the data segment begins far above the program at Hardware address lOOOOOOOheX load and store instructions cannot software directly reference data objects with their l6ibit offset elds see section 34 in Chapter 3 For example to load the word Interface in the data segment at address lOOOBOOOheX into register v0 requires two instructions lux s0 0x1000 0x1000 means 1000 base 16 or 4096 base 10 lvv v0 0x8000lts0gt 0x10000000 0x8000 0x10008000 The 0X before a number means that it is a hexadecimal value For example 0x8000 is 8000hex or 32768ten To avoid repeating the l u l instruction at every load and store MIPS sys7 tems typically dedicate a register gp as a global pointer to the static data seg7 ment This register contains address 10008000119 so load and store instructions can use their signed l6ibit offset elds to access the rst 64 KB of the static data segment With this global pointer we can rewrite the example as a single instruction lvv v0 0gp Of course a global pointer register makes addressing locations lOOOOOOOhexilOOlOOOO18X faster than other heap locations The MIPS compiler usually stores global variables in this area because these variables have xed locations and t better than other global data such as arrays Appendix A Assemblers Linkers and the SPIM Simulator Procedure Call Convention Conventions governing the use of registers are necessary when procedures in a program are compiled separately To compile a particular procedure a come piler must know which registers it may use and which registers are reserved for other procedures Rules for using registers are called register use or proce dure call conventions As the name implies these rules are for the most part conventions followed by software rather than rules enforced by hardware However most compilers and programmers try very hard to follow these conventions because violating them causes insidious bugs The calling convention described in this section is the one used by the gcc compiler The native MIPS compiler uses a more complex convention that is slightly faster The MIPS CPU contains 32 generalipurpose registers that are numbered 07 31 Register 0 always contains the hardwired value 0 l Registers at l KO 26 and K l 27 are reserved for the assembler and operating system and should not be used by user programs or come pilers Registers a07a3 477 are used to pass the rst four arguments to roui tines remaining arguments are passed on the stack Registers v0 and v l 2 3 are used to return values from functions I Registers t07t9 8715 24 25 are callerisaved registers that are used to hold temporary quantities that need not be preserved across calls see section 36 in Chapter 3 l Registers 804587 16723 are calleeisaved registers that hold long lived values that should be preserved across calls I Register gp 28 is a global pointer that points to the middle of a 64K block of memory in the static data segment Register Sp 29 is the stack pointer which points to the rst free loca7 tion on the stack Register fp 30 is the frame pointer TheJ a i instruc7 tion writes register ra 31 the return address from a procedure call These two registers are explained in the next section The twoiletter abbreviations and names for these registersifor example Sp for the stack pointerireflect the registers intended uses in the procedure call convention In describing this convention we will use the names instead of register numbers The table in Figure AlO lists the registers and describes their intended uses A6 Procedure Call Convention AZ3 constant across across across across across across return FIGURE A10 MIPS registers and usage convention Procedure Calls This section describes the steps that occur When one procedure the caller invokes another procedure the 631199 Programmers Who write in a highilevel language like C or Pascal never see the details of how one procedure calls another because the compiler takes care of this lowilevel bookkeeping How ever assembly language programmers must explicitly implement every pro cedure call and return Appendix A Assemblers Linkers and the SPIM Simulator Most of the bookkeeping associated with a call is centered around a block of memory called a procedure call frame This memory is used for a variety of purposes I To hold values passed to a procedure as arguments I To save registers that a procedure may modify but which the proce dure s caller does not want changed I To provide space for variables local to a procedure In most programming languages procedure calls and returns follow a strict lastein rsteout LIFO order so this memory can be allocated and deallocated on a stack which is why these blocks of memory are sometimes called stack frames Figure All shows a typical stack frame The frame consists of the memory between the frame pointer fp which points to the rst word of the frame and the stack pointer lisp which points to the last word the frame The stack grows down from higher memory addresses so the frame pointer points above the stack pointer The executing procedure uses the frame pointer to quickly access values in its stack frame For example an argument in the stack frame can be loaded into register v0 with the instruction Ivv v0 Ofp A stack frame may be built in many different ways however the caller and callee must agree on the sequence ofsteps The steps below describe the calling convention used on most MIPS machines This convention comes into play at three points during a procedure call immediately before the caller invokes the calleejust as the callee starts executing and immediately before the callee ref turns to the caller In the rst part the caller puts the procedure call arguments in standard places and invokes the callee to do the following 1 Pass arguments By convention the rst four arguments are passed in registers 55510755213 Any remaining arguments are pushed on the stack and appear at the beginning of the called procedure s stack frame N Save calleresaved registers The called procedure can use these registers 55210755513 and t07t9 without rst saving their value If the caller expects to use one of these registers after a call it must save its value before the call 9 Execute aJ aI instruction see section 36 of Chapter 3 whichjumps to the callee s rst instruction and saves the return address in register lira Before a called routine starts running it must take the following steps to set up its stack frame 1 Allocate memory for the frame by subtracting the frames size from the stack pointer A a Procedure Call Convention AZS Higher memory addresses fp gt Saved registers Stack grows Local variables 1 Sp gt Lower memow addresses FIGURE A11 Layout of a stack frame The frame pointer fp points to the rst word in the currently executing procedure s stack frame The stack pointer sp points to the last word of frame The rst four arguments are passed in registers so the fth argument is the rst one stored on the stack 2 Save calleeisaved registers in the frame A callee must save the values in these registers 804587 lifp and lira before altering them since the caller expects to nd these registers unchanged after the call Register fp is saved by every procedure that allocates a new stack frame How ever register lira only needs to be saved if the callee itself makes a call The other calleeisaved registers that are used also must be save 9 Establish the frame pointer by adding the stack frame s size minus four to lisp and storing the sum in register lifp The MIPS register use convention provides calleei and Hardware callerisaved registers because both types of registers are software advantageous in different circumstances Calleeisaved reg isters are better used to hold longilived values such as varii Interface ables from a user s program These registers are only saved during a procedure call if the callee expects to use the regisi ter On the other hand callerisaved registers are better used to hold shortilived quantities that do not persist across a call such as immedi7 ate values in an address calculation During a call the callee can also use these registers for shortilived temporaries AZG Appendix A Assemblers Linkers and the SPIM Simulator Finally the callee returns to the caller by executing the following steps 1 If the callee is a function that returns a value place the returned value in register v0 2 Restore all calleeisaved registers that were saved upon procedure entry 3 Pop the stack frame by subtracting the frame size from lisp 4 Return byjumping to the address in register lira Elaboration A programming language that does not permit recursive proceduresi procedures that call themselves either directly or indirectly through a chain of callsi need not allocate frames on a stack In a nonrecursive language each procedure s frame may be statically allocated since only one invocation of a procedure can be active at a time Older versions of Fortran prohibited recursion because statically allocated frames produced faster code on some older machines However on loadrstore architecr tures like MIPS stack frames may bejust as fast because a frame pointer register points directlyto the active stack frame which permits a single load or store instruction to access values in the frame In addition recursion is a valuable programming tech nique Procedure Call Example As an example consider the C routine main 0 printf The factorial of lo is dn fact 00 int fact int n i if n lt l return i else return n fact n 7 U l which computes and prints 10 the factorial of 10 10 10 X 9 X X l fact is a recursive routine that computes n by multiplying n times n 7 l The assembly code for this routine illustrates how programs manipulate stack frames Upon entry the routine mal l l creates its stack frame and saves the two calleeisaved registers it will modify fp and lira The frame is larger than re quired for these two registers because the calling convention requires the min imum size ofa stack frame to be 24 bytes This minimum frame can hold four A6 Procedure Call Convention AZ7 argument registers a07a3 and the return address ra padded to a double word boundary 24 bytes Since ma i n also needs to save fp its stack frame must be two words larger remember the stack pointer is kept doubleword aligned text globl main main subu spsp32 Stack frame is 32 bytes long svv ra20sp Save return address svv fpl6sp Save old frame pOinter addu fpsp28 Set up frame pOinter The routine ma i n then calls the factorial routine and passes it the single argu ment 10 After fact returns ma i n calls the library routine pr i ntf and passes it both a format string and the result returned from fact li a0l0 Rut argument l0 in aO Jal fact Call factorial function la a0LC Rut format string in aO move Sal v0 Move fact result to Sal Jal printf Call tne print function Finally after printing the factorial ma i n returns But rst it must restore the registers it saved and pop its stack frame lvv ra20sp Restore return address lvv fpl6sp Restore frame pOinter addu spsp32 Pop stack frame Jl ra Return to caller rdata LC ascii quotTne factorial of lo is dnOOO The factorial routine is similar in structure to main First it creates a stack frame and saves the calleeisaved registers it Will use In addition to saving ra and fp fact also saves its argument a0 Which it Will use for the recursive call text fact subu spsp32 Stack frame is 32 bytes long svv ra20sp Save return address svv fpl6sp Save frame pOinter addu fpsp28 Set up frame pOinter svv a00fp Save argument n Appendix A Assemblers Linkers and the SPIM Simulator The heart of the fact routine performs the computation from the C pro gram lt tests if the argument is greater than 0 If not the routine returns the value 1 If the argument is greater than 0 the routine recursively calls itself to compute fact ni l ancl multiplies that value times 1 lvv v00fp bgtz v0L2 l l v0l J L l L2 lvv vl Ofp subu v0vll move a0v0 J al fact lvv vl Ofp mul vO vO v l Load n Brancn lf n gt 0 Return l Jump to code to return Load n Compute n 7 l Move value to aO Call factorlal functlon Load n Compute factnil n Finally the factorial routine restores the calleeisaved registers and returns the value in register vO Ll lvv ra 20sp lvv fp l6ltspgt addu sp sp 32 J ra Result ls ln vO Restore ra Restore fp Pop stack Return to caller Slack in Recursive Procedure Figure AlZ shows the stack at the call fact7 ma l n runs first so its frame is deepest on the stack ma l n calls fact l O Whose stack frame is next on the stack Each invocation recursively invokes f act to compute the nextilowest factorial The stack frames parallel the LIFO order of these calls What does the stack look like When the call to fact l O returns A6 Procedure Call Convention AZ9 Stack Old ra Old fp main Old ra Old fp fact 10 Old a0 Old ra Old fp fact 9 Old a0 Old ra Old fp fact 8 Old a0 Old ra Stack grows Old fp fact 7 Old a0 FIGURE A1Z Stack frames during the call of fact 7 Stack Old ra Old fp main Stack grows Elaboration The difference between the MIPS compiler and the gcc compiler is that the MIPS compiler usually does not use a frame pointer so this register is available as another calleersaved register 58 This change saves a couple of instructions in the procedure call and return sequence However it complicates code generation because a procedure must access its stack frame with sp whose value can change during a procedure s execution if values are pushed on the stack Another Procedure Call Example As another example consider the following routine that computes the tak function Which is a Widely used benchmark created by lkuo Takeuchi This function does not compute anything useful but is a heavily recursive pro gram that illustrates the MIPS calling convention Appendix A Assemblers Linkers and the SPIM Simulator mt tak mt x mt y mt Z if y lt x return 1 tak tak x 7 l y Z tak y 7 i Z X tak z 71gtltgt eiSe return Z mt mam tak18126 The assembly code for this program is below The tak function rst saves its return address in its stack frame and its arguments in calleeisaved registers since the routine may make calls that need to use registers 5521045512 and lira The function uses calleeisaved registers since they hold values that persist over the lifetime of the function Which includes several calls that could potentially modify registers text giobi tak tak Subu Sp Sp 40 SW ra 32Sp SW SO l6lt8p x move SO 210 SW Sl 20Sp y move Sl 511 SW S2 24Sp Z move S2 212 SW S3 28Sp temporary The routine then begins execution by testing ify lt x If not it branches to label Ll Which is below bge S i SO Ll if y lt x lfy lt x then it executes the body of the routine Which contains four recursive calls The rst call uses almost the same arguments as its parent addu a0 SO 4 move 511 Sl move 212 S2 Jai tak takwil y Z move S3 vO A6 Procedure Call Convention A31 Note that the result from the rst recursive call is saved in register 83 so that it can be used latter The function noW prepares arguments for the second recursive call addu a0 Sl 71 move a l 82 move a2 80 Jal tak tak 371 Z x In the instructions below the result from this recursive call is saved in register 80 But first we need to read for the last time the saved value of the first argument from this register addu a0 82 71 move a l 80 move a2 8 l move 80 v0 Jal tak tak 271 x 3 After the three inner recursive calls we are ready for the nal recursive call After the call the function s result is in v0 and control jumps to the func7 tion s epilogue move a0 83 move a l 80 move a2 v0 Jal tak tak tak taklt taklt J L2 This code at label U is the consequent of the i thenielse statement It just moves the value of argument Z into the return register and falls into the func7 tion epilogue L l move v0 82 The code beloW is the function epilogue Which restores the saved registers and returns the function s result to its caller L2 lvv ra 328p lvv 80 l6lt8pgt lvv Sl 208p lvv 82 248p lvv 83 288p addu 8p 8p 40 J ra Appendix A Assemblers Linkers and the SPIM Simulator The ma 1 n routine calls the tak function with its initial arguments then takes the computed result 7 and prints it using SPIM s system call for printing integers gIobI main man subu sp sp 24 SW ra 16sp 1 1 a0 18 1 1 a1 12 1 1 a2 6 JaI tak tak18 12 6 move a0 vO 11 v0 1 p ti t syscaH syscaI 1 1w ra 16sp addu sp sp 24 J ra Exceptions and Interrupts Section 56 of Chapter 5 describes the MIPS exception facility which responds both to exceptions caused by errors during an instruction s execution and to external interrupts caused by IO devices This section describes exception and interrupt handling in more detail In MIPS processors a part of the CPU called coprocessor 0 records the information the software needs to handle exceptions and interrupts The MIPS simulator SPIM does not implement all of coprocessor 0 s registers since many are not useful in a simulator or are part of the memory system which SPIM does not implement However SPIM does provide the following coprocessor 0 registers Register Register name number BadVAddr 8 Iegl tel Status 12 Interrupt mask and enable blts Cause 13 exceptlon type and pendlng Interrupt blts EPC 14 reglster contalnlng address of Instructlon that caused exception These four registers are part of coprocessor 0 s register set and are accessed by the IWCO mch mtcO and SWCO instructions After an exception register EPC A1 Exceptions and Interrupts A33 contains the address of the instruction that was executing when the exception occurred If the instruction made a memory access that caused the exception register BadVAddr contains the referenced memory location s address The two other registers contain many elds and are described below Figure A13 shows the Status register elds implemented by the MIPS simi ulator SPIM The l nterrupt mask eld contains a bit for each ofthe ve hard ware and three software possible interrupt levels A bit that is 1 allows interrupts at that level A bit that is 0 disables interrupts at that level The low 6 bits of the Status register implement a threeideep stack for the Kerne l user and interrupt enable bits The Kerne l user bit is 0 ifaprogram was in the kernel when an exception occurred and 1 if it was running in user mode If the interrupt enable bit is l interrupts are allowed lfit is 0 they are disabled When an interrupt occurs these 6 bits are shifted left by 2 bits so the current bits become the previous bits and the previous bits become the old bits the old bits are discarded The current bits are both set to 0 so the interrupt handler runs in the kernel with interrupts disabled Figure Al4 shows the Cause register elds implemented by SPIM The ve pend l ng l nterrupt bits correspond to the five interrupt levels Abit becomes 1 when an interrupt at its level has occurred but has not been serviced The Ex ception code register describes the cause of an exception with the following codes error error error on error on or Store 15 8 5 4 3 2 1 O lllllll Interrupt maSk Old Previous Current V c b owgv FIGURE L13 The Status register Appendix A Assemblers Linkers and the SPIM Simulator Exception code Pending interrupts FIGURE L14 The Cause register In actual MIPS processors this register contains additional elds that report whether the instruction that caused the exception executed in a branch s delay slot which coprocessor caused the exception or that a software interrupt is pending Exceptions and interrupts cause a MIPS processor tojump to a piece ofcode at address 8000008011 in the kernel not user address space called an inter mpt handlen This code examines the exceptions cause andjumps to an approi priate point in the operating system The operating system responds to an exception either by terminating the process that caused the exception or by performing some action A process that causes an error such as executing an unimplemented instruction is killed by the operating system On the other hand exceptions such as page faults are requests from a process to the operate ing system to perform a service such as bringing in a page from disk The op erating system processes these requests and resumes the process The nal type of exceptions are interrupts from external devices These generally cause the operating system to move data to or from an lO device and resume the interrupted process The code in the example below is a simple interrupt hani dler which invokes a routine to print a message at each exception but not in terrupts This code is similar to the interrupt handler used by the SPIM simulator except that it does not print an error message to report an exception Interrupt Handler The interrupt handler first saves registers 510 and a l which it later uses to pass arguments The interrupt handler cannot store the old values from these registers on the stack as would an ordinary routine because the cause of the interrupt might have been a memory reference that used a bad value such as 0 in the stack pointer Instead the interrupt handler stores these registers in two memory locations saveO and save l If the inter rupt routine itself could be interrupted two locations would not be enough since the second interrupt would overwrite values saved during the rst interrupt However this simple interrupt handler nishes running before it enables interrupts so the problem does not arise A1 Exceptions and Interrupts A35 Ktext 0x80000080 svv a0 saveO Handler is not reeentrant and can t use svv a i savei stack to save a0 ai Don t need to save KOKi The interrupt handler then moves the Cause and EPC registers into CPU registers The Cause and EPC registers are not part of the CPU regisi ter set Instead they are registers in coprocessor 0 which is the part of the CPU that handles interrupts The instruction thO ilt0 1 3 moves coproi cessor 0 s register 13 the Cause register into CPU register KO Note that the interrupt handler need not save registers ilt0 and K l because user programs are not supposed to use these registers The interrupt handler uses the value from the Cause register to test if the exception was caused by an interrupt see the preceding table If so the exception is ignored If the exception was not an interrupt the handler calls pr i ntiexcp to print a warning message mtcO KO 13 Move Cause into KO mtcO K i 14 Move EPC into 55 sgt v0 KO 0x44 ignore interrupts bgtz v0 done mov a0 KO Move Cause into aO mov a i 55H Move EPC into ai Jai printiexcp Print exception error message Before returning the interrupt handler restores registers a0 and a l It then executes the rte return from exception instruction which restores the previous interrupt mask and kernel user bits in the Status register This switches the processor state back to what it was before the exception and prepares to resume program execution The interrupt handler then re turns to the program byjumping to the instruction following the one that caused the exception done i W a0 saveO i W ai savei add i u K i K i 4 Do not reexecute tau it i ng instruction rte Restore interrupt state J r 55 Kdata saveO word 0 savei word 0 A36 Appendix A Assemblers Linkers and the SPIM Simulator Elaboration On real MIPS processors the return from an interrupt handler is more complex The rfe instruction must execute in the delay slot of they r instruction see elaboration on page 444 of Chapter 6 that returns to the user program so that no interr ruptrhandler instruction executes with the user program s interrupt mask and kerr neIuser bits In addition the interrupt handler cannot alwaysjump to the instruction following EPC For example ifthe instruction that caused the exception was in a branch instruction s delay slot see Chapter 6 the next instruction may not be the following instruction in memory Input and Output SPIM simulates one lO device a memoryimapped terminal When a pro gram is running SPIM connects its own terminal or a separate console win dow in the Xiwindow version xsp r m to the processor A MIPS program running on SPIM can read the characters that you type In addition if the MIPS program writes characters to the terminal they appear on SPIM s termi7 nal or console window One exception to this rule is controliC this character is not passed to the program but instead causes SPIM to stop and return to command mode When the program stops running for example because you typed controliC or because the program hit a breakpoint the termi7 nal is reconnected to sp r m so you can type SPIM commands To use memoryimapped lO see below sp r m or xsp r m must be started with the imappedir 0 ag The terminal device consists of two independent units a receiver and a transmitten The receiver reads characters from the keyboard The transmitter writes characters to the display The two units are completely independent This means for example that characters typed at the keyboard are not auto matically echoed on the display Instead a program must explicitly echo a character by reading it from the receiver and writing it to the transmitter A program controls the terminal with four memoryimapped device regisi ters as shown in Figure A15 quotMemoryimappedquot means that each register appears as a special memory location The Receiver Control registeris at location fffiUOOOheX Only two of its bits are actually used Bit 0 is called ready if it is 1 it means that a character has arrived from the keyboard but has not yet been read from the Receiver Data register The ready bit is readionly writes to it are ignored The ready bit changes from 0 to 1 when a character is typed at the key board and it changes from 1 to 0 when the character is read from the Receiver Data register Bit 1 of the Receiver Control register is the keyboard interrupt enable This bit may be both read and written by a program The interrupt enable is initially 0 If it is set to l by a program the terminal requests an interrupt at level 0 whenever the ready bit is 1 However for the interrupt to affect the processor A8 Input and Output A37 Unused 1 1 Receiver control OxffffOOOO Interrupt Ready nable Unused 8 Receiver data Oxffff0004 Received byte Unused 1 1 Transmitter control Oxffffooos Interrupt Ready nable Unused 8 Transmitter data K 1 OxffffOOOc Transmitted byte FIGURE A15 The terminal is controlled by four device registers each of which appears as a memory location at the given address Only a few bits of these registers are actually used The others always read as US and are ignored on writes interrupts must also be enabled in the Status register see section A7 All oth er bits of the Receiver Control register are unuse The second terminal device register is the Receiver Data register at address foOO4hex The loweorder 8 bits of this register contain the last character typed at the keyboard All other bits contain Us This register is readeonly and changes onl when a new character is typed at the keyboard Reading the Receiver Data register resets the ready bit in the Receiver Control register to 0 e third terminal device register is the Transmitter Control register at ad dress fffoOOBheX Only the loweorder 2 bits of this register are used They be have much like the corresponding bits of the Receiver Control register Bit 0 is called ready and is readeonly If this bit is l the transmitter is ready to accept a new character for output If it is 0 the transmitter is still busy writing the pre vious character Bit 1 is interrupt enable and is readable and writable If this bit is set to 1 then the terminal requests an interrupt on level 1 whenever the ready bit is l The nal device register is the Transmitter Data register at address foOOcheX When a value is written into this location its loweorder 8 bits ie A33 Appendix A Assemblers Linkers and the SPIM Simulator an ASCII character as in Figure 315 in Chapter 3 are sent to the console When the Transmitter Data register is written the ready bit in the Transmitter Control register is reset to 0 This bit stays 0 until enough time has elapsed to transmit the character to the terminal then the ready bit becomes 1 again The Trans mitter Data register should only be written when the ready bit of the Transmit ter Control register is 1 If the transmitter is not ready writes to the Transmitter Data register are ignored the write appears to succeed but the character is not output Real computers require time to send characters over the serial lines that con nect terminals to computers These time lags are simulated by SPIM For exam ple after the transmitter starts to write a character the transmitter s ready bit becomes 0 for a while SPIM measures time in instructions executed not in real clock time This means that the transmitter does not become ready again until the processor executes a certain number of instructions If you stop the ma chine and look at the ready bit it will not change However if you let the ma chine run the bit eventually changes back to l SPIM SPIM is a software simulator that runs programs written for MIPS R2000R3000 processors SPIM s name isjust MIPS spelled backwards SPIM can read and immediately execute assembly language les or on some sys7 tems MIPS executable les SPIM is a selficontained system for running MIPS programs It contains a debugger and provides a few operating systemilike services SPIM is much slower than a real computer 100 or more times However its low cost and wide availability cannot be matched by real hard ware An obvious question is Why use a simulator when many people have workstations that contain MIPS chips that are signi cantly faster than SPIM One reason is that these workstations are not universally available Another reason is rapid progress toward new and faster computers may render these machines obsolete see Chapter I The current trend is to make computers faster by executing several instructions concurrently This trend makes archii tectures more dif cult to understand and program The MIPS architecture may be the epitome ofa simple clean RISC machine In addition simulators can provide a better environment for programming than an actual machine because they can detect more errors and provide more features than an actual computer For example SPIM has an Xiwindow inter face that works better than most debuggers on the actual machines Finally simulators are a useful tool in studying computers and the pro grams that run on them Because they are implemented in software not silicon simulators can be easily modi ed to add new instructions build new systems such as multiprocessors or simply to collect data A9 SPIM A39 Simulation of a Virtual Machine The MIPS architecture like that of many RISC computers is dif cult to pro gram directly because of delayed branches delayed loads and restricted address modes This dif culty is tolerable since these computers were designed to be programmed in highelevel languages and present an interface appropriate for compilers rather than assembly language programmers A good part of the programming complexity results from delayed instructions A delayed branch requires two cycles to execute see elaborations on pages 444 and 502 of Chapter 6 In the second cycle the instruction immediately fol lowing the branch executes This instruction can perform useful work that normally would have been done before the branch It can also be a mop no operation Similarly delayed loads require two cycles so the instruction imme diately following a load cannot use the value loaded from memory see section 62 of Chapter 6 MIPS wisely chose to hide this complexity by having its assembler imple ment a Virtual machine This virtual computer appears to have nondelayed branches and loads and a richer instruction set than the actual hardware The assembler reorganizes rearranges instructions to ll the delay slots The virtual computer also provides pseudoinstructions which appear as real instructions in assembly language programs The hardware however knows nothing about pseudoinstructions so the assembler must translate them into equivalent se quences of actual machine instructions For example the MIPS hardware only provides instructions to branch when a register is equal to or not equal to 0 Other conditional branches such as when one register is greater than another are synthesized by comparing the two registers and branching when the result of the comparison is true nonzero By default SPIM simulates the richer virtual machine However it can also simulate the bare hardware Below we describe the virtual machine and only mention in passing features that do not belong to the actual hardware In doing so we follow the convention of MIPS assembly language programmers and compilers who routinely use the extended machine For a description of the real machines see Gerry Kane and Joe Heinrich MIPS RISC Architecture Prene tice Hall Englewood Cliff N 1992 Getting Started with SPIM The rest of this appendix contains a complete and rather detailed description of SPIM Many details should never concern you however the sheer volume of information can obscure the fact that SPIM is a simple easyetoeuse pro gram This section contains a quick tutorial on SPIM that should enable you to load debug and run simple MIPS programs Appendix A Assemblers Linkers and the SPIM Simulator 1 SPIM comes in multiple versions One version called Sp m is a command lineidriven program and requires only an alphanumeric terminal to display it It operates like most programs of this type you type a line of text hit the re 7 turn key and apt to executes your command A fancier version called xsp m runs in the Xiwindows environment of the Unix system and therefore requires a bitimapped display to run it xsp m however is a much easier program to learn and use because its commands are always visible on the screen and because it continually displays the machine s registers Anotherversion PCSpi m is compatible with Windows 31 Windows 95 and Windows NT The Unix Windows and DOS versions of SPIM are available through WWkapCOmCod26htm Since many people use and prefer xspi m this section only discusses that program If you plan to use any version ofsp m do not skip this section Read it rst and then look at the quotSPIM CommandiLine Optionsquot section starting on page A744 to see how to accomplish the same thing with Sp m commands Check WWkapCOmCod26htm for more information on using PCsp m To start xsp m type xsp m in response to your system s prompt XSpi m On your system xsp to may be kept in an unusual place and you may need to execute a command rst to add that place to your search path Your instruci tor should tell you how to do this When xsp m starts up it pops up a large window on your screen see Figure Al6 The window is divided into ve panes l The top pane is called the register display It shows the values of all reg isters in the MIPS CPU and FPU This display is updated whenever your program stops running I The pane below contains the control buttons to operate xsp m These but tons are discussed below so we can skip the details for now I The next pane called the teXt segments displays instructions both from your program and the system code that is loaded automatically when xsp m starts running Each instruction is displayed on a line that looks like 0x00400000 OX8fa4000O iW 4 O29 89 TW 510 OSp The rst number on the line in square brackets is the hexadecimal memory address of the instruction The second number is the instruci tion s numerical encoding again displayed as a hexadecimal number The third item is the instructions mnemonic description Everything following the semicolon is the actual line from your assembly le that produced the instruction The number 89 is the line number in that le Sometimes nothing is on the line after the semicolon This means that the instruction was produced by SPIM as part of translating a pseudo instruction A9 SPIM A41 xsplm C 00000000 EPC 00000000 Cause 00000000 BadVaddr 00000000 StaEUS 00000000 H 00000000 LO 7 00000000 General registers R0 r0 00000000 R8 to 0000000 R76 50 00000000 R24 t8 00000000 R4 at OOOOOOOO R9 t4 OOOOOOOO R47 OOOOOOOO R25 S9 OOOOOOOO R2 VO OOOOOOOO RWO t2 OOOOOOOO R48 OOOOOOOO R26 KO OOOOOOOO R3 V4 OOOOOOO RM t3 OOOOOOOO R49 OOOOOOOO R27 KW OOOOOOOO Re ister R4 aO OOOOOOOO R42 t4 OOOOOOOO R2O OOOOOOOO R28 gp OOOOOOOO 9 R5 a4 OOOOOOOO R43 t5 OOOOOOOO R24 OOOOOOOO R29 Sp OOOOOOOO dISPIay R6 a2 OOOOOOOO R44 6 OOOOOOOO R22 OOOOOOOO R3O S8 OOOOOOOO R7 a3 00000000 R75 t7 7 00000000 R23 57 00000000 R37 ra 00000000 Double oatingrpoint registers FRO 0 000000 R8 000 O FRWG 0 000000 FR24 0 000000 FR2 0 000000 FRWO 0 OOO FRWS 0 000000 FR26 0 000000 FR4 0 000000 FRW2 0 000000 FR2O 0000 O FR28 0 000000 FRG 0 000000 FRW4 0 000000 FR22 0 000000 FR3O 0 000000 Control buttons 0x8fa40000 w 4 0 89 w aO OSp Ox 4 4 Ox27aSOOO4 addru 5 29 4 9O addru a Sp 4 Ox 4 8 Ox24aGOOO4 addru 6 9W addru a2 a 4 T9 LC OXOOO4WOSO SH 2 3 2 92 SH VO aO segments Z OXOOC2302W addu 6 6 2 93 addu a2 a2 VO Ox 4 4 OxOcOOOOOO Ja OXOOOOOOOO mam 94 Ja m n Ox 4 8 Ox34OZOOOa 0m WO 9 H O 40 C OxOOOOOOOc sysca 96 sysca Data segments OXWOOWOOOO OXOOOOOOOO OX74706563 Ox206e6f69 OX636f2000 OX72727563 OX64206465 0x6920646e Ox726f6e67 Data and OxOOOa6465 OX495D2020 OX7265746e 0x74707572 stack OXOOOO2OSd OX202000OO 0X64 6e555b Ox6e676960 segments 0x67 206465 0x65726464 0x69 207373 OX6e69 206e 0X64 2f7473 OX2OGW 7464 OX63746566 OXOO2OSd68 OX555D202 OX696064 6e 0x64656e67 OX646464 2O OX73736572 Ox206e6920 OX726f7473 OXOO2OSd65 SP M Versron 5 9 of January 47 4997 Copyrrgnt c 499LW997 by James R Larus aruscs wrsc edu SPIM A R nts Reserved messages See the W e README for a t copyrrgnt notrce FIGURE L16 SPIM39s Xwindow interface xsp i m Appendix A Assemblers Linkers and the SPIM Simulator l The next pane called the data and stack segments displays the data load ed into your program s memory and the data on the program s stack l The bottom pane is the SPIM messages that xsp i m uses to write mes sages This is where error messages appear Let s see how to load and run a program The rst thing to do is to click on the oad button the second one in the rst row ofbuttons with the left mouse key Your click tells xsp m to pop up a small prompt window that contains a box and two or three buttons Move your mouse so the cursor is over the box and type the name of your le of assembly code Then click on the button labeled assemb y 6 within that prompt window Ifyou change your mind click on the button labeled abort command and XSpi m gets rid of the prompt window When you click on assembly l e xsp i m gets rid ofthe prompt win dow then loads your program and redraws the screen to display its instruce tions and data Now move the mouse to put the cursor over the scrollbar to the left of the text segments and click the left mouse button on the white part of this scrollbar A click scrolls the text pane down so you can nd all the instruce tions in your program To run your program click on the rum button in xsp m s control button pane It pops up a prompt window with two boxes and two buttons Most of the time these boxes contain the correct values to run your program so you can ignore them andjust click on OK This button tells xsp m to run your pro gram Notice that when your program is running xsp m blanks out the regise ter display pane because the registers are continually changing You can always tell whether xsp m is running by looking at this pane If you want to stop your program make sure the mouse cursor is somewhere over XSpi m s window and type controleC This causes XSpi m to pop up a prompt window with two buttons Before doing anything with this prompt window you can look at registers and memory to nd outwhat your program was doing When you understand what happened you can either continue the program by click ing on comt i we or stop your program by clicking on abort command If your program reads or writes from the terminal xsp m pops up another window called the console All characters that your program writes appear on the console and everything that you type as input to your program should be typed in this window Suppose your program does not do what you expect What can you do SPIM has two features that help debug your program The rst and perhaps the most useful is singleestepping which allows you to run your program an instruction at a time Click on the button labeled Step and another prompt window pops up This prompt window contains two boxes and three buttons The rst box asks for the number of instructions to step every time you click the mouse Most of the time the default value of l is a good choice The other box asks for arguments to pass to the program when it starts running Again A9 SPIM A43 most of the time you can ignore this box because it contains an appropriate value The button labeled Step runs your program for the number of instruc7 tions in the top box If that number is l xsp l m executes the next instruction in your program updates the display and returns control to you The button la beled comt l mue stops singleistepping and continues running your program Finally abort command stops singleistepping and leaves your program stopped What do you do if your program runs for a long time before the bug arises You could singleistep until you get to the bug but that can take a long time and it is easy to get so bored and inattentive that you step past the problem A better alternative is to use a breakpoint which tells xsp l m to stop your program immediately before it executes a particular instruction Click on the button in the second row of buttons marked breakpo l mts The xsp l m program pops up a prompt window with one box and many buttons Type in this box the ad dress of the instruction at which you want to stop Or if the instruction has a global label you canjust type the name of the label Labeled breakpoints are a particularly convenient way to stop at the rst instruction of a procedure To actually set the breakpoint click on add You can then run your program When SPIM is about to execute the breakpointed instruction xsp l m pops up a prompt with the instructions address and two buttons The CO tl mue but ton continues running your program and abort command stops your program If you want to delete a breakpoint type in its address and click on del ete Finally l l at tells xsp l m to print in the bottom pane a list of all breakpoints that are set Singleistepping and setting breakpoints will probably help you nd a bug in your program quickly How do you x it Go back to the editor that you used to create your program and change it To run the program again you need a fresh copy of SPIM which you get in two ways Eitheryou can exit from xsp l m by clicking on the qu lt button or you can clear XSpl m and reload your program lfyou reload your program you mustclear the memory so remnants ofyour previous program do not interfere with your new program To do this click on the button labeled cl ear Hold the left mouse key down and a two item menu will pop up Move the mouse so the cursor is over the item labeled memory amp reg l sters and release the key This causes xsp l m to clear its meme ory and registers and return the processor to the state it was in when xsp l m rst started You can now load and run your new program The other buttons in xsp l m perform functions that are occasionally useful When you are more comfortable with xsp l m you should look at the descrip7 tion below to see what they do and how they can save you time and effort Appendix A Assemblers Linkers and the SPIM Simulator SPIM CommandLine Options Both Unix versions of SPlMiSp l m the terminal version and xsp l m the X versioniaccept the following commandiline options abare 788m ipseudo inopseudo inotrap itrap inquet eqUet anomapped imappedii o i ie iexecute O Simulate a bare MIPS machine Without pseudoinstructions or the additional addressing modes provided by the assembler lmplies iqu T et Simulate the virtual MIPS machine provided by the assembler This is the default Allow the input assembly code to contain pseudoinstruci tions This is the default Do not allow pseudoinstructions in the input assembly code Do not load the standard exception handler and startup code This exception handler handles exceptions When an exception occurs SPIM jumps to location BOOOOOBOW Which must contain code to service the exception In addi7 tion this le contains startiup code that invokes the roui tine mai m Without the startup routine SPIM begins execution at the instruction labeled fistart Load the standard exception handler and startup code This is the default Print a message When an exception occurs This is the default Do not print a message at exceptions Disable the memoryimapped lO facility see section All This is the default Enable the memoryimapped lO facility see section All Programs that use SPIM sysca i i 8 see section on quotSystem Callsquot page A748 to read from the terminal cannot also use memoryimapped lO Load and execute the assembly code in the le Load and execute the code in the MIPS executable le aout This command is only available When SPIM runs on a system containing a MIPS processor A9 SPIM 8 ltSeggt Sl Z6 fl ltSeggt Sl ze Sets the initial size of memory segment seg to be size bytes The memory segments are named text data Stack Ktext and Kd ata The text segment contains instruc7 tions from a program The data segment holds the pro grams data The Stack segment holds its runtime stack In addition to running a program SPIM also executes sys7 tem code that handles interrupts and exceptions This code resides in a separate part of the address space called the kernel The Ktext segment holds this code s instructions and Kdata holds its data There is no KStacK segment since the system code uses the same stack as the program For example the pair of arguments 7Sdata 2000000 starts the user data segment at 2000000 bytes Sets the limit on how large memory segment seg can grow to be size bytes The memory segments that can grow are data Stack and Kdata Terminal Interface Sp i m The simpler Unix version of SPIM is called Sp l m It does not require a bite mapped display and can be run from any terminal Although Sp l m may be more dif cult to learn it operatesjust like xSp l m and provides the same funci tionality The Sp l m terminal interface provides the following commands eXlt read quotfilequot load quotfilequot execute quota out run ltaddrgt Exit the simulator Read le of assembly language into SPIM If the le has already been read into SPIM the system must be cleared see re l m l tl al l ze below or global labels will be multii ply de ned Synonym for read Read the MIPS executable le aout into SPIM This command is only available when SPIM runs on a system containing a MIPS processor Start running a program If the optional address addr is provided the program starts at that address Otherwise the program starts at the global label iiStart which is usually the default startiup code that calls the routine at the global label ma l n Appendix A Assemblers Linkers and the SPIM Simulator step ltNgt continue print N print t N print addr printisym Step the program for N default 1 instructions Print instructions as they execute Continue program execution without stepping Print register N Print oating point register N Print the contents of memory at address addn Print the names and addresses of the global labels known to SPIM Labels are local by default and become global only when declared in a g i Obi assembler directive see quotAssember Syntaxquot section on page A751 re l n l t l a i l ze Clear the memory and registers breakpo l mt addr Set a breakpoint at address addr addr can be either a deiete addr Mst ltnigt 7 memory address or symbolic label Delete all breakpoints at address addn List all breakpoints Rest of line is an assembly instruction that is stored in memory A newline reexecutes previous command Print a help message Most commands can be abbreviated to their unique pre x eg ex re i m S p More dangerous commands such as re m l t l ai l ze require a longer pre x XWindow Interface xsp i m The tutorial Getting Started with SPIM page A739 explains the most common XSpl m commands However xsp l W has other commands that are occasionally useful This section provides a complete list of the commands The X version of SPIM xsp l m looks different but operates in the same man ner as Sp l m The Xiwindow has ve panes see Figure Al6 The top pane dis plays the registers These values are continually updated except while a program is running The next pane contains buttons that control the simulator quxt oad Exit from the simulator Read a source or executable le into SPIM A47 A9 SPIM rum Start the program running Step Singleistep a program c l ear Reinitialize registers or memory set va l ue Set the value in a register or memory location pm mt Print the value in a register or memory location breakpow mt Set or delete a breakpoint or list all breakpoints hel p Print a help message terml rial Raise or hide the console window mode Set SPlM operating modes The next two panes display the memory The top one shows instructions from the user and kernel text segments These instructions are realinot pseudoiMlPS instructions SPlM translates assembler pseudoinstructions into one to three MIPS instructions Each source instruction appears as a come ment on the rst instruction into which it is translated The rst few instruc7 tions in the text segment are the default startiup code fistart that loads argc and argv into registers and invokes the ma l m routine The lower of these two panes displays the data and stack segments Both panes are updated as a program executes The bottom pane is used to display SPlM messages It does not display out put from a program When a program reads or writes its lO appears in a sep7 arate window called the console which pops up when needed Surprising Features Although SPlM faithfully simulates the MIPS computer SPlM is a simulator and certain things are not identical to an actual computer The most obvious differences are that instruction timing and the memory systems are not identi7 cal SPlM does not simulate caches or memory latency nor does it accurately re ect oatingipoint operation or multiply and divide instruction delays Another surprise which occurs on the real machine as well is that a pseudoinstruction expands to several machine instructions When you single step or examine memory the instructions that you see are different from the source program The correspondence between the two sets of instructions is fairly simple since SPlM does not reorganize instructions to ll delay slots Byte Order Processors can number bytes within a word so the byte with the lowest num7 ber is either the leftmost or rightmost one The convention used by a machine Appendix A Assemblers Linkers and the SPIM Simulator is called its byte order MIPS processors can operate with either bigiendian or littleiendian byte order For example in a bigiendian machine the directive byte 0 i 2 3 would result in a memory word containing H I while in a littleiendian machine the word would contain I H SPIM operates with both byte orders SPIM s byte order is the same as the byte order of the underlying machine that runs the simulator For example on a DECstation 3100 or Intel 80x86 SPIM is littleiendian while on a Macintosh or Sun SPARC SPIM is bigiendian System Calls SPIM provides a small set of operatingisystemilike services through the sys7 tem call syscai i instruction To request a service a program loads the sys7 tem call code see Figure Al7 into register v0 and arguments into registers a07a3 or t i 2 for oatingipoint values System calls that return values put their results in register v0 or t0 for floatingipoint results For exam ple the following code prints tne answer 5 data str aSCiiZ tne answer text ii v0 4 system caii code for printistr ia a0 str address of string to print syscaii print tne string ii v0 i system caii code for pri tii t ii a0 5 integer to print syscaii print it The pr i nt i nt system call is passed an integer and prints it on the console pr i nti oat prints a single oatingipoint number pr i ntidoubi e prints a double precision number and pr i ntistr i rig is passed a pointer to a nulliteri minated string which it writes to the console A10 MIPS R2000 Assembly Language A49 FIGURE A11 System services The system calls readii mt readi oat and readidoubl e read an entire line of input up to and including the newline Characters following the num7 ber are ignored readistr l mg has the same semantics as the Unix library roui tine fgets It reads up to n 7 1 characters into a buffer and terminates the string with a null byte If fewer than n 7 1 characters are on the current line readistr l mg reads up to and including the newline and again nulliterminates the string Warning Programs that use these syscalls to read from the terminal should not use memoryimapped IO see section AB Finally Sbrk returns a pointer to a block of memory containing n additional bytes and ex l t stops a program from running MIPS R2000 Assembly Language A MIPS processor consists of an integer processing unit the CPU and a col lection of coprocessors that perform ancillary tasks or operate on other types of data such as oatingipoint numbers see Figure AIS SPIM simulates two 1 F J 0 handles 1 39 interrupts and the virtual memory system SPIM simulates most of the rst two and entirely omits details of the memory system Coprocessor l is the oatingipoint unit SPIM simulates most aspects of this unit Addressing Modes MIPS is a loadistore architecture which means that only load and store instructions access memory Computation instructions operate only on values in registers The bare machine provides only one memoryiaddressing mode C rx which uses the sum of the immediate C and register rx as the address The virtual machine provides the following addressing modes for load and store instructions A50 Appendix A Assemblers Linkers and the SPIM Simulator Memory CPU Coprocessor 1 FPU Reglslers Regl 51ers Arllhmeuc unit Coprocessor 0 traps and memory Reglslers FIGURE L18 MIPS R2000 CPU and FPU Contents Contents Contents Most load and store instructions operate only on aligned data A quantity is aligned if its memory address is a multiple ofits size in bytes Therefore a half word object must be stored at even addresses and a full word object must be stored at addresses that are a multiple of four However MIPS provides some instructions to manipulate unaligned data W l lvvr SWl and swr A10 MIPS R2000 Assembly Language A51 Elaboration The MIPS assembler and SPIM synthesizes the more complex addressing modes by producing one or more instructions before the load or store to compute a complex address For example suppose that the label tab l e referred to memory location 0x10000004 and a program contained the instruction id a0 table Masai The assembler would translate this instruction into the instructions l u l at 4096 addu at at al lw a0 8at The first instruction loads the upper bits of the label s address into register at which the register that the assemble reserves for its own use The second instruction adds the contents of register a l to the label s partial address Finally the load instruction uses the hardware address mode to add the sum of the lower bits of the label s address and the offset from the original instruction to the value in register at Assembler Syntax Comments in assembler les begin With a sharp sign Everything from the sharp sign to the end of the line is ignored Identi ers are a sequence of alphanumeric characters underbars 7 and dots that do not begin With a number Instruction opcodes are reserved words that cannot be used as identi ers Labels are declared by putting them at the beginning of a line followed by a colon for example data item word l text globl mam Must be global mam lw t0 item Numbers are base 10 by default If they are preceded by UK they are inter preted as hexadecimal Hence 256 and 0x100 denote the same value Strings are enclosed in doublequotes Special characters in strings follow the C convention I newlinen l tab t I quotequot SPIM supports a subset of the MIPS assembler directives a l l gm m Align the next datum on a 2quot byte boundary For example al l gm 2 aligns the next value on a word boundary al l gm 0 turns off automatic alignment of ha l f word float and double directives until the next data or Kdata directive asc l l str Store the string str in memory but do not nulliterminate it ASZ Appendix A Assemblers Linkers and the SPIM Simulator asc i Z str Store the string str in memory and nulliterminate it byte bl bm Store the n values in successive bytes of memory data ltaddrgt Subsequent items are stored in the data segment If the optional argument addr is present subsequent items are stored starting at address addn doub e d l do Store the n floatingipoint double precision num7 bers in successive memory locations extem sym s ze Declare that the datum stored at sym is size bytes large and is a global label This directive enables the assembler to store the datum in a portion of the data seg7 ment that is ef ciently accessed via register gp oat tn Store the n floatingipoint single precision numbers in successive memory locations 9 Ob sym Declare that label sym is global and can be referenced from other les hai t m hm Store the n l6ibit quantities in successive memory halfwords Kdata ltaddrgt Subsequent data items are stored in the kernel data seg7 ment If the optional argument addris present subsequent items are stored starting at address addn Ktext ltaddrgt Subsequent items are put in the kernel text segment ln lM these items may only be instructions or words see the word directive below If the optional argument addr is present subsequent items are stored starting at address r set oat and set at The first directive prevents SPIM from com plaining about subsequent instructions that use register at The second directive reenables the warning Since pseudoinstructions expand into code that uses register at programmers must be very careful about leaving val ues in this register space H Allocate n bytes of space in the current segment which must be the data segment in SP text ltaddrgt Subsequent items are put in the user text segment ln SPIM these items may only be instructions or words see the word directive below If the optional argument addr is present subsequent items are stored starting at address addr A10 MIPS R2000 Assembly Language A53 word W I W Store the n 327bit quantities in successive memory words SPIM does not distinguish various parts of the data segment data rd ata and sdata Encoding MIPS Instructions Figure A19 explains how a MIPS instruction is encoded in a binary number Each column contains instruction encodings for a eld a contiguous group of bits from an instruction The numbers at the left margin are values for a eld For example theJ opcode has a value of 2 in the opcode eld The text at the top of a column names a eld and speci es which bits it occupies in an instruction For example the Op eld is contained in bits 26731 of an instruc7 tion This eld encodes most instructions However some groups of instruc7 tions use additional elds to distinguish related instructions For example the different oatingipoint instructions are speci ed by bits 075 The arrows from the rst column show which opcodes use these additional elds Instruction Format The rest of this appendix describes both the instructions implemented by actual MIPS hardware and the pseudoinstructions provided by the MIPS assembler The two types of instructions are easily distinguished Actual instructions depict the elds in their binary representation For example in Addition with over ow add rd rt n n 6 5 5 5 5 6 the add instruction consists of six elds Each eld s size in bits is the small number below the eld This instruction begins with 6 bits of Us Register speci ers begin with an r so the next eld is a 57bit register speci er called r8 This is the same register that is the second argument in the symbolic assembly at the left of this line Another common eld is I mmqe which is a 16bit imme7 diate number Pseudoinstructions follow roughly the same conventions but omit instruc7 tion encoding information For example Multiply without over ow muI rdest rsrci src2 pseudoinstruction In pseudoinstructions rdest and rsrc are registers and src2 is either a reg ister or an immediate value In general the assembler and SPIM translate a more general form of an instruction eg add EEV I 210 0x55 to a special ized form eg addi v I a0 0x55 10 16 0M3 126 7a funcz539 7a funcz539 0 00 SH 0 add 1 01 1 1 Sub 2 02 J 2 Srl 2 muIJr 3 03 Jal 3 Sra 3 dlvf 4 04 beq 4 4 5 05 brle 5 5 abs 6 06 blez 6 Srlv 6 movf 7 07 bgu 7 Srav 7 8 08 3 1 8 Jr 8 9 09 addlu 9 Jalr 9 10 03 Slu 10 10 1 1 0b Slllu 11 1 1 12 0c andl 12 SyScaII 12 13 0d 0 13 break 13 14 Oe xorl 14 14 1 5 0f lul 15 1 5 1 6 10 l 0 16 I 1 6 17 1 1 z 1 17 mlhl 17 18 12 l 2 18 m o 18 19 13 I 3 19 mtlo 19 20 14 20 20 21 15 21 21 22 16 22 22 23 17 23 23 24 18 24 mull 24 25 19 25 multu 25 26 1a 26 dIV 26 27 1b 27 dlvu 27 28 1c 28 28 29 1d 29 29 30 1e funcl rt 30 30 31 1f 402 2016 31 31 32 20 0 0 0 blll 32 add 32 cvtsf 33 21 lb 1 1 tlbr 1 bgez 33 addu 33 cvtdf 34 22 Ih 2 2 tlbWI 2 34 ub 34 35 23 MI 3 3 3 35 Subu 35 36 24 IW 4 mtcz 4 4 36 and 36 cvtWf 37 25 Ibu 5 5 5 37 or 37 38 26 Ihu 6 ctcz 6 thWr 6 38 xor 39 27 Mr 7 7 1le 7 39 nor 40 28 8 If 8 8 40 41 29 Sb 9 9 9 41 42 2a Sh 10 10 10 42 It 43 2b SW 11 If 11 11 43 Sltu 44 20 SW 12 12 12 44 45 2d 13 13 13 45 46 2e 14 If Z 0 1 4 14 46 47 2 f SWr 15 15 rte 15 47 48 3 16 copl 16 16 bltzal 48 49 31 Wc0 17 copl 7 17 17 bgezal 49 50 32 WC1 18 Ilz If1 18 18 50 51 33 IWCZ 19 fd 19 19 51 52 34 WC3 20 20 20 52 53 5 21 21 21 53 54 36 22 22 22 54 55 3 23 23 23 55 56 3 24 24 24 56 57 SWcO 25 25 25 57 58 3a SWc1 26 26 26 58 59 3b SW02 27 27 27 59 60 3c SW03 28 28 28 60 61 29 29 29 61 62 3e 30 30 30 62 63 3 31 31 31 63 FIGURE L1 9 MIPS opcode map The values of each eld are shown to its left The rst column shows the values in base 10 and the second shows base 16 for the op eld bits 31 to 26 in the third column This op eld completely speci es the MIPS operation except for 6 op values 0 1 16 17 18 and 19 These operations are determined by other elds identi ed by point ers The last eld funct uses 1 to mean s ifrs 16 and op 17 or d if rs 17 and op 17 The second eld rs uses 2quot o n 0quot 1quot or 3quot ifo 16 17 18 or 19 respectively lfrs 16 the operation is speci ed elsewhere ifz 0 the operations are speci ed in the fourth eld bits 4 to 0 ifz 1 then the operations are in the last eld with f s If rs 17 and z 1 then the operations are in the last eld with f d page A754 A10 MIPS R2000 Assembly Language A55 Arithmetic and Logical Instructions Absolute value abs rdest rsrc pseudoinstructi on Put the absolute value of register rare in register rdest Addition with over ow add rd rt nn 6 5 5 5 5 6 Addition without over ow addu rd rt nn 6 5 5 5 5 6 Put the sum of registers rs and rt into register rd Addition immediate with over ow 5 5 l6 6 Addition immediate without over ow 5 5 6 16 Put the sum of register rs and the signiextended immediate into register rt AND and my mun 6 5 5 5 5 6 Put the logical AND of registers rs and rt into register rd AND immediate 6 5 5 16 Put the logical AND of register rs and the zeroiextended immediate into register rt A56 Appendix A Assemblers Linkers and the SPIM Simulator Divide with over ow 5 5 6 6 10 Divide without over ow 5 5 6 6 10 Divide register r8 by register rt Leave the quotient in register i O and the re mainder in register h i Note that if an operand is negative the remainder is uni specified by the MIPS architecture and depends on the convention of the machine on Which SPIM is run Divide with over ow d V rdest rsrci src2 pseudoinstmcn39on Divide without over ow divu rdest rsrci src2 pseudoinstmcn39on Put the quotient of register rsrci and scm into register rdest Multiply 6 5 5 10 6 Unsigned multiply 6 5 5 6 10 Multiply registers rs and rt Leave the lowiorder word of the product in reg ister i O and the highiorder word in register h i Multiply without over ow mui rdest rsrci src2 pseudoinstrucn39on Multiply with over ow muio rdest rsrci src2 pseudoinstmcn39on A10 MIPS R2000 Assembly Language A57 Unsigned multiply with over ow mu i ou rdest rs rci src2 pseudoinstrucn39on Put the product of register rsrci and scm into register rdest Negate value with over ow neg rdest rsrc pseudoinstrucn39on Negate value without over ow negu rdest rs rc pseudoinstrucn39on Put the negative of register rsrc into register rdest Put the logical NOR of registers rs and rt into register rd NOT not rdest rsrc pseudoinstrucn39on Put the bitWise logical negation of register rsrc into register rdest 0R 6 5 5 5 5 6 Put the logical OR of registers rs and rd into register rt 0R immediate 6 5 5 16 Put the logical OR of register rs and the zeroiextended immediate into register rt Remainder rem rdest rsrci rscm pseudoinstrucn39on A53 Appendix A Assemblers Linkers and the SPIM Simulator Unsigned remainder remu rdest rsrci rscm pseudoinstmcn39on Put the remainder of register rsrc i divided by register rsrCZ into register rdest Note that if an operand is negative the remainder is unspecified by the MIPS architecture and depends on the convention of the machine on Which SPIM is run Shift left logical w my sham quot 5 rt rd 6 5 5 5 6 Shift left logical variable s w rd rs nn 6 5 5 5 5 6 Shift right arithmetic 3 6 6 5 rd mm mm 5 5 5 5 Shift right arithmetic variable 6 5 5 5 5 6 Shift right logical sw rd mm m 5 5 5 6 6 5 Shift right logical variable w my rs nun 6 5 5 5 5 6 Shift register rt left right by the distance indicated by immediate 8a or the register rs and put the result in register rd Rotate left roi rdest rsrci rscm pseudoinstmcn39on A10 MIPS R2000 Assembly Language A59 Rotate right ror rdest rsrci rs rc2 pseudoinstmcn39on Rotate register rsrc l left right by the distance indicated by rsrCZ and put the result in register rdest Subtract with over ow sub rd mun 6 5 5 5 5 6 Subtract without over ow sum rd mun 6 5 5 5 5 6 Put the difference of registers rs and rt into register rd Exclusive 0R 6 5 5 5 5 6 Put the logical XOR of registers rs and rt into register rd XOR immediate 6 5 5 16 Put the logical XOR of register rs and the zeroiextended immediate into reg ister rt ConstantManipulating Instructions Load upper immediate 6 5 5 16 Load the lower halfword of the immediate mm into the upper halfword of reg ister rt The lower bits of the register are set to 0 Load immediate l rdest mm pseudoinstmcn39on Move the immediate mm into register rdest AGO Appendix A Assemblers Linkers and the SPIM Simulator Comparison Instructions Set less than sit rd mun 6 5 5 5 5 6 Set less than unsigned Sm rd mun 6 5 5 5 5 6 Set register rd to 1 if register r8 is less than rt and to 0 otherwise Set less than immediate 6 5 5 16 Set less than unsigned immediate 6 5 5 16 Set register rd to 1 if register r8 is less than the signiextended immediate and to 0 otherwise Set equal seq rdest rsrcl rscm pseudoinstmcn39on Set register rdest to 1 if register rsrci equals rsrCZ and to 0 otherwise Set greater than equal Sge rdest rsrcl rscm pseudoinstmcn39on Set greater than equal unsigned Sgeu rdest rs rcl rscm pseudoinstmcn39on Set register rdest to 1 if register rsrc l is greater than or equal to rsrCZ and to 0 otherwise A10 MIPS R2000 Assembly Language A61 Set greater than Sgt rdest rsrcl rscm pseudoinstmction Set greater than unsigned Sgtu rdest rs rel rscm pseudoinstmction Set register rdest to 1 if register rsrc l is greater than rsrCZ and to 0 other wise Set less than equal Sle rdest rsrcl rscm pseudoinstmction Set less than equal unsigned S l eu rdest rs rel rscm pseudoinstmction Set register rdest to l ifregister rsrc l is less than or equal to rsrCZ and to 0 otherwise Set not equal S e rdest rsrcl rscm pseudoinstmction Set register rdest to 1 if register rsrc l is not equal to rsrCZ and to 0 other wise Branch Instructions Branch instructions use a signed l6ibit instruction offset eld hence they can jump 2 7 l instructions not bytes forward or 215 instructions backwards Thejump instruction contains a 26bit address eld In the descriptions below the offsets are not speci ed Instead the instruc7 tions branch to a label This is the form used in most assembly language pro grams because the distance between instructions is dif cult to calculate when pseudoinstructions expand into several real instructions Branch instruction b l abe l pseudoinstm cn on Unconditionally branch to the instruction at the label Appendix A Assemblers Linkers and the SPIM Simulator Branch coprocessor 1 true beze label a 6 5 5 16 Branch coprocessor 1 false beze label an 6 5 5 l6 Conditionally branch the number of instructions specified by the offset if Z s condition flag is true false Zis 0 l 2 or 3 The floatingepoint unit is Z 1 Branch on equal bee be be label 6 5 5 l6 Conditionally branch the number of instructions specified by the offset if register r8 equals rt Branch on greater than equal lero bbez label 6 5 5 l6 Conditionally branch the number of instructions specified by the offset if register r8 is greater than or equal to 0 Branch on greater than equal lero and link bbezal label 6 5 5 l6 Conditionally branch the number of instructions specified by the offset if register r8 is greater than or equal to 0 Save the address of the next instruction in register 31 Branch on greater than lero bbez label I 6 5 5 l6 Conditionally branch the number of instructions specified by the offset if register r8 is greater than A10 MIPS R2000 Assembly Language Hi3 Branch on less than equal lero blez label an 6 5 5 l6 Conditionally branch the number of instructions specified by the offset if register r8 is less than or equal to 0 Branch on less than and link blezal label 6 5 5 l6 Conditionally branch the number of instructions specified by the offset if register r8 is less than 0 Save the address of the next instruction in register 31 Branch on less than lero blez label I 6 5 5 l6 Conditionally branch the number of instructions specified by the offset if register r8 is less than 0 Branch on not equal bbe label 6 5 5 l6 Conditionally branch the number of instructions specified by the offset if register r8 is not equal to rt Branch on equal lero beqz rsrc label pseudoinstmcn39on Conditionally branch to the instruction at the label if rsrc l equals 0 Branch on greater than equal bge rsrcl rscm label pseudoinstmcn39on Branch on greater than equal unsigned bgeu rsrcl rs rc2 label pseudoinstmcn39on Conditionally branch to the instruction at the label if register rsrc l is greater than or equal to rscm A64 Appendix A Assemblers Linkers and the SPIM Simulator Branch on greater than bgt rsrcl src2 label pseudoinstmcn39on Branch on greater than unsigned bgtu rsrcl rcm label pseudoinstmcn39on Conditionally branch to the instruction at the label if register rsrc l is greater t an scm Branch on less than equal ble rsrcl src2 label pseudoinstmcn39on Branch on less than equal unsigned b l eu rsrcl src2 label pseudoinstmcn39on Conditionally branch to the instruction at the label if register r8 rc l is less than or equal to scm Branch on less than blt rsrcl rscm label pseudoinstmcn39on Branch on less than unsigned b l tu rsrcl rs rc2 label pseudoinstmcn39on Conditionally branch to the instruction at the label if register r8 rc l is less than rs r c 2 Branch on not equal lero bnez rsrc label pseudoinstmcn39on Conditionally branch to the instruction at the label if register rsrc is not equal to 0 A10 MIPS R2000 Assembly Language AGS Jump Instructions Jump i 6 26 Unconditionallyjump to the instruction at target Jump and link at 6 26 Unconditionally jump to the instruction at target Save the address of the next instruction in register rd Jump and link register Jaw rs rd unnu 6 5 5 5 5 6 Unconditionallyjump to the instruction Whose address is in register r8 Save the address of the next instruction in register rd Which defaults to 31 Jump register Ms 6 5 5 16 Unconditionallyjump to the instruction Whose address is in register r8 Load Instructions Load address a rdest address pseudoinstmcn39on Load computed addressinot the contents of the locationiinto register rdest Load byte m address 16 6 5 5 Appendix A Assemblers Linkers and the SPIM Simulator Load unsigned byte lbu rt address Offset 6 5 5 16 Load the byte at address into register rt The byte is signeextended by 1 b but not by 1 bu Load halfword w address 6 5 5 16 Load unsigned halfword lhu rt address Offset 6 5 5 16 Load the 16bit quantity halfword at address into register tt The halfword is signeextended by 1 h but not by 1 bu Load word tw address 6 5 5 16 Load the 32bit quantity word at address into register rt Load word coprocessor lvvcz rt address Offset 6 5 5 16 Load the word at address into register rt of coprocessor Z 073 The oating point unit is Z 1 Load word left 1W1 rt address Offset 6 5 5 16 Load word right lvvr Rt address Offset 6 5 5 16 Load the left right bytes from the word at the possibly unaligned address into register rt A10 MIPS R2000 Assembly Language A67 Load doubleword id rdest address pseudoinstmcn39on Load the 64bit quantity at address into registers rdest and rdest l Unaligned load halfword uih rdest address pseudoinstmcn39on Unaligned load halfword unsigned u i m rdest address pseudoinstmcn39on Load the 16bit quantity halfword at the possibly unaligned address into register rdest The halfword is signeextended by u l h but not u l m Unaligned load word u i W rdest add ress pseudoinstmcn39on Load the 32bit quantity word at the possibly unaligned address into register rd est Store Instructions Store byte se address 6 5 5 16 Store the loW byte from register rt at address Store halfword sr address 6 5 5 16 Store the loW halfword from register rt at address Store word sd address 6 5 5 16 Store the word from register rt at address A63 Appendix A Assemblers Linkers and the SPIM Simulator Store word coprocessor address own 6 5 16 5 Store the word from register rt of coprocessor Z at address The oating point unitisZ 1 Store word left sdi address 6 5 5 16 Store word right 6 5 5 16 Store the left right bytes from register rt at the possibly unaligned address Store doubleword sd rsrc address pseudoinstmcn39on Store the 64bit quantity in registers rsrc and rsrc 1 at address Unaligned store halfword ush rsrc add ress pseudoinstmcn39on Store the loW halfword from register rsrc at the possibly unaligned address Unaligned store word usvv rsrc address pseudoinstmcn39on Store the word from register rsrc at the possibly unaligned address Data Movement Instructions Move move rdest rsrc pseudoinstmcn39on Move register rsrc to rdest Move from hi mfhi rd A10 MIPS R2000 Assembly Language A69 Move from lo m o rd 6 10 5 5 6 The multiply and divide unit produces its result in two additional registers h l and i 0 These instructions move values to and from these registers The mule tiply divide and remainder pseudoinstructions that make this unit appear to operate on the general registers move the result after the computation finishes Move the h l i O register to register rd Move to hi mm quot 6 5 15 6 Move to lo m quot 6 5 15 6 Move register R8 to the h l i O register Move from coprocessor z m rd nu 6 5 5 5 11 Coprocessors have their own register sets These instructions move values be tween these registers and the CPU s registers Move coprocessor Z s register rd to CPU register rt The oatingipoint unit is coprocessor Z 1 Move double from coprocessor 1 mfcl d rdest frs rcl pseudoinstmcn39on Move floatingipoint registers frsrcl and frsrcl l to CPU registers rdest and rdest l Move to coprocessor z mtcz rd rt 0x1 6 5 5 5 ll Move CPU register rt to coprocessor Z s register rd A70 Appendix A Assemblers Linkers and the SPIM Simulator FloatingPoint Instructions The MIPS has a floatingipoint coprocessor numbered 1 that operates on sin gle precision 32bit and double precision 647bit floatingipoint numbers This coprocessor has its own registers which are numbered f07f31 Because these registers are only 32 bits wide two of them are required to hold doubles so only floatingipoint registers with even numbers can hold double precision values Values are moved in or out ofthese registers one word 32 bits at a time by lWC l SWC l mtc l and mfc l instructions described above or by the l S l d S S and S d pseudoinstructions described below The ag set by oating point comparison operations is read by the CPU with its belt and bc l f in structions In the actual instructions below bits 21726 are 0 for single precision and l for double precision In the pseudoinstructions below fd est is a floatingipoint register eg f2 Floatingpoint absolute value double abs a m f5 n 6 5 5 5 5 6 Floatingpoint absolute value single abs 5 m f5 an 5 5 6 6 5 5 Compute the absolute value of the oatingipoint double single in register f8 and put it in register fd Floatingpoint addition double add d rd rs ft m 5 5 5 5 6 6 Floatingpoint addition single add 5 rd ts ft I 5 5 6 6 5 5 Compute the sum of the oatingipoint doubles singles in registers f8 and ft and put it in register fd A10 MIPS R2000 Assembly Language A71 Compare equal double ceqdfsftl0x11l1lnlfslfleclzl 6 5 4 Compare equal single ceqsfsftl0x11lolnlfslfleclzl 6 5 5 5 5 Z 4 Compare the floatingipoint double in register f8 against the one in ft and set the floatingipoint condition flag true if they are equal Use the lad t or bc l f instructions to test the value of this ag Compare less than equal double ciedfsftl0x11l1lnlrslochlzl 6 5 Compare less than equal single ciesfsftl0x11lolnlrslochlzl 6 5 5 5 5 Z 4 Compare the floatingipoint double in register f8 against the one in ft and set the oatingipoint condition ag true if the first is less than or equal to the sec ond Use the lad t or bc l finstructions to test the value of this ag Compare less than double c it d ts ft I 6 5 5 5 5 Z 4 Compare less than single c it s m ft an 6 5 5 5 5 Z 4 Compare the floatingipoint double in register f8 against the one in ft and set the condition flag true if the first is less than the second Use the belt or bc l f instructions to test the value of this ag Convert single to double a s m f5 n 6 5 5 5 5 6 A72 Appendix A Assemblers Linkers and the SPIM Simulator Convert integer to double a w m f5 an 6 5 5 5 5 6 Convert the single precision oatingipoint number or integer in register f8 to a double precision number and put it in register fd Convert double to single s a m f5 n 6 5 5 5 5 6 Convert integer to single s w m f5 an 6 5 5 5 5 6 Convert the double precision oatingipoint number or integer in register f8 to a single precision number and put it in register fd Convert double to integer w a m f5 n 6 5 5 5 5 6 Convert single to integer cvt w s m fs an 6 5 5 5 5 6 Convert the double or single precision oatingipoint number in register f8 to an integer and put it in register fd Floatingpoint divide double w a m m ft 6 5 5 5 5 6 Floatingpoint divide single w s m m ft 6 5 5 5 5 6 Compute the quotient of the floatingipoint cloubles singles in registers f8 and ft and put it in register fd A10 MIPS R2000 Assembly Language A73 Load oatingpoint double d fdest address pseudoinstmcn39on Load oatingpoint single s fdest add ress pseudoinstmcn39on Load the oatingipoint double single at add ress into register fd est Move oatingpoint double a m f5 an 6 5 5 5 5 6 Move oatingpoint single s m s nun 6 5 5 5 5 6 Move the floatingipoint double single from register fs to register fd Floatingpoint multiply double mm a m m ft 6 5 5 5 5 6 Floatingpoint multiply single mm s m m an 6 5 5 5 5 6 Compute the product of the oatingipoint doubles singles in registers fs and ft and put it in register fd Negate double neg a m f5 6 5 5 5 5 6 Negate single neg s m s n 6 5 5 5 5 6 Negate the floatingipoint double single in register fs and put it in register fd Appendix A Assemblers Linkers and the SPIM Simulator Store floatingpoint double s d fdest address pseudoinstrucn39on Store floatingpoint single s s fdest address pseudoinstrucn39on Store the oatingipoint double single in register dest at add ress Floatingpoint subtract double sub d rd rs ft 6 5 5 5 5 6 Floatingpoint subtract single sub s m s ft m 6 5 5 5 5 6 Compute the difference of the oatingipoint doubles singles in registers fs and ft and put it in register fd Exception and Interrupt Instructions Return from exception We 6 6 1 19 Restore the Status register System call syscaw n 6 20 6 Register v0 contains the number ofthe system call see Figure A 1 7 provided by SPIM A11 Concluding Remarks A75 Break break n 6 6 20 Cause exception code Exception l is reserved for the debugger No operation mop IIIIII 6 5 5 5 5 6 Do nothing Concluding Remarks Programming in assembly language requires a programmer to trade off help ful features of highilevel languagesisuch as data structures type checking and control constructsifor complete control over the instructions that a com puter executes External constraints on some applications such as response time or program size require a programmer to pay close attention to eve instruction However the cost of this level of attention is assembly language programs that are longer more timeiconsuming to write and more dif cult to maintain than highilevel language programs Moreover three trends are reducing the need to write programs in assembly language The rst trend is toward the improvement of compilers Modern compilers produce code that is typically comparable to the best handwritten code and is sometimes better The second trend is the introduction of new prof cessors that are not only faster but in the case of processors that execute mule tiple instructions simultaneously also more dif cult to program by hand In addition the rapid evolution of the modern computer favors highilevel lan guage programs that are not tied to a single architecture Finally we witness a trend towar 39 39 0 com ex 1 quot 39 mu t ized by complex graphic interfaces and many more features than their predecessors Large ap plications are written by teams of programmers and require the modularity and semantic checking features provided by highilevel languages To Probe Further Kane Q and Heinrich 1992 MIPS RISC Architecture Prentice Hall Englewood Cliffs NJ The last Word on the MIPS instruction set and assembly language programming on these machines Aho A R Sethi and J Ullman 1985 Compilers Principles Techniques and Tools Addison Wesley Reading MA Slightly dated and lacking in coverage ofmodem architectures butsn39ll the standard reference on compilers Appendix A Assemblers Linkers and the SPIM Simulator Key Terms A number of key terms have been introduced in this appendix Check the Glossary for de nitions of terms you are uncertain of absolute address interrupt handler separate compilation assembler directive local label source language backpatchin machine language stack segment calleersaved register macros static data callerrsaved register procedure call or stack frame symbol table data se ment recursive procedures text segment external or global label registerruse or procedurercall unresolved reference formal parame er c vention virtual mac 1ne forward reference relocation information Exerclses A1 5 lt A5gt Section A5 described how memory is partitioned on most MIPS systems Propose another way of dividing memory that meets the same goals A2 20 lt A6gt Rewrite the code for fact to use fewer instructions A3 5 lt A7gt Is it ever safe for a user program to use registers KO or K 7 AA 25 lt A7gt Section A7 contains code for a very simple exception hani dler One serious problem with this handler is that it disables interrupts for a long time This means that interrupts from a fast IO device may be lost Write a better exception handler that is interruptable and enables interrupts as quick 1y as possible A5 l 5 lt A7gt The simple exception handler always jumps back to the in struction following the exception This works fine unless the instruction that causes the exception is in the delay slot of a branch In that case the next in struction is the target of the branch Write a better handler that uses the EPC register to determine which instruction should be executed after the exception A6 5 lt A9gt Using SPIM write and test an adding machine program that repeatedly reads in integers and adds them into a running sum The program should stop when it gets an input that is 0 printing out the sum at that point Use the SPIM system calls described on pages A748 and A749 L13 Exercises A77 A7 5 lt A9gt Using SPIM write and test a program that reads in three intei gers and prints out the sum ofthe largest two of the three Use the SPIM system calls described on pages A748 and A749 You can break ties arbitrarily A8 5 lt A9gt Using SPIM write and test a program that reads in a positive integer using the SPIM system calls If the integer is not positive the program should terminate with the message Invalid Entry otherwise the program should print out the names of the digits of the integers delimited by exactly one space For example if the user entered 728 the output would be Seven Two Eight A9 25 lt A9gt Write and test a MIPS assembly language program to com pute and print the first 100 prime numbers A number n is prime if no numbers except 1 and n divide it evenly You should implement two routines l testipr l me n Return 1 ifn is prime and 0 ifn is not prime I ma l m 0 lterate over the integers testing if each is prime Print the rst 100 numbers that are prime Test your programs by running them on SPIM A10 AlO lO lt A6 A9gt Using SPIM write and test a recursive program for solving the classic mathematical recreation the Towers of Hanoi puzzle This will require the use of stack frames to support recursion The puzzle consists of three pegs l 2 and 3 and n disks the number n can vary typical values might be in the range from 1 to 8 Disk 1 is smaller than disk 2 which is in turn smaller than disk 3 and so forth with disk n being the largest Initial ly all the disks are on peg 1 starting with disk n on the bottom disk n 7 l on top of that and so forth up to disk 1 on the top The goal is to move all the disks to peg 2 You may only move one disk at a time that is the top disk from any of the three pegs onto the top of either of the other two pegs Moreover there is a constraint You must not place a larger disk on top of a smaller disk The C program on the next page can be used to help write your assembly lan guage program Appendix A Assemblers Linkers and the SPIM Simulator move m sma est disks from start to msh usmg extra voxd hano mt n mt start mt msh mt extra Wm O hanoxmei start extra msh prmtistrmgCMove disk prmtimtm prmtistrmg from peg prmtimtbtart prmtistrmg to peg prmtimt mxsh prmtistrmg hanoxmei extra msh start mam mt m prmtistrmgCEnter number of disksgt m readimto hamoxm i 2 3 return 0 Symbol table has names 7 what do these mean names can be identi ed during syntactical analysis but to determine meaning 7 semantic ysis 7 what kind of value is in x number how many bytes representation procedure parameter and result types how long should x be preserved 7 syntax can force correct declaration 7 but have we actually seen x before its use Semantic analysis goes beyond contextfree syntax Type System collection of types in programming language purpose 7 runtime safety expressiveness eg overloading better code eg more information to select most efficient instructions typechecking 7 assigning or inferring of types and checking for type related errors base types 7 INT STRING int char bool rules for constructing new types 7 structural or name equivalence 7 generative datatype in ML rules for interring type for each exp 7 for each operator mapping between operand type and result rules for conversion 7 vars 7 have declared type constants 7 implied 7 inc 7 look at decl amp compare actual and formal params 7 o en bottom up 7 can perform with parsing in g eral 7 need to establish relationships between type properties that may be in uenced by other nodes children parents ihlin elf and iterate in Tiger 7 different name spaces for types and vars U funcs create environments 7 type symbol gt Typesty creates Env 7 baseitenv INT STRING 7 enventry holds attributes for vars incs 7 var inc symbol gt enventry 7 venv prede ned mctions ord chr size need to interpret meaning of vars exps decls and type declarations inference rules correspond to recursion in tree walking code for type checker 7 can have a single procedure with specialized code for each node variant How does it work 7 build symbol table with attributes 7 resolve references to names and nd errors 39 same name in a contour need type name have a var name 7 walk tree again to assign types to expression nodes 7 check for errors need to establish binding ltname attributesgt and ui d environments for Tiger 2 environments types and values variables functions use these attributes to perform semantic checks 7 type checking operand overload break from nested loops check if all identi ers are de ned Tiger Environments signature TCiENV e si type ty Typesty bindings table entries for variable environments M datatype enventry vARentry of Cy ruNentry of formals ty list result ty type tenv ty symboltable type environment giving types associated with ype namesM type venv enventry symboltable varlable environments giving types for declared variables and functionsM type env tenv venv val baseienv env contalns predefined types tenv and functions venvM end signature TCiENV M Typechecker rnust translate all source level type specs into Typesty representation rnust assign type to all vars and exps unique used for NAME 7 for recursive es need to establish rules and operations for Tiger type system project 5 Types typessml M structure Types struct type unique unit ref N values used to provide unique type identities for declared record and array types datatype ty NIL null record matches any record type M UNIT represents no value M INT primitive integer type M STRING primitive string type M RECORD of symbolsymbol ty list unique l l NAME of symbolsymbol ty option ref for forward references M RROR for error recovery M end structure Types M unique is used to provide unique identities for declared record an array types These identities Will be tested to al39 determine type equ ity NIL and UNIT are used internally and are not directly expressible in the source code 39 All type declarations cause type ids to be bound to MIME types initially of form NAMEid ref NONE with the ref assigned the appropriate SOME ty value in a second pass Typechecking Exp type tenv Typesty symboltable type env enventry symboltable transexp env it tenv gt exp gt ty M fun transexp envtenv let fun gDpExpleftoperAplusOpr1ghtpos checklntg left pos checklntg rignt pos in g end TypeChecking Declarations transdec env tenv gt dec gt env tenv fun transdec envtenv let fun gVaDecvartypNONE1n1t let val ty transexp envtenv init val b VARentrylaccess tyty in enterenvvarb tenv end i gFunctionnedltnameparamsbodyposresultin let val b FUNentryl val env enterltenvnameb val env enterparamsparamsenv in transexp env tenv b env tenv end l 9 in g LR Parsers we still want to construct parse tree 7 nd the le most node that has not yet been constructed but all of Whose children have been constructed we have a sequence of children need to determine 7 appropriate sequence that represents the correct righthandside and 7 Which nonterminal does this sequence correspond to LRk parsers have 7 stack maintain states and constructedseen children an 7 input7 With k tokens lookahead 7 actions xl uji push to top of stack tokens from input reduce pop contents from top of the stack 7 children push to the top ofthe stack the parent 7 grammar rules give hypothesis Which set of children are We looking for and Which action should We apply 7 LR0 7 no lookahead just What s on the stack LR item 7 ab is possible and should be reduced toN iffound and 7 we have just recognized a Ngt ab 7 shi item Ngt ab 7 reduce item if is in front of a nonterminal We have to make sure to consider all possible alternatives I example With input id id 3 LR0 S gtE EgtTlET TgtidlE I initialitem setSO S gt E Egt T Egt ET Tgt id Tgt E DFA applied to the stack to tell us actions to perform shi reducegoto state edges are grammar symbols terminals and non terminals states are represented With LR item sew actions 7 shift push token on stack push new state on stack 7 reduce pop k symbols for righthandside offthe stack and push le handside non terminal on stack 7 goto advance past newly recognized nonterminal S4 1d 23 E T S0 s5 s7 g1 g6 S1 S3 s2 S2 r0 S gtE S3 s5 s7 g4 S4 12 EgtET S5 r3 Tgtid S6 r1 EgtT S7 s5 s7 g8 g6 S8 s3 s9 S9 r4 TgtE I hoW do We build DFA and parsing table I starting rule forms the initial item set 7IS gt 7 compute item all items sets compute edges bw item sets closureI repeat for each A egt axb in r for each X7gtg IIUX7gtg until 1 doesn t change GotoIX set J to empty each A egt axb in r JJUA7gtaXb return ClosureJ States Closurels 7gtEl Edges Empty repe Edges Edges U I7gtJ on x until States and Edges don t change states with outgoing edges have corresponding shi entries states with onequot reduce item are reduce states for corresponding rule otherwise 7 con ict shiftreduce con ict T gt id E then S5 Tgt id Tgt id E reducereduce conflict 5 gtV E Vgt id then S5 Tgt id V gtid con icm are frequent always with eps s LR0 7 reasonable size parsing table but lots of con icm need something more powerful SLR1 I Simple LR1 Use FOLLOW Sets and lookahead Determine which rule to applyaction to take has FOLLOW set and next token E gt T E shift E gt T reduce 7 in parsing table if lookahead matches a token from LOWE then reduce action is Valid may still have con icts a state includes multiple production rules action should depend on lookahead for thatquot rule in thatquot state FOLLOW Sew contain info for all possible alternatives of N from all possible States in the PDA LR1 0 keep lookahead set with each separate item 0 parser more powerful but increase in state 0 LR1 item 7 A gt ab x 7 production rule with position a has been parsed and is on top of stack ahead is a string derivable from b x closure 1 c for each A egt agtltb 2 in I for each Xrgtg 1 X egt 9 WM w1n F1r5tbZ until I doesn t change em for each A 7gt aXb Z in I J JU Aagt aXb 2 return ClosureJ LR1 example gt A 7gt Xb agt aAb gt B gt X wd d mm FOLLOWA FOLLOWB b S6 b S8 s5 SLR1 S11 ArgtaA b b b 8 2 shi on b or reduc e 131 LALR1 I LR1 orders of magnitude size increase compared to LR0S LR1 with LR1 may have many states with same cores but different lookahead sets I LookAhead LR 7 LALR1 core contents are determined only by results of shi s from other states LR more powerful than LALR LALR almost as powerful but smaller tables SML simplestweakest S1111 b 3812 ArgtaAb m in 2 shift on b in S7 reduce on b LALRU LALR1 Ambiguous Grammars dangling else ifa then ifb then s1 else s2 7 ifa then ifb then s1 else s2 7ifathen ifb then s1 else s2 7 can rewrite grammar or 7 resolve by shi ing I precedence 7 E gt E E l E r E 7 1 2 3 7 canrewrite grammar or 7 add precedence and associativity directives 7 Shi for rightassoc A or right operant higher precedence 7 Reduce for le assoc or left operand has higher priority MLYacc I LALR parser generator user declarations eghelp functions parser declarations eg term Danonterm p05 start prec eop error recovery m0 m0 grammar rules eg program exp exp Tiger grammar rules declaration Daverbo s e generates output 7 states and actions 0 Programming Assingment 1 is out 0 read Chapters 1 and 2 0 may want to write a warmup problem eg read in integers into binary tree traverse tree and output ins in increasing order 0 work in pairs alternating with assignments is not a good idea Lexical Analysis I lexical analysis breaking up the input into individual Words tokens begin print HelloWorld end output I token sequence of characters treated as aunit in the grammar of the language 7 begin print end identi er HelloWorld 7 some tokens have semantic values e g ldmtl a s and literals 7 not tokens commenm white spaces blank tab newline preprocessor directives Regular Expressions most appropriate for describing tokens nite descriptions for specifying possibly in nite languages core operations ho a 7 alternationM l N eg a l b a b 7 concatenation M39N e g a l b39a aa ba 7 epsilon E e g a bl E ab 7 repetition Mquot eg alb39a aa ha aaaa baaa abbreviations for convenience abcd a l b l c l d akMQOQ a b a any char except newline M Mlvlquot M Mla string 7 mm gum m mimmgw rm mu m mimmmmw mgrpm m ane Autnmata d lsbhekdwnhm Wm M marmmw kwmm39h m m bdadwnhh m mm mammal mm m mv mm wmwmmwmm W mu Wqymgnmgunmsm mmdmmmmmm M ammunmnm my hymn I with NDFA 7 need to guess correctly in order to recognize correct token longest priority 7 but NDFAs are still well suited for regexps I instead of guessing 7 explore all possibilities at once I make a DFA out ofNDFA 7 sew ofpossible states which can be reached along the way in the NDFA will correspond to a single state in the new DFA sClosure of set of states I aClosureS 7 set of states that can be reached from a state s in S without consuming any of the input just going through a edges 7 starting state in gure 7 14914 eClosureS all states already in S for each state in S 7 all states that can be reached through a edges from s DFAedgedc I d 7 initial set of states from NDFA I c 7 input char I DFAedgedc 7 set of states that can be reached from all the states in d through c or a edges DFAedgedc closureU edgesc for each s in d I NDFA gt DFA by making each set of states in NDFA correspond to a state in DFA I In the new DFA 7 start state d1 is closure of SI 7 for each symbol in alphabet DFAedge I Final state in new DFA 7 any ofNDFA states nal 7 must be marked by token type 7 if more than one token recognized 7 mark with one that occurs rst in spec 0 number of states can be minimized MLLex 7 lexical analyzer generator 1quot m L i lhun l w v um mm 7 Ie iczl quot39 quot iregexps for Structure Tokens Struct type pos int datatype token EOF of pospos IF of pospos ID of stringpospos NU39M of 1ntpospos expressions may depend upon state as COMMENT o0 o0 ltINITIALgt1f gtTokensIFyyposyypos2 ltINITIALgt W ltCOMMENTgt quot ltCOMMENTgt gt YYBEGIN COMMENT Continue 39 gt YYBEGIN INITIAL Continue 39 gt continue Basic Blocks and Traces procedure fragments gt frame and body Translate AST 7 Tree language 7 replace laguage speci c complex constructs With something more general and closer to machine 7 expressions including assignments 7 procedure calls returns 7 conditions The nal code is a sequence of instructions labels jump procedure calls and assignments Next step gt linearize the tree linearize tree into sequence of statements not the same Treestrn canonize define uninterrupted subsets 7 blocks of statements which will always execute sequentially 7 basic blocks organize blocks so as to form traces of possible executions traces slgnature cANoN val llnearlze Treestm 7gt Treestm 11 Val baschlocks Treestm 115 7gt Treestm 115 115 Treelabel val traceschedule Treestm 115 115 Treelabel 7gt eestm 115 end slgnature CANON M in general Tree exp and stm don t correspond exactly to machine languages 7 conditional jumps on real machines have one target 7 unlimited number of registers 7 high level constructs like ESEQ and CALL have sideeffects gt problems tree Statements and Express1ons Here is the detail of itree statements stm and itree expressions exp datatype stm SEQ of stm stm of label of exp label llst exp exp label label EXP of exp exp BIND of bmop exp exp MEM of ex TEMP of Temptemp cous r of m CALL of exp exp 115 111 w 444444 144444 SideEffects Sideeffects means updating the contents of a memory cell or a temporary regis er 7 ESEQse Where s is a list of statements that may contain MOVE statement BINOPopt1t2 instructions to compute t1 into ri instructions to compute t2 into rj rk lt ri o r39 But it Won t Work for this BINOPPLUSTEMP aESEQMOVETEMP auv 7 CALLeel by default puts the result in the retumresult re ister BINOPQLUS CALL CALL Canonical trees want pure expressions only 7 Why pull stmt out of exp ESEQ 7 to top pull calls up to top level 7 sideeffects register use 7 CALL ok in assignment MOVE return value or as a sideeffect only call at the top exp 7 no ESEQ no CALL 7 parent to CALL cannot be CALL strn 7 linearized into lists of stm 7 stm no SEQ EXP is exp or call 7 ok to have top call statements or to return call in move statements so 7 pull ESEQ SEQ to the top and front gt SEQ ESEQ chain can be simpli ed with lists of trees expressions CALL gt 7 ESEQMOVETEMP t CALL TEMP t ESEQSZESEQ52 e gt ESEQSEQSZ 52 e BINOPop ESEQS e1 e2 gt ESEQS BINOPop e1 e2 BINOPop e1 ESEQS e2 gt ESEQMOVETEMP tnew e1 ESEQS BINOPop TEMP tnew e2 BINOPop e1 ESEQs e2 77gt SEQS BINOP op e1 e2 if s and e1 commute ie are noninterfering the effects performed by s will not change the value computed by e1 have a commute function for each stmexp 7 extract expression list and provide build functions explist arg check if commute true when reordering cannonsml implements this you ll need to call it Rearranging itree statements Goal rearrange the list of canonical trees so that every CJUMP is immediately followed by its false branch LABELlf A basic block is a sequence of statements that is always entered at the beginning and exited at t e end 7 the rst statement is a LABEL 7 the last statement is a JUMP or CJUMP 7 there are no other LABELs JUMPs or CJUMPs in een o en used to analyze a program s control ow 1 take a list of canonical trees and form them into basic blocks 2 reorder the list of basic blocks into traces Canonical Trees gt Basic Blocks 39 Input a sequence of statements ie canonical trees the body of a mction Output a set ofbasic blocks Algorithm 39 if anew LABEL is found end the current block and start a new block 39 if a JUMP or CJUMP is found end the current block 39 if it results ablock not ending with a JUMP or CJUMP then a to the next block s label is appended to the block 39 if it results ablock Without a LABEL at the beginning invent a new LABEL and stuck it there 39 invent a new label done for the beginning ofthe epilogue 39 put JUMPNAME done at the end of the last basic block Simple Code Generation quotTranslationquot ConditionalEXprNode Exp1l Exp2 l Exp3 lEvaluate EXp l 21f Exp l is zero go to step 5 3Evaluate ans 9 Exp2 4Go to step 6 5Evaluate ans 9 Exp3 6Use the value computed in ans as the result of the conditional expression Variable Declarations VarDecINode ldent l Type lnitValue ldentifierNode TypeNode LiteralNOde AttributesRef VarDecINodeCodeGen 1 if ldentAddressAccessMode Global 2 then ldentAddressLabe e CreateUniqueLabel 3 GenGlobalDeclldentAddressLabel Type lnitVaIue 4 else ldentAddressOffset e GenLocalDeclType if lnitVaIue 2 null 5 then 0 Generate code to store lnitVaue in ldent Code Generation Support Routines GetTempType FreeTempAddress1 Adress2 GenLoadGlobalResult Type Label GenLoadLocalResult Type Offset GenLoadLiteralResult Type Value ldentifierNodeCodeGen if AddressAccessMode Global then Result 9 GenLoadGlobalResult Type AddressLabel elsif AddressAccessMode Local then Result 9 GenLoadLocalResult Type AddressOffset elsif AddressAccessMode Literal then if Result null then Result 9 GenLoadLiteralResult Type AddressValue else Result 9 AddressNodeLiteral AddressValue QONFDFNJgtFI N9 CodeGen Example PIusNodeCodeGen 1 LeftOperandCodeGen N RightOperandCodeGen 3 Result 9 GenAddResult Type LeftOperandResult RightOperandResult 4 FreeTempLeftOperandResult RightOperandResult Following is JVM code for AB3 where A is a global integer variable with a label of L1 in class C and B is a local integer variable assigned a local index an offset within the frame of 4 JVM Code Comments Generated By getstatic CLl I iload 4 iadd icons t3 iadd Push static integer eld onto stack Push integer local 4 onto stack Add top two stack locations Push integer literal 3 onto stack Add top two stack locations GenLoadGlobal GenLoadLocal GenAdd GenLoadLiteral GenAdd CodeGen for Conditionals Code for A lt B where A and B are local integer variables with frame offsets 2 and 3 iload 2 Push local 2 A onto the stack iload 3 Push local 3 B onto the stack ificmplt Ll Goto Ll if A lt B iconstO Push 0 false onto the stack goto L2 Skip around next instruction Ll iconstl Push 1 true onto the stack L2 Assignment AsgNode larget Source SX0r A ldentifierNode AsgNodeCodeGen 1 WN SourceCodeGen if SourceResultAccessMode e JumplfTrue JumplfFalse then Result 9 ConvertFromJumpCode Result SourceResultAccessMode SourceResultLabel else Result 9 SourceResult if TargetAddressAccessMode Global then GenStoreGlobalResult Type TargetAddressLabel lsExpr else GenStoreLocalResult Type TargetAddressOffset lsExpr Intermediate Code Generation 0 Translating the abstract syntax into the intermediate representation n all lexical and syntactic ems report all semantic errors compilers With 0 What should Intermediate Representation IR be like 7 not too lowlevel machine independent but also not too high 39 ns 0 M source languages to N machines withouth 7 MN M level so that We can do optimizatio 0 How to convert abstract syntax into IR Intermediate Representations IR I What makes agood IR 7 easy to convert from the absyn easy to convert into the machine code must be clear and simple must support various machineindependent optimizing transformations Some modern compilers use seveml IRs eg k3 in SMLNJ each IR in later phase is a little closer to the machine code than the previous phase Absyn gt 1R1 gt 1R2 gt IRk gt machine code 7 pros make the compiler cleaner simpler and easier to maintain 7 cons multiple passes of codetraversal compilation ay be slow The Tiger compil r uses one IR only the Intermediate Tree itree Absyn gt itree frags gt assem ly gt machine co e How to design itree stay in the middle of absyn and assembly Case Study itree 0 a new language 7 should describe simple things fetch store move jump but not at asslevel 0 tree structured convenient for 2target conditional branches easy for tree walk 0 datatype word wordSize will be specified in machinedependent modules eg Frame structure Tree TREE struct datatype stm SEQ of stm stm l and ex BINOP of blnop exp ex l and binop FPLUS l ernus l FDIV l FMUL l PLUS l MINUS l MUL l DIV D l on I LSHIFT l RSHIFT l ARSHIFT l xon andrelopEQlNElLTlGTlLElGE itree Statements and EXpressrons 0 Here is the detail of itree statements stm and itree expressions exp e stm seq of stm stm l LABEL of labe JUMP of exp label list came of relop exp exp label label X1D cons r of in CALL of exp exp list w 444444 14444 m x 39u u m H o M o rw gt7 m U H x o 0 m gtlt gt0 m gtlt gt0 itree Expressions itree expressions stand for the computation of some value possiblly with sideeffecm CONSTi the integer constant i NAMEn the symbolic constant n ie the assembly lang label TEMPt content of temporary t like registers unlimited number BTNOPoel e2 apply binary operator 0 to operands el and e2 here el must be evaluated before e2 itree Expressions cont d MEMe the contenm of wordSize bytes of memory starting at address 8 7 if used as the le child ofa MOVE it means store 7 otherwise it means fetch CALL l a procedure call the application of function f the expression f is evaluated firsL then the expression list for argumenm l are evaluated from left to right ESEQSe the statement S is evaluated for side effects then 8 is evaluated for a result itree Statements itree statements performs sideeffects and control flow no return va ue SEQsls2 statement sl followed by s2 EXPe evaluate expression e and discard the result LABELn de ne n as the current code address just like a label de nition in the assembly language MOVETEMP t e evaluate e and move it into tempomry t MOVEMEMe e2 evaluate el to address adr then evaluate e2 and store its result into Madr JUMPe labels jump to the address e the common case is 39 39 known label 1 JUMPNAME1 labels 7 possible jump addresses based on value of expression e eg switch CJUMPoele2tf conditional jump rst evaluate el and then e2 do comparison 0 ifthe result is true jump to label t otherwise jump the label f a a atype frag PROC itree Fragments How to represent Tiger function declarations inside the itree 7 can only represent the body representing it as a itree PROC fr ent of name Tree iabei mmon name body Tree stm unctmn body rame Frame frame framz layout l DATA of string each itree PROC fragment will be translated into a function de nition in the nal assembly code The itree DATA fragment is used to denote Tiger string literal It will be placed as string constants in the final assembly code Our job is to convert Absyn into a list of itree Fragmenm function entryexit added latertarget dependent Absyn gt itree Frags 0 Each absyn function declaration is translated into an itree PROC frag 7 mctions are no longer nested must gure out the stack for local and nonlocal variables 7 must convert function body Absynexp into itree stm 7 calling conventions for Ti r mc ge mctions which uses standard conventr tions and external C 39on 0 Each string literal or real constant is translated into an itree DATA frag associated with a assembly code label 0 translate itreeFrags into the assembly code of your favourite machine eg MIPS 0 More later frame layout information and the runtime access information Kind of expressions I compute a value Ex gt Treeexp I return no value eg while Nx gt Treestm I conditional gt Treestm destination labels I generic type of expression in the Translate module datatype exp 7 Ex of Treeex l Nx of Treestm l Cx of Templabel Templabel 7gt Treestm I Util Mapping Absyn Exp into itree I Each Absynexp that computes a value is tmnslated into Treeexp I Each Absynexp that returns no value is translated into Treestm I Each condititional Absynexp which computes a boolean value is translated into a rm 39on Treelabel Treelabel gt Treestm Tiger Expression agtb l cltd would be translated into Cxfn tf 7gt SEQCJUMPGTabtZ SEQLABEL z cJUMPLTcdtf one kind of expression to ano I but ag agtb l cltd 7this is Ex need ability to convert from r MOVE flag unEx e ity mctions for convertion among Ex Nx and Cx expressions unEx gexp exp uan gexp 7gt Treeiabei Treeiabei 7gt Treestm fun seq 1 7 error l seq a 7 a l seqa r 7 SEQ a seq r 7 TESEQ5 TCONST 0 l unExCx genstm let val r val t P em T newlabe 7 l and f 7 Tnewlabel 1n TESEQseqTMOVETTEMP r TCONST 1 genstmtf TLABEL f TMOVETTEMP r TCONST 0 TLABEL t TTEMP r Compiling Modules amp Packages What do we find in a module or package declaration A name and components like in a record Components declared in a new name space Components may include procedures and functions The modulepackage declares an instance not a template Compilation challenges Incomplete declarations Procedure headers Opaque and private types Specificationbody separation 10f5 Semantic Processing for ModulesPackages The AST node for a package will include o The name A pointer to a component list In Ada optionally a pointer to the private part which is just another component list Semantic processing for a package AST node Put the name in the symbol table as a package name Create a new symbol table Declare all of the components in that symbol table What kind of names are the components Components in the private part must not be externally visible If there is a separate package body AST node Make sure the body is not already defined Add body components to the symbol table making sure they are not visible externally Match proceduresfunctions in the body with headers declared as part of the interface 20f5 Code Generation for ModulesPackages Compiling an instance makes it easier but not trivial The variables among the components make up a single statically allocated data area much like a file in C Declarations possibly and a body if one is present generate executable initialization code How can you collect this possibly dispersed code 30f5 Compiling References to Package Components Syntax is identical to record components generating the same AST structure Lookup mechanism works just like for records Minor extensions are needed to handle packages as an alternative wherever records are expected 40f5 Simple Variables Define the frame and level type for each function definition t ype frame name Templabel formals access list locals lnt ref 1n Pframesml datatype level LEVEL of frame Frameframe parent level unlt ref l TOP type access level Erameaccess ln translatesml val trSlmpleVar access level 7gt exp I The access information for a variable v is a pair lk where l is the level in which v is defined and k is the frame access 7 in regs or frame I The frame offset lt allocLocal function in the Frame structure which is architecturedependant The access information will be put in the env in the typechecking Dha e Nemzm r all thi I To access a local variable v at offset k assuming the frame pointer is fp just MEMBINOPPLUS TEMP fp CONST k I To access a nonlocal variable v inside function f at level lf assuming v s access is lgk we do the following imam a Knl KJTEMP fp I Strip levels from If we use the static link offsem kl k2 from these levels to construct the tree When we reach lg we 5 op datatype level LEVEL of frame f rame rent level unlt ref l TOP I use unit ref to test if two levels are the same one lvalues and rvalues I lexp denote locations where values can be stored also known as l values 7 MEM ofexp location at address e 7 TEMP temp temporary a virtual register These can be used as targets of a MOVE stm representing either storing at a memory location or loading a regis er To use an lexp in an expression it must be transformed into an rvalue by applying the MEM operator 7 MEM means fetch or store depending on Which side Tl ofMOVE is it o Array and Record Variables I In Pascal an array variable stands for the contents of the array the assignment will do the copying var a b array 1 12 of integer begin a end I In Ti er and ML an army or record variable just stands for the pointer to the object not the actual object The assignment just assigns t e pointer l t type lntArray array of lnt var a lntArrayrlzi of 0 var b lntArrayrlzi of 7 1n a b end I In C assignment on army variables are illegal lnt a12b12c a s lllegall c a ls legall Array Subscription If a is an array variable represented as MEMe then array subscription ai would be ws is the word size MEINOP PLUS MEMe BINOP MUL l CONST WS if the array bounds are LH and the subscript is i then report runtime errors when either i lt L or i gt H happens Array subscription can be either lvalues or rvalues use it properly Record field selection can be translated in the same way We calculate the offset of each field at compile time type point lntllst 7 hd lnt tl list the offset for hd and t1 is 0 and 4 To ensure safety we must do the arrayboundschecking Record and Array Creation Tiger record creation varz foo fl el fn en 7 We can irn lement this by calling the C malloc function and then move each ei to the corresponding eld of foo I Tiger array creation varz foo n ofinitv 7 by calling a C initArraysizeinitv function which allocates an array of size size with initial value initv 7 to support arrayboundschecking We can put the array length inthe 0th eld zi is accessed at offset ilWord7sz Requirement a Way to call external C rnctions inside Tiger 7 CALLNAMETempnamedlabel initArray ab Integer and String Integer absyn lntEXpi gt itree CONSTi Arithmetic absyn OpEXpi gt itree BlNOPi Strings every string literal in Tiger or C is the c ns address of a segment of memory initialized to the proper characters During translation from Absyn to itree we associate a label 1 for each string literal s 7 to refer to s just use NAME 1 Later we ll generate assembly instructions that define and initialize this label 1 and string literal s String representations 7 a Word containing the length followed by chamcters in Tiger 7 a pointer to a sequence of chamcters followed by 000 in C 7 a xed length array of characters in Pascal Conditionals Each comparison expression a lt b will be translated to a Cx generic expression fn tf 7gt LTabtf Given a conditional expression in absyn felteneZesee3 g w translate e1 e2 e3 into itree generic expressions e1 e2 e3 apply uan to el and unEx to e2 and e3 make three labels then case t and else case fandjoin j allocate a temporary r a er label t move e2 to r thenjurnp toj a er label f move e3 to r thenjurnp toj apply uaned version of el to label t and f Need to recognize certain special case x lt 5 amp a gt b 9354 V39 it is converted to ifx lt 5 then a gt b else 0 in absyn too many labels if using the above algorithm inef cient Loops Translating while loops test if not condition goto done the loop body goto test done 0 each round executes one conditional branch plus one jump goto test the loop body 1f condition goto top done 0 each round executes one conditional branch only Translating break statements 7 just JUMP to done 7 need to pass down the label done of closest enclosing loop when translating the loop body 0 Translating For loops lo to hl do body in while 1 lt llmlt do body 1ll end Function Calls 0 Inside a function g the function call fele2 en is translated into CALL NAME lf 51 el 62 en 7 s1 is the static link it is just a pointer to fs parent level 7 striping the level of g one by one genemte the code that follow g s chains of static links until We reach fs parent level 7 If is label for f 0 When calling external C functions what kind of static link do we pass 0 In the future we need to decide what is the calling convention where the callee is expecting the formal parameters and the static link Declarations Variable declaration need to figure out the offset in the frame then move the expression on the r hs to the proper slot in the frame Type declaration no need to generate any itree code Function declaration build the PROC itree fragment 7 Later We tmnslate PROCname label body stm frame to assembly iglobal name name prologue assembly code for body epilogue The prologue and epilogue captures the calling sequence and can be figured out from the frame layout information in frame Prologue and epilogue are o en machine dependant Function Declarations Generating prologue r 9 gt V psuedoinstructions to announce the beginning of a function a label definiton fo the function name an instruction to adjust the stack pointer allocating a new frame store instructions to save calleesave registers and dr store instructions to save arguments and static links Generating epilogue an instruction to move the return result to a special register 2 load instructions to restore calleesave registers 3 an instruction to reset the stack pointer pop the frame 4 a return instruction jump to the return address 5 psuedo instructions to announce the end of a function More Arrays in Tiger scalars wordsize addresses ok for arrays ai gt simple math ok if dense layout 7 consider mte at Which index varies cache behavior parallel processing vectors of vectors 7 char mons129 January February 7 char mons9 January February linked lists realloc extensible arrays in general 7 rank element type dimensions 7 if possible bounds for dimensions would like to do runtime checking ifpossi le 7 could have all these be dynamic A ArrayDl D2 D3 of ElemType SO sizeofElemType Sl DBquot SO 7 row size 2D2Sl 7 plane size 3Dl SZ 7 array size offsetijk 7 1 2 jSl ksO 3 anle alO in frame 7 array base or pointer to array base AST Nodes for Arrays and Records K arrayrec2doc ComponentList RecordTypeNode RecordTypeRef Fields Component Component ArrayTypeNode ArrayTypeRef First Last Elements J Processing Record Declarations RecordTypeNodeTypeSemantics Create a RecordTypeDescriptor for the new record type with Size initialized to 0 Fields initilaized with a newly created symbol table Set RecordTypeRef to point to this RecordTypeDescriptor For each of the components referenced by Fields Declare each of the Ids as fields of this record by putting it in the symbol table RecordTypeRef Fields Associate a FieldAttributes record with each ld with it indicating ldType the type included in the component declaration Offset RecordTypeRefSize Allocate space for the field by adding ldTypeSize to RecordTypeRefSize Note 1 The pseudocode above allocates offsets for the elds within the record as they are declared This could be deferred until the code generation pass Note 2 The result of this processing is a type descriptor referenced from the tree NOT an attributes record for a type name since there is not necessarily one The Attributes records created are in the symbol table that describes the record type itself K arrayrec2doc 2 Processing Array Declarations ArrayTypeNodeTypeSemantics If First and Last describe a valid range then Create an ArrayTypeDescriptor for the new type with Lower value of First Upper valueof Last EIementType EIementsTypeSemantics Size EIementTypeSize Upper Lower1 Set ArrayTypeRef to point to this TypeDescriptor else Produce any appropriate error message Set ArrayTypeRef to point to an ErrorType descriptor Note The pseudocode above computes the size of the array type as the type descrip tor is created This step could be deferred until the code generation pass K arrayrec2doc 3 AST for Record References IdRefNode Node Type Prefix Suffix IdentifierNode IdRefNode AttributesRef Node Type Name Prefix K arrayrec2doc Suffix IdentifierNode AttributesRef Name AST for Array References IdRefNode NodeType Prefix Suffix IdentifierNode dSuffixNode A T N d T AttributesRef may we 0 e me Name SufflexprLst Rest ExprListNode NodeType Exprs gt K arrayrec2doc Semantics for Record References IdRefNodeExprSemantics PrefixSemantics If Suffix NULL then Set NodeType from PrefixNodeType else SuffixIdSemanticsPrefixNodeType Set NodeType from SuffixNodeType IdRefNodedSemanticsLeftType if LeftType is not a record type then error Name before a is does not denote a record Set NodeType to ErrorType else PrefixFieldSemanticsLeftTypeFields If Suffix NULL then Set NodeType from PrefixNodeType else SuffixIdSemanticsPrefixNodeType Set NodeType from SuffixNodeType Identifier Node FieldSemantcs is left for you to supply K arrayrec2doc Semantics for Array References ldSuffixNodeldSemantiosLeftType Set ArrayType to the value of LeftType SuffixExprListArraySemanticsLeftType If Rest NULL then Set NodeType from SuffixExprListNodeType else Rest ldSemanticsSuffixExprList NodeType Set NodeType from RestNodeType ExprListNodeArraySemanticsLeftType Local variable LocalType Initialize LocalType to LeftType For each ExprNode E on the list If LocalType does not denotes an array type then error Name before an index expression does not denote an array Set LocalType to ErrorType exit for loop else EExprSemantics Check that ENodeType is lntegerType Set LocalType to LocalTypeElementType Set NodeType to LocalType K arrayrec2doc Code Generation for Record References ldRefNodeExprCodeGen Set Addr to the address returned by PrefixExprCodeGen lf Suffix NULL then Result 9 Addr else Result 9 SuffixldCodeGenAddr ldRefNodeldCodeGenLeftAddr Set Addr to the address returned by PrefixldCodeGenLeftAddr lf Suffix NULL then Result 9 Addr else Result 9 SuffixldCodeGenAddr ldentifierNodeldCodeGenLeftAddr Let Offset be the offset for the field specified by this node If LeftAddr is a variable address then Result 9 LeftAddr with Offset added to its VarOffset field else LeftAddr must be a register address Generate code to add Offset to the value in the register denoted by LeftAddr Result 9 LeftAddr K arrayrec2doc 8 Register Allocation Build Build interference graph 7 look at live ranges 7 for each a lt b1bm put an edge for all live ranges in liveout 7 for move instructions a lt c a and c are move related but do not add an edge in the graph llve 1n llverout gmem12 k g j k hk1 g j k h g fgh h g f j e mem8 f 3 e f 3 m meml6 e f 3 m e f b memf m e f b e m c e8 b e m c b m d c c b m d b m k 4 d b m k d b 3 b k d b 3 d k Simplify we want to Kcolor graph 7 ifa node v has lt K neighbors and all of its neighbors have assigned colors then we can color no e v as well 7 nodes with lt K edges 7 insigni cant nodes 7 nodes with gt K edges 7 signi cant nodes Simplify 7 remove all insigni cant nodes from G 7 color the gra h G 7 add the insigni cant nodes back and assign colors in the process Note each time you remove an insigni cant node v you may decrease the degree on remaining nodes in the graph gt possibilities for further simpli cation maintain a stack of nodes removed from graph then start to pop nodes of the stack 7 you can always color insigni cant nodes placed on the stack you can stop removing nodes from graph ie simplifying when there are lt K nodes remaining 7 just give each node a different color Spill What if we cannot simplify graph to G with ltK nodes because all remaining nodes are signi cant 7 we probably don t have enough registers gt have to put something in memory Pick a nodev and spill it Actual spill if you really need to spill a node need to rewrite code to include loadstore of value to memory location Potential spill you had to choose a victim to spill 7 better Delay spilling a rewriting code color OVSI maybe you will still be able to color the graph even if Pick a vertex to spill but place it on stack and hope that when you pop it back you ll be able to assign it a If not gt have to spill have to rewrite code and start I Coalesce move a lt c if a and c don t interfere move can be eliminated and a and c assigned to the same register 7 in general any two noninterfering nodes can be coalesced r edges ofnew node are union of edges of original nodes gt 39 better to be conservative and not apply this always as may end up with graph not colorable with K colors 39 conservative gt moves are better than spills nd conservative coalescing strategy 7 stick to moverelated nodes 7 pick strategy so that resulting graph is still colorable with K colors Coalescing strategies George coalesce a and b if for every neighbor t ofa eithert already interferes with b or t is insigni cant proof original G gt G1 a er insig nodes deleted a er coalescing and removal of nodes gt G2 a s insignigicant neighbors t will be removed in both cases plus some t s neighbors ofboth a and b may be removed from G2 G2 will be subset of G1 gt at least as easily colorable as G1 gt strategy is ok Steps 0 Build Simplify Choose Spill Select 7 Really Spill 7 if no ReBuild Done Coalescing strategies Briggs coalesce a and b if ab has lt K neighbors of significant degree proof if all insigni cant nodes removed and ab removed remaining graph will have less than K neighbors and will be colorable Step s Build 7 Mark moverelated or nonmove related Simplify 7 Remove insigni cant nonmove related nodes Coalesce 7 Remove constrained move edges interfermcexy movexz moveyz Freeze moverelated node of low degree 39s will be separate register don tneedto keep track ofmove for it Choose Spill Select Really Spill 7if no ReBuild Done Spilling cont d Rewrite when actual spill detected 7 rewrite the code to incorporate the spilling of t 39 replace twith a new temp t and store nfp t where n is a new frame slot allocated using allocLocal 39 at each node using I replace 1 with a new temp t and load tquot nfp 39 Note that t and I will be liveout for only one instruction 7 redo liveness analysis construction of interference graph and coloring using the rewritten code Improvements Split the live range if possible to avoid spilling Consider choice of color based on moverelated nodes if movexy pick 7 colorix coloriy if y colored or 7 colorix ltgt coloriy sineighbors if y not colored 7 coalescing does this Spilling adds registers to stack frame introduces new temporaries which can further cause spilling 7 can we limitreuse the stack frames Can have stackslot coloring algorithm based on liveness info gt can reuse stack slots Candidates for spilling What to spill 39 Spill the least used temp 7 statically least used fewest occurrences in the code 7 dynamically least used weight occurrences in loops higher 7 this minimizes runtime cost of spills number ofloads and stores 39 Spill the temp with the most interferences largest number ofadjaeent nodes in interference EMF 7 this removes the most edges decreasing likelihood of mher spills Apple uses combination 7 spill priority usesampdefs degree 7 usesampdefs within loops have larger weight Precolored Nodes Register Allocator should consider interference between temporaries and machine registers 7 special registers which may not be available 7 FP SP 7 also instruction requires use of speci c register Assign colors to machine registers Use precolored nodes in the interference graph 7 cannot simplify spill coalesce 7 in nite degree machine registersprecolored registers interfere among each other 7 can have one of each color only Caller amp CalleeSave Reg Callersave are dead at CALL 7 interfere With anything live across call Calleesave are live across entire call 7 CALL de nes them 7 are used by return of CALL 7 interfere With everything in procedure example 1n fun a m b in Ft r1 r2 7 caiier7save 1n 657 r3 calleeisave do d d b e e 7 l whlle e gt 0 return d l RegAlloc for Trees concern is registers at tile roots for sourcedestination allocation 7 r3 rl op r2 or loadstore 7 assume stackmemory for other vars 7 can have some bottomup algorithm jointly With instruction selectionMaxMunch 7 but 39 abcd emitlhsvsrhs rst on Proc entry C100 r7 r7 15 available here r7 100 re 7 now precolored register r7 has short liverange 39 time 7ifnecessaryt100 Will be spilled 7 ifpossible Will be coalesced and moves removed RegAlloc Implementation Need ef cient 7 get nodes adjacent to X list per node use 39 select don t need it for precolored nodes 7 are XY adjacent 7 matrix or hash tab e 7 lists of candidates for simplify spill coalesce e entation for frequent insertdelete which list does Xbelongt 7 moverelatednonmoverelated 7 degree 7 color assigned 7 aliases 7 stack numRege ife is trivial gt 1 ife is unop gt max 1 numRegearg if e is binop gt if numRegseleftOp numRegse rightOp e numRegs 1numRegsele Op else e numRegs maxl regs e Op regsrightOp Activation Records functionprocedure calls involve 7 supply environment for temporary memory local vars 7 pass information to the new environment 7 the parameters 7 return information from the routine 7 the PC need a data structure to maintain this and a calling protocol structure dynamically constructed when a function is called and destroyed upon return this lecture stack layout calling protocol and the static link Collection oflocals params PC temps etc maintained in an activation record or stack frame Typically functions are entered and exited in a lastin rstout order 7 gt stack is a suitable structure 7 stack opeartions not plain pushpop batch oflocals on enter exit not necessarily of xed size 7 gt keep track of pointers of caller and callee record What should go in frame incoming parameters 7 actually this is from previous frame local variables 7 some may be in registers temporary results 7 working stack return address when making a call to another function 7 and other administrative info outgoing parameters Layout of frame almost Frame Pointer points to record of caller function Stack Pointer points to top of current record access to elds on stack is through xed offset from FF 7 locals etc have negative offset stack grows towards lower addresses 7 parameters have positive offsets xed offset from SP 7 problem size of saved regs and temps not know until later where do argumentsresults go what if there are many what if they are big 7 put in memory and pass a reference 7 eg array an 7 want to be able to access ai 7 solves issue of dangling reference programming language semantics parameter passing by value need to put in memory by reference compiler should generate code to pass address and appropriate pointer dereferences Registers some local vars amp params in regs 7 which ones and howwhen is that safe we want to keep local vars in registers whenever possible who will guarantee that they are not destroyed 7 callersave 7 calleesave 7 not necessarily hardware enforced 7 more a convention Parameter passing keep parameters lk in registers others on stack eg F4 but iffcalls g and g calls h g will have to copy rlrk as well again 7 no copy on leaf proc dead vars intraproc allocation convention allocate pararns ln in outgoing params but don t copy lk copy pararns whose address is taken or if needed done by callee in Frame too many locals and temps vars too large var passed by reference need address arithmetic var needed in nested function so nested func would know where to loo reg needed for speci c purpose eg parameter passing var escapes ifneeds to go to frame when can we 39s Typical Calling Sequence 1 Call sequence done by the caller fbefore entering g assume FP and SP in regs r f puts arguments a1an onto the stack or in registers r f puts mction g s static link onto the stack or in a register 7 fputs the return address ofthis call to g onto the stack or in a register 7 f puts current FP onto the stack ie control link optional 7 Jump to g s code 2 Entry sequence thefirst thing done after entring the callee 7 move SP to FP r decrement SP by the frame size ofg stack grows downwards 7 optional save calleesave registers if any Return address on CALL put automatically in a register on nonleaf proc have to save it Calling protocol on call set up parameters save caller saved regs on enter allocate new frame save old FP set new FP save calleesaved regs eager vs lazy on ex1t restore calleesaved regs reset SP FP jump to PC free up frame 3 Return sequence the callee g exits and returns back to f 7 put the return result into a designated register 7 optional restore calleesave registers if any 7 fetch the return address to a register if in register do nothing 7 fetch the saved FP offback to the FP register 7 increment SP by the frame size of g pop off the activation of g 7 Return back to f Tiger Specifics also true for many other modern compilers 7 return address is put in a designated register 7 only maintain SP at runtime FP is a virtual reg SP framesize when implementing Tiger framesize of each mction is a compiletime constant 39 Must maintain a separate FF and SP 1 the ame size of a function may vary 2 the ames are not always contiguous e g linked list Compilers vs Architecture Instruction Scheduling Four way addition example I In a dyadic instruction set performing an operation on more than two sources requires temporary values Instruction scheduling compiler gt instruction list machine logic gt 7 fetch decode read args execute write result 7 but not necessarily single instruction at a time ILP degree of parallelism depends on 7 datadependencies 7 resource constrains Compiler approach 0 consider resource requirements for each instruction and make sure another instruction is not scheduled while resources are unavailable consider datadependencies and make sure instructions are not scheduled before data becomes available 0 easier said than done Scheduling Without resource bounds 0 we look at loops here 7 find groups of instructions which can be coscheduled and rewrite loop so that these are body of loop 7 this is important Example lt7 7 ea W171 A l r w t H a a and g V1 lt7 b 11 M1 lt7 d J lt7 Xl I We Want to nd data dependencies here 0 need to nd largest pattern to represent loop loop length depends on longest data dependency graph 0 may include instructions executing different iterations in higherlevel loop 7 obviously will need move instr i loop bounds check etc 7 can include cycle count in model Resourcebounded planning 0 pattern if existing depends on 7 datadependency graph 7 number ofinstruction requesting certain resource 7 number of copies We have of that resource 7 cyclecount time spent at resource 0 any angorithm should 7 consider instruction priorities 7 keep track of resource assignment 7 iterate through list of possible instructions by moving them back and forth schedule and option list
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'