### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# Compiler Dsgn&implem CS 5810

WMU

GPA 3.83

### View Full Document

## 52

## 0

## Popular in Course

## Popular in ComputerScienence

This 114 page Class Notes was uploaded by Lisette Hodkiewicz on Wednesday September 30, 2015. The Class Notes belongs to CS 5810 at Western Michigan University taught by Zijiang Yang in Fall. Since its upload, it has received 52 views. For similar materials see /class/216881/cs-5810-western-michigan-university in ComputerScienence at Western Michigan University.

## Reviews for Compiler Dsgn&implem

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 09/30/15

SyntaxDirected Translation Chapter 5 Chapter 5 Syntax directed Translation C55810 Spring 2009 Bad Translation C5581 is a very boring class Use babelfish translator English to Chinese then Chinese t 0 English C5581 is the extremely tasteless kind Chapter 5 Syntax directed l h Translation 28580 bprlng 2009 Motivation parser as a translator syntax directed translation stream of ASTs or tokens l assembly code syntax translation rules typically hardcoded in the parser Chapter 5 Syntax directed Translation 85810 Spring 2009 SyntaxDirected Definitions A syntax ir definition SDD is a contex free grammar together with attributes and rules Attributes are associated with grammar symbols Rules are associated with productions A synthesized attribute for A at a parse tree node N is defined only in terms of attributes at the children of N and at N itself An inherited attribute for B at a parse tree node N is defined only in terms of attributes at the N s parent N itself and siblings Chapter 5 Syntax directed C85810 Spring 2009 Iranslatlori SDD Example 1 Production Semantic Rules E 9 E T E1val E2va Tva E 9 T Eva Tva T 9 T F T1va T2val Fva T 9 F Tva Fva F 9 int Fva intexva F9l Ww An SDD that involves only synthesized attributes is called 8 attributed Chapter 5 Syntax directed 1 Translation 3353 0 Sprlng 2009 SDD Example 2 Production Semantic Rules T 9 F T T inh Fva Tva T syn T 9 FT 1 T 1inh T inh Fva T syn T 1syn T 9 T syn T inh F 9 int Fva intexva Annotated parse tree for 35 Chapter 5 Syntax directed 1 Translation 53 0 Sprlng 2009 TEST YOURSELF 1 A CFG for the language of binary numbers B O 9 1 9 B O 9 B 1 0 De ne a syntax directed translation so that the translation of a binary number is its base 10 value 0 Draw the parse tree for 1001 and annotate each nonterminal with its translation Chapter 5 Syntax directed C85810 Spring 2009 Iranslatlori Evaluation Orders for SDDs S Attributed Definitions p0st0rderN for each child C of N from the left postorder C evaluate the attributes associated with N Chapter 5 Syntax directed l Translation C5581 Sprmg 2009 LAttributed Definitions Each ribute must be ihr Synthesized or Inherited with limited rules For A9X1X2Xn and there is an inherited attribute Xia computed by a rule The rule must use only Inherited attributes of head A Either inherited or synthesized attributes of X1X2Xi1 Either inherited or synthesized attributes of Xi but only if there are no cycles A 9 BC with semantic rules AsBb BifCc As is not Lattributed Chapter 5 Syntax directed n h Transla o 23580 Spring 2009 AST vs Parse Tree AST is a better str r for later com il r stages omits details having to do with the source language only contains information about the essential structure of the program eg parentheses commas semicolons AST is condensed form of a parse tree operators appear at internal nodes not at leaves quotChainsquot of single productions are collapsed Lists are quotflattenedquot Syntactic details are ommitted Chapter 5 Syntax directed C85810 Spring 2009 10 Iranslatlori Example 2 4 5 parse tree vs AST E l T l A F E 4 5 int 2 IA E T 7 T F F int 5 i Chapter 5 Syntax directed 39 39I 1 39 Translation Int C8584 Sprmg 2009 39 Abstract Syntax Tree AST Derivation sequence of applied 5 productions 9E591591E 912 E a derivation Parse tree graph representation of U i E Doesn t capture the order of m l 5 applying the productions E jk 5 AST discards unnecessary 1 E S M 1 information from the parse tree 2 2 E 3 l F 5 3 E 4 255810 Spring 2009 Chapter 5 Syntax directed Translation 12 AST Construction We want to explicitly construct the AST during the parsing phase Chapter 5 Syntax directed l i Translation C5581 Sprlng 2009 3 Constructing Syntax Tree Production E9ET E T T9TF T F Faint F9 Chapter 5 Syntax directed Translation Semantic Rules E1node new Node E2nodeTnode Enode Tnode T1node new Node T2node Fnode Tnode Fnode Fnode new eafint intva Fnode Enode 55810 Spring 2009 14 TEST YOURSELF 2 Illustrate the syntax directed translation defined above by drawing the parse tree for 2 3 4 and annotating the parse tree with its translation e each nonterminal X in the parse tree will have a pointer to the root of the AST subtree that is the translation of X Chapter 5 Syntax directed C85810 Spring 2009 15 Iranslatlori Constructing Syntax Tree Production Semantic Rules T 9 F T Tnode T syn T inh Fnode T 9 FT 1 T 1inh new node T inh Fnode T syn T 1syn T 9 T syn T inh F 9 int Fnode new Ieafint intval Annotated parse tree for 35 Chapter 5 Syntax directed 1 39 1 Translation 38580 Sprlng 2009 6 Example 2 Compute the type of an EgtEE EgtEandE EgtE E gt true E gt false E gt int E gt E Chapter 5 Syntax directed Translation expression if E2trans INT and E3trans INT then E1trans INT else E1trans ERROR if E2trans BOOL and E3trans BOOL then E1trans BOOL else E1trans ERROR if E2trans E3trans and E2trans I ERROR then E1trans BOOL else E1trans ERROR Etrans BOOL Etrans BOOL Etrans INT E1trans E2trans C55810 Spring 2009 17 SyntaxDirected Translation Schemes SDT SDT is a CFG with program fragments embedded within production bodies Any SDT can be implemented by first building a parse tree but not efficient Typically SDT s are implemented during parsing LRgrammar and the SDD is Sattributed LLVrammar and the SDD is Lattributed Chapter 5 Syntax directed C85810 Spring 2009 Iranslatlori AST Construction LL LL parsing extend procedures S 9 ES for nonterminals S 9 e 5 E 9 num S Expr parseS switch token case num case Expr left parseE void parseS switch token case num case parseE I parseS EgtltI0r Flght pa rseS 0 return If right NULL return left default else return new Nodeeftrght ParseError defaUlt ParseError Chapter39S Syntax dlrected 35810 Spring 2009 19 Translatlon AST Construction LR We again need to add code for explicit AST construction AST construction mechanism Store parts of the tree on the stack For each nonterminal symbol X on stack also store the sub tree rooted at X on stack Whenever the parser performs a reduce operation for a production X 9 y create an AST node for X Chapter 5 Syntax directed n h Transla o 23580 Spring 2009 0 Implementing AST Production Actions E 9 E T stacktop2 new Node stacktop2nodestacktopnode top top 2 E 9 T T 9 T F stacktop2 new Node stacktop2nodestacktopnode top top 2 T 9 F F 9 int F 9 E stacktop2stacktopnode top top 2 Chapter 5 Syntax directed 1 Translation 325580 Spring 2009 21 AST Construction for LR Example S 59 SE l E input String quot123 SIE E 9 num S A T E 3 E 2 Num3 I l S 39 Node Node Numl Num2 Num3 x g Node 77 Numl Num2 Before reduction S 9 E S After reduction S 9 E S Chapter 5 Syntax directed 55810 5 ring 2009 Translation p D Lexical Analysis Part I Chapter 3 Jayne Dewar 032 mm 1 1122009 The Structure of a Compiler Today we start myxm cmpmrz cssxmsnriris m Interactions Between the Lexer amp Parser source prugrarn 22219 CMplerS 5 5x10 w lus Lexical Analysis What do we want to do Example if i 39 The input is just a sequence of characters tif i jlnttz 0ntelsenttz 1 22219 mmquot 5 5x10 we Am Goal of Lexical Analysis Goal Partition input string into substrings And classify them according to their role gt token Reduce length of program representation remove spaces name cum c5ltx1opnng7ue What s a Token Output of lexical analysis is a stream of useful tokens A token is a syntactic category In English noun verb adjective In a programming language Identifier Integer IFTHEN Whilespace Parser relies on the token distinctions Eg identi ers are treated differently than keywords my new cssnowrni m Tokens Token represents a category It consists of a token name optiona va ue Identifier strings of letters or digits starting with a letter Integer a nanempty string of digits Keyword quotelsequot or quotI ar or or IF string quoti Whitespace a nanempty sequence of blanks newlines and tubs 11110 1122009 Lexical Analysis Process read char by char l uab Lexical Analysis m scanner 1mm claw csx1nsunm m Lexical Analyzer Implementation An implementation must do two things 1 Recognize substrings correspondingto tokens 2 Return the value or lexeme ofthe token r The lexeme is he substrng Hanna may mm mm Example jnquott tz 0ntelsenttz 1 Tokenlexeme pairs returned bythe lexer Whitespace l tquot IF OpenPan quotquotl Identifier iquot Relation Identifier jquot mum mmquot mm mg m Lexical Analyzer Implementation The Iexer usually discards quotuninterestingquot tokens that don t contribute to parsing Examples Whitespace Comments mm mm qulb lmgzvc Lookahead Two important points 1 The goal is to partition the string This is implemented by reading lefttoright recognizing one taken at a time N quotLookaheadquot may be required to decide where one token ends and the next token begins Even oursimple example has lookahead issues ivsi V5 mm mnvc5ltx1omn m I 1122009 Next Regular Language We need There are several formalisms for specifying A language to describe the lexemes of each token t0llten5 A language to resolve amblgUltles Regular languages are the most popular a Is lllwo varlables l and f Simple and useful theory ls two equalslgns Easy to understand Efficient implementations 11110 masnlgm rs mjxu claw crsxmswllns m 1 Languages Examples of Languages Alphabet English Alphabet ASCII chamclers Language c programs Language English Def Let 2 be a set ofcharacters A language semems overZ is a set of strings ofcharacters drawn from 2 39 N eVe39V quotquot3 quot 3395quot NoteASCll character set is chamclers is an Englis different from mm 2 is called the alphabet Sememe charaaerset rm may rlt5xm angm r mam Cram rlt5xm ang rxs n Regular Expressions and Regular Notation Languages Languages are sets of strings Each regular expression is a notation for a Need some notation for specifying which sets regUlaquot language 3 SEt 0f words we want For lexical analysis we care about regular IfA is a regular expression then we write LA languages which can be described using to refer to the language denoted by A regular expressions a regular language is used to describe each token in a programming language MHZIN cheml c5 r 1le mimics 1122009 Atomic Regular Expressions Compound Regular Expressions Union A B is a reg expression where A B are reg expression Therefore the language is LA l B s l sELlA or sELB 1 Examples Single character 39c39 L c39 c for any c E 2 Concatenation AB Iwhere A and B are reg expl LlABl lab l a E LA and be LlBl 39if39 I 39then39 I 39else if quotthenquot quotelsequot Example i39 and f39 are reg expressions so is if39 0 I 1 I I 19ruoJ 1n quot9 Themfore the language 395 note the are just an abbreviation Ll39 fl l 39f l 2 Another example Iwe wiabbreviate 39i39 f as if I I10 I 11 to I 11 I 00 01 10 all More Compound Regular Expressions Example Keyword So far we do not have a notation for infinite Keyword else or if or quotbeginquot or languages Iteration A is a reg expression ifA is a reg expression Therefore the language is LA l l LA LAA LAAA Examples 390quot quot quot0quot quot00quot quot000quot 39139 390quot strings starting with 1 and followed by 0 s Epsilon 8 LIE I I else I if I begin I Recall 39else39 abbreviates 39e39 39I 39s 39e39 mm CMp erS 5 Wm M m 2 mm mm 5 5x10 mg m 1 Example Integers Example Identifier Integer a nonempty string of digits Identifier strings ofletters or digits starting with a letter digit 0 I 1 I 42 I 3 I 4 I 5 I 6 I 7 I 8 I 9 number digit digit letter 39A 39Z 39a39 39z identifier letter letter I digit Abbreviation N AA Is letterletter digit the same 112143 mm C 310er m m my new c5 310er m m u 1122009 Example Whitespace Whitespace a nonempty sequence of blanks newlines and tabs l quot l 39n39 Example Phone Numbers Regular expressions are all around youl Consider 510 6431481 E lo 1 7 3 9 r l l area digit3 exchange dlglt3 phone dlgll number area quotl exchange phone Wm 2 name lkums Consider zi39lanchswmicnedu address name name name Example Email Addresses letters U l l etter Applications of regular expression ln Windows in w ndows you can use REto search for les or texts in a file ln unix there are many RE relevant tools such as Grep Stands for Global Regular Expressions and Print or Global Regular f nd patterns of chamcters in a text file Practice regular expression using grep 7 Pre ell g Usegre 2a we grep larzll r llarzl l r llarzll r l les1 p to Search for certaln pattern in html files ch luv Canadlan zlp nude lrl alex lllE sh Type lcsh pale lexl le say les1 lhal cunslsls all sample puslal nude lelH larzll allarzl Dallarzll r l lea Practice the following grep commands Summary Regular expressions describe many useful languages Next Given a strings and a rexp R is s eLK But a yesno answer is not enough Instead partition the input into lexemes We will adapt regular expressions to this goal 11110 CH2 mesz n 1122009 Procedure of Lexical Analysis Specifying lexical structure using regular expressions 0 Finite autom Deterministic Finite Automata DFAs Nondeterministic Finite Automata NFAs Implementation of regular expressions RegExp gt NFA gt DFA gt Tables 1mm clapw csx1nsunm5 m Regular Expressions gt Lexical Spec 1 1 Select a set of tokens 0 Number Keyword Identifier 2 Write a RE for the lexemes of each token Number diglt39 Keyword if39 I else I Identifier letter letter I digit OpenPar mym emu mm wen Regular Expressions gt Lexical Spec 2 3 Construct R matching a lexemes for all tokens RRi IR2 IR3 I Facts If s in LR then sis a lexeme Furthermore s in LR for some i This i determines the token that is reported mama tramquot mm mg m Regular Expressions gt Lexical Spec 3 4 Let the input be gtlt1gtltn x1 xH are characters in the language alphabet 0 Forlgigncheck xx e LIRI S It must be that xx e LlRlforsome iandj 6 Remove x1x from input and go to 4 11214 chmv c5 Lexing Example R Whitespace Integer Identifier 39 Scan f 3 g f39l matches R more precisely Identifier quotquot matches R more precisely 3939 The tokenlexeme pairs are lldentilier m w quotI llnleger 3H lWhilemace l l quotl lldenlilier g We would like to drop the Whitespace tokens after matching Whitespace continue matching my mime c Ambiguities 1 There are ambiguities in the algorithm Example R Whitespace Integer Identifier 39 Parse foo3quot f39l matches R more precisely Identifier But also fo l matches R and fooquot but not foo How much input is used What if xx a LlRIand alsoxxx e LIRI quotMaximal munch l rule Pickthe longest possible substringthat matc es R 11110 1122009 More Ambiguities R Whitespace 39new Integer Identifier Parse quotnew fooquot new matches R more precisely new39 but also Identifier which one do we pick In general if x1x e LR and x1x E LRk Rule use rule listed first j ifj lt k We must list 39new before Identifier 1mm clnpmr crsxmsmds m Error Handling R Whitespace Integer Identifier 39 Parse 56 No prefix matches R not nor quot5quot nor quot56quot Problem Can tjust get stuck Solution Add a rule matching all bad l strings and put it last Lexertools allow the writing of RR1 R Error Token Error matches if nothing else matches mm my mm min Summary Regular expressions provide a concise notation for string patterns Use in lexical analysis requires small extensions To resolve ambiguities To handle errors Good algorithms known next Require only single pass overthe input Few operations per character table lookup mm mmquot mm mg m Lexical Analysis partll Chapter 3 1192009 Chapter 3 CS 5810 Spring 2009 Finite Automata Regular expressions specification Finite automata implementation 0 A finite automaton consists of An input alphabet Z A set of states S A start state n A set of accepting states F g S A set of transitions state gt F t state 1192009 Chapter 3 CS 5810 Spring 2009 Finite Automata Transition 51 gt3 52 0 Is read H I In state 51 on input a go to state 52 0 If end of input or no transition possible If in accepting state gt accept Otherwise gt reject 1192009 Chapter 3 CS 5810 Spring 2009 Finite Automata State Graphs A state 0 o The start state 0 39 An accepTIng STaTe a A transition GAO 1192009 Chapter 3 CS 5810 Spring 2009 A Simple Example A finite automaton that accepts only quot1 A finite automaton accepts a string if we can follow transitions labeled with the characters in the string from the start to some accepting state 1 m 1192009 Chapter 3 CS 5810 Spring 2009 Another Simple Example A finite automaton acce tin any number of 1 s followed by a single 0 Alphabet 01 1192009 Chapter 3 CS 5810 Spring 2009 And Another Example Alphabet O1 What language does this recognize 1192009 Chapter 3 CS 5810 Spring 2009 And Another Example o Alhstill01 1 The operation of the automaton is not completely defined by the input On input quot11 the automaton could be in either state 1192009 Chapter 3 CS 5810 Spring 2009 Epsilon Moves Another kind of transition 8 moves 8 Machine can move from state A to state B 92009 without reading input Chapter 3 CS 5810 Spring 2009 Deterministic and Nondeterministic Automata Deterministic Finite Automata DFA One transition per input per state No 8 moves Nondeterministic Finite Automata NFA Can have multiple transitions for one input in a given state Can have 8 moves 1192009 Chapter 3 CS 5810 Spring 2009 32009 Execution of Finite Automata A DFA can take only one path through the state graph Completely determined by input NFAs can choose Whether to make 8 moves Which of multiple transitions for a single input to take Chapter 3 CS 5810 Spring 2009 11 Acceptance of N FAs 0 An NFA can get into multiple states 1 O 0 input 1 0 1 Rule NFA accepts if it w get in a final state 1 192009 Chapter 3 CS 5810 Spring 2009 12 NFA vs DFA 1 NFAs and DFAs recognize the same set of languages regular languages 0 DFAs are easier to implement There are no choices to consider 1192009 Chapter 3 cs 5810 Spring 2009 13 NFA vs DFA 2 Forauivn Ln U h NFAcan be imr hn theDFA NFA DFA DFA can be exponen rially larger Than NFA 1192009 Chapter 3 CS 5810 Spring 2009 Regular Expressions to Finite Automata High level sketch NFA Regular expressions DFA Lexi cal Tabledriven Specn lccmon ImplemenTaTion of DFA 1192009 Chapter 3 C5 5810 Spring 2009 15 Regular Expressions to NFA 1 For each kind of rexp define an NFA Notation NFA for rexp A Decomposite rexp into sub rexp O3 For39 inpu r o 04o For39e 1192009 Chapter 3 CS 5810 Spring 2009 Regular Expressions to NFA 2 ForAB 8 For39AlB 1192009 Chapter 3 CS 5810 Spring 2009 Regular Expressions to NFA 3 For A 0 For A same as A 1192009 Chapter 3 CS 5810 Spring 2009 Example of RegExp gt NFA conversion Consider the regular expression 1 01 The NFA is 8 39192009 Chapter 3 CS 5810 Spring 2009 8 m 9 N ext NFA Regular expressions DFA Lexj cal Tabledriven Specn lccmon ImplemenTaTion of DFA lflgZOOQ Chapter 3 CS 5810 Spring 2009 20 Basic ideas of remove non determinism Two cases of non a determinism O Epsilon transition Remove the edge by merging the two states Exiting from one state there a are multiple edges with same labels 8 Merge the states that can be reached from the same a symbol lt D 1192009 Chapter 3 CS Sm Spring 2009 NFA to DFA The Trick Simulate the NFA Each state of DFA a non empty subset of states of the NFA Start state the set of NFA states reachable through 8 moves from NFA start state Add a transition S gt3 S to DFA iff S is the set of NFA states reachable from the states in S after seeing the input a 1192009 considering 8WES938Wl39lg2009 22 Formalize the ideas Two key functions a closure T is set of states reachable by afrom sin T Move7jg is set of states reachable by a from sin T The algorithm Start state derived from S0 or the NFA Take its sclosure vvorK outward trying each or e Z and taking its sclosure Each state in DFA corresponds to a subset of states of the NFA That is why it is called subset construction Iterative algorithm that halts when the states wrap back on themselves 1192009 Chapter 3 CS 523310 Spring 2009 eclosu re Definition eclosureT T all NFA states reachable from any state in T using only 8 transitions Example 8 closure125 125 8 closure4 14 8 closure3 mg e closure35 1345 lil fZGO ii Chapter 3 CS m0 Spring 200i NFA to DFA Algorithm 0 A transition table Dtran for D is constructed as follows initially 8cosureso is the only state in Dstates and it s unmarked while there is an unmarked state in T in Dstates do begin mark T for each input symbol a do begin U 8cosuremove39l39a if U is not in Dstates then add U as an unmarked state to Dstates DtranTa U end i Chapter 3 CS 5810 Spring 2009 NFA to DFA Algorithml Computation of 8 ClosureT push all states in T onto stack initialize 8ClosureT to T while stack is not empty do begin pop t the top element off of stack for each state u with an edge from t to u labelled 8 do begin if u is not in 8ClosureT then add u to 8cosureT push u onto stack end end 1192009 Chapter 3 CS 5810 Spring 2009 NFA gt DFA Example F GAB CDHI EJGABCDHI 0 39192009 Chapter 3 CS 5810 Spring 2009 Implementation 0 A DFA can be implemented by a 2D table T One dimension is quotstatesquot Other dimension is quotinput symbols For every transition Si gta Sk define Tia k 0 DFA quotexecutionquot If in state Si and input a read Tia k and skip to state Sk Very efficient 1192009 Chapter 3 CS 5810 Spring 2009 28 92009 Implementation Cont NFA gt DFA conversion is at the heart of tools such as lex flex orjlex But DFAs can be huge In practice lex Iike tools trade off speed for space in the choice of NFA and DFA representations Chapter 3 CS 5810 Spring 2009 Table Implementation of a DFA 39192009 Chapter 3 CS 5810 Spring 2009 30 hUUNH 1192009 Converting DFAs to REs Combine serial links by concatenation Combine parallel links by alternation Remove self loops by Kleene closure Select a node other than initial or final for removal Replace it with a set of equivalent links whose path expressions correspond to the in and out links Repeat steps 1 4 until the graph conSIsts of a single link between the entry and exit nodes Chapter 3 CS EMU Spring 2009 Example dlt Ibl d d 0 a C 9 9 bbcd 1 192009 Chapter 3 CS EEO Spring 2009 Example cont dbld d 0 H a 9 9 bbcda dabcd a bbcdad dabcdabbcdad 5 1192009 Chapter 3 CS 3310 Spring 2009 Scanner generator history LEX Original for UNIX it now exists for many operating systems LEX produces a scanner which is a C program LEX accepts regular expressions and allows actions ie code to executed to be associated with each regular expression JLex Lex that generates a scanner written in Java Itself is also implemented in Java 1192009 Chapter 3 CS 530 Spring 2009 include ltstdiohgt int numines O numchars O n numinesnumchars numchars main vvlex printf quot of lines d of chars d nquotnumines numchars 192009 Chapter 3 CS 5810 Spring 2009 35 include ltstdiohgt 09 yytext is a string containing the matched text printfquotw an integer squot yytext Ignore all other characters int mainvoid vvlex return 0 big2009 Chapter 3 CS 5810 Spring 2009 36 3312008 Code Generation Chapter a II A Simple Code Generator Generate code for a single basic block How to use registers in most machine architectures some or all orthe operands must be in registers Registers make good temporaries Hold values that are computed in one basic block and used in other blocks orten used with runetime storage management Register and Address Descriptors For each available register a register descriptor RDl keepstrack oftne vars wnose current value intnat register lnitially empty For each var an address descriptor ADI keeps track oftne locations where the current value oftne varcan be found tocation can be a register a memory address etc Codegeneration Algorithm For a threeeaddress instruction eg xyz e Use petReplxeyrz to select registers lg lg R1 farxy z e lry is not in lg issue an instruction LD y39 e lssuethe instruction ADD Rx lg R1 Copy statementxy e lry is not already in register generate LD lg y39 e Adyust RD for lg so it includes x e changeAD rorx so its only location is lg Ending the basic block e in is used atother blacks issue ST x Rx Managing Register and Address Descriptors For LD R x e change RD furR so it holds only K e change AD for K by adding R as an additional location For ST x R 7 change AD for Km include its own memory location For ADD R R e change RD forRx so it holds only e change AD for K so its only location is RX 7 Remove R from the AD oi any var other than R Example t uv are ternp vars while a b c d are global Assurne registers are enoug Reuse registers whenever possible 3312008 Example R1 R1 R3 a b d l I R1 R1 R3 a 7 c d l u v unn ADD R3 R2 R1 R1 R1 R3 3 b c d l u v Inquot R1 R1 R3 ADD R1 R3 R1 R1RZR3abcd n1 R1 R1 R3 a 7 c d l u v n m Function getReg Consider picking RV forv in x v z lfy in a register do nothing lfy not in a register and there is a empty one choose it as R Letv be one ofthe varin R We are OK in is somewhere besides R We areDK inisx We are OK in is not used later Spill stv R Peephole Optimization Exam a sliding window and replace instruction sequence with a shorter or aster sequence Redundamrlnstmction elimination wrofrcontrol optimization I o Algebraic simpli cations Use a machine Idloms Eliminating Redundant Loads and Stores LD a R0 ST R0 a 3312008 Eliminating Unreachable Code Elow of Control Optimizations ifdebug1goto L1 goto L2 L1 print debugging info L2 ifdebugl1goto L2 pa lt b goto L2 L1 print debugging info l2 L1 goto L2 th annldmefmk Murillo cuvwrxtmmnm Lissrnmngtoos rumscmwmvv Register Allocation and Assignment Example 39 Usage Counts An approximate formula for the bene tirom allocating a register iorx Z melxB2livelxd hloolnBivL uselxB is the number of timesxis used in Bpriorto any de nition of x ivelxB is 1 iix is live on exit from B and is assigneda value in B 0 otherwise clovng tapeworm csssinsungrnm LS tram mowmvu csssinsvmgrnm A Instruction Selection by Tree Rewriting TreeTranslation Schemes 39 Instruction selection can bea large combinational task Even the evaluation order is given and register allocation has been done ST R0 R1 Cluvl r ute wntrm csssrnsnngmm L7 turrer mammal Lissinrvmgrnm s 3312008 Optimal Code Generation for Expressions We can choose registers optimally if a basic block consists ora single expression or it is surricien tto generate code for a blcok one expression at a time Ershov Numbers Assign the nodes of an expression tree a Label leaf 1 The label ofa label of its ch The label oran interior node the larger one llthe labels are l One plus the label llthe labels are the same number that tells how many registers needed n interior node with one child is the ild with two children d rent Ershov Numbers Generating Code From Labeled Expression Tree 39 Recursive algorithm starting at the root 7 Label k rneans k registerswill be used 7 RD Rm Rh where bgt1 is a base To generat mach39 and two children with equal lab 7 Recursiver generate code forthe right child using base b1 Rm Rm Result in Rh 7 Recursiver generate code forthe right child using base b Rh Rm Result in RMZ 7 Generate instmction OP Rh Rh RINH lne code for a node with label k els Generating Code From Labeled Expression Tree To generate machine code for a node with la elk and two children with unequal labels Recursiver generate code for the child with label k using base b Rb RM Result in Rb e child with label R RM Result in RM Generate instruction OP RM Rbm1Rbk1 Recursiver generate code for th m using base b For a leaf x if base is b generate LD Rb x Ershov Numbers LD R2 e lVlUl R3 R2 R3 3312008 Insufficient Supply of Registers Input a labeled tree and a number r of registers For a node N with at least one child labeled r or greater 7 recursiveiy generate code for the big child with b1 The resuitwiii appear in R 7 Generate rnachineinstructibn ST t R a if the little child has label r brgreater b1 if the label is iltr then b The resuitin R 7 Generate the instruction LD RH tk 7 Generate OP R R RH or OP R R ml Rr Insufficient Supply of Registers T R mm mum Gamma Code Generation Chapter 8 Chapter 8 Code Generation C55810 Spring 2009 Code Generation Front Intermediat Intermediat Code Target End Code Code Generator program Severe requirements Target program preserve the semantics Effective use of available resources Itself must be efficient Primary tasks Instruction selection Register allocation and assignment Instruction ordering Chapter 8 Code Generation CSSSlO Spring 2009 Input to the Code Generator IR produced by the front end Three address code quadruples triples Virtual machine bytecodes stack machine code Linear representations postfix notation brapnlcal representations DAG syntax tree Symbol table Determine the run time addresses of the names in IR Chapter 8 Code Generation 235810 Spring 2009 3 The Target Program RISC reduced instr in set mputer Many registers threeaddress instructions simple addressing modes simple instruction set arch CISC complex instruction set computer few registers twoaddress instructions various addressing modes variablelength instruction set Stackbased Machine Pushing operands onto a stack Almost disappeared then revived with Java Virtual Machine JVM Chapter 8 Code Generation 235810 Spring 2009 The Target Program Absolute machine language programs Can be placed in a fixed location in memory and immediately executed Relocatable machine language programs Subprograms can be complied separately then linked together and loaded for execution by linking loader Assembly language program Code generation easier need assembly step Chapter 8 Code Generation 235810 Spring 2009 Instruction Selection Map the IR into a code sequence that can be executed by the target machine ADD R0 R0 C ST a R0 LD R0 a ADD R0 R0 e ST d R0 ADD R0 R0 2 ST x R0 Chapter 8 Code Generation 355810 Spring 2009 Register Allocation Register allocation select the set of vars that will reside in registers Register assignment pick the specific register that a var will reside in Chapter 8 Code Generation 355810 Spring 2009 Evaluation Order 0 The order can affect the efficiency Some orders require fewer registers Picking best order is NP complete Chapter 8 Code Generation C85810 Spring 2009 The Target Language 0 Assume following kinds of instructions Load LD dst addr LD r x Store ST x r Computation OP dst srcl src2 unary operators do not have src2 Unconditional jumps BR L Conditional jumps Bcond r L BLTZ r L Chapter 8 Code Generation 85810 Spring 2009 The Target Language Addressing mode Variable name x Indexed address ar LD R1 aR2 is R1 contentsacontentsR2 An integer indexed by a register LD R1 100R2 is R1 contents100contentsR2 Indirect addressing modes LD R1 100R2 is R1contentscontents100contentsR2 Immediate constant address mode LD R1 100 Chapter 8 Code Generation C85810 Spring 2009 10 The Target Language Example L 9 irF lly UWUHL Fl gtlt lt N 44 r a H41 3 g f z j SiUJIBB a1 a1 LL15 J a a 39 ST xp 7 ifxltygotoL T 255810 Spi Program and Instruction Costs 0 Determining the actual cost of compiling and running a program is complex Cost of an instruction is one plus the costs associated with the addressing modes LD R0 R1 cost 1 LD R0 M cost 2 LD R1 100R2 cost 3 Chapter 8 Code Generation 235810 Spring 2009 12 TEST YOURSELF 1 Generate code for the following three address sequences Chapter 8 Code Generation 55810 Spring 2009 13 Basic Blocks and Flow Graphs Priin IRinto basic k which r mximl sequences of consecutive threeaddress instructions that The flow of control can only enter the basic block through the first instruction Control leave the block without haltingbranching except at the last instruction The basic blocks become the nodes of a flow graph whose edges indicate which blocks can follow which other blocks Chapter 8 Code Generation 235810 Spring 2009 14 Create Basic Blocks 0 An instruction is a leader if The first instruction in the IR The target of a jump Immediately follows a jump For each leader its basic block consists of itself and all the instructions upto but not including the next leader or the end of the IR Chapter 8 Code Generation 235810 Spring 2009 15 TEST YOURSELF 2 Construct basic blocks of the following IR for i from 1 to 10 do for from 1 to 10 do 1152 Han 1 for i from 1 to 10 do 533 a10 1413 15 Chapter 8 Code Generation 355810 Spring J 3 NextUse Information 0 Knowing when the value of a var will be used next is essential for good code generation j uses the value ofx computed at i if statement i assigns to x Ifj has x as an operand and control can reachj from i without an intervening assignment to x Chapter 8 Code Generation 235810 Spring 2009 17 NextUse Algorithm Input a basic block B Output attach to i xyz the next use info Method start at the last statement B Attach to i the information currently found in symbol table set x to quotnot live and quotno next use Set yz to live and the next uses of yz to Note the steps of 2 and 3 may not be interchanged Chapter 8 Code Generation 235810 Spring 2009 18 Optimization of Basic Blocks Substanilimrvmn can be gain by performing local optimization DAG Representation of basic blocks A node for each initial values of the vars Node for each statement Children are the statements of the last definitions Labeled by operators and vars for which it is the last definition in the block Some nodes are quotoutput nodes whose vars are live on exit Part of global flow analysis discussed later Chapter 8 Code Generation 235810 Spring 2009 19 Chapter 8 Code Generation DAGs for Basic Blocks 55810 Spring 2009 20 Finding Local Common Subexpressions Chapter 8 Code Generation C55810 Spring 2009 21 Dead Code Elimination Delete any root that has no live variables and repeat the procedure Chapter 8 Code Generation 55810 Spring 2009 22 Use of Algebraic Laws Identities XOOxx x11xx xOx x1x Reduction in strenth x2xx 2x xx x2XO5 Commutativity Xvvx Associativity a bc ecdb a bc tcd e tb abcead Chapter 8 Code Generation C85810 Spring 2009 23 A Simple Code Generator Generate code for a single basic block How to use registers In most machine architectures some or all of the operands must be in registers Registers make good temporaries Hold values that are computed in one basic block and used in other blocks Often used with run time storage management Chapter 8 Code Generation 235810 Spring 2008 24 Register and Address Descriptors For each available register a register descriptor RD keeps track of the vars whose current value in that register Initially empty For each var an address descriptor AD keeps track of the locations where the current value ofthe var can be found Location can be a register a memory address etc Chapter 8 Code Generation 235810 Spring 2008 25 Codegeneration Algorithm For a threeaddress instruction eg xyz Use getRegxyz to select registers RX Ry RZ for x y z Ify is not in Ry issue an instruction LD Ry y Similarly forx Issue the instruction ADD RX Ry RZ Copy statement Ify is not already in register generate LD Ry y Adjust RD for RV so it includes x Change AD for x so its only location is Ry Ending the basic block Ifx is used at other blocks issue ST x RX Chapter 8 Code Generation 235810 Spring 2008 Managing Register and Address Descriptors For LD R x Change RD for R so it holds only x Change AD for x by adding R as an additional location For ST x R Change AD for x to include its own memory location For ADD RX Ry RZ Change RD for RX so it holds only x Change AD for x so its only location is RX Remove RX from the AD of any var other than x Chapter 8 Code Generation 235810 Spring 2008 27 Example t u v are temp vars while a b c d are global Assume registers are enough Reuse registers whenever possible Chapter 8 Code Generation 355810 Spring 2008 Example R1R2R3 abc dt u V utcabcR3dR2R1 Chapter 8 Code Generation 355810 Spring 2008 29 R1R2 R3 a b c d t u V utvabcdR2R1R3 E R1R2 R3 a b c d t u V uadvR2bcdR2 R1R3 Chapter 8 Code Generation CSSBlO Spring 2008 30 R1R2 R3 a b c d t u V davR2bcR1R3 R1R2 R3 a b c d t u V davaR2bcdm R3 Chapter 8 Code Generation CSSBlO Spring 2008 31 Function getReg Consider picking Ry for y in x y z Ify in a register do nothing If y not in a register and there is a empty one choose it as Ry Let v be one of the var in R We are OK ifv is somewhere besides R We r OKifvisx We are OK ifv is not used later Spill ST v R Chapter 8 Code Generation 235810 Spring 2008 32 Peephole Optimization 0 Exam a sliding window and replace instruction sequence with a shorter or faster sequence Redundant instruction elimination Flow of control optimization Algebraic simplifications Use a machine idioms Chapter 8 Code Generation 235810 Spring 2008 33 Eliminating Redundant Loads and Stores LD a R0 ST R0 a Chapter 8 Code Generation 55810 Spring 2008 34 Eliminating Unreachable Code if debug1 goto L1 goto L2 L1 print debugging info L2 if debug1 goto L2 L1 print debugging info L2 Chapter 8 Code Generation 55810 Spring 2008 35 FIowofControl Optimizations oto L1 goto L2 L1 goto L2 L1 goto L2 quotaltbgoto L1 i1calt0 30t0 L2 in oto L2 39 139 goto L2 Chapter 8 Code Generation CSSBlO Sp gi ggggmermemate COde 36 Optimal Code Generation for Expressions We can choose registers optimally If a basic block consists of a single expression or It is sufficient to generate code for a block one expression at a time Chapter 8 Code Generation C85810 Spring 2008 37 Ershov Numbers 0 Assign the nodes of an expression tree a number that tells how many registers needed Label leaf 1 The label of an interior node with one child is the label of its child The label of an interior node with two children the larger one if the labels are different One plus the label if the labels are the same Chapter 8 Code Generation 235810 Spring 2008 38 Ershov Numbers t1a b t2cd a becd t3et2 t4t1t3 Chapter 8 Code Generation C85810 Spring 2008 39 Generating Code From Labeled Expression Tree Recursive IU rihm staring at h root Label k means k registers will be used Rb Rb1 Rbk1 where bgt1 is a base To generate machine code for a node with label k and two children with equal labels Recursively generate code for the right child using base bl Rm Rm Result in Rbk1 Recursively generate code for the right child using base b Rb Rm Result in Rbk2 Generate Instruction OP Rbk1 Rbk2 Rbk1 Chapter 8 Code Generation 235810 Spring 2008 40 Generating Code From Labeled Expression Tree To generate machine code for a node with label k and two children with unequal labels Recursively generate code for the child with label k using base b Rb Rm Result in Rbkl Recursively generate code for the child with label m using base b Rb Rb1 Result in Rbml Generate Instruction OP Rbkl Rbml Rbk1 For a leaf x if base is b generate LD Rb x Chapter 8 Code Generation 235810 Spring 2008 41 Ershov Numbers LD R3 d LD R2 C LD R2 b ADD R3 R2 R3 LD R1 a LD R2 e SUB R2 R1 R2 MUL R3 R2 R3 ADD R3 R2 R3 Chapter 8 Code Generation C85810 Spring 2008 42 Insufficient Supply of Registers nputalabeled r n anmrrofregirs For a node N with at least one child labeled r or greater recursively generate code for the big child with b1 The result will appear in Rr Generate machine instruction ST tk Rr If the little child has label r or greater b1 If the label is Jlt then j The result in Rr Generate the instruction LD RH tk Generate OP R R Rr1 or OP Rr RH Rr Chapter 8 Code Generation 235810 Spring 2008 43 Insufficient Supply of Registers LD R2 d LD R1 C LD R2 b ADD R2 R1 R2 LD R1 a LD R1 9 SUB R2 R1 R2 MUL R2 R1 R2 LD R1 t3 ST t3 R2 ADD R2 R2 R1 Chapter 8 Code Generation C85810 Spring 2008 44

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

#### "I made $350 in just two days after posting my first study guide."

#### "I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

#### "Their 'Elite Notetakers' are making over $1,200/month in sales by creating high quality content that helps their classmates in a time of need."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.