New User Special Price Expires in

Let's log you in.

Sign in with Facebook


Don't have a StudySoup account? Create one here!


Create a StudySoup account

Be part of our community, it's free to join!

Sign up with Facebook


Create your account
By creating an account you agree to StudySoup's terms and conditions and privacy policy

Already have a StudySoup account? Login here

Computer Architecture Sections 2.8 - 2.14

by: Aaron Maynard

Computer Architecture Sections 2.8 - 2.14 CS 3340

Marketplace > University of Texas at Dallas > Computer Science and Engineering > CS 3340 > Computer Architecture Sections 2 8 2 14
Aaron Maynard
GPA 3.5

Preview These Notes for FREE

Get a free preview of these Notes, just enter your email below.

Unlock Preview
Unlock Preview

Preview these materials now for free

Why put in your email? Get access to more of this material and other relevant free materials for your school

View Preview

About this Document

These sets of notes will be covering the subjects covered in CS 3340.003 (and other). This packet will cover topics within Chapter 2 of Computer Organization and Design, Fifth Edition: The Hardware...
Computer Architecture
Class Notes
Computer Science, Computer, Architecture, Science
25 ?




Popular in Computer Architecture

Popular in Computer Science and Engineering

This 17 page Class Notes was uploaded by Aaron Maynard on Saturday September 17, 2016. The Class Notes belongs to CS 3340 at University of Texas at Dallas taught by in Fall 2016. Since its upload, it has received 23 views. For similar materials see Computer Architecture in Computer Science and Engineering at University of Texas at Dallas.


Reviews for Computer Architecture Sections 2.8 - 2.14


Report this Material


What is Karma?


Karma is the currency of StudySoup.

You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 09/17/16
COMPUTERARCHITECTURE FALLSEMESTER2016 INSTRUCTOR:DR.KARENMAZIDI 16 September 2016 These sets of notes will be covering the subjects covered in CS 3340.003 (and other). This packet will cover topics within Chapter 2 of Computer Organization and Design, Fifth Edition: The Hardware/Software Interface by Patterson and Hennessay. Any material on these pages include but are not limited to presentational slides provided by the professor. Each set may contain concluding remarks at the end of each set. Calling Procedures In order to call a procedure there are multiple steps that need to be taken. 1. Place parameters in registers 2. Transfer control to procedure 3. Acquire storage for procedure 4. Perform procedure’s operations 5. Place result in register for caller 6. Return to place of call 1 MIPS Register Convention It is very important to know the register convention of MIPS. ​The MIPS microprocessor has 32 general-purpose registers. ● Registers are 32 bits for MIPS I instruction set architecture (ISA) and II ISA. ● MIPS III and higher ISAs have 32-bit registers when running in 32-bit mode, and 64-bit registers when running in 64-bit mode. ● Registers $1, $26, $27, $29 are reserved for special purposes by the assembler, compiler and operating system. ● Register $0 is hardwired to the value zero, and $31 is the link register for jump and link instructions but can be used with other instructions with caution. The following table summarizes the usage convention for these registers. In order to call a procedure, you must use a jump and link. “​Jal ProcedureLabel​”. These are the address of following intrusions pu in $ra, they will jump to the target address. The procedure return: jump register is “​jr $ra​”. This copies $ra to the program counter and can also be used for computed jumps. 2 Leaf and Non-Leaf A subroutine, also known as a leaf procedure, is a routine that does not make any other calls within the program. A non-leaf procedure is one that DOES make called to other procedures within the program. A procedure that calls another procedure will overwrite the value in $ra which causes a problem; an easy solution to this is to save the $ra value within the stack. The Stack Many of you might be asking the question: “What is the stack? We keep talking about using it but I don’t know what it is!” In computer science, a stack is an abstract data type that serves as a collection of elements, with two principal operations: push, which adds an element to the collection, and pop, which removes the most recently added element that was not yet removed. The stack pointer register ($sp) points to the current top of the stack. When a stack is initialized, the $sp points to the bottom of available space for the stack. Example: Push register 3 on the stack addi $sp, $sp, -4 # decr sp by 4 sw $3, 0($sp) # save r3 to the stack Example: Get register 3 off the stack lw $3, 0($sp) # copy from stack to r3 addi $sp, $sp, 4 # incr sp by 4 3 Character Data ● Byte-encoded character sets ○ ASCII: 128 characters ■ 95 graphic, 33 control ○ Latin-1: 256 characters ■ ASCII, +96 more graphic characters ● Unicode: 32-bit character set ○ Used in Java, C++ wide characters, … ○ Most of the world’s alphabets, plus symbols ○ UTF-8, UTF-16: variable-length encodings 4 Byte/Halfword Operations A MIPS ​halfword is two bytes. This, also, is a frequently used length of data. In ANSI C, a short integer is usually two bytes. So, MIPS has instructions to load halfword and store halfwords. There are two load halfword instructions. One extends the sign bit of the halfword in memory into the upper two bytes of the register. The other extends with zeros. lh t,off(b) # $t <— ​Sign-extended​ halfword # starting at memory address b+off. # b is a base register. # off is 16-bit two's complement. lhu t,off(b) # $t <— ​zero-extended​ halfword # starting at memory address b+off. # b is a base register. # off is 16-bit two's complement. Halfword addresses must be ​halfword aligned​. Attempting to load a halfword from an unaligned address will cause a trap. String Examples This copies characters from array y to array x up to and including a null (0) byte. By convention, the last character in a character string is always null (ASCII value 0). So, for example, if the string is "CAT" and it is stored beginning at address 2000, then memory would look like this: Address Contents 2003 0 2002 84 2001 65 2000 67 5 because 67 is the ASCII value for 'C', 65 is the ASCII value for 'A', etc. C++ version // copies contents of character array y into character array x // we're assuming that the two arrays are the same size void strcpy(char x[], char y[]){ int i; i=0; while (y[i] != 0){ // this tests to see whether y[i] marks the // end of the character array (null byte) x[i]=y[i]; // copy y[i] to x[i] i = i + 1; // increment i so we move to next character } x[i]=0; // y[i]= null byte so set x[i] to null byte } MIPS version 1 - using arrays # assumptions: # 1) base address of x is in $a0 # 2) base address of y is in $a1 # 3) $t0 will hold value of i # remember that we do not have to multiply i by 4 in this case because # characters only take 1 byte to store, not 1 word strcpy: add $t0, $zero, $zero # i = 0 L1: add $t1, $a1, $t0 # $t1 holds address of y[i] lb $t2, 0($t1) # $t2 holds value of y[i] beq $t2, $zero, out # if y[i]==0 (null character),leave loop add $t3, $a0, $t0 # $t3 holds address of x[i] sb $t2, 0($t3) # x[i] = y[i] 6 addi $t0, $t0, 1 # i = i + 1 j L1 # go to top of loop out: add $t3, $a0, $t0 # $t3 holds address of x[i] sb $zero, 0($t3) # x[i] = 0 (to terminate string) jr $ra # return from procedure MIPS version 2 - using pointers # assumptions: # 1) base address of x is in $a0 # 2) base address of y is in $a1 # 3) $t3 will hold pointer to x[i] (address of x[i]) # 4) $t1 will hold pointer to y[i] (address of y[i]) strcpy: add $t1, $zero, $a1 # $t1 holds address of y[0] add $t3, $zero, $a0 # $t3 holds address of x[0] L1: lb $t2, 0($t1) # $t2 holds value of y[i] beq $t2, $zero, out # if y[i]==0, leave loop sb $t2, 0($t3) # x[i] = y[i] addi $t1, $t1, 1 # $t1 holds address of y[i] addi $t3, $t3, 1 # $t3 holds address of x[i] j L1 # go to top of loop out: sb $zero, 0($t3) # x[i] = 0 (to terminate string) jr $ra # return from procedure 7 MIPS Addressing MIPS instructions are limited to 32 bits, keeping the ability to utilize simple hardware, however it also limits constants to be 16 bits for load, immediate, branch and jump instructions. Most constants used in MIPS are small 16-bit constants, but for the occasional 32-bit constant, MIPS has a “​load upper immediate​” (​lui​) instruction. Lui rt, constant: copies 16-bit constant to left 16-bits of rt, clearing ritght 16-bits or rt to zero. Let’s say we want to load constant: 0000 0000 0011 1101 0000 1001 0000 0000, which is hex 003D0900. 0x003D = 61​ 10 0x0900 = 2304​ 10 li (load immediate) is a pseudo-instruction that loads a constant into a register. lui (load upper immediate) is a real instruction that loads a 16-bit constant into the upper half of a register. The pseudo-instruction li gets assembled into an addiu (add immediate unsigned) instruction. Branches and Jumps In MIPS ​branch ​instruction has only 16 bits offset to determine next instruction. We need a register added to this 16 bit value to determine next instruction and this register is actually implied by architecture. It is PC register since PC gets updated (PC+4) during the fetch cycle so that it holds the address of the next instruction. We also limit the branch distance to -2^15 to +2^15 - 1 instruction from the (instruction after the )branch instruction. However, this is not real issue since most branches are local anyway. Step by step : ● Sign extend the 16 bit offset value to preserve its value. ● Multiply resulting value with 4. The reason behind this is that If we are going to branch some address, and PC is already word aligned, then the immediate value has to be word-aligned as well. However, it makes no sense to make the 8 immediate word-aligned because we would be wasting low two bits by forcing them to be 00. ● Now we have 32 bit address. Add this value to PC + 4 and that is your branch address. For ​Jump ​instruction Mips has only 26 bits to determine Jump location. Besides, jumps are relative to PC in MIPS. Like branch, immediate jump value need to be word-aligned;therefore, we need to multiply 26 bit address with four. Step by step: ● Multiply 26 bit value with 4. ● Since we are jumping relative to PC value, concatenate first four bits of PC value to left of our jump address. ● Resulting address is the jump value. In other words, replace the lower 28 bits of the PC with the lower 26 bits of the fetched instruction shifted left by 2 bits. 32x & 64x  Nearly all microprocessors now  have 64­bit extensions in response  to the needs of software for larger  programs.    9 Decoding Machine Language How do we convert 1s and 0s to assembly language and to C code?  Machine language ⇒ assembly ⇒ C?  For each 32 bits:  i)    Look at opcode to distinquish between R­ Format, JFormat, and I­Format  ii)  Use instruction format to determine which fields exist  iii) Write out MIPS assembly code, converting each field to name, register number/name,    or decimal/hex number  iv)  Logically convert this MIPS code into valid C code. Always possible? Unique?    Decoding (1/7)  Here are six machine language instructions in  hexadecimal:  00001025hex  0005402Ahex  11000003hex  00441020hex  20A5FFFFhex  08100001hex  Let the first instruction be at address 4,194,304 ten  (0x00400000hex)    Next step: convert hex to binary          10 Decoding (2/7)  The six machine language instructions in binary:    00000000000000000001000000100101   00000000000001010100000000101010  00010001000000000000000000000011   00000000010001000001000000100000   00100000101001011111111111111111   00001000000100000000000000000001  Next step: identify opcode and format  0  rs  rt  rd  shamt  funct  1, 4­62  rs  rt  immediate  2 or 3  target address    Decoding (3/7)  Select the opcode (first 6 bits) to determine the format:    00000000000000000001000000100101   00000000000001010100000000101010   00010001000000000000000000000011   00000000010001000001000000100000   00100000101001011111111111111111   00001000000100000000000000000001  Look at opcode: 0 means R­Format, 2 or 3 mean J­Format, otherwise I­Format  Next step: separation of fields R R I R I J Format:  0  rs  rt  rd  shamt  funct  1, 4­62  rs  rt  immediate  2 or 3  target address  11 Decoding (4/7)  Fields separated based on format/opcode:  0  0  0  2  0  37  0  0  5  8  0  42  4  8  0  +3  0  2  4  2  0  32  8  5  5  ­1  2  1,048,577     Next step: translate (“disassemble”) MIPS assembly  instructions R R I R I J Format:    Decoding (5/7)  MIPS Assembly (Part 1):  Address:      Assembly instructions:  0x00400000  or $2,$0,$0   0x00400004  slt   $8,$0,$5   0x00400008  beq   $8,$0,3   0x0040000c  add   $2,$2,$4   0x00400010  addi  $5,$5,­1   0x00400014  j  0x100001     Better solution: translate to more meaningful MIPS instructions (fix the branch/jump and add  labels, registers)  12   Decoding (6/7)  MIPS Assembly (Part 2):       or   $v0,$0,$0      Loop:   slt   $t0,$0,$a1      beq   $t0,$0,Exit      add   $v0,$v0,$a0       addi  $a1,$a1,­1      j  Loop   Exit:  Next step: translate to C code (must be  creative!)    Decoding (7/7)  After C code  $v0: var1   $a0: var2   $a1: var3   var1 = 0;   while (var3 > 0) {        var1 += var2;      var3 ­= 1;  }    Fallacies  } Powerful instruction ​Þ ​ higher performance  ◦   Fewer instructions required  ◦   But complex instructions are hard to implement  ◦   May slow down all instructions, including simple ones  ◦   Compilers are good at making fast code from simple instructions  13 } Use assembly code for high performance  ◦   But modern compilers are better at dealing with modern processors  ◦   More lines of code ​Þ​ more errors and less productivity    Pitfalls  } Sequential words are not at sequential addresses  ◦   ​ncrement by 4, not by 1!  } Keeping a pointer to an automatic variable after procedure returns  ◦   e.g., passing pointer back via an argument  ◦   Pointer becomes invalid when stack popped    Parallelism and Instructions: Synchronization Parallel computing is a type of computation in which many calculations are carried out simultaneously, or the execution of processes are carried out simultaneously. Large problems can often be divided into smaller ones, which can then be solved at the same time. There are several different forms of parallel computing: bit-level, instruction-level, data, and task parallelism. Parallelism has been employed for many years, mainly in high-performance computing, but interest in it has grown lately due to the physical constraints preventing frequency scaling. As power consumption (and consequently heat generation) by computers has become a concern in recent years, parallel computing has become the dominant paradigm in computer architecture, mainly in the form of multi-core processors. Thread synchronization is defined as a mechanism which ensures that two or more concurrent processes or threads do not simultaneously execute some particular program segment known as critical section. When one thread starts executing the critical section (serialized segment of the program) the other thread should wait until the first thread finishes. If proper synchronization techniques are not applied, it may cause a race condition where, the values of variables may be unpredictable and vary depending on the timings of context switches of the processes or threads. 14 Since doing the read/write in one instructions is challenging to implement in hardware, a work-around is to have two instructions in which the 2​ndreturns a value that indicates whether or not the pair of instructions were executed in an atomic way – without interference. ● In MIPS this is implemented with: ○ ll – load linked ○ sc – store conditional ● Load linked: ll rt, offset(rs) ● Store conditional: sc rt, offset(rs) ○ Succeeds if location not changed since the ll ■ Returns 1 in rt ○ Fails if location is changed ■ Returns 0 in rt Example: atomic swap (to test/set lock variable) try: add $t0,$zero,$s4 ;copy exchange value ll $t1,0($s1) ;load linked sc $t0,0($s1) ;store conditional beq $t0,$zero,try ;branch store fails add $s4,$zero,$t1 ;put load value in $s4 Addressing mode in MIPS Different formats of addressing registers or memory locations are called addressing modes. ● Immediate addressing. where operand is a constant in the instruction. e.g. addi, lui, slti, andi, ori, sll, srl ● Register addressing. when the operand is in a register. Simple, addresses location inside the processor. e.g. add, sub, and, or, nor, jr ● Base addressing. where operand is at a location = (16-bit constant in instruction) + (memory location stored in a register). Used for addressing elements of an array. e.g. lw, sw, lh, sh, lb, sb ● PC-relative addressing. when the address of the operand = PC + (16-bit constant shifted by 2). e.g.: branching instructions, beq, bne ● Pseudo-direct addressing. used for jump instruction. address = (26 bits shifted left = 28 bits) concatenated w/ upper 4 bits of PC. e.g. j, jal 15 Translating and Starting a program What happens when you compile a C program? When you run one? There are 4 steps for transforming a C source code into a running program in memory: compiling, assembling, linking, loading (accomplished by systems programs). Of course in an IDE, these steps are hidden from the user. The steps taken are: ● Preprocessing. processing included header files, condition compilation (ifdefs), and macros ● Compiler. Produces an assembly language program, a symbolic form of machine (binary) language. Much more lines than the source code. Low-level code (OS, assemblers) were written in AL. ● Assembler. Translates the assembly program into object file: machine code + (global) data + information for placing instructions in memory properly. ○ header. size and position of sections in .o file. ○ text. contains machine code ○ static data. Data that will available for the lifetime of program ■ .bss uninitialized global data ■ .data initialized global data ■ .rodata read-only global data. string literals and constants. ○ relocation information. instructions and data that depend on absolute addresses when the program runs. For e.g. j Label1. Linker uses this info to adjust section contents. For e.g. the linker tracks the address of a procedure so other procedures may call it. 16 Arrays & Pointers Array indexing involves the multiplying of index according to element size as well as adding to the array base address. Pointers correspond directly to memory addresses, which can avoid indexing complexity. Strength reduction is a compiler optimization technique in which expensive operations are replaced with equivalent but less expensive operations, examples: ● replace mult inside loop with “weaker” add ● replace exponentiation inside a loop with multiplication 17


Buy Material

Are you sure you want to buy this material for

25 Karma

Buy Material

BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.


You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

Why people love StudySoup

Steve Martinelli UC Los Angeles

"There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

Allison Fischer University of Alabama

"I signed up to be an Elite Notetaker with 2 of my sorority sisters this semester. We just posted our notes weekly and were each making over $600 per month. I LOVE StudySoup!"

Jim McGreen Ohio University

"Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."


"Their 'Elite Notetakers' are making over $1,200/month in sales by creating high quality content that helps their classmates in a time of need."

Become an Elite Notetaker and start selling your notes online!

Refund Policy


All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email


StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here:

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.