Popular in Course
Popular in ComputerScienence
This 21 page Class Notes was uploaded by Sterling Schmeler on Monday October 26, 2015. The Class Notes belongs to CS0447 at University of Pittsburgh taught by Staff in Fall. Since its upload, it has received 50 views. For similar materials see /class/229391/cs0447-university-of-pittsburgh in ComputerScienence at University of Pittsburgh.
Reviews for COMPUTRORGZTN&ASSMBLYLANG
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 10/26/15
CSCOE0447 Computer Organization and Assembly Language Chapter 4 Sa ngyeu n Cho De t of Computer Science University of Pittsburgh Performance evaluation Often we face an o timization roblem When cost is fixed lVlalelZe performance and Teatures When performance target is achieved 0 Minimize cost Give me the hig hest performance computer Don39t care about cost 0 Supercomputers are of this class CSCOEO447 Computer Organization and Assemb y Language University at Pittsburgh Performance evaluation It39s an everyday process When you buy food 0 Same uantit then ou look at cost 0 Same cost then you look at quantity 0 Assuming same quality When lou bu a notebook 0 You give weights to different features LCD size CPU speed Main memory size Batterylife o How do you quantify these and come up with a single score that will guide your decision CSCOEO447 Computer Organization and Assemby Language University umesburgh Computer performance Have you cared about performance seriously when buying a computer 0 What were the most important factors when you bought a new PC Defining computer performance may not be that simple 0 What do you mean when you say a computer has better performance than another We need a Common metric 0 Remember however than a single metric will not say everything about a computer 0 We may need multiple metrics Response time vs throughput CSCOEO447 Computer Organization and Assemby Language University afpinsbuigh Response time vs throughput Conventions Plane DC m Paris Top Speed Passengers Thg xft Throu h ut Boeing 747 65 hours 610 mph 470 286700 0 Eg of transactions processed per second SADSUd 3 hours 1350 mph 132 178200 Larger 395 better onoorde which has higher performance If we are primarily concerned with response time Time to deliver passenger performance can be defined as 1T where T is the response 0 Time to deliver 400 passengers time u for l called SmallerT is better 0 This is called quotresponse timequot or quotexecution timequot or quotlatencyquot That is larger 1T is better TimeJob Machine A is N times faster than B 39 JObS Per day called 0 N performanceAperformanceB timeBtimeA Thls 395 called quotthI OUghPUtquot 0quot quotbandWIdthquot 0 When we present this statement we want to have Ngt1 to eliminate o Jobstime comusnon CSC0E0447 Computer Organization and Assemby Language Unwers yummbmgh CSC0E0447 Computer Organization and Assemby Language Unmarsnyufpmsbugh Response time vs throughput Time in a computer Time Of Concorde VS BOEing 747 When 39ou time a ro39 ram what do 1 cu measure concord 395 faStequot 0 Total time to complete a task 0 65 hours3 hours or 22 times faster CPU time 39 ThroughPUt Of Being 747 V5 concorde o Overheads include disk accesses memor accesses IO activities 05 0 286700 pmph 176200 pmph overheads overheads from other programs o 16 times higher faster quotBoeing 747 is 16 times or 60 faster than Concorde in terms of throughputquot quotConcorde is 22 times or 120 faster than Boeing 747 in terms of flying time response timequot quotReal time quotresponse time quotelapsed timequot We are interested in time spent by CPU on your program only there are multiple programs running at the same time 0 quotCPU execution timequot or quotCPU timequot o Often divided into system CPU time in OS and user CPU time in user We Will focus primarily on execution time of a Single Job for program the remaining discussions 031447 Computer Organization and Assemb y Language Unmmty Bf Msbmgh CSC0E0447 computer Organization and Assemby Language University umesburgh Time in a computer Measuring time In terms of seconds cpu time computeis aie constiucted using digital ciicuitiy synclnonized at common quotcloclcquotticlcs aptaln hewheh events take place Clock is an endless ie etitive u amp down I a em 39me peiiod of a cloclc cycle up and down 1dtxkrateordokfrequenty in if 16Hz iotk is use ns ilfSGHz cloclc is used vnxyem Measuring time in terms of clocks Measuring me n terms of clocks CPU execution time an be represented with of locks 01 of lock licks lock licks E g it takes i000000 cloclcticlcsto Tihish programA has to do With how many lnitrumoni are executed in the program to do as With how many cloclcticlcs aie needed to finish a single instruction m av erage If we know the cloclc cycle time equivalently cloclc iate we can easily calculate the actual time Time t clock ticks instructions also called quotinstiuction ounlquotgtltloks pei instruction oi CPI o iotktltksgtltdolltydetlme Time otcloclcticlcs dock iatei Time of cloclc ticlcsxcloclc cycle time Time ICXCPIXCCT lc may not change iftwo macliines use the same instiuction em mammal ama emu sm v yrtgtn l1 What impacts time Workload Time CgtltCPgtltCCT A set of ro39 rams run on a com uter would form a workload lC of instructions Actual collection of applications 0 Instruction set architecture ISA o S nthetic rorams o Compiler used 0 Kernel loops CPI average clock per instruction o Microarchitecture ea 0 Compiler ISA I i elininw cache size To compare two computer systems a user would typically CO e uivalentl Clock rate compare the execution time of the workload on the two 39 39 39 machines 0 Technology amp Circwt optimization 0 Microarchitecture esp pipelining ISA CSCOEO447 Computer Organization and Assemb y Language unweer amnsbmgh CSCOEO447 Computer Organization and Assemby Language University umesburgh Benchmark suites Synthetic benchmarks A benchmark suite is a set of applications relevant for performance I A S nthesized r0 ram intended to mimic real r0 rams evaluation 39 39 39 39 39 o A BgtltCDEF SPEC standard performance evaluation corporation 39 WWW5 ec39 39 One may write a synthetic benchmark to exercise a specific CPU benchmarks f Server benchmarks Seq uence O aCtlons Graphics benchmarks o A program that maximally exercises the multiplication unit in CPU EEMBC em bedded microprocessor benchmark consortium Automotive Office CSCOEO447 Computer Organization and Assemb y Language unwas y uf nshmgh CSCOEO447 Computer Organization and Assemby Language University umesburgh Kernel programs or loops An im ortant ortion ofa real ro39 ram 0 Quicksort 0 Matrix multiplication Real programs usually consist of many loops and functions and it may be quite difficult to understand and interpret any performance evaluation results if we are only given summary information In the case kernel programs we immediately understand what39s going on in the programs CSCOEO447 Computer Organization and Assemb y Language univarmy uf ansbmgh Arithmetic mean Assume that we have N ro39 rams Arithmetic mean AM Z TimeiN Weighted arithmetic mean Z TimeiXWi Z Wi 1 AM is a special case of weighted AM where W 1N CSCOEO447 Computer Organization and Assemb y Language Unw my uf ansbmgh Summarizing performance Computer A cam puter B Program 1 sec 1 10 Program 2 sec 1000 100 Total time sec 1001 110 present a confusing picture CSCOEO447 Computer Organization and Assemby Language Normalized time Com uter A is 10 times faster than B for I rogram 1 Computer B is 10 times faster than A for program 2 Although the above statements are correct individually they Umverstty umesburgh camputerA Computer B Ratio Program 1 1 01 101 01 Program 2 1 10 011 10 Sum 2 101 101 2 different CSCOEO447 Computer Organization and Assemby Language Normalized time maintains the ratio of the original times However depending on the reference the result can be Summing eg arithmetic mean the normalized times may not produce the same comparison result University umesburgh Dealing w ratios CamputerA Computer B Ratla Program 1 1 10 01 Program 2 1 1000 01 100 10 GM 1 When performance ratios are given geometric mean GM can summarize the result more consistently GM H Ratioi N GMXiGMYi GMXYi 0 Taking either the ratio of GM or GM of the ratios yields the same result CSC0E0447 Computer Organization and Assemb y Language unwarmy uf Hasburgh Case study SPEC benchmark SPEC CPUZOOO benchmark o 12 integer programs 0 14 floatingpoint computationintensive programs To obtain a quotSPECmarkquot 0 Run each program on the target machine 0 Get each program39s performance ratio by dividing the pre provided execution time of an old SUN workstation with the execution time obtaine 0 Get GM of the ratios from all the programs CSC0E0447 Computer Organization and Assemb y Language Unw my uf Malawi Summarizing performance wme mrt immunity 1 9e SPEDN39JDIEVDJH ZDW z u r team on as in v 5 Zn 25 no in in Sewer and HPC Perfarmance Performance Fer watt Camparison From Intel Xeon 5100 brochure CsCoE0447 Computer Organization and Assemby Language unwarmy ufpmsburgh Amdahl39s law Your 0 timization techni ue is usuall a licable to onl a limited portion of program execution o Eg larger cache improved CPU frequency improved FSB frequency TImeimproved TImeuna ededTmea ededImprovement factor Make the common case fast CsCoE0447 Computer Organization and Assemby Language gummy ufpmsburgh Amdahl39s law Fallacies and pitfalls 7 Pitfall 1 Ex ectin39 the im rovement of one as ect of a Timbem computer to increase performance by an amount proportional to the size of the improvement nquot Pitfall 2 Using a subset of the performance equation as a performance metric If we extend this basic form we have 0 Timea eT1N1TzN2 TMNM where Ti is a portion of Timebefoe and Ni is the improvement ratio or speedup applied to the program execution portion CSC0E0447 Computer Organization and Assemb y Language Unwers y ummmgh CSC0E0447 Computer Organization and Assemby Language UnmmCy ufpmsbugh 2 Summary Summary Performance evaluation is an im ortant sta39e of an39 desi39n Res onse time latenc39 time to finish a Hiven 10b and engineering process Throughput ofjobs done in a second We are interested in measuring computer performance where the goal is 0 Software improvement Time of clock cycles gtlt clock cycle time H of clock cycles of instructions gtlt CPI o ardware Improvement 0 Or both We need relevant metrics and a method to summarize measurements Latency vs throug hput CSC0E0447 Computer Organization and Assemb y Language Unmmty Bf Msbmgh CSC0E0447 computer Organization and Assemby Language University umesburgh 2 Summary Workloads used in performance evaluation are typically drawn from real applications Synthetic or kernel benchmarks are often useful A benchmark suite is a set of applications designed to aid performance evaluation tasks 0 Easier experiments fair comparison consistent reporting Summarizing results 0 Use arithmetic mean AM for measured times 0 Use eometric mean GM for ratios Amdahl39s law 0 Dictates the actual performance improvement due to a speedup factor applied to a limited portion of a program 031447 Computer Organization and Assemb y Language Un versxty uf Hasburgh CSCOE0447 Computer Organization and HDDCIIIIJIy Language A Little n vyl v 39 Dept of Computer Science University of Pittsburgh Number representation What is the natural quotbasequot for numbers 0 2 10 13 27 We are accustomed to decimal or base10 39 X X X XXXXX UNARY 39 X X Unary is inefficient in space All other bases are a compact way of representing numbers CSCoE0447 Computer Organization and Assembly Language University ofpmsburgh Example b S mbols in a di39it 0 1 2 3 4 5 6 7 8 9 10 of them Example 3217 3x103 2x102 1x101 7x100 CSCOE0447 Computer Organization and Assembly Language University ofpmsburgh Numbers and base Number with base B 2 B symbols Base 10 O123456789 Base 2 01 Base 16 O123456789abcdef Unsigned number representation d31d30d1d0 is a 32digit nonnegative number Value d31xB31 d30xB3 d1xB1 doxB0 Ndigit base B 2 BN unique values CSCOE0447 Computer Organization and Assembly Language University ofpmsburgh Example b S mbols in a di39it 01 2 of them quotBinary digitquot quotbitquot Example 1101 otwo 1x24 1x23 0x22 1x21 0x20 16 8 o 2 O 26 ten Binary numbers go along well with Boolean logic 1 on O off CSCoE0447 Computer Organization and Assembly Language University ofpmsburgh Base conversion Let s do decimaltobinary conversion first A dn1dn2d1d te n Otwo Given a base10 number A come up with an ndigit binary number that has the same value CSCoE0447 Computer Organization and Assembly Language University ofpmsburgh Base conve From binary to decimal From decimal to binary From binary to hexadecimal From hexadecimal to binary From decimal to hexadecimal CSCoE0447 Computer Organization and Assembly Language University ofpmsburgh Base conversion Base 10 0 1 2 3 4 S 6 7 8 9 CSCoE0447 Computer Organization and Assembly Language University ofpmsburgh Base conve Binar base 2 to hex base 16 or hex to binar base conversion is rather straightforward 0 Take 4 bits in binary and convert them into one hex digit and vice versa Since binary notation tends to be long hex notation is frequently used Ill abbClllUly language allu Ill x programs More on binary number representation will be discussed when we study arithmetic CSCoE0447 Computer Organization and Assembly Language University ofpmsburgh CSCOE0447 Computer Organization and Assembly Language Chapter 3 Sa ngyeu n Cho De t of Computer Science University of Pittsburgh Binary arithmetic CSCoEO447 Computer Organization and Assemb y Language Sounds scary So far we studied 0 Instruction set architecture basic 0 MIPS architecture amp assembly language We will review binary arithmetic algorithms and their implementations Binary arithmetic will form the basis for CPU39s datapath design University at Pittsburgh Five classic components lam like a pack of file folders lam like a control tower iiipvi 41 exchange information H V with outside world I am like a conveyor belt service stations University umesburgh CSCoEO447 Computer Organization and Assemby Language Binary number representations We looked at how to represent a number in fact the value represented by a number in binary o Unsigned numbersa everything is positive We will deal with more complicated cases 0 Negative numbers 0 Real numbers aka tloatingpomt numbers The first question How do we re resent a ne ative number in decimal CSCoEO447 Computer Organization and Assemby Language gummy ufF nshurgh Method 1 signmagnitude Method 2 one39s complement This is the same method we use for decimal numbers 2N 1 number N 1 Sign bit absolute value magnitude A A 2 l S bt 0 Given a number A it39s negation is done by11111111A lgn l o In fact simple bitbybit inversion will give the samemagnitude 39 0 WSW9391 neg t39ve number with a different sign Examples assume 439blt representatlon 0 Examples assume 4bit representation 0000 0000 0 11 0011 v 1001 e 1001 1111 1111 1000 7 1000 Properties 39 Propert39es 0 Two Os a positive 0 and a negative 0 o There are two 05 0 There are equal of positive and negative numbers There are equal Of posmve and negatlve numbers CSC0E0447 Computer Organization andAssemby Language Umvers yufpmsbwh CSC0E0447 Computer Organization andAssemby Language Umv myumusbmgh S I I Method 3 two le Summary 2N number Code SignnMagn tude 1 s Complement 2 s Complement A A 2N 000 0 0 0 0 Given a number A it39s negation is done by 11111111 A 1 001 1 1 1 0 In fact simple bitby bit inversion followed by adding 1 will give the 010 2 2 2 samemagnitude number with a different sign 0 Examples assume 4bit representation 011 3 3 3 0 0000 100 0 3 4 39 0011 101 1 2 3 1001 1111 110 2 1 2 o 1000 111 3 0 1 Properties I Issues 0 There is a Single 0 of zeros 0 There are unequal of positive and negative numbers Baiance Arithmetic algorithm implementation CSC0E0447 Computer Organization andAssemby Language Umvermy Dfmsbmgh CSC0E0447 Computer Organization and Assemby tanguage Unwerm Dfmsbmgh 7 When N 32 If we use the two39s complement method 0000 0000 0000 0000 0000 0000 0000 0000 0 0000 0000 0000 0000 0000 0000 0000 0001 1 0000 0000 0000 0000 0000 0000 0000 0010 2 0 01111111111111111111111111111110 2147483646 01111111111111111111111111111111 2147483647 1000 0000 0000 0000 0000 0000 0000 0000 2147483648 1000 0000 0000 0000 0000 0000 0000 0001 2147483647 0 1000 0000 0000 0000 0000 0000 0000 0010 2147483646 111111111111111111111111111111013 111111111111111111111111111111102 111111111111111111111111111111111 CSC0E0447 Computer Organization and Assemb y Language Urnvers ty uf ansbmgh Addition We are uite familiar with addin39 two numbers in decimal o What about adding two binary numbers If we use the two39s complement method to represent binary numbers addition can be done in a straightforward way Examples 4 bit representation 00102 0011 3 o 1111 1 0100 4 o 0111 7 0010 2 Let39s talk about overflow now CSC0E0447 Computer Organization and Assemb y Lan guage Unw my uf Malawi CSC0E0447 Computer Organization and Assemby Language CSC0E0447 Computer Organization and Assemby Language Two39s complement operations Nequot atin39 o Invert all bits and add 1 o This operation is equivalent to subtracting the number from 2 why Converting an N bit number into an M bit number MgtN o Eg MIPS 16bit immediate it s a signed number converted into a 32bit number 0 This operation is called signextension o Eg from 4bit encoding to 8bit encoding 0010 gt 1010 gt Umva slty of Pittsburgh Overflow Because we use a limited number of di39 its to re resent a number the result of an operation may not fit No overflow when 0 We add two numbers with different signs 0 We subtract a number from another number having the same sign Overflow detection 0 Addin two ositive numbers ields a neative number 0 Adding two negative numbers yields a positive number 0 How about subtraction University umesburgh What happens on overflow The CPU can 0 Generate an exception what is an exception 0 Set a flag in the status register what is the status register 0 Do nothing Languages may have different notions about overflow Do we have overflows in the case of unsigned always positive numbers 0 Example addu addiu subu CSCOEO447 Computer Organization and Asserno y Language C language example University at Pittsburgh CSCOEO447 Computer Organization and Asserno y Language University at Pmslzlurgh MIPS example CSCOEO447 Computer Organization and Assemby Language I looked at the MIPS32 instruction set manual ADD ADDI instructions generate an exception on overflow ADDU ADDIU are silent University umesburgh Subtraction CSCOEO447 Computer Organization and Assemby Language We know how to add We know how to negate a number We will use the above two known operations to perform subtraction A B A B The hardware used for addition can be extended to nandie subtraction University umesburgh 1bit adder Nbit adder Wewm ook at a ngxem adder Nrb adder an be onitrutted Wm Wunmadmwe gnazzrm m vth N smwerbn adders A am am game m a We Drapaqaxedm me quotan 39npp emr39y 2mm sidequot A um a 2mm imp n ammum A N mm p a Nrmm39mpm 20mm gquot WW I 3an 39 WW Nbit rippleca rry adder Descr ng a s gIebit adder tmth tabewm teH uiabouttheo erauon ofeimjebwt adds quotmm 39 Multiplication More complicated than addition 0 A straightforward implementation will involve addition and shifting A quotmore complex operationquot implies o More area on silicon andor 0 More time more clock cycles or longer clock cycle time Let39s begin from a simple straightforward method 0 For now consider onlr unsigned numbers CSC0E0447 Computer Organization and Assemb y Language univarmy uf ansbmgh Hardware design 1 Mumpr cm 5m in at alts 64nit shift register mumpram r Mumvrem u 3 pr Mull pilir srrn r y is 2 Sm1 me Mul plrrard egrsler rem ml 2 arm me Murnprer rewrsla vgirt rm to lt 32 repel Ham htforward but naive desi nquot 21 repatrzmr res 32 rrpsmrcns University at Pittsburgh CSC0E0447 Computer Organization and Assemb y Language Straightforward algorithm 01010010 multiplicand x 01101101 multiplier CSC0E0447 Computer Organization and Assemby Language Hardware design 2 27bit shift register 64gb shift register viruanrcaro 32bit ALU instead of 64bit ALU CSC0E0447 Computer Organization and Assemby Language University umesburgh J r sniuuie MH auninign W 3 sniuiriewnniei rwrslw rng i ii 32rd recelmam s V39vewr An 12 rcrml liarr University umesburgh Hardware design 3 u a 9 Removed a 32bit shift register cycacwv Computa Ofgmlzamn andAxmby nguage Ummmmnsbmgh Booth39s encoding Three symbols to represent a binary number 104 Examples 8bit encoding 11111111 two39scomplement 000000071 Booth39s encoding 1 00001110two395 complement 000100710 Booth39s encoding Bit transitions in number in two39s complement encoding show how Booth39s encoding wor s 0 to 0 from right to left 0 0to 1 1 to 10 1 to 0 1 cycacnw Computa Ofgmixamn andAxmby nguage Ummmmnsbmgh Example Let39s do 0010x0110 2 X6 unsigned Iteratmn Multlpl gang Pruduct umu Dunn mm Dunn mm uu1u Dunn nun mum nun mum mum nun mum mum mun cscacow Computer otgamxatmn ang Agaamg y Language Unlvexsxtyame uurgh Booth39s encoding Ke oint A quot1quot in he multiplier implies an addition operation If you have many quot1quots it at means many addition operations Booth39s encoding is useful because it can reduce the number of addition operations you have to perform With Booth39s encoding partial results are obtained by Addin multi licand Adding 0 Subtracting multiplicand cscacow Computer otgamxatmn ang Agaamg y Language Ummyarpmgmgh Booth39s algorithm in action Let39s do OO10gtlt11012gtlt 3 CSC0E0447 Computer ergamzauon and Assemb y Language Booth s algor mm Iteration Multiplicand smp Product 0 0010 mma va ues 0000 1101 0 10 7gt product 7 1110 1101 0 1 0010 product mumphcand smrt ght 1111 0110 1 01 7gt product 0001 0110 1 2 0010 product mumphcand smrt ght 0000 1011 0 1 W 1110 1011 0 3 0010 product 7 mumphcand smrt ght 1111 0101 1 11 7gt no op 1111 0101 1 4 0010 Sort ght 1111 1010 1 Umversxty uf nasburgh Booth39s algorithm In action Imra on Mul pl cand 0 CSC0E0447 Computer Orgamzauon and Assemby Language University umesburgh
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'