Aside from the smart cell phones used by a billion people, list and describe four other types of computers.
Read more- Computer science / Computer Organization and Design 5 / Chapter 1 / Problem 1.12
Textbook Solutions for Computer Organization and Design
Question
Section 1.10 cites as a pitfall the utilization of a subset of the performance equation as a performance metric. To illustrate this, consider the following two processors. P1 has a clock rate of 4 GHz, average CPI of 0.9, and requires the execution of 5.0E9 instructions. P2 has a clock rate of 3 GHz, an average CPI of 0.75, and requires the execution of 1.0E9 instructions.
Solution
Step 1 of 5
There are two processes P1, P2 with different clock rate, CPI (clock cycles per instruction) and number of instructions. The details are specified in the following table:
The formula to calculate the CPU time (Execution time) is given below:
full solution
Section 1.10 cites as a pitfall the utilization of a
Chapter 1 textbook questions
-
Chapter 1: Problem 1 Computer Organization and Design 5
-
Chapter 1: Problem 1 Computer Organization and Design 5
[The eight great ideas in computer architecture are similar to ideas from other fields. Match the eight ideas from computer architecture, “Design for Moore’s Law”, “Use Abstraction to Simplify Design”, “Make the Common Case Fast”, “Performance via Parallelism”, “Performance via Pipelining”, “Performance via Prediction”, “Hierarchy of Memories”, and “Dependability via Redundancy” to the following ideas from other fields: a. Assembly lines in automobile manufacturing b. Suspension bridge cables c. Aircraft and marine navigation systems that incorporate wind information d. Express elevators in buildings e. Library reserve desk f. Increasing the gate area on a CMOS transistor to decrease its switching time g. Adding electromagnetic aircraft catapults (which are electrically-powered as opposed to current steam-powered models), allowed by the increased power generation offered by the new reactor technology h. Building self-driving cars whose control systems partially rely on existing sensor systems already installed into the base vehicle, such as lane departure systems and smart cruise control systems
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
Describe the steps that transform a program written in a high-level language such as C into a representation that is directly executed by a computer processor.
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
Assume a color display using 8 bits for each of the primary colors (red, green, blue) per pixel and a frame size of \(1280 \times 1024\). a. What is the minimum size in bytes of the frame buffer to store a frame? b. How long would it take, at a minimum, for the frame to be sent over a 100 Mbit/s network?
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
Consider three different processors P1, P2, and P3 executing the same instruction set. P1 has a 3 GHz clock rate and a CPI of 1.5. P2 has a 2.5 GHz clock rate and a CPI of 1.0. P3 has a 4.0 GHz clock rate and has a CPI of 2.2. a. Which processor has the highest performance expressed in instructions per second? b. If the processors each execute a program in 10 seconds, find the number of cycles and the number of instructions. c. We are trying to reduce the execution time by 30% but this leads to an increase of 20% in the CPI. What clock rate should we have to get this time reduction?
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
Consider two different implementations of the same instruction set architecture. The instructions can be divided into four classes according to their CPI (class A, B, C, and D). P1 with a clock rate of 2.5 GHz and CPIs of 1, 2, 3, and 3, and P2 with a clock rate of 3 GHz and CPIs of 2, 2, 2, and 2. Given a program with a dynamic instruction count of 1.0E6 instructions divided into classes as follows: 10% class A, 20% class B, 50% class C, and 20% class D, which implementation is faster? a. What is the global CPI for each implementation? b. Find the clock cycles required in both cases.
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
Compilers can have a profound impact on the performance of an application. Assume that for a program, compiler A results in a dynamic instruction count of 1.0E9 and has an execution time of 1.1 s, while compiler B results in a dynamic instruction count of 1.2E9 and an execution time of 1.5 s. a. Find the average CPI for each program given that the processor has a clock cycle time of 1 ns. b. Assume the compiled programs run on two different processors. If the execution times on the two processors are the same, how much faster is the clock of the processor running compiler A’s code versus the clock of the processor running compiler B’s code? c. A new compiler is developed that uses only 6.0E8 instructions and has an average CPI of 1.1. What is the speedup of using this new compiler versus using compiler A or B on the original processor?
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
The Pentium 4 Prescott processor, released in 2004, had a clock rate of 3.6 GHz and voltage of 1.25 V. Assume that, on average, it consumed 10 W of static power and 90 W of dynamic power. The Core i5 Ivy Bridge, released in 2012, had a clock rate of 3.4 GHz and voltage of 0.9 V. Assume that, on average, it consumed 30 W of static power and 40 W of dynamic power.
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
For each processor find the average capacitive loads.
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
Find the percentage of the total dissipated power comprised by static power and the ratio of static power to dynamic power for each technology.
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
If the total dissipated power is to be reduced by 10%, how much should the voltage be reduced to maintain the same leakage current? Note: power is defined as the product of voltage and current.
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
Assume for arithmetic, load/store, and branch instructions, a processor has CPIs of 1, 12, and 5, respectively. Also assume that on a single processor a program requires the execution of 2.56E9 arithmetic instructions, 1.28E9 load/store instructions, and 256 million branch instructions. Assume that each processor has a 2 GHz clock frequency. Assume that, as the program is parallelized to run over multiple cores, the number of arithmetic and load/store instructions per processor is divided by 0.7 x p (where p is the number of processors) but the number of branch instructions per processor remains the same.
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
Find the total execution time for this program on 1, 2, 4, and 8 processors, and show the relative speedup of the 2, 4, and 8 processor result relative to the single processor result.
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
If the CPI of the arithmetic instructions was doubled, what would the impact be on the execution time of the program on 1, 2, 4, or 8 processors?
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
To what should the CPI of load/store instructions be reduced in order for a single processor to match the performance of four processors using the original CPI values?
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
Assume a 15 cm diameter wafer has a cost of 12, contains 84 dies, and has \(0.020 \ \mathrm{defects/cm}^2\) . Assume a 20 cm diameter wafer has a cost of 15, contains 100 dies, and has \(0.031 \ \mathrm{defects/cm}^2\).
Read more -
-
Chapter 1: Problem 1 Computer Organization and Design 5
Find the cost per die for both wafers.
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
If the number of dies per wafer is increased by 10% and the defects per area unit increases by 15%, find the die area and yield.
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
Assume a fabrication process improves the yield from 0.92 to 0.95. Find the defects per area unit for each version of the technology given a die area of \(200 \ \mathrm{mm}^2\)
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
The results of the SPEC CPU2006 bzip2 benchmark running on an AMD Barcelona has an instruction count of 2.389E12, an execution time of 750 s, and a reference time of 9650 s.
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
Find the CPI if the clock cycle time is 0.333 ns.
Read more -
-
Chapter 1: Problem 1 Computer Organization and Design 5
Find the increase in CPU time if the number of instructions of the benchmark is increased by 10% without affecting the CPI.
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
Find the increase in CPU time if the number of instructions of the benchmark is increased by 10% and the CPI is increased by 5%.
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
Find the change in the SPECratio for this change.
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
Suppose that we are developing a new version of the AMD Barcelona processor with a 4 GHz clock rate. We have added some additional instructions to the instruction set in such a way that the number of instructions has been reduced by 15%. The execution time is reduced to 700 s and the new SPECratio is 13.7. Find the new CPI.
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
This CPI value is larger than obtained in 1.11.1 as the clock rate was increased from 3 GHz to 4 GHz. Determine whether the increase in the CPI is similar to that of the clock rate. If they are dissimilar, why?
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
By how much has the CPU time been reduced?
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
For a second benchmark, libquantum, assume an execution time of 960 ns, CPI of 1.61, and clock rate of 3 GHz. If the execution time is reduced by an additional 10% without affecting to the CPI and with a clock rate of 4 GHz, determine the number of instructions.
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
Determine the clock rate required to give a further 10% reduction in CPU time while maintaining the number of instructions and with the CPI unchanged.
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
Determine the clock rate if the CPI is reduced by 15% and the CPU time by 20% while the number of instructions is unchanged.
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
Section 1.10 cites as a pitfall the utilization of a subset of the performance equation as a performance metric. To illustrate this, consider the following two processors. P1 has a clock rate of 4 GHz, average CPI of 0.9, and requires the execution of 5.0E9 instructions. P2 has a clock rate of 3 GHz, an average CPI of 0.75, and requires the execution of 1.0E9 instructions.
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
One usual fallacy is to consider the computer with the largest clock rate as having the largest performance. Check if this is true for P1 and P2.
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
Another fallacy is to consider that the processor executing the largest number of instructions will need a larger CPU time. Considering that processor P1 is executing a sequence of 1.0E9 instructions and that the CPI of processors P1 and P2 do not change, determine the number of instructions that P2 can execute in the same time that P1 needs to execute 1.0E9 instructions.
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
A common fallacy is to use MIPS (millions of instructions per second) to compare the performance of two different processors, and consider that the processor with the largest MIPS has the largest performance. Check if this is true for P1 and P2.
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
Another common performance figure is MFLOPS (millions of floating-point operations per second), defined as MFLOPS = No. FP operations / (execution time \(\times\) 1E6) but this figure has the same problems as MIPS. Assume that 40% of the instructions executed on both P1 and P2 are floating-point instructions. Find the MFLOPS figures for the programs.
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
Another pitfall cited in Section 1.10 is expecting to improve the overall performance of a computer by improving only one aspect of the computer. Consider a computer running a program that requires 250 s, with 70 s spent executing FP instructions, 85 s executed L/S instructions, and 40 s spent executing branch instructions.
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
By how much is the total time reduced if the time for FP operations is reduced by 20%?
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
By how much is the time for INT operations reduced if the total time is reduced by 20%?
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
Can the total time can be reduced by 20% by reducing only the time for branch instructions?
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
Assume a program requires the execution of \(50 \times 106 \ \mathrm{FP}\) instructions, \(110 \times 106 \ \mathrm{INT}\) instructions, \(80 \times 106 \ \mathrm{L/S}\) instructions, and \(16 \times 106\) branch instructions. The CPI for each type of instruction is 1, 1, 4, and 2, respectively. Assume that the processor has a 2 GHz clock rate.
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
By how much must we improve the CPI of FP instructions if we want the program to run two times faster?
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
By how much must we improve the CPI of L/S instructions if we want the program to run two times faster?
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
By how much is the execution time of the program improved if the CPI of INT and FP instructions is reduced by 40% and the CPI of L/S and Branch is reduced by 30%?
Read more -
Chapter 1: Problem 1 Computer Organization and Design 5
When a program is adapted to run on multiple processors in a multiprocessor system, the execution time on each processor is comprised of computing time and the overhead time required for locked critical sections and/or to send data from one processor to another. Assume a program requires t = 100 s of execution time on one processor. When run p processors, each processor requires t/p s, as well as an additional 4 s of overhead, irrespective of the number of processors. Compute the per-processor execution time for 2, 4, 8, 16, 32, 64, and 128 processors. For each case, list the corresponding speedup relative to a single processor and the ratio between actual speedup versus ideal speedup (speedup if there was no overhead).
Read more