Computer Architecture: A Quantitative Approach, 4th Edition

Computer Architecture: A Quantitative Approach, 4th Edition

John L. Hennessy, David A. Patterson

Language: English

Pages: 704

ISBN: 0123704901

Format: PDF / Kindle (mobi) / ePub


The era of seemingly unlimited growth in processor performance is over: single chip architectures can no longer overcome the performance limitations imposed by the power they consume and the heat they generate. Today, Intel and other semiconductor firms are abandoning the single fast processor model in favor of multi-core microprocessors--chips that combine two or more processors in a single package. In the fourth edition of Computer Architecture, the authors focus on this historic shift, increasing their coverage of multiprocessors and exploring the most effective ways of achieving parallelism as the key to unlocking the power of multiple processor architectures. Additionally, the new edition has expanded and updated coverage of design topics beyond processor performance, including power, reliability, availability, and dependability.

CD System Requirements
PDF Viewer
The CD material includes PDF documents that you can read with a PDF viewer such as Adobe, Acrobat or Adobe Reader. Recent versions of Adobe Reader for some platforms are included on the CD.

HTML Browser
The navigation framework on this CD is delivered in HTML and JavaScript. It is recommended that you install the latest version of your favorite HTML browser to view this CD. The content has been verified under Windows XP with the following browsers: Internet Explorer 6.0, Firefox 1.5; under Mac OS X (Panther) with the following browsers: Internet Explorer 5.2, Firefox 1.0.6, Safari 1.3; and under Mandriva Linux 2006 with the following browsers: Firefox 1.0.6, Konqueror 3.4.2, Mozilla 1.7.11.
The content is designed to be viewed in a browser window that is at least 720 pixels wide. You may find the content does not display well if your display is not set to at least 1024x768 pixel resolution.

Operating System
This CD can be used under any operating system that includes an HTML browser and a PDF viewer. This includes Windows, Mac OS, and most Linux and Unix systems.

Increased coverage on achieving parallelism with multiprocessors.

Case studies of latest technology from industry including the Sun Niagara Multiprocessor, AMD Opteron, and Pentium 4.

Three review appendices, included in the printed volume, review the basic and intermediate principles the main text relies upon.

Eight reference appendices, collected on the CD, cover a range of topics including specific architectures, embedded systems, application specific processors--some guest authored by subject experts.

iOS Programming: The Big Nerd Ranch Guide (3rd Edition) (Big Nerd Ranch Guides)

Tapworthy: Designing Great iPhone Apps

HTML5 Canvas for Dummies

Science for Agriculture and Rural Development in Low-income Countries

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

is attributable to more advanced architectural and organizational ideas. By 2002, this growth led to a difference in performance of about a factor of seven. Performance for floating-point-oriented calculations has increased even faster. Since 2002, the limits of power, available instruction-level parallelism, and long memory latency have slowed uniprocessor performance recently, to about 20% per year. Since SPEC has changed over the years, performance of newer machines is estimated by a scaling

requirement in an embedded application is real-time execution. A real-time performance requirement is when a segment of the application has an absolute maximum execution time. For example, in a digital set-top box, the time to process each video frame is limited, since the processor must accept and process the next frame shortly. In some applications, a more nuanced requirement exists: the average time for a particular task is constrained as well as the number of instances when some maximum time

execution time was in a single line (see SPEC [1989]). When an IBM compiler optimized this inner loop (using an idea called blocking, discussed in Chapter 5), performance improved by a factor of 9 over a prior version of the compiler! This benchmark tested compiler tuning and was not, of course, a good indication of overall performance, nor of the typical value of this particular optimization. Even after the elimination of this benchmark, vendors found methods to tune the performance of others by

MUL.D. The Value column indicates the value being held; the format #X is used to refer to a value field of ROB entry X. Reorder buffers 1 and 2 are actually completed, but are shown for informational purposes. We do not show the entries for the load-store queue, but these entries are kept in order. the ROB for both the effective address operand (R1 in this example) and the value (F4 in this example). Since we are only considering the floating-point pipeline, assume the effective address for the

170 172 179 183 184 185 185 3 Limits on Instruction-Level Parallelism Processors are being produced with the potential for very many parallel operations on the instruction level. . . . Far greater extremes in instruction-level parallelism are on the horizon. J. Fisher (1981), in the paper that inaugurated the term “instruction-level parallelism” One of the surprises about IA-64 is that we hear no claims of high frequency, despite claims that an EPIC processor is less complex than a

Download sample

Download