CSC/ECE 506 Fall 2007/wiki1 4 la: Difference between revisions

From Expertiza_Wiki
Jump to navigation Jump to search
Line 39: Line 39:


http://upload.wikimedia.org/wikipedia/commons/5/5e/Bandwidth.JPG
http://upload.wikimedia.org/wikipedia/commons/5/5e/Bandwidth.JPG
Figure 2. Bandwidth of the shared memory bus in commercial multiprocessors.
<br>Figure 2. Bandwidth of the shared memory bus in commercial multiprocessors.
==References==
==References==
http://www.endian.net/details.aspx?ItemNo=655
http://www.endian.net/details.aspx?ItemNo=655

Revision as of 00:43, 6 September 2007

Update section 1.1.3: Architectural Trends

Microprocessor Design Trends

The textbook discusses that up to 1986, advancement in microprocessors were dominated by bit-level parallelism. It started with 4-bit datapaths, followed by 8-bit, 16-bit and 32-bit wide datapaths. In server design, the norm has been established to be at 64-bit since the start of the millennium. A 128-bit datapath is rarely mentioned to be used in microprocessors. However, graphics processors have been using 128-bit and 256-bit wide datapaths and it is possible to see an increase to 512-bit wide datapaths soon, especially with the advancements in computer graphics, animations and gaming.

Instruction-level parallelism took off as advancements in bit-level parallelism receded. After all, the benefits possible by advancements in bit-level parallelism are limited to the ability to address more storage space and the ability to do more in a single cycle. The latter benefit has been limited to more precise floating point calculation although some microprocessors have the ability to bundle a couple of instructions into one.

The period within the 1980's and 1990's indeed set the stage for the modern microprocessor. Superscalar microprocessors were created, which encompassed branch predictors, out-of-order execution, deeper and larger levels of cache on chip, cache coherency protocols and the ability to communicate with other microprocessors on chip. Research done in the 1990's and early 2000's set the stage for the next level of parallelism to be exploited: thread-level parallelism.

Two technologies appeard in the 2000's that altered the microprocessor performance race. The first is Multiple Cores on chip and the second is Simultaneous Multi-Threading (SMT) (also known as Hyper-Threading). Industry refrained at this point from using the clock speed as the performance metric since a microprocessor encompassed many more intertwined technologies than merely speeding up the clock cycle. The industry has seen two cores on a single chip. Then, it saw cores taking advantage of SMT. The number of cores and the number of threads exploited in a microprocessor are ever increasing. Both core and thread technologies are increasing in the number of threads they are able to support. Dual core processors and dual thread processors are already in existence with the promise to merge both technologies so each core can support two threads. There are microprocessors in existence today with four and eight cores, with the promise of sixteen cores on a single chip to be made in a matter of months.

Clock Speed and Parallelism

In the PC system world, throughout the 1990's and early 2000's, increasing chip clock speed was the standard way to increase system performance. Desktop processors topped 1GHz clock speeds in 2000, 2GHz in 2001, and topped 3GHz in 2002. But, due to power demands and heat concerns, this trend has since been discontinued. Design obstacles, especially in laptop computers, meant that other methods had to be pursued in order to increase processing power without losing efficiency. The Multi-Core era was then introduced to the PC world. In the spring of 2005, dual-core chips were introduced by Intel and then by AMD. Quad-core processors have reached the market, and octal-cores may hit the market by 2009.

In 2002, Intel released the Itanium microprocessor, which takes advantage of explicit instruction-level parallelism. The compiler makes decisions about which instructions to execute in parallel, allowing the processor to execute up to six instructions per clock cycle. Although the original (and several subsequent) Itanium processors contained a single core, In 2006, Intel released an Itanium dual core microprocessor. The future of the Itanium family will follow the trend of most other microprocessors, in that thread-level parallelism will be exploited via multi-core chips.

Silicon Technologies

System Design Trends

Desktop Direction

The number of supported processors in a computer is ever increasing. In mid 2000's, the norm has increasingly been supporting more than one processor in a desktop computer (with laptops following closely behind). Intel and AMD are in a constant race to provide a stronger chip which provides higher performance (with multiple cores) and higher bandwidth (with faster electrical signaling, wider datapaths, pipelined protocols, multiple paths and software support).

Server Direction

Figure 1 shows the number of processrs that has been supported this decade in a shared bus. However, a commonality between the technology appearing this decade and in the last decade is that servers at these times supported either a single core or a dual core microprocessor. The industry has been inching towards supporting a 100 microprocessors on a single shared bus. Such an approach is bound to reaching a deadend if new levels of indirection were not exploited. Indeed, new technologies have made supporting more microprocessors on a shared bus more feasible. Among these technologies we mention multiple cores per chip, deeper levels of caching and better addressing schemes. If a microprocessor with multiple cores was considered a node, nodes communicate and it is left to the microprocessor to arbitrate between cores relieving the shared bus from this addressing strain. With the constant improvements in multiple core support within a chip, it is possible to see servers with over two hundred cores as soon as this decade.

http://upload.wikimedia.org/wikipedia/commons/3/32/Procs.JPG Figure 1. Number of processors in fully configured commercial bus-based share memory multiprocessors.

A different class of servers is emerging which is neither an SMP or a cluster. It is called ccNUMA. ccNUMA servers utilize Cache-Coherent Non-Uniform Memory Access. Such servers provide better memory access time to local memory. However, the different copies of the same data are kept up to date through cahce-coherency protocols. Such technology is being supported by Intel and AMD. Another server manufacturer supporting this technology is SGI with its Origin 350 server supporting up to 32 microprocessors.

Figures

http://upload.wikimedia.org/wikipedia/commons/5/5e/Bandwidth.JPG
Figure 2. Bandwidth of the shared memory bus in commercial multiprocessors.

References

http://www.endian.net/details.aspx?ItemNo=655 http://www-05.ibm.com/se/news/sv/2007/05/power-timeline.html http://www-03.ibm.com/servers/eserver/pseries/hardware/whitepapers/power/ppc_arch.html http://www.theinquirer.net/?article=9235 http://www.sun.com/processors/ http://www.sgi.com/products/remarketed/offering.html