CSC 456 Fall 2013/1d vb: Difference between revisions
(55 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
=Trends in Pipelining= | |||
==Introduction== | |||
== | Computing architectures have changed greatly over the relatively short span of a few decades. In the pursuit of good performance and economical cost, processor architectures have taken many forms. There have been many trends over the years relating to specific processor characteristics such as pipeline length. Some changes are based on technological limitations of the time period, and other decisions are based on hypothetical and real world performance research. | ||
==Factors Favoring Longer Pipeline Length== | |||
With each tick of the clock, the pipeline is advanced by one stage. Having a much longer pipeline allows for each individual step to be very short. Since each individual pipeline step is relatively small it is possible for the clock speed to be much faster since each step does not require as much time or work. | |||
There is a secondary affect of longer pipelines as well. The resulting higher clock speed can also be used as a marketing point. The average user does not understand the metrics of raw processor power, but being able to compare two numbers such as 2.9Ghz vs 3.4Ghz is a simple way in which many attempt to understand different processors. | |||
==Factors Favoring Shorter Pipeline Length== | |||
The issue with increased pipeline length is the problem of incorrect branch predictions. The longer a pipeline is, the more stages of wasted processing have been wasted when a different branch is taken. Decreasing the pipeline length has resulted in lower clock frequencies, but equal or better IPC. A smaller pipeline suffers less of a loss for every bad prediction, and the overall performance is improved. With all processor properties, there is no simple "best" pipeline, there is always a bell curve pointing to the the most effective pipeline pipeline length for a given setup. | |||
Latch delays can also play a role when you increase pipeline Length. When you attempt to improve performance by increasing the number of stages more than 90% of the optimum performance, problems occur. The performance does not increase at this point unless the latch overhead time is addressed, requiring that the increasable latch overhead time is less than the total overhead time. [5] | |||
==Power Consumption vs. Performance== | |||
Along with performance, power is another concern in microarchitectural design. There have been multiple studies on what optimal pipeline depth when considering both performance and power. Srinivasan stated that majority of power used is related to latches, including clocking and the leakage of power per latch. The number of latches grows super linearly with the number of pipeline stages according to Srinivasan. The overall power/performance is improved with the number of pipeline stages, as illustrated in the graph. | |||
[[File:Pipeline_stage_vs_metric.png]] | |||
==Examples of Pipeline changes in Different Processors== | |||
{| class="wikitable" | {| class="wikitable" | ||
Line 13: | Line 32: | ||
|- | |- | ||
| 1976 | | 1976 | ||
| Cray 1 | | Cray 1 | ||
| | | 3 [2] | ||
| 12 | | 12 | ||
|- | |||
|2006 | |||
|IBM Cell BE | |||
|23 [6] | |||
|16 | |||
|- | |- | ||
| 2012 | | 2012 | ||
| Cray XK7 | | Cray XK7 | ||
| 12 for scalar | | 12 for scalar , 17 for vector [3] | ||
| | | 6 | ||
| | |||
|} | |} | ||
[http://www.ibm.com/developerworks/power/library/pa-cellperf/figure2.gif A visual example of the Cell Pipeline] | |||
==Sources== | |||
<ol> | |||
<li>[https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CE4QFjAA&url=http%3A%2F%2Fclasses.soe.ucsc.edu%2Fcmpe202%2FFall04%2Fpapers%2Fopteron.pdf&ei=hQsmUv6LKMnJsASb8IGACQ&usg=AFQjCNHvPcgDLJjfk0ufcd7HRA6aDAgU8w&sig2=fZvAhhwalsuZAs7GuamDHg&bvm=bv.51495398,d.cWc The AMD Opteron Processor for Multiprocessor Servers] Chetana N. Keltcher, Kevin J. McGrath, Ardsher Ahmed, Pat Conway </li> | |||
<li>[http://en.wikipedia.org/wiki/Cray-1 Cray 1] Wikipedia</li> | |||
<li>[http://en.wikipedia.org/wiki/XK7 Cray XK7] Wikipedia</li> | |||
<li>[https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&cad=rja&ved=0CDkQFjAB&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.93.4333%26rep%3Drep1%26type%3Dpdf&ei=BwwmUtX-D7OgsQSrxYCgDw&usg=AFQjCNFrDohjVe-SefuaJvLAJwXEFVgWYw&sig2=Hfx9Gs6MI8XtOVT3PvoDlw&bvm=bv.51495398,d.cWc The Optimum Pipeline Depth for a Microprocessor] A. Hartstein, Thomas R. Puzak</li> | |||
<li>[http://www.readcube.com/articles/10.1002/ecjc.20127?locale=en An Analysis about Increasable Latch Overheard time for Processor Pipeline Depth Increase] Magoshi & Murakami</li> | |||
<li>[http://www.ibm.com/developerworks/power/library/pa-cellperf/ Cell Broadband Engine Architecture] Dr.Thomas Chen, Dr.Ram Raghavan, Jason Dale, Eiji Iwata</li> | |||
</ol> |
Latest revision as of 16:24, 17 September 2013
Trends in Pipelining
Introduction
Computing architectures have changed greatly over the relatively short span of a few decades. In the pursuit of good performance and economical cost, processor architectures have taken many forms. There have been many trends over the years relating to specific processor characteristics such as pipeline length. Some changes are based on technological limitations of the time period, and other decisions are based on hypothetical and real world performance research.
Factors Favoring Longer Pipeline Length
With each tick of the clock, the pipeline is advanced by one stage. Having a much longer pipeline allows for each individual step to be very short. Since each individual pipeline step is relatively small it is possible for the clock speed to be much faster since each step does not require as much time or work.
There is a secondary affect of longer pipelines as well. The resulting higher clock speed can also be used as a marketing point. The average user does not understand the metrics of raw processor power, but being able to compare two numbers such as 2.9Ghz vs 3.4Ghz is a simple way in which many attempt to understand different processors.
Factors Favoring Shorter Pipeline Length
The issue with increased pipeline length is the problem of incorrect branch predictions. The longer a pipeline is, the more stages of wasted processing have been wasted when a different branch is taken. Decreasing the pipeline length has resulted in lower clock frequencies, but equal or better IPC. A smaller pipeline suffers less of a loss for every bad prediction, and the overall performance is improved. With all processor properties, there is no simple "best" pipeline, there is always a bell curve pointing to the the most effective pipeline pipeline length for a given setup.
Latch delays can also play a role when you increase pipeline Length. When you attempt to improve performance by increasing the number of stages more than 90% of the optimum performance, problems occur. The performance does not increase at this point unless the latch overhead time is addressed, requiring that the increasable latch overhead time is less than the total overhead time. [5]
Power Consumption vs. Performance
Along with performance, power is another concern in microarchitectural design. There have been multiple studies on what optimal pipeline depth when considering both performance and power. Srinivasan stated that majority of power used is related to latches, including clocking and the leakage of power per latch. The number of latches grows super linearly with the number of pipeline stages according to Srinivasan. The overall power/performance is improved with the number of pipeline stages, as illustrated in the graph.
Examples of Pipeline changes in Different Processors
Year | Name | Pipeline Length | Number of Pipelines | |
---|---|---|---|---|
1976 | Cray 1 | 3 [2] | 12 | |
2006 | IBM Cell BE | 23 [6] | 16 | |
2012 | Cray XK7 | 12 for scalar , 17 for vector [3] | 6 |
A visual example of the Cell Pipeline
Sources
- The AMD Opteron Processor for Multiprocessor Servers Chetana N. Keltcher, Kevin J. McGrath, Ardsher Ahmed, Pat Conway
- Cray 1 Wikipedia
- Cray XK7 Wikipedia
- The Optimum Pipeline Depth for a Microprocessor A. Hartstein, Thomas R. Puzak
- An Analysis about Increasable Latch Overheard time for Processor Pipeline Depth Increase Magoshi & Murakami
- Cell Broadband Engine Architecture Dr.Thomas Chen, Dr.Ram Raghavan, Jason Dale, Eiji Iwata