CSC 456 Fall 2013/1c wa: Difference between revisions

From Expertiza_Wiki
Jump to navigation Jump to search
No edit summary
Line 433: Line 433:
<br />
<br />
<br />
<br />
----
==References==
References<br />
 
Cache size has varied over the years.  Intuitively one would expect cache sizes to keep growing larger and larger following some law similar to Moore’s Law.  In actuality however L1 cache sizes have all but maxed out for an individual processor. Observing the trend of cache growth it can be seen that some processor lines stopped growing from one iteration to the next and in some cases even decreased.  To go along with this cache associativity has varied over the years.  While it is true that no cache organization is optimal for every situation certain organizations certainly perform better for the average task on certain systems.  This wiki will try to analyze data on cache size and associativities to gain some insight into the trends and reasoning behind vendor’s choices in cache size and organization over the years, specifically from the late 80’s / early 90’s to the early 2000’s.<br />
[http://download.intel.com/design/itanium2/manuals/25111003.pdf Itanium Specs(p.20)]<br />
[http://download.intel.com/design/itanium2/manuals/25111003.pdf Itanium Specs(p.20)]<br />
[http://www.chips.5u.com/idxhst.html Cache Evolution]<br />
[http://www.chips.5u.com/idxhst.html Cache Evolution]<br />

Revision as of 20:00, 3 October 2013

Trends in cache size and organization


Introduction

Cache size has grown over the years alongside the evolution of the micropocessor. Intuitively one would expect cache sizes to keep growing larger and larger following some law similar to Moore’s Law. In actuality however L1 cache sizes have all but maxed out for an individual processor. Observing the trend of cache growth it can be seen that some processor lines stopped growing from one iteration to the next and in some cases even decreased in size. To go along with this, cache associativity has varied over the years. While it is true that no cache organization is optimal for every situation certain organizations certainly perform better for most tasks on certain systems. This wiki will try to analyze data on cache size and associativities to gain some insight into the trends and reasoning behind vendor choices of cache size and organization over the years. Specifically it looks from the late 80’s / early 90’s to the early 2000’s.


Cache Associativity

This table shows cache associativities found in some mainstream processors from the late 80’s to the early 2000’s with one processor from 1968 just for reference. As can be seen from the data, the late 80’s early 90’s tended towards a set associative cache with around four lines. In the mid-90’s it tended towards lower associativity and direct mapping. Then in the late 90’s and early 2000’s it tended back towards higher associativities with larger set sizes again.


L1, L2, L3 Associativity

System Year L1 Associativity L2 Associativity L3 Associativity Notes:
IBM 360/85 1968 Sector N/A N/A First processor with a cache, clock speed 12.5MHz
Intel 80486 1989 4-way associative N/A N/A
SuperSPARC 1992 4 & 5 way set N/A N/A Used to render Toy Story, Core @ 40MHz
Alpha 21064(DEC) 1992 Direct Direct N/A
UltraSPARC 1995 2-Way & Direct Direct N/A 64-bit w/ Core@200MHz
Alpha 21164(DEC) 1995 Direct 3 way set N/A
Pentium Pro 1995 2 & 4 way ? N/A First on-die L2
K6-III 1999 2 way 4 way n/a
Pentium 4 10/2000 4 Way 8 Way N/A
UltraSPARC III 2001 4 Way N/A N/A
Itanium 2 2002 4 -way 8-way 12 way





Cache Size

Looking at this table, we can see a general progression of cache sizes growing for all manufacturers. However if we look at certain manufacturers from year to year we can sometimes see a decrease in cache size. For instance in 1992 the SuperSPARC had a 16+20 KB L1 cache, then in 1995 the UltraSPARC only had a 16+16 KB L1 cache. The L2 cache capacity increased however. More recently we can see that from 2008 to 2011 Intel decreased the cache size of its Core line of processors from the 64 KB in the Nehalem architecture to 32 KB in the Sandy Bridge line. In this instance the L2 cache stayed the same at 256 KB but the L3 capacity increased.



L1, L2, L3 Size by Year

Processor System Type Year L1 size L2 size L3 size
IBM 360/85 Mainframe 1968 16 to 32 KB
PDP-11/70 Minicomputer 1975 1 KB
VAX 11/780 Minicomputer 1978 16 KB
IBM 3033 Mainframe 1978 64 KB
IBM 3090 Mainframe 1985 128 to 256 KB
Intel 80486 PC 1989 8 KB
SuperSPARC PC 1992 16 KB/20 KB 0 to 2 MB
Pentium PC 1993 8 KB/8 KB 256 to 512 KB
PowerPC 601 PC 1993 32 KB
UltraSPARC PC 1995 16 KB/16 KB 512 KB to 4 MB
Pentium Pro PC 1995 8 KB/8 KB 256 KB - 1 MB
PowerPC 620 PC 1996 32 KB/32 KB
PowerPC G4 PC/server 1999 32 KB/32 KB 256 KB to 1 MB 2 MB
IBM S/390 G4 Mainframe 1997 32 KB 256 KB 2 MB
IBM S/390 G6 Mainframe 1999 256 KB 8 MB
Pentium 4 PC/server 2000 8 KB/8 KB 256 KB
IBM SP High-end server 2000 64 KB/32 KB 8 MB
CRAY MTAb Supercomputer 2000 8 KB 2 MB
UltraSPARCIII PC 2001 32 KB/64 KB 2 to 8 MB
Itanium PC/server 2001 16 KB/16 KB 96 KB 4 MB
SGI Origin 2001 High-end server 2001 32 KB/32 KB 4 MB
Itanium 2 PC/server 2002 32 KB 256 KB 6 MB
IBM POWER5 High-end server 2003 64 KB 1.9 MB 36 MB
CRAY XD-1 Supercomputer 2004 64 KB/64 KB 1MB
Nehalem (i5,7, Xenon) PC, Server 2008 64 KB per 256 KB per 4 MB to 12 MB total
Sandy Bridge (i3-7, Pent.) PC 2011 32 KB per 256 KB per 1 MB to 20 MB total




Main Memory Specs

DRAM Memory Standards:

Standard Mem Clock Cycle time I/O Bus Clock Module Name Peak Transfer Rate Prefetch Latency Year
DDR-333 166 MHz 6 ns
DDR2-400 100MHz 10 ns 200 MHz PC2-3200 3200 MB/s 4 n 4-6 Bus CC 2003
DDR2-533 133 MHz 7.5 ns 266 PC2-4200 4266 MB/s 4 n
DDR2-667 166 MHz 6 ns 333 MHz PC2-5300 5333 MB/s 4 n
DDR2-800 200 MHz 5 ns 400 MHz PC2-6400 6400 MB/s 4 n
DDR2-1066 266 MHz 3.75 ns 533 MHz PC2-8500 8533 MB/s 4 n
DDR3-800 100 MHz 10 ns 400 MHz PC2-6400 6400 MB/s 8 n 5-9 ns (7 avg.) 2007
DDR3-1066 133 MHz 7.5 ns 533 MHz PC2-8500 8533 MB/s 8 n
DDR3-1333 166 MHz 6 ns 667 MHz PC2-10600 10667 MB/s 8 n
DDR3-1600 200 MHz 5 ns 800 MHz PC2-12800 12800 MB/s 8 n


Conclusion

Theory: Cache Associativity decreased as cache size became larger because it became too expensive to have to search the cache each time once the cache was too large. Also, bigger the cache size as a percentage of main memory, the less need for associativity. But while caches and main memory have both grown, main memory size has grown faster in the 2000’s. So when the percentage of cache to main memory goes down associativity needs to increase.


The Pentium/Pentium (1995)pro was the first processor to have the l2 cache on the processor chip. Before this, the l2 cache was an option to add on to the motherboard. [1]

Systems to consider in table

Pentium
amd
Mips
sun-microsystems: sparc
ibm: power pc
DEC: alpha

Penalty <100 when before 2000 after 2000 started to increase to get to main memory
< 20 1 level fine
<=100 2 level
>=200 3 level

miss rate reported, spec benchmarks >=200 3 level

miss rate reported, spec benchmarks



References

Cache size has varied over the years. Intuitively one would expect cache sizes to keep growing larger and larger following some law similar to Moore’s Law. In actuality however L1 cache sizes have all but maxed out for an individual processor. Observing the trend of cache growth it can be seen that some processor lines stopped growing from one iteration to the next and in some cases even decreased. To go along with this cache associativity has varied over the years. While it is true that no cache organization is optimal for every situation certain organizations certainly perform better for the average task on certain systems. This wiki will try to analyze data on cache size and associativities to gain some insight into the trends and reasoning behind vendor’s choices in cache size and organization over the years, specifically from the late 80’s / early 90’s to the early 2000’s.
Itanium Specs(p.20)
Cache Evolution
Intel Processors
First on-board L1
Cache Trend Table‎
Sector Caches
DDR2/3 Speeds
Memory Wall
Pentium Pro Specs