CSC/ECE 506 Spring 2011/ch6a jp: Difference between revisions
No edit summary |
|||
Line 1: | Line 1: | ||
=Cache Hierarchy= | =Cache Hierarchy= | ||
In a simple computer model, processor reads data and instructions from the memory and operates on the data. Operating frequency of CPU increased faster than the speed of memory and memory interconnects. For example, cores in Intel first generation i7 processors run at 3.2 GHz frequency, while the memory only runs at 1.3GHz frequency. Also, multi-core architecture started putting more demand on memory bandwidth. This increases the latency in memory access and CPU will have to be idle for most of the time. Due to this, memory became a bottle neck in performance. | |||
To solve this problem, “cache” was invented. Cache is simply a temporary volatile storage space like primary memory but runs at the speed similar to core frequency. CPU can access data and instructions from cache in few clock cycles while accessing data from main memory can take more than 50 cycles. In early days of computing, cache was implemented as a stand alone chip outside the processor. In today’s processors, cache is implemented on same die as core. | |||
There can be multiple levels of caches, each cache subsequently away from the core and larger in size. L1 is closest to the CPU and as a result, fastest to excess. Next to L1 is L2 cache and then L3. L1 cache is divided into instruction cache and data cache. This is better than having a combined larger cache as instruction cache being read-only is easy to implement while data cache is read-write. | |||
=Cache Write Policies= | =Cache Write Policies= |
Revision as of 23:05, 26 February 2011
Cache Hierarchy
In a simple computer model, processor reads data and instructions from the memory and operates on the data. Operating frequency of CPU increased faster than the speed of memory and memory interconnects. For example, cores in Intel first generation i7 processors run at 3.2 GHz frequency, while the memory only runs at 1.3GHz frequency. Also, multi-core architecture started putting more demand on memory bandwidth. This increases the latency in memory access and CPU will have to be idle for most of the time. Due to this, memory became a bottle neck in performance.
To solve this problem, “cache” was invented. Cache is simply a temporary volatile storage space like primary memory but runs at the speed similar to core frequency. CPU can access data and instructions from cache in few clock cycles while accessing data from main memory can take more than 50 cycles. In early days of computing, cache was implemented as a stand alone chip outside the processor. In today’s processors, cache is implemented on same die as core.
There can be multiple levels of caches, each cache subsequently away from the core and larger in size. L1 is closest to the CPU and as a result, fastest to excess. Next to L1 is L2 cache and then L3. L1 cache is divided into instruction cache and data cache. This is better than having a combined larger cache as instruction cache being read-only is easy to implement while data cache is read-write.
Cache Write Policies
Write hit policies
Write miss policies
Generic Policies
Prefetching
Advantages
Disadvantages
Effectiveness
Stream Buffer Prefetching
Prefetching in Parallel Computing
References
Header 1 | Header 2 | Header 3 |
---|---|---|
row 1, cell 1 | row 1, cell 2 | row 1, cell 3 |
row 2, cell 1 | row 2, cell 2 | row 2, cell 3 |
row 3, cell 1 | row 3, cell 2 | row 3, cell 3 |
× | 1 | 2 | 3 |
---|---|---|---|
1 | 1 | 2 | 3 |
2 | 2 | 4 | 6 |
3 | 3 | 6 | 9 |