CSC/ECE 506 Spring 2011/ch6a jp

From Expertiza_Wiki
Revision as of 23:09, 26 February 2011 by Pmpatel (talk | contribs)
Jump to navigation Jump to search

Cache Hierarchy

In a simple computer model, processor reads data and instructions from the memory and operates on the data. Operating frequency of CPU increased faster than the speed of memory and memory interconnects. For example, cores in Intel first generation i7 processors run at 3.2 GHz frequency, while the memory only runs at 1.3GHz frequency. Also, multi-core architecture started putting more demand on memory bandwidth. This increases the latency in memory access and CPU will have to be idle for most of the time. Due to this, memory became a bottle neck in performance.

To solve this problem, “cache” was invented. Cache is simply a temporary volatile storage space like primary memory but runs at the speed similar to core frequency. CPU can access data and instructions from cache in few clock cycles while accessing data from main memory can take more than 50 cycles. In early days of computing, cache was implemented as a stand alone chip outside the processor. In today’s processors, cache is implemented on same die as core.

There can be multiple levels of caches, each cache subsequently away from the core and larger in size. L1 is closest to the CPU and as a result, fastest to excess. Next to L1 is L2 cache and then L3. L1 cache is divided into instruction cache and data cache. This is better than having a combined larger cache as instruction cache being read-only is easy to implement while data cache is read-write.

Cache Write Policies

Write hit policies

Write-through

Also known as store-through, this policy will write to main memory whenever a write is performed to cache.

Write-back

Also known as store-in or copy-back, this policy will write to main memory only when a block of data is purged from the cache storage.

Write miss policies

Write-allocate vs Write no-allocate

When a write misses in the cache, there may or may not be a line in the cache allocated to the block. For write-allocate, there will be a line in the cache for the written data. This policy is typically associated with write-back caches. For no-write-allocate, there will not be a line in the cache.

Fetch-on-write vs no-fetch-on-write

The fetch-on-write will cause the block of data to be fetched from a lower memory hierarchy if the write misses. The policy fetches a block on every write miss.

Write-before-hit vs no-write-before-hit

The write-before-hit will write data to the cache before checking the cache tags for a match. In case of a miss, the policy will displace the block of data already in the cache.

Combination Policies

Write-validate

It is a combination of no-fetch-on-write and write-allocate. The policy allows partial lines to be written to the cache on a miss. It provides for better performance as well as works with machines that have various line sizes and does not add instruction execution overhead to the program being run.

Write-invalidate

This policy is a combination of write-before-hit, no-fetch-on-write, and no-write-allocate. This policy invalidates lines when there is a miss.

Write-around

Combination of no-fetch-on-write, no-write-allocate, and no-write-before-hit. This policy uses a non-blocking write scheme to write to cache. It writes data to the next lower cache without modifying the data of the cache line.

Prefetching

Advantages

Disadvantages

Effectiveness

Stream Buffer Prefetching

Prefetching in Parallel Computing

References

Header 1 Header 2 Header 3
row 1, cell 1 row 1, cell 2 row 1, cell 3
row 2, cell 1 row 2, cell 2 row 2, cell 3
row 3, cell 1 row 3, cell 2 row 3, cell 3


Multiplication table
× 1 2 3
1 1 2 3
2 2 4 6
3 3 6 9