CSC/ECE 506 Fall 2007/wiki3 1 satkar: Difference between revisions
No edit summary |
No edit summary |
||
Line 8: | Line 8: | ||
Increasing the size of the line in the cache helps in reducing the hit time, as more blocks can be accommodated in the same line. However, long cache lines may cause false sharing, when different processors access different words in the same cache line. In essence, they share the same line, without truly sharing the accessed data. | Increasing the size of the line in the cache helps in reducing the hit time, as more blocks can be accommodated in the same line. However, long cache lines may cause false sharing, when different processors access different words in the same cache line. In essence, they share the same line, without truly sharing the accessed data. | ||
==''Problem with False Sharing'' == | |||
In the multiprocessors, accessing a memory location causes a slice of actual memory (a cache line) containing the memory location requested to be copied into the cache.Subsequent references to the same memory location or those around it can probably be satisfied out of the cache until the system determines it is necessary to maintain the coherency between cache and memory and restore the cache line back to memory. | |||
But, in a scenario, where multiple processors try to update individual elements in the same cache line, leads to invalidation of entire cache line, even though the updates are independent of each other.Each update of an individual element of a cache line marks the line as invalid. Hence, other processors accessing a different element in the same line see the line marked as invalid. They are forced to fetch a fresh copy of the line from memory, even though the element accessed has not been modified. This is because cache coherency is maintained on a cache-line basis, and not for individual elements. As a result there will be an increase in interconnect traffic and overhead. Also, while the cache-line update is in progress, access to the elements in the line is inhibited. | |||
This situation is called false sharing, and might become a bottleneck in the path of performance and scalability. | |||
== ''Strategies to combat “False Sharing”'' == | == ''Strategies to combat “False Sharing”'' == |
Revision as of 20:41, 17 October 2007
Introduction
The cache organization plays a key role in the modern computers, especially in the multiprocessors. The cache misses are broadly categorized into “Three-Cs”, namely Compulsory misses, Capacity misses and the Conflict misses. There is yet another category of misses introduced by the cache coherent multiprocessors, called the coherence misses. These occur when blocks of data are shared among multiple caches, and are of two types;
• True sharing: When a data word produced by a processor is used by another processor, then it is said to be True Sharing.
• False Sharing: When independent data words for different processors are placed in the same block, then it is called false sharing.
Increasing the size of the line in the cache helps in reducing the hit time, as more blocks can be accommodated in the same line. However, long cache lines may cause false sharing, when different processors access different words in the same cache line. In essence, they share the same line, without truly sharing the accessed data.
Problem with False Sharing
In the multiprocessors, accessing a memory location causes a slice of actual memory (a cache line) containing the memory location requested to be copied into the cache.Subsequent references to the same memory location or those around it can probably be satisfied out of the cache until the system determines it is necessary to maintain the coherency between cache and memory and restore the cache line back to memory.
But, in a scenario, where multiple processors try to update individual elements in the same cache line, leads to invalidation of entire cache line, even though the updates are independent of each other.Each update of an individual element of a cache line marks the line as invalid. Hence, other processors accessing a different element in the same line see the line marked as invalid. They are forced to fetch a fresh copy of the line from memory, even though the element accessed has not been modified. This is because cache coherency is maintained on a cache-line basis, and not for individual elements. As a result there will be an increase in interconnect traffic and overhead. Also, while the cache-line update is in progress, access to the elements in the line is inhibited.
This situation is called false sharing, and might become a bottleneck in the path of performance and scalability.