CSC/ECE 506 Fall 2007/wiki3 7 qaz

From Expertiza_Wiki
Jump to navigation Jump to search

Wiki: True and false sharing. In Lectures 9 and 10, we covered performance results for true- and false-sharing misses. The results showed that some applications experienced degradation due to false sharing, and that this problem was greater with larger cache lines. But these data are at least 9 years old, and for multiprocessors that are smaller than those in use today. Comb the ACM Digital Library, IEEE Xplore, and the Web for more up-to-date results. What strategies have proven successful in combating false sharing? Is there any research into ways of diminishing true-sharing misses, e.g., by locating communicating processes on the same processor? Wouldn't this diminish parallelism and thus hurt performance?

False Sharing Miss

   In multiprocessor system, it is important to ensure data coherence across all processors. Vendors like Intel uses MESI protocol to ensure cache coherence. When the program is loaded on to the cache for the first time, MESI requires to put this in Exclusive state. This data in the processors cache will go to shared state when another processor requests the same portion of program. For all subsequent stores by any one processor will cause its state to change from shared to modified and it will invalidate the corresponding cache content of other processors. Figure 1 demonstrates how two distinct variables that are placed each other in system memory can be loaded into adjacent locations on a processors cache line causing the processor to mark the whole line as shared and invalidate the line for each load/store.

[[fig1] fig1.jpg]

Block Size and False Sharing Miss

Strategies to reduce False Sharing Miss

True Sharing Miss

Strategies to reduce True Sharing Miss