CSC/ECE 506 Spring 2012/8a cj

From Expertiza_Wiki
Revision as of 14:42, 20 March 2012 by Cjkirkpa (talk | contribs)
Jump to navigation Jump to search

Introduction to bus-based cache coherence in real machines

Most parallel software in the commercial market relies on the shared-memory programming model in which all processors access the same physical address space. The most common multiprocessors today use SMP architecture which use a common bus as the interconnect. In the case of multicore processors ("chip multiprocessors," or CMP) the SMP architecture applies to the cores treating them as separate processors. The key problem of shared-memory multiprocessors is providing a consistent view of memory with various cache hierarchies. This is called cache coherence problem. It is critical to achieve correctness and performance-sensitive design point for supporting the shared-memory model. The cache coherence mechanisms not only govern communication in a shared-memory multiprocessor, but also typically determine how the memory system transfers data between processors, caches, and memory.

Figure 1: Typical Bus-Based Processor Model



MSI, MESI, MESIF, and MOESI protocols on real architectures

MSI Protocol

Figure 1: MSI State Diagram

MESI Protocol

Figure 2: MESI State Diagram

Five State Protocols

MOESI

Figure 3: MOESI State Diagram

MESIF

Figure 4: Reduced Traffic with MESIF Protocol [1]

Protocol Performance

Protocol performance is strongly linked to the program that is running on the parallel system. This complicates protocol performance evaluation and prevents the designer from definitively choosing the best protocol. This binding of program and protocol makes "Making design decisions in real systems ... part art and part science. The art is the past experience, intuition, and aesthetics of the designers, and the science is workload-driven evaluation." [4]

While making definitive design decisions for machine designed to run a variety of programs is challenging, the performance of various cache parameters for specific programs can be simulated allowing for the prediction of general trends. Figure 5 shows the performance of the MESI protocol (III) and the standard MSI 3 state protocol (3st). The MSI protocol when BusRdX instead of BusUpgr is used for S -> M transitions is also shown (3st-RdX)

Figure 5: Traffic for various protocols [2]
Figure 6: Type of Miss [3]

References

  1. The research of the inclusive cache used in multi-core processor
  2. Parallel Computer Architecture: A Hardware/Software Approach