CSC/ECE 506 Spring 2011/ch11 sw: Difference between revisions

From Expertiza_Wiki
Jump to navigation Jump to search
Line 39: Line 39:
Either SRAM or DRAM can be used for storage of the directory information. Based on these design options, the next section discusses the DSM cache coherence protocols.
Either SRAM or DRAM can be used for storage of the directory information. Based on these design options, the next section discusses the DSM cache coherence protocols.


= DSM Cache Coherence Protocol =
== DSM Cache Coherence Protocol ==
 
There are variants of three major cache coherence protocols that have been implemented in commercial DSM machines, and other protocols have been proposed in the research community. The protocols differ in terms of directory organization (and therefore directory memory overhead), the number and types of messages exchanged between nodes, the direct protocol processing overhead, and inherent scalability features.
 
== See Also ==
 
[1] http://ntrg.cs.tcd.ie/undergrad/4ba2.05/group12/index.html
 
[2] http://www.csl.cornell.edu/~heinrich/dissertation/ChapterTwo.pdf
 
[3] Simoni, Richard. Implementing a Directory-Based Cache Consistency Protocol. Technical Report: CSL-TR-90-423 March 1990.

Revision as of 02:15, 19 April 2011

This article discusses the design of scalable shared memory multiprocessors. As we know, bus-based multiprocessor possess the disadvantage of being non-scalable. So, a new system called Distributed Shared Memory was introduced wherein, accessing different parts of memory takes different times. Thus, a DSM is more scalable than a bus-based multiprocessors. But as the size of the DSM increases, so does the cost for the hardware support needed for it.

The wiki chapter introduces the approaches used to scale multiprocessors, the cache coherence protocols for a basic DSM and explains how the race conditions are handled.

Approaches to Large-Scale Multiprocessors

The two main factors limiting the scalability of a bus-based multiprocessor are:

  • Physical scalability
  • Protocol scalability

As mentioned in section 11.1 of Solihin book, a directory protocol using point-to-point interconnection is the best option. In a directory-based protocol, the directory - a structure - holds the information about which caches have a copy of the block. So it is required to contact the directory to get the list of caches to be requested for the copy of the block (which avoids broadcasting request). The directory-based protocol benefits by saving traffic in cases where data sharing occurs for read-only data.

Design considerations

The following design decisions are chosen to implement a DSM:

  • A straight-forward physical address - to - memory mapping function is used.
  • Memory is considered to be linear and interleaving is avoided.
  • Page allocation policies like the Least Recently Used policy, Round Robin policy etc, can be used.

Also, to implement the directory structure, we have the following options:

  • Cache-based or memory-based directory.
  • Centralized or distributed directory.

A distributed directory approach is more scalable. In order to implement such a protocol, a directory has the following ways to keep track of which caches hold the copy of a block: [ Solihin, 11.2.1]

  • Full-bit vector format
  • Coarse-bit vector format
  • Limited pointer format
  • Sparse directory format

These formats can be either used exclusively or can be used in combinations. The last design issue is where is the directory information physically located, and the following choices exist for this:

  • Allocating a part of main memory
  • Allocating separate memory
  • On the same chip as the processor

Either SRAM or DRAM can be used for storage of the directory information. Based on these design options, the next section discusses the DSM cache coherence protocols.

DSM Cache Coherence Protocol

There are variants of three major cache coherence protocols that have been implemented in commercial DSM machines, and other protocols have been proposed in the research community. The protocols differ in terms of directory organization (and therefore directory memory overhead), the number and types of messages exchanged between nodes, the direct protocol processing overhead, and inherent scalability features.

See Also

[1] http://ntrg.cs.tcd.ie/undergrad/4ba2.05/group12/index.html

[2] http://www.csl.cornell.edu/~heinrich/dissertation/ChapterTwo.pdf

[3] Simoni, Richard. Implementing a Directory-Based Cache Consistency Protocol. Technical Report: CSL-TR-90-423 March 1990.