CSC/ECE 506 Fall 2007/wiki3 2 tl
Directory Based Cache Coherence
Scalable distributed memory machines are made up of various nodes connected by a network. Each of these nodes is comprised of a processor with cache, a memory unit and a communication assist unit, which acts as the interface between the processor and the network. To obtain cache coherence on a physically distributed system built from numerous nodes without a interconnect which can be snooped, we can use a flat directory based scheme. A flat directory based scheme can be either memory-based or cache-based and exhibit the following characteristics:
- In a flat directory scheme, directory information is located in a fixed place, typically at the home node where the memory is located.
- To locate copies of the data
- Memory-based: all the information is held in the directory at the home node.
- Cache-based: home node has pointer to first element of a linked list
- To communicate with those copies
- Memory-based: uses point-to-point messages which can be multicast or overlapped.
- Cache-based: uses a point-to-point linked list traversal to find communicate.
Simple Scalable Coherent Interface (SSCI)
In a cache-based directory scheme, every shareable block in memory is associated with a list of processors sharing that block. The home node maintains a pointer to first sharer plus state bits, which is the head pointer for the block. Each node with a cached copy maintains two additional pointers for each cache line to next and previous sharer, which are the forward and backward pointers.
The SSCI protocol is an example of a flat cache-based directory scheme. The SSCI protocol is a simplified version of the SCI protocol which we will examine briefly. The directory entries can be distributed along with the memory and the high order bits of an address can be used to identify the location of the memory and directory entries for that portion of memory. The SSCI protocol scheme makes use of a full bit-vector approach by maintaining a vector which keeps track of the states in the cache, the states in the memory directory and a pointer to the processors that share a particular block.
Cache states in the SSCI can be labeled:
- M(modified)
- E(exclusive)
- S(Shared)
- I(Invalid)
States in the memory directory may have the value of:
- U(Uncached)
- S(Shared)
- EM(Exclusive Modified)
In addition, there is a local node or requestor node, where the request originates, a home node where the memory and directory live and a remote node that has a copy of the block (exclusive or shared).
In handling a read miss, a transaction takes place from the requestor node to the home node. Home either replies with the data or with head of the list of sharer nodes. The head pointer is changed to point to the requestor node and the requestor sends a transaction to first sharer to be inserted into the list. In handling a write miss, the requestor node obtains the head information from the home node and the requestor is inserted as new head into list. If the requestor is on the list it is deleted and reinserted as the head of the list. The list is then traversed and copies are invalidated.
The Scalable Coherent Interface (SCI)
The Scalable Coherent Interface (SCI) is a cache coherent memory model that can be used in systems up to 64K nodes. SCI's versatility stems from the fact that it combines the best of both message-based and shared-memory programming models. SCI uses point-to-point links so the concurrent data transfers can occur. The above SSCI protocol is a simplified version of the SCI protocol and therefore acts in much the same manner. For each block of memroy, there is a centralized “directory” that maintains the state of the block in the different caches. The directory is co-located with the corresponding memory and requests and replies on the interconnect are no longer seen by everyone but are limited to the specific nodes involved in the request.
Memory states in SCI are defined as:
- Home: no cached copy exists
- Fresh: read-only copies may exist and the copy in the memory is valid.
- Gone: a writable (either exclusive or dirty) copy exists and no valid copy exists on the local node.
The cached states differs significantly from the SSCI protocol in that SCI has 29 stable states and many pending states. The stable states have two parts:
- The position in the sharing list
- Head, Tail, Mid, Only
- The actual state
- Dirty: modified or writable
- Clean: writable but it's the same value that's in memory
- Copy: readable
References and Links
- James, D.V.; Laundrie, A.T.; Gjessing, S.; Sohi, G.S., "Distributed-directory scheme: scalable coherent interface," Computer , vol.23, no.6, pp.74-77, Jun 1990 URL: http://www.lib.ncsu.edu:2178/iel5/2/2005/00055503.pdf?isnumber=2005∏=JNL&arnumber=55503&arnumber=55503&arSt=74&ared=77&arAuthor=James%2C+D.V.%3B+Laundrie%2C+A.T.%3B+Gjessing%2C+S.%3B+Sohi%2C+G.S.
- Gustavson, D. B. 1992. "The Scalable Coherent Interface and Related Standards Projects". IEEE Micro 12, 1 (Jan. 1992), 10-22. DOI= http://dx.doi.org/10.1109
- Gustavson, D.B.; Qiang Li, "The Scalable Coherent Interface (SCI)," Communications Magazine, IEEE , vol.34, no.8, pp.52-63, Aug 1996 URL: http://www.lib.ncsu.edu:2178/iel1/35/11187/00533919.pdf?isnumber=11187∏=STD&arnumber=533919&arnumber=533919&arSt=52&ared=63&arAuthor=Gustavson%2C+D.B.%3B+Qiang+Li/40.124376
- Alnaes, K.; Kristiansen, E.H.; Gustavson, D.B.; James, D.V., "Scalable Coherent Interface," CompEuro '90. Proceedings of the 1990 IEEE International Conference on Computer Systems and Software Engineering , vol., no., pp.446-453, 8-10 May 1990 URL: http://www.lib.ncsu.edu:2178/iel5/285/3365/00113656.pdf?isnumber=3365∏=STD&arnumber=113656&arnumber=113656&arSt=446&ared=453&arAuthor=Alnaes%2C+K.%3B+Kristiansen%2C+E.H.%3B+Gustavson%2C+D.B.%3B+James%2C+D.V.