CSC/ECE 506 Fall 2007/wiki3 2 tl: Difference between revisions
Line 3: | Line 3: | ||
Scalable distributed memory machines are made up of various nodes connected by a network. Each of these nodes is comprised of a processor with cache, a memory unit and a communication assist unit, which acts as the interface between the processor and the network. To obtain cache coherence on a physically distributed system built from numerous nodes without a interconnect which can be snooped, we can use a flat directory based scheme. In a flat directory scheme, directory information is located in a fixed place, typically at the home node where the memory is located. In a cache-based directory scheme, the home node maintains a pointer to first sharer plus state bits, which is the head pointer for the block). Each node with a cached copy maintains two additional pointers for each cache line to next and previous sharer, which are the forward and backward pointers. | Scalable distributed memory machines are made up of various nodes connected by a network. Each of these nodes is comprised of a processor with cache, a memory unit and a communication assist unit, which acts as the interface between the processor and the network. To obtain cache coherence on a physically distributed system built from numerous nodes without a interconnect which can be snooped, we can use a flat directory based scheme. In a flat directory scheme, directory information is located in a fixed place, typically at the home node where the memory is located. In a cache-based directory scheme, the home node maintains a pointer to first sharer plus state bits, which is the head pointer for the block). Each node with a cached copy maintains two additional pointers for each cache line to next and previous sharer, which are the forward and backward pointers. | ||
[[Image:pict. | [[Image:pict.gif|alt text]] | ||
==The Scalable Coherent Interface (SCI)== | ==The Scalable Coherent Interface (SCI)== |
Revision as of 20:37, 14 October 2007
Simple Scalable Coherent Interface (SSCI)
Scalable distributed memory machines are made up of various nodes connected by a network. Each of these nodes is comprised of a processor with cache, a memory unit and a communication assist unit, which acts as the interface between the processor and the network. To obtain cache coherence on a physically distributed system built from numerous nodes without a interconnect which can be snooped, we can use a flat directory based scheme. In a flat directory scheme, directory information is located in a fixed place, typically at the home node where the memory is located. In a cache-based directory scheme, the home node maintains a pointer to first sharer plus state bits, which is the head pointer for the block). Each node with a cached copy maintains two additional pointers for each cache line to next and previous sharer, which are the forward and backward pointers.
The Scalable Coherent Interface (SCI)
The Scalable Coherent Interface (SCI) is a cache coherent memory model that can be used in systems up to 64K nodes. SCI's flexibility stems mainly from its communication protocol: In contrast to many former systems, it is not only restricted to either message-based or shared-memory programming models. Instead, it rather combines both. By also providing a distributed directory-based cache coherence protocol, it is up to the computer architect to choose from a broad range of execution models, including efficient message passing architectures as well as shared-memory models that can feature both of its NUMA or CC-NUMA variants.
The core feature of SCI based networks is the ability to perform remote memory operations through direct hardware distributed shared memory (DSM) support. The figure below gives a general overview of how this capability can be applied. The basis is formed by the SCI physical address space which allows addressing of any physical memory location on any connected node through a 64 bit identifier (16 bit to specify the node, 48 bit to specify the physical address). From this global address space, each node can import pieces of remote memory into a designated address window within the PCI address space using special address translation tables on the SCI adapter cards. After mapping the PCI address space into the virtual memory of a process, the remote memory can be directly accessed using standard user--level read and write operations. The SCI hardware forwards these operations transparently to the remote node and, in case of a read operation, returns the result. Due to the pure hardware implementation avoiding any software overhead, extremely low latencies of about 1.8 us (one way) can be achieved.
References and Links
- James, D.V.; Laundrie, A.T.; Gjessing, S.; Sohi, G.S., "Distributed-directory scheme: scalable coherent interface," Computer , vol.23, no.6, pp.74-77, Jun 1990 URL: http://www.lib.ncsu.edu:2178/iel5/2/2005/00055503.pdf?isnumber=2005∏=JNL&arnumber=55503&arnumber=55503&arSt=74&ared=77&arAuthor=James%2C+D.V.%3B+Laundrie%2C+A.T.%3B+Gjessing%2C+S.%3B+Sohi%2C+G.S.
- Gustavson, D. B. 1992. The Scalable Coherent Interface and Related Standards Projects. IEEE Micro 12, 1 (Jan. 1992), 10-22. DOI= http://dx.doi.org/10.1109/40.124376