ECE506 CSC/ECE 506 Spring 2012/11a az

From Expertiza_Wiki
Revision as of 03:54, 15 April 2012 by Afzafar (talk | contribs)
Jump to navigation Jump to search

11a.  Performance of DSM systems.  Distributed shared memory systems combine the programming models of shared memory systems, and the scalability of distributed systems.  However, since DSM systems need extra coordination between software layer and underlying hardware, achieving good performance could be a big challenge. The factors that harm the performance could be the overhead to maintain cache coherence, memory consistency, and the latency of interconnections. Please further explore the factors that can affect the performance of DSM systems, and the improvements that have been made on the existing systems.

Introduction

Cache coherence

DSM systems must maintain cache coherence just as it required by bus-based multiprocessor systems. Cache coherence problems arise when it is undefined how a change of a value in a specific processor's cache is propagated to the other caches [1, p. 183]. If multiple processors access and modify a shared location in memory and produce outputs based on that shared variable, it is possible to calculate incorrect values if cache coherence is not maintained.

Ensuring that a value changed in one cache is sent to another cache is called write propagation. [1, p. 183] Write propagation is one of the requirements that must be addressed to be provide cache coherency. Without write propagation, one processor could modify a cached value and not notify the other processors that have the same value cached. The other caches may believe they have the latest data, thus on subsequent reads, their caches will provide it to their respective processors, leading to incoherent results.

FIXME: EXAMPLE?

Another requirement for cache coherence is write serialization, which Solihin [1 p. 183] defines as a requirement that "multiple changes to a single memory location are seen in the same order by all processors". If two processors perform writes to a single memory location in a certain order, then all other processors in the system should see the writes (by reading that memory location and subsequently caching the values) in the order in which they were written. If other processors observe the writes by reading the variable, but see the writes in different orders, this can lead to incoherent copies of the same variable in multiple caches while each think they have the latest copy.

FIXME: EXAMPLE?

Thus, for correctness purposes, it is required that write propagation and write serialization are provided by the cache coherence implementation.

In order to maintain cache coherence, a cache coherence protocol is implemented in hardware (or in specific cases, in software). In DSM systems, the cache coherence controller interfaces with the processor and it's cache, but also has a communication link to the other nodes through an interconnect network. It receives and acts upon requests from the local processor, as well as receives and acts on requests or messages sent across the interconnects from other nodes.

Memory consistency

Interconnections

Performance Concerns

Maintaining cache coherence

Maintaining memory consistency

Latency of interconnections

Performance Improvements

Maintaining cache coherence

Maintaining memory consistency

Relaxed memory models with fine granularity coherence

Latency of interconnections

Definitions

DSM
Distributed shared memory, a parallel computer architecture which consists of a set of nodes that maintain their own local memory, but all nodes are connected together, making their memories one shared addressable space.
granularity
FIXME
node
A compute unit that makes up one components of a DSM system. A node consists of one or more sets of processors, cache, and memory. A node is connected to the larger DSM system through an interconnect.

References

<references/>

Quiz