CSC/ECE 506 Spring 2012/10b sr: Difference between revisions

From Expertiza_Wiki
Jump to navigation Jump to search
Line 25: Line 25:
     (i) there must exist a total order of all loads and stores to physical addresses that respects the program order of each thread and  
     (i) there must exist a total order of all loads and stores to physical addresses that respects the program order of each thread and  
     (ii) the value of each load is equal to the value of the most recent store to that physical address in the total order.
     (ii) the value of each load is equal to the value of the most recent store to that physical address in the total order.
SC for PAMC. Loads and stores are to physical addresses. An “X” denotes an enforced ordering.
            Op 2
          Load Store
Op 1 Load    X    X
    Store  X    X


==== Virtual address memory consistency (VAMC)====
==== Virtual address memory consistency (VAMC)====

Revision as of 08:59, 4 April 2012

Use of consistency models in current multiprocessors

The memory consistency model of a shared memory system determines the order in which memory operations will appear to execute to the programmer. This article describes how consistency is used in multiprocessors today and later digs into the details of popular consistency models in use today. The impact of these models on the multiprocessor performance is also discussed. The article finishes off with a discussion about how the consistency models perform with larger multiprocessors.

Introduction

Many modern computer systems and most multicore chips support shared memory in hardware. In a shared memory system, each of the processor cores may read and write to a single shared address space. These designs seek various goodness properties, such as high performance, low power, and low cost. Of course, it is not valuable to provide these goodness properties without first providing correctness. Correct shared memory seems intuitive at a hand-wave level, but, there are subtle issues in even defining what it means for a shared memory system to be correct, as well as many subtle corner cases in designing a correct shared memory implementation. Moreover, these subtleties must be mastered in hardware implementations where bug fixes are expensive.

It is the job of consistency to define shared memory correctness. Consistency definitions provide rules about loads and stores (or memory reads and writes) and how they act upon memory. Ideally, consistency definitions would be simple and easy to understand. However, defining what it means for shared memory to behave correctly is more subtle than defining the correct behavior of, for example, a single-threaded processor core. The correctness criterion for a single processor core partitions behavior between one correct result and many incorrect alternatives. This is because the processor’s architecture mandates that the execution of a thread transforms a given input state into a single well-defined output state, even on an out-of-order core. Shared memory consistency models, however, concern the loads and stores of multiple threads and usually allow many correct executions while disallowing many incorrect ones. The possibility of multiple correct executions is due to the ISA allowing multiple threads to execute concurrently, often with many possible legal interleavings of instructions from different threads. The multitude of correct executions complicates the erstwhile simple challenge of determining whether an execution is correct. Nevertheless, consistency must be mastered to implement shared memory and, in some cases, to write correct programs that use it.

Consistency in current-day multiprocessors

Consistency models used

Address translation aware memory consistency

These memory consistency models define the behavior of operations (loads, stores, memory barriers, etc.) on physical addresses and virtual addresses. The two important levels of memory consistency that can be classified as address translation aware are described below:

Physical address memory consistency (PAMC)

It is necessary to have correct PAMC for unmapped code to work correctly. Unmapped software, including the boot code and part of the system software that manages AT, relies upon PAMC. It is the responsibility of the hardware to implement PAMC and this is specified precisely in the architectural manual. It is not too difficult to adapt an AT-oblivious consistency model as the specification of PAMC.

Example:

   The PAMC model could be SC. In such a case the interface would specify that 
   (i) there must exist a total order of all loads and stores to physical addresses that respects the program order of each thread and 
   (ii) the value of each load is equal to the value of the most recent store to that physical address in the total order.

Virtual address memory consistency (VAMC)

Correct VAMC is required for mapped code to work correctly.

Although adapting an AT-oblivious consistency model for PAMC is straightforward, there are a few challenges when adapting an AT-oblivious consistency model for VAMC:

  • synonyms - Multiple virtual addresses may map to the same physical address. Suppose two virtual addresses VA1 and VA2 map to the same physical address PA. SC requires that the value of a load is equal to the value of the most recent store to the same address. It is possible to have a naive definition of VAMC that does not consider the level of indirection introduced by AT. To overcome this challenge, we re-formulate AT-oblivious consistency models for VAMC by applying the model to synonym sets of virtual addresses rather than individual addresses. For instance, we can define SC for VAMC as follows: there must exist a total order of all loads and stores to virtual addresses that respects program order and in which each load gets the value of the most recent store to any virtual address in the same virtual address synonym set. Incorporating synonyms explicitly in the consistency model allows programmers to reason about the ordering of accesses to virtual addresses.
  • mapping and permission changes - Another challenge is that the set of memory operations at the VAMC level is richer than at the PAMC level.
  • load/store side effects - Yet another challenge in specifying VAMC is that loads and stores to virtual addresses have certain side effects. The AT system includes status bits such as accessed and dirty bits for each page table entry. These status bits are part of the architectural state, and the ordering of updates to those bits must thus be specified in VAMC. To achieve this we add two new operations to the specification tables: Ld-sb (load’s impact on status bits) and St-sb (store’s impact on status bits).

Causal consistency

Delta consistency

Entry consistency

Eventual consistency

Fork consistency

Linearizability

(also known as strict or atomic consistency)

One-copy serializability

PRAM consistency

(also known as FIFO consistency)

Release consistency

Sequential consistency

Serializability

Vector-field consistency

Weak consistency

Strong consistency

Practical performance impact

Conclusion