CSC/ECE 506 Spring 2011/ch10 MC: Difference between revisions
No edit summary |
|||
Line 4: | Line 4: | ||
==Motivation== | ==Motivation== | ||
Chapter 10 of [[#References | Solihin (2009)]] | Chapter 10 of [[#References | Solihin (2009)]] looks at what is specified by the memory model at the hardware level and why it is required. At the processor level, a memory model defines necessary and sufficient conditions for knowing that writes to memory by other processors are visible to the current processor, and writes by the current processor are visible to other processors. Some processors exhibit a strong memory model, where all processors see exactly the same value for any given memory location at all times. Other processors exhibit a weaker memory model, where special instructions, called memory barriers, are required to flush or invalidate the local processor cache in order to see writes made by other processors or make writes by this processor visible to others. These memory barriers are usually performed when lock and unlock actions are taken. Even on some of the strongest memory models, memory barriers are often necessary; quite frequently their placement is counterintuitive. Recent trends in processor design have encouraged weaker memory models, because the relaxations they make for cache consistency allow for greater scalability across multiple processors and larger amounts of memory. | ||
The issue of when a write becomes visible to another thread is compounded by the compiler's reordering of code. For example, the compiler might decide that it is more efficient to move a write operation later in the program; as long as this code motion does not change the program's semantics, it is free to do so. If a compiler defers an operation, another thread will not see it until it is performed; this mirrors the effect of caching. Most languages provide keywords such as 'volatile' to get around compiler reordering of memory accesses. | The issue of when a write becomes visible to another thread is compounded by the compiler's reordering of code. For example, the compiler might decide that it is more efficient to move a write operation later in the program; as long as this code motion does not change the program's semantics, it is free to do so. If a compiler defers an operation, another thread will not see it until it is performed; this mirrors the effect of caching. Most languages provide keywords such as 'volatile' to get around compiler reordering of memory accesses. |
Revision as of 01:03, 28 March 2011
Supplement to Chapter 10: Memory Consistency Models
Introduction
Memory consistency deals with visibility of the ordering of memory accesses to shared data across processors (or cores) in a multiprocessor environment. Memory consistency models define frameworks that outline what programmers can expect when writing multi-threaded programs using shared memory. We briefly discuss the guarantees provided by the hardware memory model. We then look at the motivation of defining the memory model at the programming language level. The Java programming model was designed ground up with a memory model in mind, this has resulted in language features that allow it to enforce the Java Memory Model(JMM) in a manner that is transparent to the Java programmer. We look at some of the building blocks in the Java programming language to enforce the JMM. We follow that up with a detailed discussion of the JMM and some of the related issues.
Motivation
Chapter 10 of Solihin (2009) looks at what is specified by the memory model at the hardware level and why it is required. At the processor level, a memory model defines necessary and sufficient conditions for knowing that writes to memory by other processors are visible to the current processor, and writes by the current processor are visible to other processors. Some processors exhibit a strong memory model, where all processors see exactly the same value for any given memory location at all times. Other processors exhibit a weaker memory model, where special instructions, called memory barriers, are required to flush or invalidate the local processor cache in order to see writes made by other processors or make writes by this processor visible to others. These memory barriers are usually performed when lock and unlock actions are taken. Even on some of the strongest memory models, memory barriers are often necessary; quite frequently their placement is counterintuitive. Recent trends in processor design have encouraged weaker memory models, because the relaxations they make for cache consistency allow for greater scalability across multiple processors and larger amounts of memory.
The issue of when a write becomes visible to another thread is compounded by the compiler's reordering of code. For example, the compiler might decide that it is more efficient to move a write operation later in the program; as long as this code motion does not change the program's semantics, it is free to do so. If a compiler defers an operation, another thread will not see it until it is performed; this mirrors the effect of caching. Most languages provide keywords such as 'volatile' to get around compiler reordering of memory accesses.
All of this flexibility is by design -- by giving the compiler, runtime, or hardware the flexibility to execute operations in the optimal order, within the bounds of the memory model, we can achieve higher performance.
The Java programming language is unique compared to other older languages such as C/C++ in a few different ways. Java is platform independent and strongly follows the philosophy of 'write once , run everywhere'. It achieves this by abstracting away the specifics of the underlying platform from the Java programmer. This abstraction is provided by the Java Virtual machine (JVM), which is the layer between the bare metal and a Java program. Java programs as opposed to C/C++ programs don't run on directly on bare-metal, instead they are run atop a virtual machine. The JVM is cognizant of the underlying platform, i.e., the underlying instruction set architecture, the memory model, the operating system etc. Java programs are first compiled into a universal binary, referred to as byte-code, the byte-code is then translated by the JVM into instructions specific to the underlying architecture. The JVM has interpreters and compiler built into it for this purpose.
Providing 'platform-independence' is one of the key goals of the Java programming languages, thus multi-threaded Java programs are expected to run 'safely' on platforms with different memory models. The JMM shields the Java developer from the differences between memory models across architectures and the JVM deals with the differences between the JMM and the underlying platform's memory model by inserting memory barriers at the appropriate places.
The Java Memory Model describes what behaviors are legal in multithreaded code, and how threads may interact through memory. It describes the relationship between variables in a program and the low-level details of storing and retrieving them to and from memory or registers in a real computer system. It does this in a way that can be implemented correctly using a wide variety of hardware and a wide variety of compiler optimizations.
The JMM specifies the minimal guarantees the JVM must make about when writes to variables become visible to other threads. It was designed to balance the need for predictability and ease of program development with the realities of implementing high-performance JVMs on a wide range of popular processor It preserves existing safety guarantees, like type-safety. It also defines the semantics of incompletely or incorrectly synchronized programs so that potential security hazards are minimzed.
Java includes several language constructs, including volatile, final, and synchronized, which are intended to help the programmer describe a program's concurrency requirements to the compiler. The Java Memory Model defines the behavior of volatile and synchronized, and, more importantly, ensures that a correctly synchronized Java program runs correctly on all processor architectures.
The Java Memory Model
The Java memory model is specified in terms of 'actions', which include reads and writes to variables, locks and unlocks of monitors, and starting and joining with threads. The JMM defines a partial ordering called happens-before on all actions within the program. To guarantee that the thread executing action B can see the results of action A(whether or not A and B occur in different threads), there must be a happens-before relationship between A and B. In the absence of a happens-before ordering between two operations, the JVM is free to reorder them as it pleases.
The rules for happens-before are:
- Program order rule: Each action in a thread happens-before every action in that thread that comes later in the program order.
- Monitor lock rule: An unlock on a monitor lock happens-before every subsequent lock on that same monitor lock.
- Volatile variable rule: A write to a volatile filed happens-before every subsequent read of that same field.
- Thread start rule: A call to Thread.Start on a thread happens-before every action in the started thread.
- Thread termination rule: Any action in a thread happens-before any other thread detects that thread has terminated, either by successfully return from Thread.join or by Thread.isAlive returning false.
- Interruption rule: A thread calling interrupt on another thread happens-before the interrupted thread detects the interrupt (either by having InterruptedException thrown, or invoking isInterrupted or interrupted).
- Finalizer rule: The end of a constructor for an object happens-before the start of the finalizer for that object.
- Transitivity: If a A happens-before B, and B happens-before C, then A happens-before C.
Locks and unlocks on explicit Lock objects have the same memory semantics as intrinsic locks. Read and writes of atomic variables have the same memory semantics as volatile variables. For, e.g., if two threads synchronize on a lock, then all memory accesses on the thread releasing the lock first are visible to the thread immediately acquiring the lock after the release. This is analogous to the release consistency model covered in section 10.3.4 of Solihin (2009). It is important to note that if two threads synchronize on different locks, then we can't say anything about the ordering of actions between them, since there is no happens-before relation between the actions in the two threads.
Publication
Initialization Safety
Double-Checked Locking
References
- Yan Solihin, Fundamentals of Parallel Computer Architecture: Multichip and Multicore Systems, Solihin Books, August 2009.
- David E. Culler, Jaswinder Pal Singh, and Anoop Gupta, Parallel Computer Architecture: A Hardware/Software Approach, Gulf Professional Publishing, August 1998.
- Jeremy Manson, William Pugh and Sarita Adve, "The Java Memory Model" http://rsim.cs.illinois.edu/Pubs/popl05.pdf
- Bill Pugh JMM Page http://www.cs.umd.edu/~pugh/java/memoryModel/
- Brian Goetz, Java Concurrency in Practice,