Chp8 my: Difference between revisions

From Expertiza_Wiki
Jump to navigation Jump to search
Line 75: Line 75:


====Special Coherency Considerations====
====Special Coherency Considerations====
In some cases, data can be modified in a manner that is impossible for the memory-coherency protocol
In some cases, data can be modified in a manner that is impossible for the memory-coherency protocol to handle due to the effects of instruction prefetching. In such situations software must use serializing
to handle due to the effects of instruction prefetching. In such situations software must use serializing
instructions and/or cache-invalidation instructions to guarantee subsequent data accesses are coherent. An example of this type of a situation is a page-table update followed by accesses to the physical pages
instructions and/or cache-invalidation instructions to guarantee subsequent data accesses are coherent.
referenced by the updated page tables. The following sequence of events shows what can happen when software changes the translation of virtual-page A from physical-page M to physical-page N:
An example of this type of a situation is a page-table update followed by accesses to the physical pages
<ul>
referenced by the updated page tables. The following sequence of events shows what can happen when
<li>Software invalidates the TLB entry. The tables that translate virtual-page A to physical-page M are now held only in main memory. They are not cached by the TLB.</li>
software changes the translation of virtual-page A from physical-page M to physical-page N:
<li>Software changes the page-table entry for virtual-page A in main memory to point to physicalpage N rather than physical-page M.</li>
1. Software invalidates the TLB entry. The tables that translate virtual-page A to physical-page M are
<li>Software accesses data in virtual-page A. During Step 3, software expects the processor to access the data from physical-page N. However, it is possible for the processor to prefetch the data from physical-page M before the page table for virtualpage A is updated in Step 2. This is because the physical-memory references for the page tables are different than the physical-memory references for the data. Because the physical-memory references are different, the processor does not recognize them as requiring coherency checking and believes it is safe to prefetch the data from virtual-page A, which is translated into a read from physical page M. Similar behavior can occur when instructions are prefetched from beyond the page table update instruction.</li>
now held only in main memory. They are not cached by the TLB.
</ul>
2. Software changes the page-table entry for virtual-page A in main memory to point to physicalpage
N rather than physical-page M.
3. Software accesses data in virtual-page A.
During Step 3, software expects the processor to access the data from physical-page N. However, it is
possible for the processor to prefetch the data from physical-page M before the page table for virtualpage
A is updated in Step 2. This is because the physical-memory references for the page tables are
different than the physical-memory references for the data. Because the physical-memory references
are different, the processor does not recognize them as requiring coherency checking and believes it is
safe to prefetch the data from virtual-page A, which is translated into a read from physical page M.
Similar behavior can occur when instructions are prefetched from beyond the page table update
instruction.


===Dragon protocol===
===Dragon protocol===

Revision as of 01:58, 26 March 2010

In computing, cache coherence (also cache coherency) refers to the consistency of data stored in local caches of a shared resource. Cache coherence is a special case of memory coherence.

Cache Coherence

Definition

Coherency protocol

MSI protocol

Intel

AMD

MESI protocol

In MESI protocol, there are four cache block status:

  • 1. Modified (M): the cache block valid in only one cache and the value is like different from the main memory.
  • 2. Exclusive (E): the cache block is valid and clean, but only resides in one cache.
  • 3. Shared (S): the cache block is valid and clean, but may exist in multiple caches.
  • 4. Invalid (I): the cache block is invalid.


This figure shows the status change when bus traction generated. We are going to introduce those requests.

  • PrRd: processors request to read a cache block.
  • PrWr: processors request to write a cache block.
  • BusRd: snooped request a read request to a cache block made by another processor.
  • BusRdX: snooped request a read exclusive (write) request to a cache block made by another processor which doesn't already have the block. Shortly, write cache-to-memory
  • BusUpgr: snooped request indicates that there is a write request to a cache block that another processor already has in its cache.
  • Flush: snooped request indicates than an entire cache block is written back to main memory by another processors. Shortly, cache-to-memory.
  • FlushOpt: snooped request indicates that an entire block cache block is posted on the bus in order to supply it to another processor. Shortly, cache-to-cache.

Intel

AMD

MOESI protocol (AMD)

AMD

AMD Opteron is using MOESI (modified, owned, exclusive, shared, invalid) protocol for cache sharing. In addition to the four states in MESI, which is adopted by Intel for their Xeon processors, a fifth state "Owned" appears here representing data that is both modified and shared. Using MOESI, writing modified data back to main memory is avoided before being shared, which could save bandwidth and gain much faster access to users to the cache.

The states of the MOESI protocol are:

  • Invalid—A cache line in the invalid state does not hold a valid copy of the data. Valid copies of the data can be either in main memory or another processor cache.
  • Exclusive—A cache line in the exclusive state holds the most recent, correct copy of the data. The copy in main memory is also the most recent, correct copy of the data. No other processor holds a copy of the data.
  • Shared—A cache line in the shared state holds the most recent, correct copy of the data. Other processors in the system may hold copies of the data in the shared state, as well. If no other processor holds it in the owned state, then the copy in main memory is also the most recent.
  • Modified—A cache line in the modified state holds the most recent, correct copy of the data. The copy in main memory is stale (incorrect), and no other processor holds a copy.
  • Owned—A cache line in the owned state holds the most recent, correct copy of the data. The owned state is similar to the shared state in that other processors can hold a copy of the most recent, correct data. Unlike the shared state, however, the copy in main memory can be stale (incorrect). Only one processor can hold the data in the owned state—all other processors must hold the data in the shared state.


The figure below shows the five different states of MOESI protocol.

There are four valid states: M(odified) and E(xclusive) are not shared with other cache, while O(wned) and S(hared) with other caches. The state transitions of MOESI protocol is showed below:

Special Coherency Considerations

In some cases, data can be modified in a manner that is impossible for the memory-coherency protocol to handle due to the effects of instruction prefetching. In such situations software must use serializing instructions and/or cache-invalidation instructions to guarantee subsequent data accesses are coherent. An example of this type of a situation is a page-table update followed by accesses to the physical pages referenced by the updated page tables. The following sequence of events shows what can happen when software changes the translation of virtual-page A from physical-page M to physical-page N:

  • Software invalidates the TLB entry. The tables that translate virtual-page A to physical-page M are now held only in main memory. They are not cached by the TLB.
  • Software changes the page-table entry for virtual-page A in main memory to point to physicalpage N rather than physical-page M.
  • Software accesses data in virtual-page A. During Step 3, software expects the processor to access the data from physical-page N. However, it is possible for the processor to prefetch the data from physical-page M before the page table for virtualpage A is updated in Step 2. This is because the physical-memory references for the page tables are different than the physical-memory references for the data. Because the physical-memory references are different, the processor does not recognize them as requiring coherency checking and believes it is safe to prefetch the data from virtual-page A, which is translated into a read from physical page M. Similar behavior can occur when instructions are prefetched from beyond the page table update instruction.

Dragon protocol

Intel

AMD

References

  1. Cache Coherence
  2. Cache consistency & MESI Intel
  3. A closer look at AMD's dual-core architecture
  4. CSE 400 – Related Work: Instructions & Example
  5. Trace-Driven Simulation of the MSI, MESI and Dragon Cache Coherence Protocols
  6. Understanding the Detailed Architecture of AMD's 64 bit Core
  7. MSI,MESI,MOESI sheet
  8. AMD64 Architecture Programmer’s Manual