|
|
Line 1: |
Line 1: |
| MOESI:
| |
| http://www.chip-architect.com/news/2003_09_21_Detailed_Architecture_of_AMDs_64bit_Core.html
| |
|
| |
|
| It is one of the popular snoop based protocol supported in AMD64 architecture. Opteron can maintain cache coherency in systems upto 8 processors using this protocol.
| |
| It has 5 states:
| |
| Modified(M) : The most recent copy of the data is present in the cache line. But it is not present in any other processor cache.
| |
| •Owned (O) : The cache line has the most recent correct copy of the data . This can be shared by other processors. The processor in this state for this cache line is responsible to update the correct value in the main memory before it gets evicted.
| |
| •Exclusive (E) : A cache line holds the most recent, correct copy of the data, which is exclusively present on this processor and a copy is present in the main memory.
| |
| •Shared (S) : A cache line in the shared state holds the most recent, correct copy of the data, which may be shared by other processors.
| |
| •Invalid (I)—A cache line does not hold a valid copy of the data.
| |
|
| |
|
| |
| Reference : http://www.chip-architect.com/news/2003_09_21_Detailed_Architecture_of_AMDs_64bit_Core.html
| |
|
| |
|
| |
| MOESI addresses the bandwidth problem faced in MESI protocol when processor having invalid data in its cache wants to modify the data. The processor will have to wait for the processor which modified this data to write back to the main memory which takes time and bandwidth. This drawback is removed in MOESI by allowing dirty sharing. When the data is held by a processor in the new state “Owner”, it can provide other processors the modified data without or even before writing it to the main memory. This is called dirty sharing.
| |
| The owner stays responsible to update the main memory later before the cache line is evicted.
| |
| http://usqcd.jlab.org/usqcd-docs/qmt/multicoretalk.pdf
| |
| AMD Opteron memory Architecture
| |
|
| |
| The AMD Athlon processor’s high-performance cache architecture includes an integrated, 64-bit, dual-ported
| |
| 128-Kbyte split-L1 cache with separate snoop port, multi-level
| |
| translation lookaside buffers (TLBs), a scalable L2 cache
| |
| controller with a 72-bit (64-bit data + 8-bit ECC) interface to as
| |
| much as 8-Mbyte of industry-standard SDR or DDR SRAMs, and
| |
| an integrated tag for the most cost-effective 512-Kbyte L2
| |
| configurations.
| |
| The AMD Athlon processor’s integrated L1 cache comprises
| |
| two separate 64-Kbyte, two-way set-associative data and
| |
| instruction caches. The data cache has eight banks to support
| |
| concurrent access by two 64-bit loads or stores. The
| |