CSC/ECE 506 Spring 2010/chapter 8

From Expertiza_Wiki
Revision as of 00:51, 26 March 2010 by Shebbur (talk | contribs)
Jump to navigation Jump to search

MOESI: http://www.chip-architect.com/news/2003_09_21_Detailed_Architecture_of_AMDs_64bit_Core.html

It is one of the popular snoop based protocol supported in AMD64 architecture. Opteron can maintain cache coherency in systems upto 8 processors using this protocol. It has 5 states: Modified(M) : The most recent copy of the data is present in the cache line. But it is not present in any other processor cache.

•Owned (O) : The cache line has the most recent correct copy of the data . This can be shared by other processors. The processor in this state for this cache line is responsible to update the correct value in the main memory before it gets evicted.  

•Exclusive (E) : A cache line holds the most recent, correct copy of the data, which is exclusively present on this processor and a copy is present in the main memory. •Shared (S) : A cache line in the shared state holds the most recent, correct copy of the data, which may be shared by other processors. •Invalid (I)—A cache line does not hold a valid copy of the data.


Reference : http://www.chip-architect.com/news/2003_09_21_Detailed_Architecture_of_AMDs_64bit_Core.html


MOESI addresses the bandwidth problem faced in MESI protocol when processor having invalid data in its cache wants to modify the data. The processor will have to wait for the processor which modified this data to write back to the main memory which takes time and bandwidth. This drawback is removed in MOESI by allowing dirty sharing. When the data is held by a processor in the new state “Owner”, it can provide other processors the modified data without or even before writing it to the main memory. This is called dirty sharing. The owner stays responsible to update the main memory later before the cache line is evicted. http://usqcd.jlab.org/usqcd-docs/qmt/multicoretalk.pdf AMD Opteron memory Architecture

The AMD Athlon processor’s high-performance cache architecture includes an integrated, 64-bit, dual-ported 128-Kbyte split-L1 cache with separate snoop port, multi-level translation lookaside buffers (TLBs), a scalable L2 cache controller with a 72-bit (64-bit data + 8-bit ECC) interface to as much as 8-Mbyte of industry-standard SDR or DDR SRAMs, and an integrated tag for the most cost-effective 512-Kbyte L2 configurations. The AMD Athlon processor’s integrated L1 cache comprises two separate 64-Kbyte, two-way set-associative data and instruction caches. The data cache has eight banks to support concurrent access by two 64-bit loads or stores. The