0321-2670057/wiki2 5 089321: Difference between revisions
No edit summary |
No edit summary |
||
Line 509: | Line 509: | ||
'''Single Core MP vs Multi-Core | '''Single Core MP vs Multi-Core''' | ||
To compare performance difference between these two different MP architectures we can see following selection of benchmarks. First we can see results of AMD Dual Core System vs AMD Dual Processor System. | To compare performance difference between these two different MP architectures we can see following selection of benchmarks. First we can see results of AMD Dual Core System vs AMD Dual Processor System. |
Revision as of 15:52, 27 September 2007
Multi-core processor
A multi-core processor is a processor can have more than one execution cores packaged in a single die (a package comprised of a single integrated circuit (IC)). A dual core processor contains two core plugged into a socket whereas quad-core contains four cores. The connection between cores in the same socket is much faster than single core multi-processors. As single core multi-processors are reaching its physical limits of speed and complexity because of producing heat and data synchronization problems.
Communication between multiple CPUs on same die is much faster than single core CPUs on different dies which leads to better results in case of cache coherence in multi-processing. Packaging multi-cores in single die allows the cache coherency circuitry to operate at much higher clock rate than communicating with off-chip processors.
Cache organization in multicore
Cache is small high speed memory usually Static RAM (SRAM) that contains most recently accessed pieces of main memory and basic purpose of cache is to minimize the latency to frequently accessed data. Cache organization deals with the number of levels in the cache hierarchy, and with the size, associativity, latency, and bandwidth parameters at each level. Cache policies determine accessibility, allocation, and eviction policies to effectively utilize on-chip cache resource.
In single core processor hierarchy, we minimize latency by moving cache blocks closer and closer to the core through the levels of cache in the cache hierarchy. Same way we do in case of multi-core cache hierarchies, but there are some more things to be included in the design like some levels of cache can be private to a core and others can be shared one we have to into account whether cores has to share a given level in the cache hierarchy or whether a level is implemented as a single physical block or as a multiple physical distributed banks with non-uniform access latency to each bank.
In recent release of multi-core processors, the first one or two levels in the cache are private to each core. However, deciding whether a level should be private to a core or shared among cores can depends upon the levels of parallelism like following figure depicts for Data Level Parallelism both 1st and 2nd level caches are private to each core where in case of Thread Level Parallelism 2nd level is shared among cores.
Cache Table
Following table shows caches used in recent multi-core architectures with the help of different parameters cache performance can be measured and this includes caches size in each level, line size in each level, latency in each level, associativity of each level, whether each level is private or shared and coherence protocol used. Each level of latency includes previous latency cycles.
|
Quad -Core AMD Opteron ™ 2000 Series Procesors |
Intel Core 2 Duo |
Intel Core Duo |
AMD Athlon 64 |
|||||
|
L1 |
L2 |
L3 |
L1 |
L2 |
L1 |
L2 |
L1 |
L2 |
Cache Size |
64 KB Data+64 KB Instruction |
512 per core |
2 MB |
32 KB Data+32 KB Instruction |
4 MB |
32 KB |
2 MB |
64 KB |
512 x 2 KB |
Shared |
NO |
NO |
YES |
NO |
YES |
NO |
YES |
NO |
YES |
Line Size |
64 bytes |
64 bytes |
64 bytes |
64 byte |
64 byte |
64 bytes |
64 bytes |
64 bytes |
64 bytes |
Latency |
N/A |
N/A |
N/A |
3 |
14 |
3 |
14 |
3 |
20 |
Associatively |
2-way |
16-way |
32-way |
8-way |
16-way |
8-way |
8-way |
2-way |
16-way |
Coherence Protocol |
MOESI
|
MESI |
MESI |
MOESI
|
MOESI
Modified Owner Exclusive Shared Invalid" or MOESI, is a full cache coherency protocol that encompasses all of possible states commonly used in other protocols.
MESI
Modified Exclusive Shared Invalid" or MESI, is a widely used cache coherency and memory coherency protocol.
Single Core MP vs Multi-Core
To compare performance difference between these two different MP architectures we can see following selection of benchmarks. First we can see results of AMD Dual Core System vs AMD Dual Processor System.
Intel Performance comparison can also be seen
Both AMD and Intel has launch many of single core multi-processors in past and spent time to improve in performance, reducing size of single gates but they found physical limits of semiconductor-based microelectronics becomes a major design concern. Multiple cores in a single die gave them a new way to improve processing power and go beyond the physical limit of single core processors.
References:
http://www.intel.com/technology/itj/2007/v11i3/1-integration/5-cache-heirarchy.htm
http://en.wikipedia.org/wiki/Multi-core_(computing)
http://www.intel.com/performance/desktop/digoffice/index.htm