CSC 456 Spring 2012/11b AB: Difference between revisions

From Expertiza_Wiki
Jump to navigation Jump to search
Line 122: Line 122:
<ref name="alewife">http://webcache.googleusercontent.com/search?q=cache:-oLJbStOeAEJ:groups.csail.mit.edu/cag/pub/papers/chaiken-thesis.ps.Z+&cd=1&hl=en&ct=clnk&gl=us&client=firefox-a</ref>
<ref name="alewife">http://webcache.googleusercontent.com/search?q=cache:-oLJbStOeAEJ:groups.csail.mit.edu/cag/pub/papers/chaiken-thesis.ps.Z+&cd=1&hl=en&ct=clnk&gl=us&client=firefox-a</ref>
<ref name="blacklight">http://www.psc.edu/machines/sgi/uv/blacklight.php</ref>
<ref name="blacklight">http://www.psc.edu/machines/sgi/uv/blacklight.php</ref>
<ref name="topology">http://learn-networking.com/network-design/a-guide-to-network-topology</ref>
<ref name="topology">http://en.wikibooks.org/wiki/Communication_Networks/Network_Topologies</ref>
</references>
</references>

Revision as of 07:57, 17 April 2012

Large-Scale Multiprocessors

Manufacturers

In order to build a large-scale multiprocessor (LSM), you will need to choose the right processors, as well as the most appropriate cabinet(s) to place them in. There are several different manufacturers of processors and cabinets that can be used in LSM configurations. For example, Fujitsu's K computer (the number one ranked supercomputer on TOP500's November 2011 list) uses a configuration of 88,128 SPARC64 VIIIfx processors. This means it has a total of 705,024 cores at its use<ref name="k computer"/>. Additional examples of processors used in LSMs can be found in Table 1.

Table 1: Processor Manufacturers
Manufacturer Processor Cores Clock Rate Architecture
Fujitsu SPARC64 VIIIfx<ref name="fujitsu proc"/> 8 2.0 GHz SPARC
Intel Xeon 7500<ref name="intel proc"/> 8 1.733-2.667 GHz Nehalem
IBM POWER7<ref name="ibm proc"/> 8 2.4-4.25 GHz Power ISA v.2.06
AMD Opteron 6100<ref name="amd proc"/> 12 1.7-2.4 GHz Direct Connect 2.0

Like processors, different manufacturers offer varying cabinet/server types, such as IBM's BladeCenter HT. This particular model uses their CoolBlue technology, a set of tools that allows the user to have greater control over cooling and power use. There are also some standard cabinet frames, such as 19-inch racks, which get their name from the 19-inch panels used in their design. Typically, these racks allow for easy processor/server installation and removal. <ref name="19 inch rack"/>

Table 2: Cabinet Manufacturers
Manufacturer Cabinet Blade Count
SuperMicro MP Superserver 8064B-TRLF<ref name="supermicro chassis"/> 4
HP Integrity Superdome 2<ref name="hp chassis"/> 32
IBM BladeCenter HT<ref name="ibm chassis"/> 12

Assembling

Network Topology

There are many different ways to connect the network of processors. Each network type has different properties and values related to their diameter, bisection bandwidth, and degree. The diameter of a network is the longest number of network hops between any pair of nodes. Bisection bandwidth refers to the minimum number of links that need to be cut to divide the network in half. The degree of a network refers to the number of in/out links on each node. The following figure displays some examples.

An example of possible network structures.<ref name = "topology"/>
An example of possible network structures.<ref name = "topology"/>
Table 3: Network Properties
Topology Diameter Bandwidth Degree
Ring p/2 2 2
k-ary d Mesh 2(sqrt(p) - 1) sqrt(p) 4
Line p - 1 1 2
k-ary Tree 2 x log_k(p) 1 k+1
Fully Connected log_2(p) p/2 log_2(p)

Coherence

For LSMs that use a Distributed Shared Memory (DSM) architecture, cache coherence is an important issue. In 1990, researchers at the Massachusetts Institute of Technology showed that it was possible to build to build a coherent LSM using a directory-based approach with the Alewife multiprocessor <ref name="alewife"/>. A modern example is the Pittsburgh Supercomputing Center's Blacklight, a supercomputer with hardware-enabled shared coherent memory <ref name="blacklight"/>.

On the other hand, some LSMs use distributed memory systems, meaning that each of the processors has its own private memory, making cache coherency a non-issue. Fujitsu's K computer is an example of such a system <ref name="k computer"/>.

References

<references> <ref name="fujitsu proc">http://en.wikipedia.org/wiki/SPARC64_VIIIfx</ref> <ref name="intel proc">http://en.wikipedia.org/wiki/Xeon#6500.2F7500-series_.22Beckton.22</ref> <ref name="ibm proc">http://en.wikipedia.org/wiki/Power7</ref> <ref name="amd proc">http://en.wikipedia.org/wiki/Opteron#Opteron_.2845_nm_SOI.29</ref> <ref name="supermicro chassis">http://www.supermicro.com/products/system/4U/8046/SYS-8046B-TRLF.cfm</ref> <ref name="hp chassis">http://h20341.www2.hp.com/integrity/us/en/high-end/integrity-high-end-servers-superdome2.html</ref> <ref name="ibm chassis">http://www-03.ibm.com/systems/bladecenter/hardware/chassis/bladeht/index.html</ref> <ref name="k computer">http://top500.org/lists/2011/11/press-release</ref> <ref name="19 inch rack">http://en.wikipedia.org/wiki/19-inch_rack</ref> <ref name="alewife">http://webcache.googleusercontent.com/search?q=cache:-oLJbStOeAEJ:groups.csail.mit.edu/cag/pub/papers/chaiken-thesis.ps.Z+&cd=1&hl=en&ct=clnk&gl=us&client=firefox-a</ref> <ref name="blacklight">http://www.psc.edu/machines/sgi/uv/blacklight.php</ref> <ref name="topology">http://en.wikibooks.org/wiki/Communication_Networks/Network_Topologies</ref> </references>