CSC 456 Spring 2012/11a NC: Difference between revisions

From Expertiza_Wiki
Jump to navigation Jump to search
 
(14 intermediate revisions by the same user not shown)
Line 1: Line 1:
==Motivation For Article==
This aims to briefly outline the architecture of current large scale multiprocessor systems. We will go into detail about several current systems including their manufacturer, physical composition and connecting network, memory consistency models, and how data is kept coherent within the system.
==Large-Scale Multiprocessor Examples==
==Large-Scale Multiprocessor Examples==


Some examples of large-scale multiprocessor systems include Fujitsu's K Computer, the Tianhe-1A from the National Supercomputer Center in Tianjin, China, and [another example or two] [[How 'bout IBM's large systems--Blue Gene, etc]].
Some examples of large-scale multiprocessor systems include Fujitsu's K Computer, the Tianhe-1A from the National Supercomputer Center in Tianjin, China, and mordred2 from Kerlabs.


==K Computer==
==K Computer==
Line 7: Line 10:
Made by [http://www.fujitsu.com/global/ Fujitsu], the K Computer consists of 88,128 processors between 864 cabinets. Each cabinet contains 96 nodes which, in turn, each contain one processor and 16 GBytes of memory. <ref name="kprocs"/>
Made by [http://www.fujitsu.com/global/ Fujitsu], the K Computer consists of 88,128 processors between 864 cabinets. Each cabinet contains 96 nodes which, in turn, each contain one processor and 16 GBytes of memory. <ref name="kprocs"/>


The system is networked together via [http://en.wikipedia.org/wiki/Point-to-point_(network_topology)#Point-to-point point-to-point], or direct, connection. Fujitsu has their own proprietary network, known as the "Tofu Interconnect". It is a six-dimensional [http://en.wikipedia.org/wiki/Mesh_topology mesh]/[http://en.wikipedia.org/wiki/Torus_interconnect torus] topology. Each set of 12 nodes is called a "node group" and is considered the unit of job allocation. Each node group is connected to adjacent node groups via a three-dimensional torus network. Additionally, the nodes within each node group are adjacently connection via their own three-dimensional mesh/torus. <ref name="kpdf"/><ref name="ktofu"/><ref name="knetwork"/> [[What topology?  Surely not 95^2 links!]]
The system is networked together via [http://en.wikipedia.org/wiki/Point-to-point_(network_topology)#Point-to-point point-to-point], or direct, connection. Fujitsu has their own proprietary network, known as the "Tofu Interconnect". It is a six-dimensional [http://en.wikipedia.org/wiki/Mesh_topology mesh]/[http://en.wikipedia.org/wiki/Torus_interconnect torus] topology. Each set of 12 nodes is called a "node group" and is considered the unit of job allocation. Each node group is connected to adjacent node groups via a three-dimensional torus network. Additionally, the nodes within each node group are adjacently connection via their own three-dimensional mesh/torus. <ref name="kpdf"/><ref name="ktofu"/><ref name="knetwork"/>


The K Computer is not a [http://en.wikipedia.org/wiki/Distributed_shared_memory distributed shared memory] (DSM) machine in which the physically separate nodes are addressed as one logically shared address space. Instead, the K Computer utilizes a [http://en.wikipedia.org/wiki/Message_Passing_Interface message passing interface] (MPI), allowing the nodes to pass messages to one another as needed.
The K Computer utilizes a [http://en.wikipedia.org/wiki/Message_Passing_Interface message passing interface] (MPI), allowing the nodes to pass messages to one another as needed.


==Tianhe-1A==
==Tianhe-1A==
Line 18: Line 21:
The Arch interconnect uses point-to-point connections in a hybrid fat tree configuration.
The Arch interconnect uses point-to-point connections in a hybrid fat tree configuration.


The system uses message passing rather than shared memory, so neither a system-wide cache coherency protocol nor a memory consistency protocol is necessary.
The system also uses message passing (MPI), so neither a system-wide cache coherency protocol nor a memory consistency protocol is necessary.<ref name="tianhe_mpi"/>


[[Maybe you could make a table of characteristics of these supercomputers ... you could use top500 as a starting point, and add more detailed info on architecture ... though that might be hard to obtain for some.]]
This cluster computer cost $88 million to build and an additional $20 million per year for electricity and operating expenses.<ref name="tianhe" />


==mordred2 (Kerlabs)==
==mordred2 (Kerlabs)==
The mordred2 is one of several clusters operated by Kerlabs. It is a distributed shared memory system running the open source software Kerrighed. The cluster contains 110 nodes, each with 2 dual-core AMD Opteron processors and 4GB of memory.<ref name="mordred2"/> Its distributed shared memory is provided on the software level by the Linux extension Kerrighed. The software provides sequential consistency, process migration to another node, and checkpointing (the ability to return to a previous application state in case of failure).<ref name="kerrighed"/>
The mordred2 is one of several clusters operated by Kerlabs. It is a distributed shared memory system running the open source software Kerrighed. The cluster contains 110 nodes, each with 2 dual-core AMD Opteron processors and 4GB of memory, making it the largest known cluster running Kerrighed.<ref name="mordred2"/> Its distributed shared memory is provided on the software level by the Linux extension Kerrighed. This extension allows the programmer to treat remote memory accesses as shared memory accesses, effectively abstracting away the message passing required by underlying DSM hardware. The software provides sequential consistency, process migration to another node, and checkpointing (the ability to return to a previous application state in case of failure).<ref name="kerrighed"/>


==References==
==References==
Line 34: Line 37:
<ref name="kerrighed">http://en.wikipedia.org/wiki/Kerrighed</ref>
<ref name="kerrighed">http://en.wikipedia.org/wiki/Kerrighed</ref>
<ref name="mordred2">http://kerrighed.org/php/clusterview.php?id=29</ref>
<ref name="mordred2">http://kerrighed.org/php/clusterview.php?id=29</ref>
<ref name="tianhe">http://en.wikipedia.org/wiki/Tianhe-1</ref>
<ref name="tianhe_mpi">https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=4&ved=0CD8QFjAD&url=http%3A%2F%2Fsoftware.intel.com%2Ffile%2F39450&ei=iZaVT4fEC4Tetgfgtfm1Cw&usg=AFQjCNFCsxJccaKAuD5knOPxC6VQAR-xbQ</ref>
</references>
</references>

Latest revision as of 18:03, 23 April 2012

Motivation For Article

This aims to briefly outline the architecture of current large scale multiprocessor systems. We will go into detail about several current systems including their manufacturer, physical composition and connecting network, memory consistency models, and how data is kept coherent within the system.

Large-Scale Multiprocessor Examples

Some examples of large-scale multiprocessor systems include Fujitsu's K Computer, the Tianhe-1A from the National Supercomputer Center in Tianjin, China, and mordred2 from Kerlabs.

K Computer

Made by Fujitsu, the K Computer consists of 88,128 processors between 864 cabinets. Each cabinet contains 96 nodes which, in turn, each contain one processor and 16 GBytes of memory. <ref name="kprocs"/>

The system is networked together via point-to-point, or direct, connection. Fujitsu has their own proprietary network, known as the "Tofu Interconnect". It is a six-dimensional mesh/torus topology. Each set of 12 nodes is called a "node group" and is considered the unit of job allocation. Each node group is connected to adjacent node groups via a three-dimensional torus network. Additionally, the nodes within each node group are adjacently connection via their own three-dimensional mesh/torus. <ref name="kpdf"/><ref name="ktofu"/><ref name="knetwork"/>

The K Computer utilizes a message passing interface (MPI), allowing the nodes to pass messages to one another as needed.

Tianhe-1A

The Tianhe-1A, sponsored by the National University of Defense Technology in China, is capable of 4.701 petaFLOPS. It is comprised of 14,336 Xeon X5670 processors and 7,168 Nvidia GP-GPUs. In addition to the Xeon and Nvidia chips, there are 2048 FeiTeng 1000 processors.

All of these processors are contained in 112 computer cabinets, 12 storage cabinets, 6 communication cabinets, and 8 I/O cabinets. In each computer cabinet are 4 racks with 8 blades each and a 16 port switch. A single blade contains 2 computer nodes each containing 2 Xeon processors and 1 Nvidia GPU. This comes to a total of 3584 blades. These individual nodes are connected using a high-speed interconnect called Arch, which has a bandwidth of 160 Gbps.

The Arch interconnect uses point-to-point connections in a hybrid fat tree configuration.

The system also uses message passing (MPI), so neither a system-wide cache coherency protocol nor a memory consistency protocol is necessary.<ref name="tianhe_mpi"/>

This cluster computer cost $88 million to build and an additional $20 million per year for electricity and operating expenses.<ref name="tianhe" />

mordred2 (Kerlabs)

The mordred2 is one of several clusters operated by Kerlabs. It is a distributed shared memory system running the open source software Kerrighed. The cluster contains 110 nodes, each with 2 dual-core AMD Opteron processors and 4GB of memory, making it the largest known cluster running Kerrighed.<ref name="mordred2"/> Its distributed shared memory is provided on the software level by the Linux extension Kerrighed. This extension allows the programmer to treat remote memory accesses as shared memory accesses, effectively abstracting away the message passing required by underlying DSM hardware. The software provides sequential consistency, process migration to another node, and checkpointing (the ability to return to a previous application state in case of failure).<ref name="kerrighed"/>

References

<references> <ref name="kpdf">http://www.fujitsu.com/downloads/TC/sc10/interconnect-of-k-computer.pdf</ref> <ref name="ktofu">http://www.fujitsu.com/global/about/tech/k/whatis/network/</ref> <ref name="kprocs">http://en.wikipedia.org/wiki/K_computer</ref> <ref name="knetwork">http://www.riken.jp/engn/r-world/info/release/pamphlet/aics/pdf/2010_09.pdf</ref> <ref name="kerrighed">http://en.wikipedia.org/wiki/Kerrighed</ref> <ref name="mordred2">http://kerrighed.org/php/clusterview.php?id=29</ref> <ref name="tianhe">http://en.wikipedia.org/wiki/Tianhe-1</ref> <ref name="tianhe_mpi">https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=4&ved=0CD8QFjAD&url=http%3A%2F%2Fsoftware.intel.com%2Ffile%2F39450&ei=iZaVT4fEC4Tetgfgtfm1Cw&usg=AFQjCNFCsxJccaKAuD5knOPxC6VQAR-xbQ</ref> </references>