CSC/ECE 506 Fall 2007/wiki1 8 s5
Message Passing
When we have multiple processors, there needs to be a way to communicate between those processors. Message Passing forms a part of this communication architecture. There are other methods of communication like Shared Address Space and Data Parallel Processing, which along with Message Passing contribute to the communication abstraction. Communication abstraction is essentially a layer in between the application software and the communication hardware where the programmer uses available libraries to initiate communication between processors though programs.
Message Passing Model
Message Passing Model is defined as:
1. Set of Processes having only local memory
2. Processes communicate by sending and receiving messages
3. Transfer of data between processes requires cooperative operations to be performed by each process (a send operation must have a matching receive)
The message passing model has gained wide use in the field of parallel computing due to advantages that include:
1. Hardware match - The message passing model fits well on parallel supercomputers and clusters of workstations which are composed of separate processors connected by a communications network.
2. Functionality - Message passing offers a full set of functions for expressing parallel algorithms, providing the control not found in data-parallel model.
3. Performance - Message passing gives the programmer explicit control of data locality. This in turn enables effective management of memory and caches in CPUs.
Latest Developments in Message Passing
Although Message Passing Model as a whole has not changed over time, the Message Passing Interface (MPI) has undergone continuous change. MPI is a communications protocol used to program parallel computers. MPI is not sanctioned by any major standards body; nevertheless, it has become the de facto standard for communication among processes that model a parallel program running on a distributed memory system.
Message Passing Interface (MPI)
It is a specification for message passing libraries, designed to be a standard for distributed memory, message passing and parallel computing. The goal of the Message Passing Interface simply stated is to provide a widely used standard for writing message-passing programs. The interface attempts to establish a practical, portable, efficient, and flexible standard for message passing.
History
MPI resulted from the efforts of numerous individuals and groups over the course of 2 years, dated back in 1980’s. Given below is a chronology of developments in MPI according to documentation provided by the Maui High Performance Computing Center. (www.mhpcc.edu)
1. 1980s - early 1990s: Distributed memory, parallel computing develops, as do a number of incompatible software tools for writing such programs - usually with tradeoffs between portability, performance, functionality and price. Recognition of the need for a standard arose.
2. April, 1992: Workshop on Standards for Message Passing in a Distributed Memory Environment, sponsored by the Center for Research on Parallel Computing, Williamsburg, Virginia. The basic features essential to a standard message passing interface were discussed, and a working group established to continue the standardization process. Preliminary draft proposal developed subsequently.
3. November 1992: - Working group meets in Minneapolis. MPI draft proposal (MPI1) from ORNL presented. Group adopts procedures and organization to form the MPI Forum. MPIF eventually comprised of about 175 individuals from 40 organizations including parallel computer vendors, software writers, academia and application scientists.
4. November 1993: Supercomputing 93 conference - draft MPI standard presented.
5. Final version of draft released in May, 1994 - available on the WWW at: http://www.mcs.anl.gov/Projects/mpi/standard.html
Advantages
MPI is preferred over other implementations for several reasons like:
1. Standardization - MPI is the only message passing library which can be considered a standard. It is supported on virtually all High Performance Computing (HPC) platforms.
2. Portability – Modification of source code not required when the application is ported to a different platform that supports MPI.
3. Performance - vendor implementations should be able to exploit native hardware features to optimize performance.
4. Functionality (over 115 routines)
5. Availability - a variety of implementations are available, both vendor and public domain.
MPI Implementations
Some of the implementations of MPI include:
1. Classical Cluster and Supercomputer implementations
2. Python
3. OCaml
4. Java
5. Microsoft Windows
6. MATLAB
7. Hardware implementations
Blade Servers
A blade server is a server chassis housing multiple thin, modular electronic circuit boards, known as server blades. Each blade is a server in its own right, often dedicated to a single application. The blades are literally servers on a card, containing processors, memory, integrated network controllers, an optional fiber channel host bus adaptor (HBA) and other input/output (IO) ports.
Blade servers allow more processing power in less rack space, simplifying cabling and reducing power consumption. According to a Search, WinSystems.com article on server technology, enterprises moving to blade servers can experience as much as an 85% reduction in cabling for blade installations over conventional 1U or tower servers. With so much less cabling, IT administrators can spend less time managing the infrastructure and more time ensuring high availability
A blade server is sometimes referred to as a high-density server and is typically used in a clustering of servers that are dedicated to a single task, such as:
1. File sharing
2. Web page serving and caching
3. SSL encrypting of Web communication
4. The transcoding of Web page content for smaller displays
5. Streaming audio and video content
Architecture
A general blade server architecture is shown in the figure below. The hardware components of a blade server are the switch blade, chassis (with fans, temperature sensors, etc), and multiple compute blades. Some vendors offer, partner, or plan to partner with companies that provide application specific blades that provide traffic conditioning, protection, or network processing prior to the traffic reaching the compute blades. Often, these application specific blades may be functionally positioned between the switch blade and compute blades. However, these blades reside in a standard compute blade slot. The outside world connects through the rear of the chassis to a switch card in the blade server. The switch card is provisioned to distribute packets to blades within the blade server. All these components are wrapped together with network management system software provided by the blade server vendor. The network management could be done through Message Passing which essentially makes blade servers an extension of message passing.
Evolution
The name blade server appeared when a card included the processor, memory, I/O and non-volatile program storage (flash memory or small hard disk(s)). This allowed a complete server, with its operating system and applications, to be packaged on a single card / board / blade. These blades could then operate independently within a common chassis, doing the work of multiple separate server boxes more efficiently. Less space consumption is the most obvious benefit of this packaging, but additional efficiency benefits have become clear in power, cooling, management, and networking due to the pooling or sharing of common infrastructure to supports the entire chassis, rather than providing each of these on a per server box basis. Blade servers date back to 1970s. The evolution chronology as provided by Wikipedia (Article: Blade Servers) is given below: Complete microcomputers were placed on cards and packaged in standard 19-inch racks in the 1970s soon after the introduction of 8-bit microprocessors. This architecture was used in the industrial process control industry as an alternative to minicomputer control systems. Programs were stored in EPROM on early models and were limited to a single function with a small realtime executive.
The VMEBus architecture (ca. 1981) defined a computer interface which included implementation of a board-level computer that was installed in a chassis backplane with multiple slots for pluggable boards that provide I/O, memory, or additional computing. The PCI Industrial Computer Manufacturers Group PICMG developed a chassis/blade structure for the then emerging Peripheral Component Interconnect bus PCI which is called CompactPCI. Common among these chassis based computers was the fact that the entire chassis was a single system. While a chassis might include multiple computing elements to provide the desired level of performance and redundancy, there was always one board in charge, one master board coordinating the operation of the entire system. PICMG expanded the CompactPCI specification with the use of standard Ethernet connectivity between boards across the backplane. The PICMG 2.16 CompactPCI Packet Switching Backplane specification was adopted in Sept 2001 (PICMG specifications). This provided the first open architecture for a multi-server chassis. PICMG followed with the larger and more feature rich AdvancedTCA specification targeting the telecom industry's need for a high availability and dense computing platform with extended product life (10+ years). While AdvancedTCA system and board pricing is typically higher than blade servers, AdvancedTCA suppliers claim that low operating expenses and total cost of ownership can make AdvancedTCA-based solutions a cost effective alternative for many building blocks of the next generation telecom network.
Future
Early versions of server blades will be primarily high-density, low-power devices with relatively low performance. This type of blade is suited for first-tier applications such as static Web servers, security, network services, and streaming media because the applications can be easily and inexpensively load balanced. The performance of an application depends on the aggregate performance of the servers rather than the performance of an individual server. Higher performance, less dense blade designs will help drive blade usage into more mainstream applications in the corporate data center. These designs can offer the individual performance characteristics and features available in today's rack-dense servers along with the cost, deployment, serviceability, and density benefits of server blades. The blades will be well suited to high-performance Web servers, dedicated application servers, server-based or thin-client computing, and high-performance computing (HPC) clusters.
The introduction of server blades and associated technology like IB will usher in a new IT infrastructure. IT managers should start planning now for server blade installations by evaluating IP-based storage solutions, remote software provisioning and management solutions, scale-out architectures, and load-balancing technologies