CSC/ECE 506 Spring 2012/3a df
Patterns of Parallel Programming
Introduction
Computer programming has gone through several paradigm shifts in regards to how software is design and constructed. In more recent years with the advent of object-oriented programming, an effort was made by the “Gang of Four” to categorize different arrangements and relationships between programming objects in terms of commonly used design patterns. In a similar vein, with recent advances in parallel programming and availability of hardware, there are several common designs that can be drawn from the vast amount of literature available. Although there exists no canonical text for parallel programing as there exists for object oriented programming, there is substantial overlap, which is presented here.
Overall, programming patterns can be split into three categories of similar abstraction: architectural patterns, design patterns, and idioms.<ref name="ortega">Ortega-Arjona, Jorge Luis. Patterns for Parallel Software Design. Hoboken, NJ: John Wiley, 2010</ref> An architectural pattern defines software organization, a design pattern is a refinement of an architectural pattern in regards to solving a particular pattern, and an idiom is a particular implementation of a design pattern in a programming language or framework. This chapter will focus on architectural and design patterns.
Intuitive Patterns
Intuitive patterns are termed as such because even those with basic knowledge of parallel processes could create their design. They are obvious, straight-forward applications of parallel computing tasks, and their implementation and use are easily implemented even without specialized software packages or advanced algorithm development.
"Embarrassingly" Parallel
Embarrassingly parallel patterns are those in which the tasks to be performed are completely disparate. Specifically, these operations occur when the datasets of concern are able to be abstracted to a point where the resources are not in contention. Examples abound of this behavior in productivity programs like Word or Outlook, when a spell-check or auto-archive operation is in progress.
Replicable
The replicable pattern comes from the fact that data available to all processes needs to be duplicated for use in other processes<ref name="kim_ppp_ppt">Parallel Programming Patterns by Eun-Gyu Kim</ref>. This necessitates a reduction step where the results from all of the tasks is analyzed for the total final result. An application of this pattern could be in certain kinds of encryption algorithms where a set of data can be broken into "chunks" of a particular size, decrypted seperately, and joined back together at the end.
Repository
The Repository pattern is related to the Replicate pattern in that they both act on a set of data and rely on the result. However, in the Repository pattern all data remains centralized while the computations themselves are performed remotely. In essence, the Repository pattern could be thought of as a continuous Replicate pattern where the global data set is much larger than can be simultaneously replicated. An excellent (simplified) example of this pattern would be distributed computing applications like Folding@Home or SETI@Home, where a central server coordinates the calculations between all available processing units.
Divide and Conquer
The Divide and Conquer pattern is very similar to the Replicate pattern. It involves taking a problem or task and splitting it into subproblems whereupon the required calculations are performed in parallel. A merging action still needs to occur to arrive at the solution, but the advantage of Divide and Conquer over Replicate is that the entire data set does not have to be communicated between the processing units -- only the required data for the computation does. Also, due to the degree of separation of computation and data, there is a non-negligible amount of overhead associated with determining the proper points at which the problems should be split and the overall allocation of resources.
Common Patterns
Common patterns solve parallel design tasks that are readily understood and frequently encountered, but whose implementation could benefit from a more formalized and structured approach. Generally, common parallel patterns can be implemented manually or with a framework with a moderate degree of difficulty depending on the programmer's skill level.
Pipeline
The Pipeline pattern is a hybrid of the “Embarrassingly” Parallel pattern as well as the Replicable pattern. The best way to think of the Pipeline pattern is as an assembly line <ref name=”pipeline_wiki”>Pipeline (computing) on Wikipedia</ref>. Suppose there are three steps in assembling a computer: un-packaging the internals, installing the internals, and performing a first-boot diagnostic, all of which take 15 minutes each. In a serial pattern, the entire process takes 45 minutes for one assembled computer, and it takes 45 more minutes for another computer to be assembled. In a Pipeline pattern, however, that same process would still have the same latency (since it takes 45 minutes for a particular computer to be built), but the overall throughput is increased because a computer in general is completed every 15 minutes after the first one.
Pipelining differs from the Replicable pattern because in Pipeline only one copy of the resources is available. Referring back to the assembly line example, there is only one station for un-packaging, one station for installation, and one station for diagnostics. In a Replicable pattern, however, each work stream would have its own copy of the resources, so that the same steps could occur simultaneously. To be sure, this does nothing for latency, but has the capability to increase throughput dramatically.
Recursive Data
The Recursive Data pattern presents an interesting problem for parallel algorithms; while the need is common, its solution can be very complex depending on the data structure. In general, when working with recursive data structures, it is desired for any and all commonalities and reductions to be established in order to minimize the operations required to traverse the structure. Once the structure is mapped (or its behavior analyzed), other parallel patterns can be employed. One possible example is that of an electric utility’s mesh network of smart meters. Hub (head-end) units where data is aggregated need to employ parallel processing in order to efficiently collect usage data. For smaller hubs, it might be beneficial to communicate directly with the meters, whereas for larger hubs it may be more prudent to communicate with smaller hubs. In such a case, Recursive Data can be paired with Divide & Conquer to come to a solution much faster than its serial or naively parallel alternatives.
Geometric
The Geometric pattern is a specialized implementation of the Divide & Conquer pattern. It differs in that there is sharing of data on boundaries of the problem, be that the first and last elements of different sets (1D), perimeters (2D), or faces (3D). The significant difference between the Geometric and Divide & Conquer patterns is that in the Geometric pattern, there is ideally no sub-problem step, as the inter-process communication that might occur is used in the current process’ calculation and does not have to subsequently be calculated. A great example of the Geometric pattern is a fluid simulation, where a matrix (or higher dimension) data set must be analyzed for changes in motion.
Irregular Mesh
The Irregular Mesh pattern is a hybrid of the Geometric and Recursive Data patterns (and sub-patterns implied); Geometric because of the boundaries involved with splitting up the shape, and Recursive Data due to the varying hierarchies that might be involved. An example of this kind of operation would be a skeletal/bone structure system for three-dimensional computer models. A difficult task for even serial computation, the Irregular Mesh pattern is difficult to implement in an optimal manner without careful consideration to the algorithm and having insight as to what kind of structure is being analyzed.
Inseparable
Complex Patterns
Complex patterns involve the interplay between two or more parallel patterns, or are a single pattern that greatly deviates from a common implementation itself. Such algorithms require moderate to advanced programmer skill, and are almost always accompanied by a framework or involve a significant amount of custom "infrastructural" code to accomodate the needs of the architecture.
Multi-grid
Multi-domain
Formalization
Formalization describes the efforts of the programming community to devise tools which can aid programmers in designing software which follows standardized (whether international/documented or by practice) procedures, methods, and templates for performing parallel operations. These tools may be purely documentation pieces, frameworks, examples, academic papers, etc.
Design Languages
Design languages allow programmers to define problems in terms of a common vocabulary so that solutions to problems can be readily identified and efficiently shared.<ref name="ortega"/> A brief survey of some current pattern language efforts follows.
ParLab
http://parlab.eecs.berkeley.edu/wiki/patterns/patterns
University of Florida PLPP
http://www.cise.ufl.edu/research/ParallelPatterns/index.htm
Frameworks
Frameworks are any piece of code for a particular programming language which can help a programmer in creating a program which takes advantage of parallel algorithms. A sample of some available frameworks follows.
Parallel Patterns Library
http://msdn.microsoft.com/en-us/library/dd492418.aspx
OpenMP
SWARM
http://www.cc.gatech.edu/~bader/papers/SWARM.html
Summary
See Also
References
<references></references>