CSC/ECE 506 Spring 2012/3a df
Patterns of Parallel Programming
Introduction
Computer programming has gone through several paradigm shifts in regards to how software is design and constructed. In more recent years with the advent of object-oriented programming, an effort was made by the “Gang of Four” to categorize different arrangements and relationships between programming objects in terms of commonly used design patterns. In a similar vein, with recent advances in parallel programming and availability of hardware, there are several common designs that can be drawn from the vast amount of literature available. Although there exists no canonical text for parallel programing as there exists for object oriented programming, there is substantial overlap, which is presented here.
Intuitive Patterns
Intuitive patterns are termed as such because even those with basic knowledge of parallel processes could create their design. They are obvious, straight-forward applications of parallel computing tasks, and their implementation and use are easily implemented even without specialized software packages or advanced algorithm development.
"Embarrassingly" Parallel
Embarrassingly parallel patterns are those in which the tasks to be performed are completely disparate. Specifically, these operations occur when the datasets of concern are able to be abstracted to a point where the resources are not in contention. Examples abound of this behavior in productivity programs like Word or Outlook, when a spell-check or auto-archive operation is in progress.
Replicable
The replicable pattern comes from the fact that data available to all processes needs to be duplicated for use in other processes<ref name="kim_ppp_ppt">Parallel Programming Patterns by Eun-Gyu Kim</ref>. This necessitates a reduction step where the results from all of the tasks is analyzed for the total final result. An application of this pattern could be in certain kinds of encryption algorithms where a set of data can be broken into "chunks" of a particular size, decrypted seperately, and joined back together at the end.
Repository
The Repository pattern is related to the Replicate pattern in that they both act on a set of data and rely on the result. However, in the Repository pattern all data remains centralized while the computations themselves are performed remotely. In essence, the Repository pattern could be thought of as a continuous Replicate pattern where the global data set is much larger than can be simultaneously replicated. An excellent (simplified) example of this pattern would be distributed computing applications like Folding@Home or SETI@Home, where a central server coordinates the calculations between all available processing units.
Divide and Conquer
The Divide and Conquer pattern is very similar to the Replicate pattern. It involves taking a problem or task and splitting it into subproblems whereupon the required calculations are performed in parallel. A merging action still needs to occur to arrive at the solution, but the advantage of Divide and Conquer over Replicate is that the entire data set does not have to be communicated between the processing units -- only the required data for the computation does. Also, due to the degree of separation of computation and data, there is a non-negligible amount of overhead associated with determining the proper points at which the problems should be split and the overall allocation of resources.
Common Patterns
Common patterns solve parallel design tasks that are readily understood and frequently encountered, but whose implementation could benefit from a more formalized and structured approach. Generally, common parallel patterns can be implemented manually or with a framework with a moderate degree of difficulty depending on the programmer's skill level.
Pipeline
Recursive Data
Geometric
Irregular Mesh
Inseperable
Complex Patterns
Complex patterns involve the interplay between two or more parallel patterns, or are a single pattern that greatly deviates from a common implementation itself. Such algorithms require moderate to advanced programmer skill, and are almost always accompanied by a framework or involve a significant amount of custom "infrastructural" code to accomodate the needs of the architecture.
Multi-grid
Multi-domain
Formalization
Formalization describes the efforts of the programming community to devise tools which can aid programmers in designing software which follows standardized (whether international/documented or by practice) procedures, methods, and templates for performing parallel operations. These tools may be purely documentation pieces, frameworks, examples, academic papers, etc.
Design Languages
Frameworks
See Also
References
<references></references>