CSC/ECE 506 Spring 2012/3a df: Difference between revisions

From Expertiza_Wiki
Jump to navigation Jump to search
Line 21: Line 21:
Common patterns solve parallel design tasks that are readily understood and frequently encountered, but whose implementation could benefit from a more formalized and structured approach. Generally, common parallel patterns can be implemented manually or with a framework with a moderate degree of difficulty depending on the programmer's skill level.
Common patterns solve parallel design tasks that are readily understood and frequently encountered, but whose implementation could benefit from a more formalized and structured approach. Generally, common parallel patterns can be implemented manually or with a framework with a moderate degree of difficulty depending on the programmer's skill level.
===Pipeline===
===Pipeline===
The Pipeline pattern is a hybrid of the “Embarrassingly” Parallel pattern as well as the Replicable pattern. The best way to think of the Pipeline pattern is as an assembly line <ref name=”pipeline_wiki”>[http://en.wikipedia.org/wiki/Pipeline_(computing) Pipeline (computing)] on Wikipedia</ref>. Suppose there are three steps in assembling a computer: un-packaging the internals, installing the internals, and performing a first-boot diagnostic, all of which take 15 minutes each. In a serial pattern, the entire process takes 45 minutes for one assembled computer, and it takes 45 more minutes for another computer to be assembled. In a Pipeline pattern, however, that same process would still have the same latency (since it takes 45 minutes for a particular computer to be built), but the overall throughput is increased because a computer in general is completed every 15 minutes after the first one.
Pipelining differs from the Replicable pattern because in Pipeline only one copy of the resources is available. Referring back to the assembly line example, there is only one station for un-packaging, one station for installation, and one station for diagnostics. In a Replicable pattern, however, each work stream would have its own copy of the resources, so that the same steps could occur simultaneously. To be sure, this does nothing for latency, but has the capability to increase throughput dramatically.
===Recursive Data===
===Recursive Data===
===Geometric===
===Geometric===

Revision as of 03:30, 13 February 2012

Patterns of Parallel Programming

Introduction

Computer programming has gone through several paradigm shifts in regards to how software is design and constructed. In more recent years with the advent of object-oriented programming, an effort was made by the “Gang of Four” to categorize different arrangements and relationships between programming objects in terms of commonly used design patterns. In a similar vein, with recent advances in parallel programming and availability of hardware, there are several common designs that can be drawn from the vast amount of literature available. Although there exists no canonical text for parallel programing as there exists for object oriented programming, there is substantial overlap, which is presented here.

Intuitive Patterns

Intuitive patterns are termed as such because even those with basic knowledge of parallel processes could create their design. They are obvious, straight-forward applications of parallel computing tasks, and their implementation and use are easily implemented even without specialized software packages or advanced algorithm development.

"Embarrassingly" Parallel

Embarrassingly parallel patterns are those in which the tasks to be performed are completely disparate. Specifically, these operations occur when the datasets of concern are able to be abstracted to a point where the resources are not in contention. Examples abound of this behavior in productivity programs like Word or Outlook, when a spell-check or auto-archive operation is in progress.

Replicable

The replicable pattern comes from the fact that data available to all processes needs to be duplicated for use in other processes<ref name="kim_ppp_ppt">Parallel Programming Patterns by Eun-Gyu Kim</ref>. This necessitates a reduction step where the results from all of the tasks is analyzed for the total final result. An application of this pattern could be in certain kinds of encryption algorithms where a set of data can be broken into "chunks" of a particular size, decrypted seperately, and joined back together at the end.

Repository

The Repository pattern is related to the Replicate pattern in that they both act on a set of data and rely on the result. However, in the Repository pattern all data remains centralized while the computations themselves are performed remotely. In essence, the Repository pattern could be thought of as a continuous Replicate pattern where the global data set is much larger than can be simultaneously replicated. An excellent (simplified) example of this pattern would be distributed computing applications like Folding@Home or SETI@Home, where a central server coordinates the calculations between all available processing units.

Divide and Conquer

The Divide and Conquer pattern is very similar to the Replicate pattern. It involves taking a problem or task and splitting it into subproblems whereupon the required calculations are performed in parallel. A merging action still needs to occur to arrive at the solution, but the advantage of Divide and Conquer over Replicate is that the entire data set does not have to be communicated between the processing units -- only the required data for the computation does. Also, due to the degree of separation of computation and data, there is a non-negligible amount of overhead associated with determining the proper points at which the problems should be split and the overall allocation of resources.

Common Patterns

Common patterns solve parallel design tasks that are readily understood and frequently encountered, but whose implementation could benefit from a more formalized and structured approach. Generally, common parallel patterns can be implemented manually or with a framework with a moderate degree of difficulty depending on the programmer's skill level.

Pipeline

The Pipeline pattern is a hybrid of the “Embarrassingly” Parallel pattern as well as the Replicable pattern. The best way to think of the Pipeline pattern is as an assembly line <ref name=”pipeline_wiki”>Pipeline (computing) on Wikipedia</ref>. Suppose there are three steps in assembling a computer: un-packaging the internals, installing the internals, and performing a first-boot diagnostic, all of which take 15 minutes each. In a serial pattern, the entire process takes 45 minutes for one assembled computer, and it takes 45 more minutes for another computer to be assembled. In a Pipeline pattern, however, that same process would still have the same latency (since it takes 45 minutes for a particular computer to be built), but the overall throughput is increased because a computer in general is completed every 15 minutes after the first one.

Pipelining differs from the Replicable pattern because in Pipeline only one copy of the resources is available. Referring back to the assembly line example, there is only one station for un-packaging, one station for installation, and one station for diagnostics. In a Replicable pattern, however, each work stream would have its own copy of the resources, so that the same steps could occur simultaneously. To be sure, this does nothing for latency, but has the capability to increase throughput dramatically.

Recursive Data

Geometric

Irregular Mesh

Inseperable

Complex Patterns

Complex patterns involve the interplay between two or more parallel patterns, or are a single pattern that greatly deviates from a common implementation itself. Such algorithms require moderate to advanced programmer skill, and are almost always accompanied by a framework or involve a significant amount of custom "infrastructural" code to accomodate the needs of the architecture.

Multi-grid

Multi-domain

Formalization

Formalization describes the efforts of the programming community to devise tools which can aid programmers in designing software which follows standardized (whether international/documented or by practice) procedures, methods, and templates for performing parallel operations. These tools may be purely documentation pieces, frameworks, examples, academic papers, etc.

Design Languages

Frameworks

See Also

References

<references></references>