CSC/ECE 506 Spring 2010/ch 2 aj/Data Parallel Programming: Difference between revisions

From Expertiza_Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 5: Line 5:
When multiple processing elements perform an action separately on different sets of data and exchange information globally before processing more code synchronously is data-parallel processing.  Even though this model is offered in contrast to the shared memory model and the message passing model, data-parallel processing can be accomplished by actually passing messages or sharing memory.  The main distinction between the data-parallel model and the other two models has to do with the outcome of the individual steps instead.
When multiple processing elements perform an action separately on different sets of data and exchange information globally before processing more code synchronously is data-parallel processing.  Even though this model is offered in contrast to the shared memory model and the message passing model, data-parallel processing can be accomplished by actually passing messages or sharing memory.  The main distinction between the data-parallel model and the other two models has to do with the outcome of the individual steps instead.


The term SIMD, or single-instruction-multiple-data, machines refer to the implementation of the data-parallel programming model.  This term was coined as part of Flynn's taxonomy in the 1970s.  Vector processors are also used to implement this programming model.  A visual representation of a SIMD machine is shown below
The term SIMD, or single-instruction-multiple-data, machines refer to the implementation of the data-parallel programming model.  This term was coined as part of Flynn's taxonomy in the 1970s.  Vector processors are also used to implement this programming model.  A visual representation of a SIMD machine is shown below.
 
[[File:Simd.png|thumb|caption:SIMD]]


= Other Parallel Programming Models =
= Other Parallel Programming Models =

Revision as of 03:40, 29 January 2010

Data-Parallel Programming Model

Introduction

When multiple processing elements perform an action separately on different sets of data and exchange information globally before processing more code synchronously is data-parallel processing. Even though this model is offered in contrast to the shared memory model and the message passing model, data-parallel processing can be accomplished by actually passing messages or sharing memory. The main distinction between the data-parallel model and the other two models has to do with the outcome of the individual steps instead.

The term SIMD, or single-instruction-multiple-data, machines refer to the implementation of the data-parallel programming model. This term was coined as part of Flynn's taxonomy in the 1970s. Vector processors are also used to implement this programming model. A visual representation of a SIMD machine is shown below.

caption:SIMD

Other Parallel Programming Models

Shared Memory Programming Model

Message Passing Programming Model

Examples

This section shows a simple example illustrating the data-parallel programming model. All codes below are written in pseudo-code style.

Suppose we want to perform the following task on an array a: updating each element of a by the product of itself and its subscript, and adding together the elements of a into the variable sum. Corresponding code is shown below.

// a simple code
sum = 0;
for (i = 0; i < a.length; i++)
{
   a[i] = a[i] * i;
   sum = sum + a[i];
}

When we implement the task using data-parallel programming model, the program can be divided into two parts. The first part performs the same operations on separate elements of the array for each processing element, and the second part reorganizes data among all processing elements (In our example data reorganization is summing up values across different processing elements). Since data-parallel programming model only defines the overall effects of parallel steps, the second part can be accomplished either through shared memory or message passing. The three code fragments below are examples for the first part of the program, shared memory version of the second part, and message passing for the second part, respectively.

// data parallel programming: let each PE perform the same task on different pieces of distributed data
pe_id = getid();
my_sum = 0;
for (i = pe_id; i < a.length; i += number_of_pe)
{
   a[i] = a[i] * i;
   my_sum = my_sum + a[i];
}
// data reorganization via message passing
if (pe_id != 0) send_msg (0, my_sum);
else
   for (i = 1; i < number_of_pe; i++)
   {
      recv_msg (i, temp);
      my_sum = my_sum + temp;
   }
sum = my_sum;
// data reorganization via shared memory
shared sum;
lock();
sum = sum + my_sum;
unlock();
barrier;

As illustrated above, each processing element performs actions on the array's separate elements which are identified using the processing element's id. In our example, elements assigned to each processing element are staggered instead of continuous. For example, if the system contains only 2 process elements, then one of them would operate on all the odd entries and the other the even entries of the array.

References

1) David E. Culler, Jaswinder Pal Singh, and Anoop Gupta, Parallel Computer Architecture: A Hardware/Software Approach, Gulf Professional Publishing, August 1998.

2) Yan Solihin, Fundamentals of Parallel Computer Architecture: Multichip and Multicore Systems, Solihin Books, August 2009.

3) Wikipedia, "SIMD", http://en.wikipedia.org/wiki/SIMD