CSC/ECE 506 Spring 2011/ch2 JR: Difference between revisions

From Expertiza_Wiki
Jump to navigation Jump to search
No edit summary
 
No edit summary
Line 2: Line 2:


== History ==
== History ==
As computer architectures have evolved, so have parallel programming models. The earliest advancements in parallel computers took advantage of bit-level parallelism.  These computers used vector processing, which required a shared memory programming model.  As performance returns from this architecture diminished, the emphasis was placed on instruction-level parallelism and the message passing model began to dominate.  Most recently, with the move to cluster-based machines, there has been an increased emphasis on thread-level parallelism. This has corresponded to an increase interest in the data parallel programming model.
=== Bit-level parallelism in the 1970's ===
The major performance improvements from computers during this time were due to the ability to execute 32-bit word size operations at one time. [Parallel Computer Architecture: A Hardware/Software Approach (The Morgan Kaufmann Series in Computer Architecture and Design) by David Culler, J.P. Singh, and Anoop Gupta,  pg 15].  The dominant supercomputers of the time, like the Cray and the ILLIAC IV, were mainly Single Instruction Multiple Data architectures and used a shared memory programming model.  They each used different forms of vector processing [Parallel Computer Architecture: A Hardware/Software Approach (The Morgan Kaufmann Series in Computer Architecture and Design) by David Culler, J.P. Singh, and Anoop Gupta (Hardcover – Aug 15, 1998) pg 21].
Development of the ILLIAC IV began in 1964 and wasn't finished until 1975 [CITE: http://en.wikipedia.org/wiki/ILLIAC_IV].  A central processor was connected to the main memory and delegated tasks to individual PE's, which each had their own memory cache. [http://archive.computerhistory.org/resources/text/Burroughs/Burroughs.ILLIAC%20IV.1974.102624911.pdf pg 4].  Each PE could operate either an 8-, 32- or 64-bit operand at a given time [http://archive.computerhistory.org/resources/text/Burroughs/Burroughs.ILLIAC%20IV.1974.102624911.pdf pg 4].
The Cray machine was installed at Los Alamos National Laborartory in1976 by Control Data Corporation and had similar performance to the ILLIAC IV [http://en.wikipedia.org/wiki/ILLIAC_IV].  The Cray machine relied heavily on the use of registers instead of individual processors like the ILLIAC IV.  Each processor was connected to main memory and had a number of 64-bit registers used to perform operations [CITE: http://www.eecg.toronto.edu/~moshovos/ACA05/read/cray1.pdf pg 65].
=== Move to instruction-level parallelism in the 1980's ===
Increasing the word size above 32-bits offered diminishing returns in terms of performance [CITE: Parallel Computer Architecture: A Hardware/Software Approach (The Morgan Kaufmann Series in Computer Architecture and Design) by David Culler, J.P. Singh, and Anoop Gupta (Hardcover – Aug 15, 1998) pg 15]. In the mid-1980's the emphasis changed from bit-level parallelism to instruction-level parallelism, which involved increasing the number of instructions that could be executed at one time [CITE: Parallel Computer Architecture: A Hardware/Software Approach (The Morgan Kaufmann Series in Computer Architecture and Design) by David Culler, J.P. Singh, and Anoop Gupta (Hardcover – Aug 15, 1998) pg 15].  The message passing model allowed programmers the ability to divide up instructions in order to take advantage of this architecture.
=== Thread-level parallelism ===
The move to cluster-based machines in the past decade, has added another layer of complexity to parallelism.  Since computers could be located across a network from each other, there is more emphasis on software acting as a bridge [CITE: http://cobweb.ecn.purdue.edu/~pplinux/ppcluster.html]. This has led to a greater emphasis on thread- or task-level parallelism [CITE: http://en.wikipedia.org/wiki/Thread-level_parallelism]  and the addition of the data parallelism programming model to existing message passing or shared memory models [CITE: http://en.wikipedia.org/wiki/Thread-level_parallelism]. 


== Data Parallel Model ==
== Data Parallel Model ==

Revision as of 22:39, 29 January 2011

Introduction

History

As computer architectures have evolved, so have parallel programming models. The earliest advancements in parallel computers took advantage of bit-level parallelism. These computers used vector processing, which required a shared memory programming model. As performance returns from this architecture diminished, the emphasis was placed on instruction-level parallelism and the message passing model began to dominate. Most recently, with the move to cluster-based machines, there has been an increased emphasis on thread-level parallelism. This has corresponded to an increase interest in the data parallel programming model.

Bit-level parallelism in the 1970's

The major performance improvements from computers during this time were due to the ability to execute 32-bit word size operations at one time. [Parallel Computer Architecture: A Hardware/Software Approach (The Morgan Kaufmann Series in Computer Architecture and Design) by David Culler, J.P. Singh, and Anoop Gupta, pg 15]. The dominant supercomputers of the time, like the Cray and the ILLIAC IV, were mainly Single Instruction Multiple Data architectures and used a shared memory programming model. They each used different forms of vector processing [Parallel Computer Architecture: A Hardware/Software Approach (The Morgan Kaufmann Series in Computer Architecture and Design) by David Culler, J.P. Singh, and Anoop Gupta (Hardcover – Aug 15, 1998) pg 21]. Development of the ILLIAC IV began in 1964 and wasn't finished until 1975 [CITE: http://en.wikipedia.org/wiki/ILLIAC_IV]. A central processor was connected to the main memory and delegated tasks to individual PE's, which each had their own memory cache. pg 4. Each PE could operate either an 8-, 32- or 64-bit operand at a given time pg 4.

The Cray machine was installed at Los Alamos National Laborartory in1976 by Control Data Corporation and had similar performance to the ILLIAC IV [1]. The Cray machine relied heavily on the use of registers instead of individual processors like the ILLIAC IV. Each processor was connected to main memory and had a number of 64-bit registers used to perform operations [CITE: http://www.eecg.toronto.edu/~moshovos/ACA05/read/cray1.pdf pg 65].

Move to instruction-level parallelism in the 1980's

Increasing the word size above 32-bits offered diminishing returns in terms of performance [CITE: Parallel Computer Architecture: A Hardware/Software Approach (The Morgan Kaufmann Series in Computer Architecture and Design) by David Culler, J.P. Singh, and Anoop Gupta (Hardcover – Aug 15, 1998) pg 15]. In the mid-1980's the emphasis changed from bit-level parallelism to instruction-level parallelism, which involved increasing the number of instructions that could be executed at one time [CITE: Parallel Computer Architecture: A Hardware/Software Approach (The Morgan Kaufmann Series in Computer Architecture and Design) by David Culler, J.P. Singh, and Anoop Gupta (Hardcover – Aug 15, 1998) pg 15]. The message passing model allowed programmers the ability to divide up instructions in order to take advantage of this architecture.

Thread-level parallelism

The move to cluster-based machines in the past decade, has added another layer of complexity to parallelism. Since computers could be located across a network from each other, there is more emphasis on software acting as a bridge [CITE: http://cobweb.ecn.purdue.edu/~pplinux/ppcluster.html]. This has led to a greater emphasis on thread- or task-level parallelism [CITE: http://en.wikipedia.org/wiki/Thread-level_parallelism] and the addition of the data parallelism programming model to existing message passing or shared memory models [CITE: http://en.wikipedia.org/wiki/Thread-level_parallelism].

Data Parallel Model

Description and Example

Comparison with Message Passing and Shared Memory

Task Parallel Model

Description and Example

Data Parallel Model vs Task Parallel Model

Definitions

References