CSC/ECE 506 Fall 2007/wiki1 1.3.3 1.3.4 chase2007: Difference between revisions

From Expertiza_Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 58: Line 58:
===<u>BSP Model:</u>===
===<u>BSP Model:</u>===


'''Bulk Synchronization Model:<BR>'''
'''Bulk Synchronization Parallel Model:<BR>'''
'''The BSP model is a series of supersteps.<BR>'''
'''The BSP model is a series of supersteps.<BR>'''
'''A superstep on a BSP machine is given by W + G*H + L<BR>'''
'''A superstep on a BSP machine is given by W + G*H + L<BR>'''
Line 65: Line 65:
'''G is the number of available processors<BR>'''
'''G is the number of available processors<BR>'''
'''L is the time required for the barrier synchronization<BR>'''
'''L is the time required for the barrier synchronization<BR>'''
<BR>
'''There are other similar variations of BSP like LogP<BR>'''


===<u>LogP Model:</u>===
===<u>Terminology:</u>===


'''L is a maximum on the latency<BR>'''
'''Saturation Point: In performance of a parallel system there is a point that is reached where the system can not keep up with the load on it.  At this point response times will go to infinity.  This point is where the load on the system completely utilizes the most scarce resource, often the cpu.<BR>'''
'''o is the overhead of transmission or reception<BR>'''
'''Scaleability: Often a parallel system is not rated by its speed alone, but instead based on how it performes with varying loads.  A system that is scaleable will have a gradual increase in response times untill it reaches saturation.<BR>'''
'''g is the minimum time between consectutive messages<BR>'''
'''Load: The load on a parallel system is the measure of the work/time that is required of the system.  It is relative to the number of actions taking place as well as the time between those actions and the complexity of each action.<BR>'''
'''P is the number of processors<BR>'''


===<u>References:</u>===
===<u>References:</u>===

Revision as of 23:39, 10 September 2007

Introduction to Parallel Computer Architecture ->Fundamental Design Issues -> Communication and Replication (section 1.3.3)

Replication:

Replication is the creation of a local copy of data to help enable parallelism.

Communication:

Communication occurs when data written from one process is read by another process. Replication helps to avoid unnecessary communication.


Introduction to Parallel Computer Architecture ->Fundamental Design Issues -> Performance (section 1.3.4)

Computer Performance:

Computer performance is a measure of the output of a computer with with respect to time and resources used.

Performance metrics:

Following are the important metrics used to measure a computer's performance:

1.Latency: The time taken to perform an operation
2.Bandwidth:The rate at which the operations are performed
3.Cost: The impact these operations have on the execution time of the program

All the above metrics can be used to define a uniprocessor systems where a single CPU operates.

However, in the context of parallel computers, it becomes difficult to express the performance in above stated metrics. The reason for this is the communication between the processors that occurs mostly in the form of data transfers between the processors. So, to completely define the performace of a parallel computer, the following metrics are also considered.

Data Transfer Time:

It is the time taken for initiation of a data transfer and the time required for actual data transfer. So the Data Transfer Time can be given as:

Transfer Time (n) = T+(n/B)
where
n = Amount of Data (in bytes)
B = Transfer Rate of the component moving the data (bytes per second)
T = Start up cost, a constant

Overhead and Occupancy:

The data transfer operations are initiated by the processor through communication assist
The overhead is the time the processor spends initiating the transfer of data. This may be a fixed cost, if the processor imply has to tell the communication assist to start. The overhead can also be linear with Tranfer time, if the processor has to copy the data into the assist. The occupancy is the time it takes for the data to pass through the slowest componant on the communication path. The occupancy limits how frequently communication operations can be initiated.The next data transfer will have to wait untill the critical resource is no longer occupied before it can use the same resource.

Communication Cost:

It is the time that the processor spends in communicating with other processors. It can be given by the following:

Communication Cost = Frequency of Communication * (Communication Time - Overlap)

Frequency of Communication = Number of communications per unit of work Communication Time = Overhead + Occupancy + Network Delay Overlap = The portion of the communication operation that is performed concurrently with other useful work.

BSP Model:

Bulk Synchronization Parallel Model:
The BSP model is a series of supersteps.
A superstep on a BSP machine is given by W + G*H + L
W is the maximum possible work for a given processor
H is the maximum bytes sent or recieved by a processor
G is the number of available processors
L is the time required for the barrier synchronization

There are other similar variations of BSP like LogP

Terminology:

Saturation Point: In performance of a parallel system there is a point that is reached where the system can not keep up with the load on it. At this point response times will go to infinity. This point is where the load on the system completely utilizes the most scarce resource, often the cpu.
Scaleability: Often a parallel system is not rated by its speed alone, but instead based on how it performes with varying loads. A system that is scaleable will have a gradual increase in response times untill it reaches saturation.
Load: The load on a parallel system is the measure of the work/time that is required of the system. It is relative to the number of actions taking place as well as the time between those actions and the complexity of each action.

References:

Parallel Computer Architecture- A Hardware/Software Approach by David E Culler, Jaswinder Pal Singh and Anoop Guptha

http://en.wikipedia.org/wiki/Computer_performance

The BSP Model http://wwwcs.uni-paderborn.de/fachbereich/AG/agmadh/WWW/bono/paper/nestedbsp/node6.html

LogP: Towards a Realistic Model of Parallel Computation http://cs315b-wiki.stanford.edu/images/8/8b/Logp.pdf