CSC/ECE 506 Fall 2007/wiki1 10 mt: Difference between revisions
Line 22: | Line 22: | ||
#Dataflow programs typically made use of multiple threads since parallel functions and loops were frequently used. Therefore, if there wasn't enough of a workload for multiple threads, '''single threaded execution''' of a program provided poor performance. | #Dataflow programs typically made use of multiple threads since parallel functions and loops were frequently used. Therefore, if there wasn't enough of a workload for multiple threads, '''single threaded execution''' of a program provided poor performance. | ||
#The dataflow model failed to take advantage of locality such as the usage of local registers and cache. Since all information for the tokens (data and tags) moves through the network, it is difficult to transfer all that information in a '''timely efficient''' manner over a large parallel system. | #The dataflow model failed to take advantage of locality such as the usage of local registers and cache. Since all information for the tokens (data and tags) moves through the network, it is difficult to transfer all that information in a '''timely efficient''' manner over a large parallel system. | ||
Out of order execution | |||
Explicit token store approach(moonsoon) | |||
Enhancing data flow with control flow |
Revision as of 16:49, 3 September 2007
Dataflow & Systolic Architectures
The dataflow and systolic models are two of the many possible parallel computer architectures. Unlike shared address, message passing and data parallel processing, the dataflow and systolic architectures were not as commonly used for parallel programming systems although they recieved a considerable amount of analysis from both private industry and academia.
Dataflow
Dataflow architecture is in oppostion to the von Neumann or control flow architecture which has memory, and I/O subsystem, an arithmetic unit and a control unit. The one shared memory is used for both program instructions and data with a data bus and address bus between the memory and processing unit. Because instructions and data must be fetched in sequential order, a bottleneck may occur limiting the throughput between the CPU and the memory.
The dataflow model of architecture, in contrast, is a distributive model where there is no single point of control and the execution of an instructions takes place only when the required data is available. Dataflow models are typically represented as a graph of nodes where each node in the graph is an operation to be executed when its operands become available along with the address of the subsequent nodes in the graph that need the results of the operation.
Included in the dataflow model of architecture there is also static and dynamic dataflow. The static dataflow model is characterized by the use of the memory address to specify the destination nodes that are data dependent. The dynamic model uses content-addressable memory which searches the computer memory for specific tags. Each subprogram or subgraph should be able to execute in parallel as separate instances. In the dynamic dataflow model, programs are executed by dealing with tokens which contain both data and a tag. A node is executed when incoming tokens with identical tags are present.
Systolic
New Developments in Dataflow and Systolic Architectures
Since 1999, little advancement has been made in the field of dataflow architecture. Dataflow was primarily abandoned due to several problems.
- The dynamic dataflow model requires some sort of associative memory to store the tokens waiting to be matched. Unfortunately, even in moderate size programs the required memory needed for storage tends to be large and therefore not very cost efficient.
- Dataflow programs typically made use of multiple threads since parallel functions and loops were frequently used. Therefore, if there wasn't enough of a workload for multiple threads, single threaded execution of a program provided poor performance.
- The dataflow model failed to take advantage of locality such as the usage of local registers and cache. Since all information for the tokens (data and tags) moves through the network, it is difficult to transfer all that information in a timely efficient manner over a large parallel system.
Out of order execution
Explicit token store approach(moonsoon)
Enhancing data flow with control flow