CSC/ECE 506 Spring 2013/1a sp: Difference between revisions
(→=) |
|||
(2 intermediate revisions by the same user not shown) | |||
Line 112: | Line 112: | ||
<p align="center">Table 1: Pipeline structure for image edge detection[8]</p> | <p align="center">Table 1: Pipeline structure for image edge detection[8]</p> | ||
=== | ===Recursive Pattern Matching using MISD Architecture <ref>Arne Halaas, Børge Svingen, Magnar Nedland, Pål Sætrom, Ola Snøve, Jr., and Olaf René Birkeland. 2004. A recursive MISD architecture for pattern matching. IEEE Trans. Very Large Scale Integr. Syst. 12, 7 (July 2004), 727-734. DOI=10.1109/TVLSI.2004.830918 http://dx.doi.org/10.1109/TVLSI.2004.830918</ref>=== | ||
The situations where several patterns are required to be matched concurrently in data and no persistent index can be created for look-up requires the use of Online multipattern approximate searching. This algorithm requires evaluation of the hit function H(Si,P) which is the function working on strings and pattern,and requires partitioning P into smaller components and combining the individual results. A recursive approach is required and there exists a need to evaluate the Si value in parallel. Thus it helps in using an architecture with two complete binary trees which take care of the data distribution and result processing. The architecture is implemented with as many PE's that are practically possible within the implementation constraints. The data from the stream is fed to the root node of the data distribution tree and simultaneously distributed to several collection of PE's belonging to separate queries thus implementing the MISD architecture with a single data stream and multiple instruction streams. | The situations where several patterns are required to be matched concurrently in data and no persistent index can be created for look-up requires the use of Online multipattern approximate searching. This algorithm requires evaluation of the hit function H(Si,P) which is the function working on strings and pattern,and requires partitioning P into smaller components and combining the individual results. A recursive approach is required and there exists a need to evaluate the Si value in parallel. Thus it helps in using an architecture with two complete binary trees which take care of the data distribution and result processing. The architecture is implemented with as many PE's that are practically possible within the implementation constraints. The data from the stream is fed to the root node of the data distribution tree and simultaneously distributed to several collection of PE's belonging to separate queries thus implementing the MISD architecture with a single data stream and multiple instruction streams. |
Latest revision as of 04:45, 11 February 2013
Multiple Instruction Single Data (MISD) Architecture
Multiple Instruction Single Data, or MISD, architecture is one of the four general categories of multi-processor or multi-core architectures described by Flynn's Taxonomy.
MISD Architectures within the Context of Flynn’s Taxonomy<ref>http://en.wikipedia.org/wiki/Flynn's_taxonomy</ref><ref>http://www.phy.ornl.gov/csep/ca/node11.html</ref>
Flynn's Taxonomy categorizes parallel computing architectures based on the number of concurrent information streams that flowed into the processor. According to Flynn's Taxonomy, there are two kinds of logical streams of information that flows into a parallel computer, the Instruction (or Control) Stream and the Data Stream. Note that for the purposes of classification using Flynn’s Taxonomy, it does not matter whether or not the streams are physically independent of each other on the hardware, only that the streams can be logically treated as independent. Based on how many of each type of stream flows into the parallel computer (single vs. multiple), Flynn's Taxonomy divides parallel architectures into four categories. They are Single Instruction, Single Data (SISD), Single Instruction, Multiple Data (SIMD), Multiple Instruction, Single Data (MISD), and Multiple Instruction, Multiple Data (MIMD).
The following table (Figure 1) shows the categories of different parallel architectures according to Flynn:<ref>https://computing.llnl.gov/tutorials/parallel_comp/#Flynn</ref>
Overview of the MISD Architecture
In order to be categorized as MISD architecture, there must be at least two Instruction streams and there can only be one Data stream as shown on Figure 2.
As per the description according to Flynn’s Taxonomy, each processing elements should receive the same data and execute different instructions on the same data simultaneously.
There are very few implementations of the classical MISD architecture as described by Flynn’s Taxonomy, none of which are commercial, general purpose parallel computers since the MIMD and SIMD are often better suited to common data parallel techniques. Under most circumstances MIMD and SIMD architectures tend to provide better scaling and more efficient use of computational resources. Instead of commercial use, MISD architectures are mainly used to create custom hardware in order to solve specific problems. The majority of the implementations have been created are in academia.
Over the years, several architectures have been described as MISD even though the architectures do not completely follow the description in Flynn’s Taxonomy. One commonly cited example is the Pipeline architecture where the data is passed from processing element to processing element and each processing element executes a different instruction. It is often argued that this is not a true MISD architecture as the data changes as it moves through the pipeline. Similarly, other architectures or applications exist which may be called MISD by some while others disagree. The following section looks at some of those implementations.
Implementations of MISD architecture
Some major implementations of MISD are in the fields of systolic arrays, pattern matching, flight control systems and fault tolerance systems. These MISD implementations are discussed below.
Systolic Array
A systolic array is an arrangement of processors in an array where data flows synchronously across the array between neighbors, usually with different data flowing in different directions. Each Processor at each step takes in data from one or more neighbors, processes it and, in the next step, outputs results in the opposite direction. The systolic array paradigm, data-stream-driven by data counters, is the counterpart of the von Neumann paradigm, instruction-stream-driven by a program counter. Because a systolic array usually sends and receives multiple data streams, and multiple data counters are needed to generate these data streams, it supports data parallelism.<ref>http://en.wikipedia.org/wiki/Systolic_array</ref>
Type of Systolic Arrays<ref>Johnson, K.T.; Hurson, A.R.; Shirazi, B.; , "General-purpose systolic arrays," Computer , vol.26, no.11, pp.20-31, Nov. 1993 doi: 10.1109/2.241423</ref>
Special-purpose systolic array
An array of hardwired systolic processing elements tailored for a specific application. Typically, many tens or hundreds of cells fit on a single chip. One of the major applications of special-purpose systolic array is in matrix operations. Figure 3 illustrates the algorithm for the sum of a scalar product, computed in a single systolic element. Here, a’s and b’s are synchronously shifted through the processing element to be available for next element. These data synchronously exits the processing element unmodified for the next element. The sum of the products is then shifted out of the accumulator.
General-purpose systolic array
It is an array of systolic processing elements, which gets adapted to a variety of applications via programming or reconfiguration. Array topologies can be either programmable or reconfigurable. Likewise, array cells are either programmable or reconfigurable. This is referred to as Systolic topologies.
Programmable Systolic Array
It is an array of programmable systolic elements that operates either in SIMD or MIMD fashion. Either the arrays interconnect or each processing unit is programmable and a program controls dataflow through the elements. Programmable systolic arrays are programmable either at a high level or a low level. At either level, programmable arrays can be categorized as either SIMD or MIMD machines.
For the latter approach, the workstation downloads a program to each MIMD (Figure 4) systolic cell. Each cell may be loaded with a different program, or all the cells in the array may be loaded with the same program. Each cell's architecture is somewhat similar to the conventional von Neumann architecture: It contains a control unit, an ALU, and local memory. MIMD systolic cells have more local memory than their SIMD counterparts to support the von Neumann-style organization.
This architecture is defined as Multiple Instruction Multiple Data (MIMD) architecture in 5. The architecture has multiple instruction streams for the PEs and a single data stream passing through all the PEs. Thus, it can also be defined as Multiple Instruction Single Data (MISD) architecture. The architecture of Systolic array configuration are controversial as explained in the section 4.1.2.
Reconfigurable Systolic Array
It is an array of systolic elements that can be programmed at the lowest level. Recent gate density advances in FPGA technology have produced a low-level, reconfigurable systolic array architecture that bridges the gap between special-purpose arrays and the more versatile, programmable general-purpose arrays. The FPGA architecture is unusual because a single hardware platform can be logically reconfigured as an exact duplicate of a special-purpose systolic array.
The RSA circuit design is based on systolic array architecture consisting of PEs interconnected via SWs as depicted in Figure 6. The homogeneous characteristic of the Reconfigurable Systolic Array (RSA) architecture, where each reconfigurable processing element (PE) cell is connected to its nearest neighbors via configurable switch (SW) elements, enables array expansion for parallel processing and facilitates time sharing computation of high-throughput data by individual PEs. Both the PEs and SWs can be reconfigured dynamically with the former as an arithmetic processor and the latter as a flexible router linking the neighboring PE cells. The RSA shifts reconfiguration and input signals into the PEs and SWs on separate data bus which enables the circuit to continue its operation while the reconfiguration is in process.
Architecture of systolic arrays as against MISD architecture
As from the above mentioned configurations of the Systolic Arrays, it is seen that generally the configurations have multiple processing elements executing different instructions from dedicated instruction streams for each processing element. There is a single data stream that connects the adjacent PEs. Thus, systolic array can be defined as an MISD architecture.
Many authors say that as the data read as input by one processing element is processed data output of the adjacent PE. The data stream cannot be considered as single because all the data paths do not carry the same data to all the PEs. Figure 7 shows the difference between the Data Stream for Systolic Arrays and the MISD architecture. Thus the systolic array should be considered as “Multiple Data” architecture and not Single Data architecture.
Fault Tolerant Systems<ref>http://en.wikipedia.org/wiki/Fault-tolerant_computer_system#Types_of_fault_tolerance</ref>
The fault tolerant systems are designed to handle the possible failures in software, hardware or interfaces. The hardware faults include hard disk failures, input or output device failures, etc. and the software and interface faults include driver failures; operator errors, installing unexpected software etc. The hardware faults can be detected and identified by implementing redundant hardware and multiple backups. The software faults can be tolerable by removing the program errors by executing the software redundantly or by implementing small programs that take over the tasks that crash or generate errors.
In computer systems, the SIMD, MISD and MIMD architectures facilitate the implementation of the fault tolerance systems by multiple instruction streams or multiple data streams or both. Fault tolerance on computations can be implemented by multiple processors (likely with different architectures) executing the algorithms on the same set of data. The output of each processor is compared with that of the others and M out of N majority voting method is used to determine the faulty processor. Thus MISD architecture is utilized to get the fault tolerance on critical computations.
There are various examples of MISD architecture being used as fault tolerant architecture. The major examples being flight control systems, nuclear power plants, satellite systems, super collider experiment systems, etc. Here, the flight control system is explained as an example of MISD architecture.
The Flight Control System – MISD Example for fault tolerance
A Fly-By-Wire system is used to replace the manual flight control by an electronic control interface. The movements of the flight control in the cockpit are converted to electronic signals and are transmitted to the actuators by wires. The control computers use the feedback from the sensors to compute and control the movement of the actuators to provide the expected response. These computers also perform the task to stabilize the aircraft and perform other tasks without the knowledge of the pilot. Flight control systems must meet extremely high levels of accuracy and functional integrity.
There are redundant flight control computers present in the flight control system. If one of the flight-control computers crashes, gets damaged or is affected by electromagnetic pulses, the other computer can overrule the faulty one and hence the flight of the aircraft is unharmed.The number of redundant flight control computers is generally more than two, so that any computer whose results disagree with the others is ruled out to be faulty and is either ignored or rebooted.
=====Multiple Processors Implementation in Boeing 777<ref>Yeh, Y.C.;, "Triple-triple redundant 777 primary flight computer," Aerospace Applications Conference, 1996. Proceedings., 1996 IEEE , vol.1, no., pp.293-307 vol.1, 3-10 Feb 1996 doi: 10.1109/AERO.1996.495891</ref>=====
In modern computers, the redundant flight control computations are carried out by multiprocessor systems. The triple redundant 777 primary flight computer, has the architecture as shown in Figure 9.
The system has three primary flight control computers, each of them having three lanes with different processors. The flight control program is compiled for each of the processors which get the input data from the same data bus but drive the output on their individual control bus. Thus each processor executes different instructions but they process the same data. Thus, it is the best suited example of Multiple Instruction Single Data (MISD) architecture.
The three processors selected for the flight control system of Boeing 777 were Intel 80486, Motorola 68040 and AMD 29050. The dissimilar processors lead to dissimilar interface hardware circuits and compilers. Each lane of the flight control computer is data synchronized with the other lanes so that all of the lanes read the same frame of data from the flight sensors. As the outputs of each lane can be different, the median value of the outputs is used to select the output of the lane to be considered. The lane which has the median value select hardware selected is said to be in “command mode” whereas the other lanes are said to be in “monitoring mode”. It receives the data from the other Primary Flight Computer (PFC) lanes and performs a median select of the outputs. This provides a fault blocking mechanism before the fault detection and identification by the cross-lane monitoring system. Thus, the MISD based multi computer architecture is capable of detecting generic errors in compilers or in complex hardware devices providing assurance beyond reasonable doubt of the dependability of the Fly-By-Wire system.
The above mentioned system clearly has individual Instruction Streams as the architecture of each processor is different, thus different instruction sets and different instruction streams. These processors have frame synchronized input data which means they have same set of data to work upon which is fed from a single data stream. Thus the flight control system can be classified under MISD architecture.
Edge Detection of Images on TMS320C80 Multiprocessor System<ref>M.Gajer;,“ Parallel Image Processing on The TMS320C80 Multiprocessor System”, http://focus.ti.com/pdfs/univ/11-CommunicationAudioSpeech.pdf</ref>
The image pre-processing system of this Multiprocessor, the pre-image processing time needed to be made shorter. To achieve the aim of reducing the pre-processing time, different types of image processing architectures for ‘C80 were presented. For example MIMD architecture is used for image histogram equalization; SIMD architecture for median filtering; MISD architecture for directional edge detection.
The MISD architecture can be explained as follows. Here, each processor performs different operations on the same data set. It is mentioned at many places that MISD architecture was defined only for the sake of completeness of classification due to difficulty in practice to find a computer based on MISD architecture<ref>TMS320C80 System Level Synopsis, Texas Instruments, 1995</ref>. But in case of Image Processing, this assumption need not be true as here making many different operations on same image is rare and hence MISD architecture is a best fit for such cases. As an example of implementation of image processing operations for the MISD architecture the edge detection operation will be considered. The edges should be detected in four different directions (north, south, west and east) by the medium of four different filter masks. Each of the parallel processors of the TMS320C80 calculates one of the directional edges. Finally one obtains four images with detected edges, each in different direction.
The image processing in computer is very often a multistage process composed of many stages. For example, an algorithm detecting contours of the objects in an image. The algorithm for achieving the before mentioned is composed of four basic image processing operation. The image acquired from the camera in the first step undergoes low-pass filtering, during which it is smoothed and noise is also eliminated. In the next step image is thresholding is performed and each pixel gets black or white depending on that if its grey level is below or above threshold value. The resulting image is a binary image from which edges are detected by the usage of the Laplace filter. In the last step the image with detected contours undergoes logic filtering in the purpose of elimination of single black pixels placed on the white background. For implementing the above processes parallel, a five-stage pipeline was used. The first stage of this pipeline constitutes the master processor which accounts for control of the image acquisition process. Further pipe line stages were implemented for 4 parallel processors as mentioned in table 1.
Stage | Processor | Operation |
1 | MP | Image acquisition |
2 | PP0/td> | Low-pass Filtering |
3 | PP1 | Thresholding |
4 | PP2 | Laplace Filtering |
5 | PP3 | Logic Filtering |
Table 1: Pipeline structure for image edge detection[8]
Recursive Pattern Matching using MISD Architecture <ref>Arne Halaas, Børge Svingen, Magnar Nedland, Pål Sætrom, Ola Snøve, Jr., and Olaf René Birkeland. 2004. A recursive MISD architecture for pattern matching. IEEE Trans. Very Large Scale Integr. Syst. 12, 7 (July 2004), 727-734. DOI=10.1109/TVLSI.2004.830918 http://dx.doi.org/10.1109/TVLSI.2004.830918</ref>
The situations where several patterns are required to be matched concurrently in data and no persistent index can be created for look-up requires the use of Online multipattern approximate searching. This algorithm requires evaluation of the hit function H(Si,P) which is the function working on strings and pattern,and requires partitioning P into smaller components and combining the individual results. A recursive approach is required and there exists a need to evaluate the Si value in parallel. Thus it helps in using an architecture with two complete binary trees which take care of the data distribution and result processing. The architecture is implemented with as many PE's that are practically possible within the implementation constraints. The data from the stream is fed to the root node of the data distribution tree and simultaneously distributed to several collection of PE's belonging to separate queries thus implementing the MISD architecture with a single data stream and multiple instruction streams.
Conclusion
The MISD architecture is one of the four types of parallel architecture described in Flynn’s Taxonomy. By definition, the MISD architecture has multiple instruction streams and a single data stream. The definition also describes this architecture as requiring that all processing elements receive the same data, but most commonly cited implementations of this class of architecture do not adhere to this detail. The result is there is debate among some experts on whether or not certain implementations are truly MISD.
MISD architectures are not commonly found in industry as they tend to lack the scalability and efficiency of resource use of the MIMD and SIMD class of architectures. However, MISD architectures are efficient for solving specific classes of problems often found in research. Some examples of its use in research include image processing and systolic arrays. While rare, some implementations do exist outside of academic settings. One such example is fault tolerant systems such as the flight control system on the Boeing 777.
References
<references/>
Glossary
- ALU: Arithmetic Logic Unit, the control unit in a computer
- CMU: Carnegie Mellon University
- Fly-By-Wire: System that replaces the conventional manual flight controls of an aircraft with an electronic interface
- FPGA: Field Programmable Gate Array
- MIMD: Multiple Instruction Multiple Data
- MISD: Multiple Instruction Single Data
- PE: Processing Element
- PFC: Primary Flight Computer
- SIMD: Single Instruction Multiple Data
- SISD: Single Instruction Single Data
Quiz
Q1: Using the classical definition of the MISD architecture from Flynn’s Taxonomy, what is the minimum number of input streams that must be present to be classified as MISD architecture?
A. One information stream, the data stream, the number of instruction streams do not matter.
B. Two information streams, one instruction stream, one data stream.
C. Three information streams, two instruction streams, one data stream.
D. The number of information streams is irrelevant as per Flynn’s Taxonomy.
A1: C, Three information streams, two instruction streams, one data stream.
Q2: Why is the Pipeline architecture often cited as an example of MISD?
A. The Pipeline architecture is exactly like the definition of MISD architecture from Flynn’s Taxonomy.
B. The data to be processed travels linearly across all processing elements with each processing element executing a different instruction and passing it on to the next processing element. This line of data is cited to be a single “stream.”
C. The data stream sends the same data to all processing elements for every cycle.
D. Flynn based his description of MISD architecture on the Pipeline architecture in his paper that introduced Flynn’s Taxonomy.
A2: B, The data to be processed travels linearly across all processing elements with each processing element executing a different instruction and passing it on to the next processing element. This line of data is cited to be a single “stream.”
Q3: Why is the Pipeline architecture NOT considered a true MISD architecture by many experts?
A. This is incorrect; there is consensus that the Pipeline architecture is synonymous with the MISD architecture described by Flynn’s Taxonomy.
B. There is only a single, global information/control stream in the Pipeline architecture.
C. The data passed to each processing element is different as it reflects the result of the processing of all the previous processing elements. Flynn’s Taxonomy describes the same data going to every processing element.
D. It has been proven that it is impossible to build a MISD architecture parallel computer as described by Flynn’s Taxonomy.
A3: C, The data passed to each processing element is different as it reflects the result of the processing of all the previous processing elements. Flynn’s Taxonomy describes the same data going to every processing element.
Q4: True or False? MISD architecture is the most common parallel computing architecture used for general computing tasks.
A. True
B. False
A4: B, False, it is often not the best suited architecture for general computing tasks and as such not common outside of academic research.
Q5: When classifying a particular parallel architecture, it was noted that there was only one physical connection to provide all information streams, is it possible for this architecture to be classified as MISD architecture? (Note that there are three choices, you need to select the correct reason for the Yes/No/Maybe answer.)
A. Yes, Flynn’s taxonomy does not require the number of physical connections in the hardware correspond to the number of logical streams.
B. Maybe, we do not have enough information about the architecture to rule out the possibility of this parallel computer having a MISD architecture. We will need to see how the connections are physically laid out on the chip.
C. No, there must be multiple physical information streams.
A5. A, Yes, Flynn’s taxonomy does not require the number of physical connections in the hardware correspond to the number of logical streams.
Q6. True or False? To be a MISD architecture, it requires that every processing element is identical.
A. True
B. False
A6. B, False, it is not required by definition of Flynn’s Taxonomy that all processing elements are identical. See example in the wiki reading about the fault tolerant flight control system on the Boeing 777.
Q7: Which of these is an application of the MISD architecture?
A. Keyword search
B. Controlling an assembly line.
C. Home computing
D. Image edge detection.
A7: D, Image edge detection.
Q8. True or False? The systolic array is another name for the MISD architecture.
A. True
B. False
A8. B, False, systolic arrays may be implemented using a MISD architecture but they also may be other architectures such as SIMD and MIMD.
Q9. True or False? MISD is the considered the easiest parallel architecture to scale.
A. True
B. False
A9. False, MIMD and SIMD are often used because of advantages in scaling.
Q10. Flynn’s Taxonomy relies on the number of logical information streams to classify the types of parallel architectures. What is the difference between a logical information stream and a physical information stream?
A. A logical information stream is a connection between the ALU and the location of the data that is responsible for passing information about boolean (logical) computations, where the physical information stream it the connection between computer and other parts of the network.
B. Physical information streams are determined by how the information is physically transported around chip while logical information streams are one layer of abstraction above the physical information streams. Logical streams describe conceptually how the information on the stream can be treated.
C. Logical and physical information streams transport different kinds of information but are no different in implementation.
D. There is no difference; the two terms can be used interchangeably.
A10. B, Physical information streams are determined by how the information is physically transported around chip while logical information streams are one layer of abstraction above the physical information streams. Logical streams describe conceptually how the information on the stream can be treated.