CSC/ECE 506 Fall 2007/wiki2 4 md
Parallel Application: Shuffled Complex Evolution Metropolis
Overview
The scope of environmental sciences increases significantly each year. With the advancement of computer systems, more intricate (and thus realistic) models are developed every year. As with in advancement in technology, the number of parameters for computation increase each year as well. The Shuffled Complex Evolution Metropolis (SCEM-UA) algorithm is used for global optimization of the estimation of environmental models. This algorithm is then implemented in a parallel in a very user-friendly way for further optimization. The algorithm can be used for several model case studies, such as the prediction of migratory bird flight paths.
Steps of Parallelization
Flow Chart Overview of Parallelization
Decomposition
Decomposition in parallel processes is described as the division of the computation into various tasks. In the SCEM-UA algorithm, a dimension (n ), number of complexes (k ) and a population size (s ) are taken. The number of points, m, is then calculated using the formula:
m = s/k
Each processor is then assigned a different subset of s to evaluate. Once this is accomplished, an array D[1:s,1:n+1] (where n = number of parameters) is created. Finally, the k sequences are used to initialize starting locations for each sequence using the first k elements in D such that Sk = D[k,1:n+1] (again where n = # of parameters).
Assignment
Once the algorithm is completed and split into different parts, it is then partitioned into different complexes. The s points of D are partitioned into k different complexes {C1 ... Ck}. Each different complex contains m points. The first complex contains every k(j-1)+1 point of D, the second contains k(j-1)+2 and so on (where j = 1,...m).
Orchestration
The orchestration phase of parallelization is concerned with process communication and synchronization. Interestingly, this algorithm is not concerned with communication between processors, and in fact does not concern itself with communication among the processors! Because data sets are so large that the majority of computational time is alloted toward running the model code and/or generating the desired output.
Since data is stored into an array (referenced here as D ), once computations are complete, the array is sorted based upon decreasing posterior density.
Mapping
In the mapping of the SCEM-UA algorithm, it is vital that each processor has a different data set on which to work. Candidate points are generated and then sent to different nodes. Since each node is working on it's own set of data, and said data is stored into array D , there is no need for communication between processors (as described in "Orchestration").
References
Application of Parallel Computing to Stochastic Parameter Estimation in Environmental Models