CSC 456 Fall 2013/4a bc: Difference between revisions

From Expertiza_Wiki
Jump to navigation Jump to search
 
(24 intermediate revisions by 3 users not shown)
Line 1: Line 1:
=Load Balancing=
=Load Balancing=
In multi-processor systems, load-balancing is used to break up and distribute the workload to individual processors in order to make effective use of processor time. When the workload is divided up at compile-time, the balance is said to be ''statically'' balanced. Dividing the workload up during run-time is ''dynamically'' balancing the load. Static load balancing has reduced overhead as the work is divided before run time. Dynamic load balancing assigns work as processors become idle, so there is greater overhead. However, dynamic balancing can lead to increased performance of load balancing due to being able to assign work to a processor when it does become idle, reducing the overall idle time of processors.
In multi-processor systems, load-balancing is used to break up and distribute the work load to individual processors in order to make effective use of processor time. When the work load is divided up at compile-time, the balance is said to be ''statically'' balanced. Dividing the work load up during run-time is ''dynamically'' balancing the load. Static load balancing has reduced overhead as the work is divided before run time. Dynamic load balancing assigns work as processors become idle, so there is greater overhead. However, dynamic balancing can lead to improved performance of load balancing due to being able to assign work to a processor when it does become idle, reducing the overall idle time of processors.


==Static vs. Dynamic Techniques==
==Static vs. Dynamic Techniques==
Line 12: Line 12:
====Random====
====Random====


Random load balancing relies on the hope that over the course of enough time, workloads are evenly spread by random chance. Random is fairly easy to implement with little overhead. Generating good "random" values is one challenge, because the function is called so many times that any bias will have a large affect. Random suffers from the same drawbacks as round robin though. There is always the chance that a certain processor is randomly picked in an unusually frequent fashion, leading to wait times for other processors. Random could also assign multiple large tasks to a single processor in a short period of time, which would also lead to uneven load balancing.
Random load balancing relies on the hope that over the course of enough time, workloads are evenly spread by random chance. Random is fairly easy to implement with little overhead. Generating good "random" values is one challenge, because the function is called so many times that any bias will have a large effect. Random suffers from the same drawbacks as round robin though. There is always the chance that a certain processor is randomly picked in an unusually frequent fashion, leading to wait times for other processors. Random could also assign multiple large tasks to a single processor in a short period of time, which would also lead to uneven load balancing.


====Central Manager====
====Central Manager====
Line 21: Line 21:


====Local Queue====
====Local Queue====
Under local queue workload management, also called distributed workload management, each processor is responsible for maintaining a sufficient workload. When a load drops below a threshold, the load manager for the processor fires off a request to another random processor workload manager to send work. The remote load manager receiving the request examines it's own workload and, if it has sufficient extra workload, will send work to the requesting load manager. This algorithm scheme is fault tolerant in that if any processor were to fail, the other nodes would be able to continue working as they still have their workload and can still manager workloads with other processors. Unfortunately, this scheme generally requires a relatively large amount of inter-processor communications to maintain a satisfactory workload at all processors.
Under local queue workload management, also called distributed workload management, each processor is responsible for maintaining a sufficient workload. When a load drops below a threshold, the load manager for the processor fires off a request to another random processor workload manager to send work. The remote load manager receiving the request examines its own workload and, if it has sufficient extra work load, will send work to the requesting load manager. This algorithm scheme is fault tolerant in that if any processor were to fail, the other nodes would be able to continue working as they still have their workload and can still manage workloads with other processors. Unfortunately, this scheme generally requires a relatively large amount of inter-processor communications to maintain a satisfactory workload at all processors.


====Central Queue====
====Central Queue====
A centralized workload manager is responsible for distributing workload to processors under the central queue algorithm. The central manager is aware of all work to be distributed to the processors. When a processor's load falls below a threshold, a request for more work is sent to the central load manager, which then distributes more work. If there is not enough work in the central queue to meet the demand, the request is buffered until there enough work is available to meet the request. In systems with large numbers of processors, clusters can be formed of groups of processors with each cluster have a centralized workload manager. One workload manager would be in charge of distributing workloads to each cluster workload manager. This scheme has a lower fault tolerance as the system can be at risk of being brought down if the central load manager were to stop working. Also, an entire cluster could stop producing of it's central load manager were to stop functioning.
A centralized workload manager is responsible for distributing workload to processors under the central queue algorithm. The central manager is aware of all work to be distributed to the processors. When a processor's load falls below a threshold, a request for more work is sent to the central load manager, which then distributes more work. If there is not enough work in the central queue to meet the demand, the request is buffered until there enough work is available to meet the request. In systems with large numbers of processors, clusters can be formed of groups of processors with each cluster have a centralized workload manager. One workload manager would be in charge of distributing workloads to each cluster workload manager. This scheme has a lower fault tolerance as the system can be at risk of being brought down if the central load manager were to stop working. Also, an entire cluster could stop producing of its central load manager were to stop functioning.
 
==Comparisons of Static versus Dynamic==
{| class="wikitable"
|+ Table: Comparison of Load Balancing Algorithms<ref name="complb">http://masters.donntu.edu.ua/2010/fknt/babkin/library/article11.pdf
{{cite web
|        url = http://masters.donntu.edu.ua/2010/fknt/babkin/library/article11.pdf
|      title = Performance Analysis of Load Balancing Algorithms
|      last1 =
|    first1 =
|    middle1 =
|      last2 =
|    first2 =
|    middle2 =
|  location =
|      date =
| accessdate November 19, 2013
|  separator = ,
}}
</ref>
|-
! Parameters
! Round Robin
! Random
! Central Manager
! Local Queue
! Central Queue
|-
| Dynamic/Static
| Static
| Static
| Static
| Dynamic
| Dynamic
|-
| Overload Rejection
| No
| No
| No
| Yes
| Yes
|-
| Fault Tolerant
| No
| No
| Yes
| Yes
| Yes
|-
| Forecasting Accuracy
| More
| More
| More
| Less
| Less
|-
| Stability
| Large
| Large
| Large
| Small
| Small
|-
| Centralized/Decentralized
| D
| D
| C
| D
| C
|-
| Cooperative
| No
| No
| Yes
| Yes
| Yes
|-
| Process Migration
| No
| No
| No
| Yes
| No
|-
| Resource Utilization
| Less
| Less
| Less
| More
| Less
|}


==Real World applications of Load Balancing==
==Real World applications of Load Balancing==
====Weather Modeling====
====Weather Modeling====
Loading balancing methods play a large role in weather modeling as the amount of data that needs to be processed is quite large and computationally intensive. Many models construct their own data structures and use variations on static and dynamic load balancing to achieve satisfactory performance.
[http://wwwpub.zih.tu-dresden.de/~mlieber/publications/para10web.pdf Highly Scalable Dynamic Load Balancing in the Atmospheric Modeling System COSMO-SPECS+FD4 ]


====Visible Human Project====
====Visible Human Project====
Line 33: Line 126:
==Examples of Load Balancing in action==
==Examples of Load Balancing in action==


Server Load balancing pseudocode
Server Load balancing pseudocode <ref name="pseudocode">http://code.google.com/p/hypertable/wiki/LoadBalancing
{{cite web
|        url = http://code.google.com/p/hypertable/wiki/LoadBalancing
|      title = Load Balancing Design
|      last1 =
|    first1 =
|    middle1 =
|      last2 =
|    first2 =
|    middle2 =
|  location =
|      date =
| accessdate December 28, 2010
|  separator = ,
}}
</ref>


<code>
<code>
//sort the load data and store it in different orders for use later
  server_load_vec_desc = sort_descending(server_load_vec);
  server_load_vec_desc = sort_descending(server_load_vec);
  server_load_vec_asc = sort_ascending(server_load_vec);
  server_load_vec_asc = sort_ascending(server_load_vec);
  while (server_load_vec_desc[0].deviation > DEVIATION_THRESHOLD) {
&nbsp;
//While the deviation is too high, iterate through the nodes
  while (server_load_vec_desc[0].deviation > DEVIATION_THRESHOLD)
{
  //get the tasks for node [0], and sort them
   populate_range_load_vector(server_load_vec_desc[0].server_name);
   populate_range_load_vector(server_load_vec_desc[0].server_name);
   sort descending range_load_vec;
   sort descending range_load_vec;
&nbsp;
   i=0;
   i=0;
   while (server_load_vec_desc[0].deviation > DEVIATION_THRESHOLD &&
  //iterates through the past load data for this node
            i < range_load_vec.size()) {
   while (server_load_vec_desc[0].deviation > DEVIATION_THRESHOLD && i < range_load_vec.size())
     if (moving range_load_vec[i] from server_load_vec_desc[0] to server_load_vec_asc[0] reduces deviation) {
  {
    &nbsp;
    //If a given swap results in a lesser deviation
     if (moving range_load_vec[i] from server_load_vec_desc[0] to server_load_vec_asc[0] reduces deviation)
    {
        //swap and update load balance data related to the load swap
         add range_load_vec[i] to balance plan
         add range_load_vec[i] to balance plan
         partial_deviation = range_load_vec[i].loadestimate * loadavg_per_loadestimate;
         partial_deviation = range_load_vec[i].loadestimate * loadavg_per_loadestimate;
Line 55: Line 174:
     i++;
     i++;
   }
   }
&nbsp;
  //if true, then the entire load has been processed for this node, and entry [0] which is the current node can be removed
   if (i == range_load_vec.size())
   if (i == range_load_vec.size())
  {
     remove server_load_vec_desc[0] and corresponding entry in server_load_vec_asc   
     remove server_load_vec_desc[0] and corresponding entry in server_load_vec_asc   
  }
&nbsp;
  //re-balance the load before iterating again on the next node
   server_load_vec_desc = sort_descending(server_load_vec_desc);
   server_load_vec_desc = sort_descending(server_load_vec_desc);
  }
  }
</code>
</code>


==Sources==
==Sources==
====References====
<references/>


====Other Sources====
<ol>
<ol>
<li>[http://code.google.com/p/hypertable/wiki/LoadBalancing Load Balancing PseudoCode and other information]  </li>
<li>[http://paper.ijcsns.org/07_book/201006/20100619.pdf A Guide to Dynamic Load Balancing in Distributed Computer Systems] </li>
<li>[http://paper.ijcsns.org/07_book/201006/20100619.pdf A Guide to Dynamic Load Balancing in Distributed Computer Systems] </li>
<li>[http://www.ics.uci.edu/~cs237/reading/parallel.pdf Strategies for Dynamic Load Balancing on Highly Parallel Computers] </li>
<li>[http://www.ics.uci.edu/~cs237/reading/parallel.pdf Strategies for Dynamic Load Balancing on Highly Parallel Computers] </li>
<li>[http://www.vsrdjournals.com/CSIT/Issue/2013_05_May/Web/1_Jagdeep_Singh_1670_Research_Article_VSRDIJCSIT_May_2013.docx SIMULATION OF STATIC LOAD BALANCING ALGORITHMS ON HOMOGENEOUS AND HETEROGENEOUS CPUs ] </li>
<li>[http://www.vsrdjournals.com/CSIT/Issue/2013_05_May/Web/1_Jagdeep_Singh_1670_Research_Article_VSRDIJCSIT_May_2013.docx Simulation of Static Load Balancing Algorithms on Homogeneous and Heterogeneous CPUs ] </li>
<li>[http://masters.donntu.edu.ua/2010/fknt/babkin/library/article11.pdf Performance Analysis of Load Balancing Algorithms]</li>
</ol>
</ol>

Latest revision as of 17:15, 19 November 2013

Load Balancing

In multi-processor systems, load-balancing is used to break up and distribute the work load to individual processors in order to make effective use of processor time. When the work load is divided up at compile-time, the balance is said to be statically balanced. Dividing the work load up during run-time is dynamically balancing the load. Static load balancing has reduced overhead as the work is divided before run time. Dynamic load balancing assigns work as processors become idle, so there is greater overhead. However, dynamic balancing can lead to improved performance of load balancing due to being able to assign work to a processor when it does become idle, reducing the overall idle time of processors.

Static vs. Dynamic Techniques

Static Load balancing

Round Robin

Round robin is a load balancing technique which evenly distributes tasks across available processors. Each processor is lined up, and given a task one after the other until it loops around again back to the first processor. Visualize a dealer in a casino passing out cards to each player in a circle, one at a time. The advantage is that this is a very simple load balancing technique to implement, with very little overhead. A disadvantage is that there is no care given to the job size or performance. This can create problems if a processor is unlucky and is continually assigned large tasks, causing it to fall behind.

Random

Random load balancing relies on the hope that over the course of enough time, workloads are evenly spread by random chance. Random is fairly easy to implement with little overhead. Generating good "random" values is one challenge, because the function is called so many times that any bias will have a large effect. Random suffers from the same drawbacks as round robin though. There is always the chance that a certain processor is randomly picked in an unusually frequent fashion, leading to wait times for other processors. Random could also assign multiple large tasks to a single processor in a short period of time, which would also lead to uneven load balancing.

Central Manager

Central manager is a load balancing scheme which selects a certain processor to act as the "central node", which handles the balancing. The central node assigns each new task to the slave processor which currently has the least load. This method has a different overhead than usual. Before there would be intercommunication between all processors, where as with central load balancing, the communication exists solely between the central node and the other processors. A drawback of the Central Management is that it usually works best with smaller networks of processors. A hierarchy of master central nodes controlling lesser central nodes is possible, but adds more complexity. It is possible for a central control node to be inundated by messages from its children nodes, locking up the system and causing great drops in performance. The Central Manager policy has an advantage because it requires fewer messages to be sent in order to facilitate load balancing. This method also greatly reduces the chance that any one processor is overworked or left idle.

Dynamic Load Balancing

Local Queue

Under local queue workload management, also called distributed workload management, each processor is responsible for maintaining a sufficient workload. When a load drops below a threshold, the load manager for the processor fires off a request to another random processor workload manager to send work. The remote load manager receiving the request examines its own workload and, if it has sufficient extra work load, will send work to the requesting load manager. This algorithm scheme is fault tolerant in that if any processor were to fail, the other nodes would be able to continue working as they still have their workload and can still manage workloads with other processors. Unfortunately, this scheme generally requires a relatively large amount of inter-processor communications to maintain a satisfactory workload at all processors.

Central Queue

A centralized workload manager is responsible for distributing workload to processors under the central queue algorithm. The central manager is aware of all work to be distributed to the processors. When a processor's load falls below a threshold, a request for more work is sent to the central load manager, which then distributes more work. If there is not enough work in the central queue to meet the demand, the request is buffered until there enough work is available to meet the request. In systems with large numbers of processors, clusters can be formed of groups of processors with each cluster have a centralized workload manager. One workload manager would be in charge of distributing workloads to each cluster workload manager. This scheme has a lower fault tolerance as the system can be at risk of being brought down if the central load manager were to stop working. Also, an entire cluster could stop producing of its central load manager were to stop functioning.

Comparisons of Static versus Dynamic

Table: Comparison of Load Balancing Algorithms<ref name="complb">http://masters.donntu.edu.ua/2010/fknt/babkin/library/article11.pdf </ref>
Parameters Round Robin Random Central Manager Local Queue Central Queue
Dynamic/Static Static Static Static Dynamic Dynamic
Overload Rejection No No No Yes Yes
Fault Tolerant No No Yes Yes Yes
Forecasting Accuracy More More More Less Less
Stability Large Large Large Small Small
Centralized/Decentralized D D C D C
Cooperative No No Yes Yes Yes
Process Migration No No No Yes No
Resource Utilization Less Less Less More Less

Real World applications of Load Balancing

Weather Modeling

Loading balancing methods play a large role in weather modeling as the amount of data that needs to be processed is quite large and computationally intensive. Many models construct their own data structures and use variations on static and dynamic load balancing to achieve satisfactory performance.

Highly Scalable Dynamic Load Balancing in the Atmospheric Modeling System COSMO-SPECS+FD4

Visible Human Project

Examples of Load Balancing in action

Server Load balancing pseudocode <ref name="pseudocode">http://code.google.com/p/hypertable/wiki/LoadBalancing

</ref>

//sort the load data and store it in different orders for use later
server_load_vec_desc = sort_descending(server_load_vec);
server_load_vec_asc = sort_ascending(server_load_vec);
 
//While the deviation is too high, iterate through the nodes
while (server_load_vec_desc[0].deviation > DEVIATION_THRESHOLD)
{
  //get the tasks for node [0], and sort them
  populate_range_load_vector(server_load_vec_desc[0].server_name);
  sort descending range_load_vec;
 
  i=0;
  //iterates through the past load data for this node
  while (server_load_vec_desc[0].deviation > DEVIATION_THRESHOLD && i < range_load_vec.size())
  {
     
    //If a given swap results in a lesser deviation
    if (moving range_load_vec[i] from server_load_vec_desc[0] to server_load_vec_asc[0] reduces deviation)
    {
       //swap and update load balance data related to the load swap
       add range_load_vec[i] to balance plan
       partial_deviation = range_load_vec[i].loadestimate * loadavg_per_loadestimate;
       server_load_vec_desc[0].loadavg -= partial_deviation;
       server_load_vec_desc[0].deviation -= partial_deviation;
       server_load_vec_asc[0].loadavg += partial_deviation;
       server_load_vec_asc[0].deviation += partial_deviation;
       server_load_vec_asc = sort_ascending(server_load_vec_asc); 
    }
    i++;
  }
 
  //if true, then the entire load has been processed for this node, and entry [0] which is the current node can be removed
  if (i == range_load_vec.size())
  {
    remove server_load_vec_desc[0] and corresponding entry in server_load_vec_asc  
  }
 
  //re-balance the load before iterating again on the next node
  server_load_vec_desc = sort_descending(server_load_vec_desc);
}

Sources

References

<references/>

Other Sources

  1. A Guide to Dynamic Load Balancing in Distributed Computer Systems
  2. Strategies for Dynamic Load Balancing on Highly Parallel Computers
  3. Simulation of Static Load Balancing Algorithms on Homogeneous and Heterogeneous CPUs
  4. Performance Analysis of Load Balancing Algorithms