CSC 456 Fall 2013/7a ac

From Expertiza_Wiki
Jump to navigation Jump to search

Survey of Primitives for Synchronization

This wiki touches on high level synchronization methods across multiple languages and platforms. First it will define the most common of these high level constructs, their purpose, and appropriate use. Next it will show what languages support these constructs and in what Operating Systems. Then an example of external libraries providing API's for different types of synchronization will be shown. Finally a major paper describing some common synchronization methods' performance will be summarized.

Types of Synchronization

Locks

A lock is a synchronization mechanism for enforcing limits on access to a resource in an environment where there are many threads of execution. A lock is designed to enforce a mutual exclusion concurrency control policy.<ref name="lock"/> There are several different implementations of locks, as well as higher level constructs that use them, which will be briefly examined below. Several languages--such as C#--have an inherent support for locks, but these implementations are closer to the Java synchronized keyword below.

Semaphores and Mutexes

Semaphores are simple data types used for controlling access to variables. They classically have two operations, wait() and signal(), also known as acquire() and release(). Semaphores have an integer value representing the number of available units of a resource available. The wait() or acquire() operation attempts to decrement the value of the semaphore (signifying that a thread is taking one of the units) but will only succeed if the value is greater than 0. This makes sense since it would be impossible for there to be a negative number of units of a resource available. The signal() or release() operation increments the value of the semaphore, signifying that the thread has freed its resource.

There are two types of semaphores, binary semaphores and counting semaphores. Binary semaphores can only hold a value of 0 or 1, meaning that only one thread can access the semaphore at a time. Counting semaphores can have any number of resource instances. Mutexes are essentially binary semaphores with a few extra safety features that make them more desirable to use than plain semaphores.<ref name="semaphore"/>

The following pseudocode shows an implementation of acquire() and release()<ref name="semaphore"/>:

 function acquire(semaphore S):
   [S ← S + 1]
 
 function release(semaphore S):
   repeat:
     [if S >= 0:
         S ← S - 1
         break]

Synchronized

In Java, the synchronized keyword forces methods or even smaller pieces of code to be executed with mutual exclusion.<ref name="synchronized"/> C# has a lock keyword which functions in much the same way. All code within a synchronized method or lock statement is executed with mutual exclusion.

This Java code shows how its synchronized keyword can be used to achieve synchronization:<ref name="synchronized"/>

 public class SynchronizedCounter {
   private int c = 0;
 
   public synchronized void increment() {
     c++;
   }
 
   public synchronized void decrement() {
     c--;
   }
 
   public synchronized int value() {
     return c;
   }
 }

The following example in C# shows a high level implementation of a lock:<ref name="lock"/>

 class Account {     // this is a monitor of an account
   long val = 0;
   object thisLock = new object();
   public void Deposit(const long x) {
     lock (thisLock) {   // only 1 thread at a time may execute this statement
       val += x;
     }
   }
 
   public void Withdraw(const long x) {
     lock (thisLock) {
       val -= x;
     }
   }
 }

For Java, the synchronized keyword is required for every method that needs to be synchronized. In the C# example, the locking actually takes place within the functions, meaning multiple threads could be accessing the same function at the same time, just not that piece of code within the lock.

Monitors

According to Wikipedia, a monitor is "a thread-safe class, object, or module that uses wrapped mutual exclusion in order to transparently safely be used by more than one thread."<ref name="monitor"/> In more basic terms, a monitor consists of a mutex and a condition variable. This combination allows threads to try and acquire the mutex, but then allows them to release the monitor and wait on the condition variable if acquisition fails, which helps to prevent deadlock.

A monitor can also be a thread-safe class that's methods are executed with mutual exclusion, allowing the programmer an easier way to implement synchronization.<ref name="monitor"/> Java's monitor class for example falls into this second category.

The following example shows an example of a Monitor class using Java-like syntax.<ref name="monitor"/>

 monitor class Account {
  private int balance := 0
  invariant balance >= 0

  public method boolean withdraw(int amount)
     precondition amount >= 0
  {
    if balance < amount then return false
    else { balance := balance - amount ; return true }
  }

  public method deposit(int amount)
     precondition amount >= 0
  {
    balance := balance + amount
  }
}

Notice that there is no use of locks, synchronized statements, or any other explicit synchronization code other than the monitor keyword. By definition, all the methods in the class are executed with mutual exclusion, removing the need to deal with synchronization independently.

Supported Synchronization in Different Languages

The following table shows the major well known native synchronization methods for some popular languages and what OS they are supported in. Mostly if there is a compiler for that language in an OS it supports all its features. Some synchronization types have been grouped together due to how similar they perform, such as the Semaphore and Mutex. Others such as Condition Variables are grouped with Monitors because they extend the same abilities to a language. This list is not comprehensive, for instance Java has Classes to support almost every synchronization method even though they are mostly all simulated and not truly atomic at the hardware level. The OS support key is as follows:

Supported Operating Systems Key
None Windows Only Linux Only Windows & Linux Windows, Linux, & Mac
Native Synchronization Support
Language Semaphore/Mutex Monitor/Cond. Var Synchronized Other
Java Reentrant Lock<ref name="reentrant_java"/>
C/C++ <ref name="semaphore_c"/> <ref name="conditional_c"/> Slim Rd/Wr Lock (SRW)<ref name="slimrdwr"/>
C#
Ruby <ref name="conditional_ruby"/>
Python Reentrant Lock<ref name="reentrant_python"/>
PHP <ref name="sync_php"/>
Perl <ref name="conditional_perl"/> flock<ref name="flock_perl"/>
Fortran Coarray<ref name="coarray"/>

This table is an example of some external libraries that provide an API for synchronization, thread support, or some type of parallelization specialization. It is by no means exhaustive and is simply meant to give some example and show the diversity of possible libraries. OpenMP uses compiler directives called pragmas to divide a program into threads accordingly and includes support for synchronization techniques such as barriers, critical section declarations, and synchronization constructs. PaStiX is an API specifically for parallelizing direct solving methods of sparse matrices. Intel TBB is the Thread Building Blocks library meant specifically for maximizing the use of mulit-core processors by means of task stealing. RubySync is an open source project that is on a much higher level than the others here, it is meant for synchronization across databases.

External Synchronization Support
Language OpenMP<ref name="openmp"/> PaStiX<ref name="pastix"/> Intel TBB RubySync
Fortran
C/C++
Python
Ruby <ref name="rubysync"/>



A Study on the Behavior of Synchronization Methods [...]

Introduction

There was a study in 2013 by the Computer Science and Engineering department at Chalmers University of Technology entitled: A Study on the Behavior of Synchronization Methods in Commonly Used Languages and Systems<ref name="cederman"/>, by Cederman et al. This paper analyzed the throughput and fairness of different lock methods across two architectures and three languages. Most benchmarks focus on throughput because this is a rough measurement of the 'speed' in terms of how much processing can be completed in a given amount of time. However the methods with the highest throughput usually have the lowest fairness, meaning they tend to starve one or more threads. This aspect is largely overlooked in synchronization methods and this paper addresses this issue. It uses a scale where 1 means all threads received equal work and 0 means at least one thread starved completely. They also measure synchronization across three different approaches: fine grain, coarse grain, and lock-free. These are implemented differently for the two tests that were perfomed: accessing a global FIFO queue, and modifying a global chained hash table. Finally, these tests were performed on two different architectures, Intel Nehalem and AMD Bulldozer, with the languages C++, Java, and C# (.NET and Mono).

Findings

The results of the study appear in the graphs below. For throughput, note that the vertical axes do not represent the same ranges of values.

Queue Results
Hash Table Results

Conclusions

As can be seen, some synchronization methods perform much better for some jobs in some languages. For C++ it can be seen that lock-free methods outperform everything else for hash table access (Figure 4), and while lock-free was a top performer for C# and Java it was not the clear winner. Also to note, for the hash table throughput graphs the scale of the C++ throughput access is much higher than all the others. In terms of fairness, the lock-free methods are roughly the same as all the other methods. .NET is the clear loser for fairness across all methods as the number of threads passes 12 it quickly drops off.

Looking at throughput for the queues (Figure 1), C#(.NET) had the best throughput up until about 12 threads using TTAS (Test Test and Set), then Java's Reentrant Lock was the winner for 24 threads, then C#(Mono & .Net) have the best throughput for queue access using TTAS. In C++ most of the methods performed more or less equivalently, with TTAS staying near the top most of the time. Interestingly every method decreased in throughput with the addition of more threads except Java's Reentrant Lock and C# (Mono)'s TTAS. They both grew out until 24 threads. So Java's Reentrant Lock and C#'s TTAS are good choices for synchronization of a FIFO Queue. In terms of fairness all the methods in all languages dropped off drastically after 24 threads. In C++ TTAS, Array Lock, & TAS dropped to starvation levels. In Java all methods dropped low except the Reentrant Fair lock, which seems logical. That lock had the worst throughput performance by far though, so it does not seem to be a good trade. In C# the TTAS dropped off on par with the other methods, with .NET being worse for starvation with more threads than Mono.

References

<references> <ref name="lock">Lock</ref> <ref name="semaphore">Semaphore</ref> <ref name="monitor">Monitor (synchronization)</ref> <ref name="openmp">OpenMP</ref> <ref name="synchronized">Java Synchronization</ref> <ref name="slimrdwr">Slim Reader/Writer Lock</ref> <ref name="coarray">Co-Array Fortran</ref> <ref name="reentrant_python">Python: ReLocks, Mutex, Cond. Vars</ref> <ref name="reentrant_java">Reentrant Lock (Java)</ref> <ref name="semaphore_c">C Mutex</ref> <ref name="conditional_c">C Conditional</ref> <ref name="conditional_ruby">Ruby Conditional</ref> <ref name="sync_php">PHP Synchronized</ref> <ref name="conditional_perl">Perl: Mutex, Cond. Var</ref> <ref name="flock_perl">Perl flock</ref> <ref name="pastix">PaStiX</ref> <ref name="rubysync">RubySync</ref> <ref name="cederman">Cederman et al</ref>

</references>

Apple Synchronization

C++ Sync Article