CSC/ECE 506 Spring 2013/3a bs
Overview:
The main goal of this wiki is to explain which architectural mechanisms are used by library functions for DOALL, DOACROSS, and DOPIPE parallelism, reduction, and functional parallelism in various architectures.
Synchronization:
== Libraries: 1. Semaphore.h ==
3.1 Semaphore.h
A semaphore is special variable that acts similar to a lock. If the semaphore can be acquired then the process can proceed into the critical section. If the semaphore cannon be acquired, then the process is “put to sleep” and the processor is then used for another process. This means the processes cache is saved off in a place where it can be retrieved when the process is “woken up”. Once the semaphore is available the “sleeping” process is woken up and obtains the semaphore and proceeds in to the critical section.A simple way to execute a semaphore would be to use the following functions1;
I Initializing a semaphore:
int sem_init(sem_t *sem, int pshared, unsigned int value);: sem_init() initializes the unnamed semaphore at the address pointed to by sem. The value argument specifies the initial value for the semaphore. The pshared argument indicates whether this semaphore is to be shared between the threads of a process, or between processes. If pshared has the value 0, then the semaphore is shared between the threads of a process, and should be located at some address that is visible to all threads (e.g., a global variable, or a variable allocated dynamically on the heap). If pshared is nonzero, then the semaphore is shared between processes, and should be located in a region of shared memory (see shm_open(3), mmap(2), and shmget(2)). (Since a child created by fork(2) inherits its parent's memory mappings, it can also access the semaphore.) Any process that can access the shared memory region can operate on the semaphore using sem_post(3), sem_wait(3), etc. Initializing a semaphore that has already been initialized results in undefined behavior. Return Value: sem_init() returns 0 on success; on error, -1 is returned, and errno is set to indicate the error.
II Locking the semaphore:
1. int sem_wait(sem_t *sem):sem_wait() decrements (locks) the semaphore pointed to by sem. If the semaphore's value is greater than zero, then the decrement proceeds, and the function returns, immediately. If the semaphore currently has the value zero, then the call blocks until either it becomes possible to perform the decrement (i.e., the semaphore value rises above zero), or a signal handler interrupts the call.
2. int sem_trywait(sem_t *sem); sem_trywait() is the same as sem_wait(), except that if the decrement cannot be immediately performed, then call returns an error (errno set to EAGAIN) instead of blocking.
3. int sem_timedwait(sem_t *sem, const struct timespec *abs_timeout); sem_timedwait() is the same as sem_wait(), except that abs_timeout specifies a limit on the amount of time that the call should block if the decrement cannot be immediately performed. The abs_timeout argument points to a structure that specifies an absolute timeout in seconds and nanoseconds since the Epoch, 1970-01-01 00:00:00 +0000 (UTC). This structure is defined as follows:
struct timespec {
time_t tv_sec; /* Seconds */ long tv_nsec; /* Nanoseconds [0 .. 999999999] */
};
If the timeout has already expired by the time of the call, and the semaphore could not be locked immediately, then sem_timedwait() fails with a timeout error (errno set toETIMEDOUT). If the operation can be performed immediately, then sem_timedwait() never fails with a timeout error, regardless of the value of abs_timeout. Furthermore, the validity ofabs_timeout is not checked in this case.
Return Value: All of these functions return 0 on success; on error, the value of the semaphore is left unchanged, -1 is returned, and errno is set to indicate the error.
III Releasing the semaphore:
int sem_post(sem_t *sem);sem_post() increments (unlocks) the semaphore pointed to by sem. If the semaphore's value consequently becomes greater than zero, then another process or thread blocked in a sem_wait(3) call will be woken up and proceed to lock the semaphore.
Return value: sem_post() returns 0 on success; on error, the value of the semaphore is left unchanged, -1 is returned, and errno is set to indicate the error.
Pseudo Code:
sem_t *sem; int pshared; unsigned int value; int i = sem_init(sem, pshared, value); /*initialize the semaphore*/ int wait=sem_wait(sem); /*will decrement the value of the semaphore i.e. acquire the lock */ if(wait==-1) printf(“Error occurred, the value of the semaphore was not decremented”); /*critical section;*/ int post=sem_post(sem); /*will increment the value of the semaphore i.e. release the lock*/ if(post==-1) printf(“Error occurred, the value of the semaphore was not incremented”);
2.pthread.h
The pthread library provides three synchronization mechanisms:
1. Mutexes
Mutual Exclusion Lock, mutex in short are used to protect a shared resource from a race condition. Mutexes are used to prevent operations by multiple threads on the same memory location at the same time or when an order of operation is expected which would lead to data inconsistencies. It blocks access to variables by other threads. Mutexes are in particular used to protect a critical region (“a segment of memory”) from other threads. Mutexes work only between threads in a single process and donot work between processes as do semaphores.
The following are the functions for managing mutexes:
I. Initialising the mutex:
pthread_mutex_init (pthread_mutex_t *restrict mutex,const pthread_mutexattr_t *restrict attr); pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER: This function shall initialize the mutex referenced by mutex with attributes specified by attribute. If attr is NULL, the default mutex attributes are used; the effect shall be the same as passing the address of a default mutex attributes object. Upon successful initialization, the state of the mutex becomes initialized and unlocked. The pthread_mutex_destroy() function shall destroy the mutex object referenced bymutex; the mutex object becomes, in effect, uninitialized. An implementation may cause pthread_mutex_destroy() to set the object referenced by mutex to an invalid value. A destroyed mutex object can be reinitialized using pthread_mutex_init(); the results of otherwise referencing the object after it has been destroyed are undefined. It shall be safe to destroy an initialized mutex that is unlocked. Attempting to destroy a locked mutex results in undefined behavior. Attempting to initialize an already initialized mutex results in undefined behavior. In cases where default mutex attributes are appropriate, the macro PTHREAD_MUTEX_INITIALIZER can be used to initialize mutexes that are statically allocated. The effect shall be equivalent to dynamic initialization by a call to pthread_mutex_init() with parameter attr specified as NULL, except that no error checks are performed.
II. Destroying the mutex:
pthread_mutex_destroy (pthread_mutex_t *mutex): This function is used to clean up a mutex that is no longer needed. This function shall destroy the mutex object referenced by mutex; the mutex object becomes, in effect, uninitialized. An implementation may cause pthread_mutex_destroy() to set the object referenced by mutex to an invalid value. A destroyed mutex object can be reinitialized using pthread_mutex_init(); the results of otherwise referencing the object after it has been destroyed are undefined.It shall be safe to destroy an initialized mutex that is unlocked. Attempting to destroy a locked mutex results in undefined behavior.
Only mutex itself may be used for performing synchronization. The result of referring to copies of mutex in calls to pthread_mutex_lock(), pthread_mutex_trylock(), pthread_mutex_unlock(), and pthread_mutex_destroy() is undefined.
Attempting to initialize an already initialized mutex results in undefined behavior.
In cases where default mutex attributes are appropriate, the macro PTHREAD_MUTEX_INITIALIZER can be used to initialize mutexes that are statically allocated. The effect shall be equivalent to dynamic initialization by a call to pthread_mutex_init() with parameter attr specified as NULL, except that no error checks are performed.
Return Value If successful, the pthread_mutex_destroy() and pthread_mutex_init() functions shall return zero; otherwise, an error number shall be returned to indicate the error.The [EBUSY] and [EINVAL] error checks, if implemented, act as if they were performed immediately at the beginning of processing for the function and shall cause an error return prior to modifying the state of the mutex specified by mutex.
III. Locking the mutex:
pthread_mutex_lock (pthread_mutex_t *mutex): The mutex object referenced by mutex shall be locked by calling pthread_mutex_lock(). If the mutex is already locked, the calling thread shall block until the mutex becomes available.This operation shall return with the mutex object referenced by mutex in the locked state with the calling thread as its owner.If the mutex type is PTHREAD_MUTEX_NORMAL, deadlock detection shall not be provided. Attempting to relock the mutex causes deadlock. If a thread attempts to unlock a mutex that it has not locked or a mutex which is unlocked, undefined behavior results.
If the mutex type is PTHREAD_MUTEX_ERRORCHECK, then error checking shall be provided. If a thread attempts to relock a mutex that it has already locked, an error shall be returned. If a thread attempts to unlock a mutex that it has not locked or a mutex which is unlocked, an error shall be returned.
If the mutex type is PTHREAD_MUTEX_RECURSIVE, then the mutex shall maintain the concept of a lock count. When a thread successfully acquires a mutex for the first time, the lock count shall be set to one. Every time a thread relocks this mutex, the lock count shall be incremented by one. Each time the thread unlocks the mutex, the lock count shall be decremented by one. When the lock count reaches zero, the mutex shall become available for other threads to acquire. If a thread attempts to unlock a mutex that it has not locked or a mutex which is unlocked, an error shall be returned.
If the mutex type is PTHREAD_MUTEX_DEFAULT, attempting to recursively lock the mutex results in undefined behavior. Attempting to unlock the mutex if it was not locked by the calling thread results in undefined behavior. Attempting to unlock the mutex if it is not locked results in undefined behavior.
The pthread_mutex_trylock() function shall be equivalent to pthread_mutex_lock(), except that if the mutex object referenced by mutex is currently locked (by any thread, including the current thread), the call shall return immediately. If the mutex type is PTHREAD_MUTEX_RECURSIVE and the mutex is currently owned by the calling thread, the mutex lock count shall be incremented by one and the pthread_mutex_trylock() function shall immediately return success.(In the case of PTHREAD_MUTEX_RECURSIVE mutexes, the mutex shall become available when the count reaches zero and the calling thread no longer has any locks on this mutex.)
If a signal is delivered to a thread waiting for a mutex, upon return from the signal handler the thread shall resume waiting for the mutex as if it was not interrupted.
IV. Unlocking the mutex:
pthread_mutex_unlock (pthread_mutex_t *mutex): Release a mutex that you previously locked.The pthread_mutex_unlock() function shall release the mutex object referenced by mutex. The manner in which a mutex is released is dependent upon the mutex's type attribute. If there are threads blocked on the mutex object referenced by mutex when pthread_mutex_unlock() is called, resulting in the mutex becoming available, the scheduling policy shall determine which thread shall acquire the mutex. unlock a mutex variable. An error is returned if mutex is already unlocked or owned by another thread.The pthread_mutex_unlock() function shall release the mutex object referenced by mutex. The manner in which a mutex is released is dependent upon the mutex's type attribute. If there are threads blocked on the mutex object referenced by mutex when pthread_mutex_unlock() is called, resulting in the mutex becoming available, the scheduling policy shall determine which thread shall acquire the mutex.
Return Value: Only mutex itself may be used for performing synchronization. The result of referring to copies of mutex in calls to pthread_mutex_lock(), pthread_mutex_trylock(),pthread_mutex_unlock(), and pthread_mutex_destroy() is undefined.
Pseudo code:
pthread_mutex_t *mutex, const pthread_mutexattr_t *attr; int p = pthread_mutex_init(mutex, attr); if(p!=0) printf(“Error occurred mutex was not created”); int pl = pthread_mutex_lock(mutex); if(pl!=0) printf(“Error occurred mutex was not locked”); //critical section int pu = pthread_mutex_unlock(mutex); if(pu!=0) printf(“Error occurred mutex was not unlocked”); int pd = pthread_mutex_destroy(mutex); if(pd!=0) printf(“Error occurred mutex was not destroyed”);
2. Joins:
A join is performed when one wants to wait for a thread to finish. A thread calling routine may launch multiple threads then wait for them to finish to get the results. One waits for the completion of the threads with a join. pthread_join() function shall suspend execution of the calling thread until the target thread terminates, unless the target thread has already terminated. On return from a successful pthread_join() call with a non-NULL value_ptr argument, the value passed to pthread_exit() by the terminating thread shall be made available in the location referenced by value_ptr. When a pthread_join() returns successfully, the target thread has been terminated. The results of multiple simultaneous calls to pthread_join() specifying the same target thread are undefined. If the thread calling pthread_join() is canceled, then the target thread shall not be detached.
3. Conditional Variables:
Condition variables are synchronization objects that allow threads to wait for certain events (conditions) to occur. Condition variables are slightly more complex than mutexes, and the correct use of condition variables requires the thread to co-operatively use a specific protocol in order to ensure safe and consistent serialization. The protocol for using condition variables includes a mutex, a boolean predicate (true/false expression) and the condition variable itself. The threads that are cooperating using condition variables can wait for a condition to occur, or can wake up other threads that are waiting for a condition.
4. Barriers:
A barrier is a type of synchronization method. A barrier for a group of threads or processes in the source code means any thread/process must stop at this point and cannot proceed until all other threads/processes reach this barrier.
I. Initialising the Barrier:
int pthread_barrier_wait(pthread_barrier_t *barrier); The pthread_barrier_wait() function shall synchronize participating threads at the barrier referenced by barrier. The calling thread shall block until the required number of threads have called pthread_barrier_wait() specifying the barrier.
When the required number of threads have called pthread_barrier_wait() specifying the barrier, the constant PTHREAD_BARRIER_SERIAL_THREAD shall be returned to one unspecified thread and zero shall be returned to each of the remaining threads. At this point, the barrier shall be reset to the state it had as a result of the most recent pthread_barrier_init() function that referenced it.
The constant PTHREAD_BARRIER_SERIAL_THREAD is defined in <pthread.h> and its value shall be distinct from any other value returned by pthread_barrier_wait().The results are undefined if this function is called with an uninitialized barrier.
If a signal is delivered to a thread blocked on a barrier, upon return from the signal handler the thread shall resume waiting at the barrier if the barrier wait has not completed (that is, if the required number of threads have not arrived at the barrier during the execution of the signal handler); otherwise, the thread shall continue as normal from the completed barrier wait. Until the thread in the signal handler returns from it, it is unspecified whether other threads may proceed past the barrier once they have all reached it.
A thread that has blocked on a barrier shall not prevent any unblocked thread that is eligible to use the same processing resources from eventually making forward progress in its execution. Eligibility for processing resources shall be determined by the scheduling policy.
Return Value:Upon successful completion, the pthread_barrier_wait() function shall return PTHREAD_BARRIER_SERIAL_THREAD for a single (arbitrary) thread synchronized at the barrier and zero for each of the other threads. Otherwise, an error number shall be returned to indicate the error.
II.Destroying the Barrier:
int pthread_barrier_destroy(pthread_barrier_t *barrier);The pthread_barrier_destroy() function shall destroy the barrier referenced by barrier and release any resources used by the barrier. The effect of subsequent use of the barrier is undefined until the barrier is reinitialized by another call to pthread_barrier_init(). An implementation may use this function to set barrier to an invalid value. The results are undefined if pthread_barrier_destroy() is called when any thread is blocked on the barrier, or if this function is called with an uninitialized barrier.
III. Initialising the Barrier:
int pthread_barrier_init(pthread_barrier_t *restrict barrier, const pthread_barrierattr_t *restrict attr, unsigned count);The pthread_barrier_init() function shall allocate any resources required to use the barrier referenced by barrier and shall initialize the barrier with attributes referenced by attr. If attr is NULL, the default barrier attributes shall be used; the effect is the same as passing the address of a default barrier attributes object. The results are undefined if pthread_barrier_init() is called when any thread is blocked on the barrier (that is, has not returned from the pthread_barrier_wait() call). The results are undefined if a barrier is used without first being initialized. The results are undefined if pthread_barrier_init() is called specifying an already initialized barrier.
The count argument specifies the number of threads that must call pthread_barrier_wait() before any of them successfully return from the call. The value specified by count must be greater than zero.
If the pthread_barrier_init() function fails, the barrier shall not be initialized and the contents of barrier are undefined.
Only the object referenced by barrier may be used for performing synchronization. The result of referring to copies of that object in calls to pthread_barrier_destroy() or pthread_barrier_wait() is undefined.
Return Value:Upon successful completion, these functions shall return zero; otherwise, an error number shall be returned to indicate the error.
Pseudo code:
pthread_barrier_t *barrier; pthread_barrierattr_t *attr; unsigned int count; int i = pthread_barrier_init(barrier, attr, count); // initialize the barrier if(i!=0) printf(“Error occurred barrier was not initialized”): int b = pthread_barrier_wait(barrier); //synchronize participating threads if(b!=0) printf(“Error occurred in synchronizing threads”); // critical section int d = pthread_barrier_destroy(barrier); //destroy the barrier if(d!=0) printf(“Error occurred barrier was not destroyed”):
References:
[1] Linux Manual pages
[2]http://publib.boulder.ibm.com/infocenter/iseries/v7r1m0/index.jsp?topic=%2Fapis%2Fusers_68.htm