Revision as of 17:04, 27 February 2011

Supplement to Chapter 5: Parallel Programming For Linked Data Structures

Introduction

Chapter 5 of Solihin (2008) discusses parallel algorithms for operating on Linked data structures. The focus is primarily on thread-safe accesses based on locks. The semantics of locks imply that a single thread can access the data structure at any given time. The chapter discusses optimizations such as finer grained locks and read-write locks to enable more parallelization. This supplements discusses non-blocking algorithms to access linked data structures and the related issues.

Non blocking algorithms

Motivation

Parallel algorithms enable threads (or processes) to concurrently access shared data structures in a thread-safe manner. All lock based algorithms are examples of blocking algorithms, since only one thread can be present in the critical section while other threads trying to get in will block on the lock. All the methods outlined in chapter 5 Solihin (2008) are lock based and hence are blocking in nature. Lock based algorithms have the following drawbacks:

Blocking limits the extent of scalability, especially when the granularity of the lock is too coarse.
Locks can result in deadlocks and livelocks [].
Locks can cause priority inversion i.e. high priority threads/processes cannot proceed if a low priority thread/process is holding the common lock.
Locks can result in convoying. All other threads have to wait if a thread holding a lock is descheduled due to a time-slice interrupt or page fault (See lock convoy)
Locks can cause other issues such as thundering herd problem []

Non-blocking algorithms permit truly concurrent thread-safe access to shared data structures where threads don't block on access. A thread might however have to retry to achieve it's objective. In the modern computing era, multi-core architecture [] is ubiquitous. In these architectures non-blocking algorithms can make use of the available compute rather than the blocking approach where the extra cores would generally be wasted. Thus non-blocking algorithms have significant throughput benefits on current architectures.

building blocks

Lock-free linked list implementation

Pop-only case

Push-Pop case aka the ABA problem

Tag-based solution

DCAS based solutions

Memory model and barrier instructions

The X86 memory model

Barrier instructions

Linked list example of use of barrier instructions

Skip lists

References

Yan Solihin, Fundamentals of Parallel Computer Architecture: Multichip and Multicore Systems, Solihin Books, August 2009.
David E. Culler, Jaswinder Pal Singh, and Anoop Gupta, Parallel Computer Architecture: A Hardware/Software Approach, Gulf Professional Publishing, August 1998.
Nimar S. Arora , Robert D. Blumofe , C. Greg Plaxton, "Thread scheduling for multiprogrammed multiprocessors (1998)" http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.130.3853
Wikipedia, NonBlocking http://en.wikipedia.org/wiki/Non-blocking_algorithm
Wikipedia, ABA http://en.wikipedia.org/wiki/ABA_problem
Wikipedia, Hazard http://en.wikipedia.org/wiki/Hazard_pointer
Wikipedia, CAS http://en.wikipedia.org/wiki/Compare-And-Swap
Wikipedia, LLSC http://en.wikipedia.org/wiki/Load-Link/Store-Conditional
GCC, Atomic builtins http://gcc.gnu.org/onlinedocs/gcc/Atomic-Builtins.html#Atomic-Builtins

Appendix

// Sequential code, from Solihin (2008), p. 25.

for (i = 0; i < 8; i++)
    a[i] = b[i] + c[i];
sum = 0;
for (i = 0; i < 8; i++)
    if (a[i] > 0)
        sum = sum + a[i];
Print sum;

@@ Line 5: / Line 5: @@
 =Non blocking algorithms=
 ==Motivation==
-Parallel algorithms enable threads (or processes) to concurrently access shared data structures in a thread-safe manner. The two categories of concurrent algorithms are:
+Parallel algorithms enable threads (or processes) to concurrently access shared data structures in a thread-safe manner.  All lock based algorithms are examples of blocking algorithms, since only one thread can be present in the critical section while other threads trying to get in will block on the lock. All the methods outlined in chapter 5 [[#References | Solihin (2008)]]  are lock based and hence are blocking in nature.
-* "Blocking algorithms"
-  All lock based algorithms are examples of blocking algorithms, since only one thread can be present in the critical section while other threads trying to get in will block on the lock. All the methods outlined in chapter 5 [[#References | Solihin (2008)]]  are lock based and hence are blocking in nature.
 Lock based algorithms have the following drawbacks:
 * Blocking limits the extent of scalability, especially when the granularity of the lock is too coarse.
@@ Line 15: / Line 13: @@
 * Locks can cause other issues such as thundering herd problem []
-* "Non-blocking algorithms"
+Non-blocking algorithms permit truly concurrent thread-safe access to shared data structures where threads don't  block on access. A thread might however have to retry to achieve it's objective. In the modern computing era, multi-core architecture [] is ubiquitous. In these architectures non-blocking algorithms can make use of the available compute rather than the blocking approach where the extra cores would generally be wasted. Thus non-blocking algorithms have significant throughput benefits on current architectures.
 ==categories==
+Non-blocking algorithms fall into three main categories. This taxonomy is primarily driven by the level of ‘progress guarantee’ for an individual thread level or a system as a whole.
+Wait-freedom is the strongest non-blocking guarantee of progress, combining guaranteed system-wide throughput with starvation-freedom
+Lock-freedom allows individual threads to starve but guarantees system-wide throughput. An algorithm is lock-free if it satisfies that when the program threads are run sufficiently long at least one of the threads make progress (for some sensible definition of progress). All wait-free algorithms are lock-free
+Obstruction-freedom is possibly the weakest natural non-blocking progress guarantee. An algorithm is obstruction-free if at any point, a single thread executed in isolation (i.e., with all obstructing threads suspended) for a bounded number of steps will complete its operation. All lock-free algorithms are obstruction-free.
 ==building blocks==
 =Lock-free linked list implementation=

CSC/ECE 506 Spring 2011/ch5 LL: Difference between revisions

Revision as of 17:04, 27 February 2011

Contents

Supplement to Chapter 5: Parallel Programming For Linked Data Structures

Introduction

Non blocking algorithms

Motivation

categories

building blocks

Lock-free linked list implementation

Pop-only case

Push-Pop case aka the ABA problem

Tag-based solution

DCAS based solutions

Memory model and barrier instructions

The X86 memory model

Barrier instructions

Linked list example of use of barrier instructions

Skip lists

References

Appendix

Navigation menu

CSC/ECE 506 Spring 2011/ch5 LL: Difference between revisions

Revision as of 17:04, 27 February 2011

Supplement to Chapter 5: Parallel Programming For Linked Data Structures

Introduction

Non blocking algorithms

Motivation

categories

building blocks

Lock-free linked list implementation

Pop-only case

Push-Pop case aka the ABA problem

Tag-based solution

DCAS based solutions

Memory model and barrier instructions

The X86 memory model

Barrier instructions

Linked list example of use of barrier instructions

Skip lists

References

Appendix

Navigation menu

Search