CSC/ECE 517 Fall 2013/ch1 1w13 aa: Difference between revisions
Line 102: | Line 102: | ||
=Threads and Exceptions= | =Threads and Exceptions= | ||
In Ruby, exceptions are handled through the 'rescue' clause which is analogous to try and catch blocks in java. The code that might throw an exception is placed within 'begin' and 'end' statements and a handler (rescue clause) is written inside it. If an exception is raised, control is transferred to the rescue clause. | In Ruby, [http://en.wikipedia.org/wiki/Exception_handling exceptions] are handled through the 'rescue' clause which is analogous to try and catch blocks in java. The code that might throw an exception is placed within 'begin' and 'end' statements and a handler (rescue clause) is written inside it. If an exception is raised, control is transferred to the rescue clause. | ||
<pre> | <pre> |
Revision as of 06:31, 18 September 2013
Introduction
Traditional programs have a single-thread of execution. The statements or instructions that comprise the program are executed sequentially until the program terminates. The threads are in-process and are implemented by the Ruby interpreter. As interpreted code is independent of the operating system, ruby threads are completely portable. On the other hand, there are some disadvantages when compared to native threads, such as deadlock and starvation. If a thread makes a call to the operating system that takes a long time to complete, all threads will hang until the interpreter gets the control back. As the threads run within a single process, it cannot take advantage of multiple processors to achieve parallelism. Despite having so many disadvantages, Ruby threads are an efficient and lightweight way of achieving parallelism in code.
Ruby Thread Model
Different implementations of the Ruby language have different thread models. The two main models are the MRI model and the YARV models.
MRI
Ruby used green(user-level) threads along with MRI (Matz’s Ruby Interpreter). In this model, an application can use only one kernel level thread. Due to the nature of this mapping, the kernel does not know if the application is multi-threaded or not. Its advantages are :
- Provides threading in environments which do not support threading as threading is handled by the application itself.
- Reduces the cost of context-switching between the threads.
The major drawback of this approach is that it cannot benefit from multi-core processors,, i.e the threads cannot run in parallel. Threads can block each other during I/O operations as they share the same execution context. However, any number of POSIX threads can run in parallel to the Ruby threads and therefore external MRI libraries or MRI C extensions that create threads of their own can be used to enforce parallelism.The MRI thread model was used prior to Ruby 1.9.x and it has now been replaced by the YARV model.
YARV
YARV is short for Yet another Ruby VM. It was introduced with a motive of making the Ruby interpreter faster. It implements threads as POSIX<reference> or Windows NT threads<reference>. However, it uses the Global Interpreter Lock (GIL) to ensure that only only one thread uses the Ruby interpreter at the same time. This lock applies to only Ruby code and not to other operations such as I/O operations. Though GIL is a bottleneck to achieving true concurrency, there are some advantages of the GIL such as :
- It is harder to corrupt data
- Avoids race conditions with certain C extensions
- Most wrapper C libraries are not thread safe and therefore GIL can help avoid synchronization problems
- Some parts of the Ruby implementation also aren’t thread safe (Ex : Hash)<ref>http://merbist.com/2011/10/03/about-concurrency-and-the-gil</ref>
Creating Ruby Threads
The Thread library of Ruby allows concurrent programming. It provides multiple threads of execution in a single process that share the same memory space and execute concurrently. The Thread class represents user-level threads.To start a new thread, the Thread.new call is used and a block is passed as a parameter to it. This block of code runs in the thread.
The following code illustrates multithreading in Ruby
def function_1 count = 0 while count<=2 puts "Inside function_1" count += count end end def function_2 count = 0 while count<=2 puts "Inside function_2" count += count end end thread1 = Thread.new{function_1()} thread2 = Thread.new{function_2()} thread1.join thread2.join
Two threads are created and functions function_1 and fucntion_2 are passed as parameters. thread1 executes fucntion_1 and thread2 executes fucntion_2 . The following interleaved output is produced.
Inside function_1 Inside function_1 Inside function_2 Inside function_2 Inside function_1 Inside fucntion_2
Threads share the global, instance and local variables that are in existence at the time the thread starts. Local variables created within a thread’s block are truly local to that thread. Each thread will have its own copy of these variables. Any number of arguments can be passed to the thread as parameters to Thread.new.
Thread life-cycle
A thread need not be started once it is created. It automatically begins running when the CPU resources become available.There are a number of methods in the thread class to manipulate the thread while it is running.
The ‘value’ method of the Thread object returns the value of the last method that is computed in the thread if the thread has run to completion. Otherwise, the method blocks and does not return until the thread has completed. The method Thread.current returns the Thread object that represents the current thread. The method Thread.main returns the Thread object that represents the main thread which is the initial thread of execution when the program started. Thread.status and Thread.alive give the status of a particular thread. A Ruby program can terminate even if its child threads are still in execution. In order to avoid such a situation, the 'join' method can be used. Join blocks the calling process until the thread that it is called on runs to completion. Alternatively a join timeout parameter can also be passed as an argument to this method to specify the amount of time to block the calling thread.
Thread Variables
Variables defined in the block that is passed to a thread are local to the thread and cannot be accessed by other threads. However, in Ruby it is possible to define variables in a block that are accessible to other threads. This is done by treating the thread object as if it were a Hash and writing to elements using []= and reading them using [].
The following code illustrates this point.
count = 0 threads = [] 10.times do |i| threads[i] = Thread.new do sleep(rand(0.1)) Thread.current["mycount"] = count count += 1 end end threads.each {|t| t.join; print t["mycount"], ", " } puts "count = #{count}" produces: 4, 1, 0, 8, 7, 9, 5, 6, 3, 2, count = 10
Threads and Exceptions
In Ruby, exceptions are handled through the 'rescue' clause which is analogous to try and catch blocks in java. The code that might throw an exception is placed within 'begin' and 'end' statements and a handler (rescue clause) is written inside it. If an exception is raised, control is transferred to the rescue clause.
def excp_test begin puts 'Before raise.' raise 'An error.' puts 'after raise.' rescue puts 'being rescued.' end puts 'after begin block.' end
In the above code snippet, an exception is raised after the first puts statement which transfers control to the rescue block. The program control continues with the next line in the program following the rescue block. The above code produces the following output.
Before raise An error being rescued after begin block
There are two flags which determine the course of action in case of un-handled exceptions namely abort_on_exception and debug. If abort_on_exception flag is set to false, only the current thread is terminated by an un-handled exception and the rest continue to run. However, if it is set to true, every single thread is killed and the entire process terminates.
In the following example abort_on_exception flag is set to false.
threads = [] 5.times do |count| threads << Thread.new(count) do |i| raise "Boom!" if i == 2 print "#{i}\n" end end threads.each {|t| t.join }
produces the following output:
C:/Users/amoolp/RubymineProjects/untitled/nnn:4:in `block (2 levels) in <top (required)>': Boom! (RuntimeError) 0 1 3 4
Mutual Exclusion
Critical section is a code segment in which shared data is accessed. The problem is to ensure that when one thread is executing in its critical section, no other process is allowed to execute in its critical section. Mutual exclusion is a way of making sure that no two threads are in their critical section at the same time.
In Ruby, thread synchronization is achieved at the lowest level by using a global thread-critical condition. When the condition is set to true, the scheduler will not schedule any others threads to run and only one thread has access to its critical section. Although this method provides basic synchronization, it is not recommended as it is tricky to implement and requires a good amount of expertise on the part of the programmer. Ruby comes with several libraries to provide synchronization that can be readily included in our code. Some of them include the Monitor library, Sync library, Mutex_m library.
Monitors
A monitor is an object that provides synchronization in a multithreaded environment. It is a high level abstraction of three features- shared data, operations performed on the shared data and synchronization. The following code illustrates the need for synchronization
class Increment @@ count def initialize @count = 0 end def inc count +=1 end end r = Increment.new thread1 = Thread.new { 2000.times{ r.inc}} thread2 = Thread.new { 2000.times{ r.inc}} thread1.join thread2.join r.count # outputs 3614
If you consider the above code snippet, the correct order of execution is thread2 waits until thread1 finishes execution and then continues running. So the value of r.count should be 4000. But in reality, before thread1 completes its execution cycle, thread2 pre-empts and accesses the count variable, thereby affecting the final value which is 3614 in this case. This is because the statement count=+1 is not atomic. It is divided into three steps- fetching the value of count, setting it to the new value and writing back the new value. Any context switch that happens before the thread finishes executing the three steps results in an erroneous value. In order to synchronize this interleaved execution Ruby provides the Monitor library that can be included in your code. The following code illustrates the above example using Monitors.
require 'monitor' class Increment < Monitor @@ count def initialize @count = 0 end def inc synchronize do count +=1 end end end r = Increment.new thread1 = Thread.new { 2000.times{ r.inc}} thread2 = Thread.new { 2000.times{ r.inc}} thread1.join thread2.join r.count # outputs 4000
In this example, we make Increment a monitor by subclassing the Monitor class and therefore, it gains access to its ‘synchronize’ method. The code within the synchronize block can be executed only by a single thread for a particular monitor object. This ensures that no two threads operate on intermediate results at the same time and that the value of count is as expected.
Queues
The Queue class of the Ruby thread library is another technique for synchronization. It implements a thread safe queuing mechanism which allows threads to add and remove objects from a queue. This addition and removal is guaranteed to be atomic. Queues are used in producer-consumer problems.
Performance of the Ruby Thread model
Threading in Ruby is the mechanism to implement concurrency similar to other programming languages like Java, Python, C and few others. Until version 1.9 of Ruby Green threads independent of the Operating System, developing a multithreading environment on user space were used. Prior to the 1.9.x branch, Ruby was using green threads which are user-level threads. Using this model only one application per kernel thread is executed. This means that kernel is unaware if the application is multithreaded. Advantages of this approach are: • Threading is provided on applications that do not support it also as it is provided by the application itself. • The Cost of context-switching is reduced to an extent. Since version 1.9, Native threads being used. Ruby threading has a problem since it uses Global Interpreter Lock (GIL) which is a mechanism that preserves data integrity by implementing locking. This means that at most one thread can modify the data at any instant of time. To overcome this limitation of using GIL, JRuby, Rubinius (hydra branch) or MacRuby (Rubinius & MacRuby) can be implemented.
The performance of Ruby’s thread model differs from other languages in various perspectives as stated below.
Ruby Vs Python
Even though both Python and Ruby support multithreading there are a couple of differences that set them apart. One such example is that Python has the Global Interpreter Lock (GIL) that limits the potential use of threads whereas Ruby has a comparable Global VM Lock (GVL).Python has a similar threading model to Ruby 1.9 that is not designed to be able to take advantage of multiple cores! This means that you'll have to run two separate processes even if you have a dual core CPU.Whereas in Ruby, JRuby is the only implementation that allows the user to scale the code on multiple cores.
Ruby Vs PHP
Unlike Ruby, in PHP a new process is started for every request instead of implementing the threads. Since Ruby has threads like many other programming languages, the common things to worry about while dealing with the multithreading mechanism are locks, deadlocks and the code that is hidden behind the threads. Multithreading as a concept in Ruby eases the problem of multitasking by providing concurrency and its control. PHP is basically a language that is designed for web and processes with short life-span and hence do not use threads as it should be faster in execution.
Ruby Vs Java
The JRuby implementation of Ruby is very similar to the multithreading approach in Java. This means that when JRuby is used. The threads in Ruby become same as threads in Java and hence the performance comparing in both these languages would not differ by a great extent. One of the advantages of the Java threaded approach is memory saving that is due to the fact that the memory is shared between the threads. The start-up time is also saved in this case and shared memory is used for inter thread communication.
Ruby Vs Other languages
Ruby can also be compared to other programming languages such as like Scala and Erlang. The approach used by these languages for multithreading is the actor model. In this model, the actors are a like threads that don’t share the same memory context. Messages are the prior method of communication between the actors ensuring that each actor handles its own state. This implies that corrupt data is avoided and hence avoids the reception of more than one message by the actor at the same time when two threads modify the same data. Even though these languages have a simpler implementation of threads, if these are bad implementations then a problem arises. In any case the thread model of Ruby can be compared to these other models on factors like performance, running time etc.
Conclusion
Ruby is one of the