CSC/ECE 517 Fall 2013/ch1 1w13 aa

From Expertiza_Wiki
Jump to navigation Jump to search

Introduction

Traditional programs have a single-thread of execution. The statements or instructions that comprise the program are executed sequentially until the program terminates. The threads are in-process and are implemented by the Ruby interpreter. As interpreted code is independent of the operating system, Ruby threads are completely portable. On the other hand, there are some disadvantages when compared to native threads, such as deadlock and starvation. If a thread makes a call to the operating system that takes a long time to complete, all threads will hang until the interpreter gets the control back. As the threads run within a single process, it cannot take advantage of multiple processors to achieve parallelism. Despite having so many disadvantages, Ruby threads are an efficient and lightweight way of achieving parallelism in code.


Ruby Thread Model

Different implementations of the Ruby language have different thread models. The two main models are the MRI model and the YARV models.

MRI

<ref>Ruby MRI: http://en.wikipedia.org/wiki/Ruby_MRI</ref>Ruby used green(user-level) threads along with MRI (Matz’s Ruby Interpreter). In this model, an application can use only one kernel level thread. Due to the nature of this mapping, the kernel does not know if the application is multi-threaded or not. Its advantages are :

  • Provides threading in environments which do not support threading as threading is handled by the application itself.
  • Reduces the cost of context-switching between the threads.

The major drawback of this approach is that it cannot benefit from multi-core processors,, i.e the threads cannot run in parallel. Threads can block each other during I/O operations as they share the same execution context. However, any number of POSIX threads can run in parallel to the Ruby threads and therefore external MRI libraries or MRI C extensions that create threads of their own can be used to enforce parallelism.The MRI thread model was used prior to Ruby 1.9.x and it has now been replaced by the YARV model.

YARV

<ref>About YARV: http://en.wikipedia.org/wiki/YARV</ref>YARV is short for Yet another Ruby VM. It was introduced with a motive of making the Ruby interpreter faster. It implements threads as POSIX or Windows NT threads. However, it uses the Global Interpreter Lock (GIL)<ref>Global Interpreter Loc: http://en.wikipedia.org/wiki/Global_Interpreter_Loc</ref> to ensure that only only one thread uses the Ruby interpreter at the same time. This lock applies to only Ruby code and not to other operations such as I/O operations. Though GIL is a bottleneck to achieving true concurrency, there are some advantages of the GIL such as :

Creating Ruby Threads

The Thread library of Ruby allows concurrent programming. It provides multiple threads of execution in a single process that share the same memory space and execute concurrently. The Thread class represents user-level threads.To start a new thread, the Thread.new call is used and a block is passed as a parameter to it. This block of code runs in the thread.

The following code illustrates multithreading in Ruby <ref>Tutorial on Ruby Multithreading: http://www.tutorialspoint.com/ruby/ruby_multithreading.htm</ref>

def function_1
	count = 0
	while count<=2
		puts "Inside function_1"
		count += count
	end
end

def function_2
	count = 0
	while count<=2
		puts "Inside function_2"
		count += count
	end
end

thread1 = Thread.new{function_1()}
thread2 = Thread.new{function_2()}

thread1.join
thread2.join

Two threads are created and functions function_1 and fucntion_2 are passed as parameters. thread1 executes fucntion_1 and thread2 executes fucntion_2 . The following interleaved output is produced.

Inside function_1
Inside function_1
Inside function_2
Inside function_2
Inside function_1
Inside fucntion_2

Threads share the global, instance and local variables that are in existence at the time the thread starts. Local variables created within a thread’s block are truly local to that thread. Each thread will have its own copy of these variables. Any number of arguments can be passed to the thread as parameters to Thread.new.

Thread life-cycle

A thread need not be started once it is created. It automatically begins running when the CPU resources become available.There are a number of methods in the thread class to manipulate the thread while it is running.

  • Thread.value

The ‘value’ method of the Thread object returns the value of the last method that is computed in the thread if the thread has run to completion. Otherwise, the method blocks and does not return until the thread has completed.

  • Thread.current

The method Thread.current returns the Thread object that represents the current thread.

  • Thread.main

The method Thread.main returns the Thread object that represents the main thread which is the initial thread of execution when the program started.

  • Thread.status

Thread.status and Thread.alive give the status of a particular thread.

  • Thread.join

A Ruby program can terminate even if its child threads are still in execution. In order to avoid such a situation, the 'join' method can be used. Join blocks the calling process until the thread that it is called on runs to completion. Alternatively a join timeout parameter can also be passed as an argument to this method to specify the amount of time to block the calling thread. <ref>Thread Library http://ruby.activeventure.com/manual/man-1.4/thread.html</ref>

Threads and Exceptions

<ref>Exception Handling http://www.skorks.com/2009/09/ruby-exceptions-and-exception-handling/</ref>In Ruby, exceptions are handled through the 'rescue' clause which is analogous to try and catch blocks in java. The code that might throw an exception is placed within 'begin' and 'end' statements and a handler (rescue clause) is written inside it. If an exception is raised, control is transferred to the rescue clause.<ref>Exceptions in Ruby http://rubylearning.com/satishtalim/ruby_exceptions.html</ref>

def excp_test  
  begin  
    puts 'Before raise.'  
    raise 'An error.'  
    puts 'after raise.'  
  rescue  
    puts 'being rescued.'  
  end  
  puts 'after begin block.'  
end  

In the above code snippet, an exception is raised after the first puts statement which transfers control to the rescue block. The program control continues with the next line in the program following the rescue block. The above code produces the following output.

Before raise
An error
being rescued
after begin block

There are two flags which determine the course of action in case of un-handled exceptions namely abort_on_exception and debug. If abort_on_exception flag is set to false, only the current thread is terminated by an un-handled exception and the rest continue to run. However, if it is set to true, every single thread is killed and the entire process terminates.

In the following example abort_on_exception flag is set to false.

threads = []
5.times do |count|
threads << Thread.new(count) do |i|
raise "Boom!" if i == 2
print "#{i}\n"
end
end
threads.each {|t| t.join }

The above program creates five threads. An un-handled exception is raised in the second thread. Since the abort_on_exception flag is false, only the second thread terminates and all other threads remain unaffected and run to completion. The following output is produced:

C:/Users/amoolp/RubymineProjects/untitled/nnn:4:in `block (2 levels) in <top (required)>': Boom! (RuntimeError)
0
1
3
4

Mutual Exclusion

Critical section is a code segment in which shared data is accessed. The important concern is to make sure that when a single thread is executing in its critical section, no other process is allowed to execute in the critical section of this particular thread/process. Mutual exclusion is a mechanism that ensures that no two threads are in the critical section at the same time which if occurred would cause a problem.

In Ruby, thread synchronization is achieved at the lowest level by using a global thread-critical condition. When the condition is set to true, the scheduler will not schedule any others threads to run and only one thread has access to its critical section. Although this method provides basic synchronization, it is not recommended as it is tricky to implement and requires a good amount of expertise on the part of the programmer. Ruby comes with several libraries to provide synchronization that can be readily included in our code. Some of them include the Monitor library, Sync library, Mutex_m library.

Monitors

A monitor is an object that provides synchronization in a multithreaded environment. It is a high level abstraction of three features- shared data, operations performed on the shared data and synchronization. The following code illustrates the need for synchronization. <ref> Text book http://www.ruby-doc.org/docs/ProgrammingRuby/</ref>

class Increment
	@@ count 
	def initialize
		@count = 0
	end

	def inc
		count +=1
	end
end

r = Increment.new
thread1 = Thread.new { 2000.times{ r.inc}}
thread2 = Thread.new { 2000.times{ r.inc}}
thread1.join
thread2.join

r.count	                 # outputs 3614

If you consider the above code snippet, the correct order of execution is thread2 waits until thread1 finishes execution and then continues running. So the value of r.count should be 4000. But in reality, before thread1 completes its execution cycle, thread2 pre-empts and accesses the count variable, thereby affecting the final value which is 3614 in this case. This is because the statement count=+1 is not atomic. It is divided into three steps- fetching the value of count, setting it to the new value and writing back the new value. Any context switch that happens before the thread finishes executing the three steps results in an erroneous value. In order to synchronize this interleaved execution Ruby provides the Monitor library that can be included in your code. The following code illustrates the above example using Monitors.

require 'monitor'
class Increment < Monitor
	@@ count 
	def initialize
		@count = 0
	end

	def inc
		synchronize do
			count +=1
		end
	end
end

r = Increment.new
thread1 = Thread.new { 2000.times{ r.inc}}
thread2 = Thread.new { 2000.times{ r.inc}}
thread1.join
thread2.join

r.count	                    # outputs 4000

In this example, we make Increment a monitor by subclassing the Monitor class and therefore, it gains access to its ‘synchronize’ method. The code within the synchronize block can be executed only by a single thread for a particular monitor object. This ensures that no two threads operate on intermediate results at the same time and that the value of count is as expected.

Queues

The Queue class of the Ruby thread library is another technique for synchronization. It implements a thread safe queuing mechanism which allows threads to add and remove objects from a queue. This addition and removal is guaranteed to be atomic. Queues are used in producer-consumer problems.

Performance of the Ruby Thread model

Threading in Ruby is the mechanism to implement concurrency similar to other programming languages like Java, Python, C and few others. Until version 1.9 of Ruby, Green threads that were independent of the Operating System and developed on a multithreading environment on user space were used. Prior to the 1.9.x branch, Ruby was using green threads which are user-level threads. Using this model only one application per kernel thread is executed. This means that kernel is unaware if the application is multithreaded.<ref>Ruby Thread model https://justin.harmonize.fm/development/2008/09/09/threading-model-overview.html</ref>

Since version 1.9, Native threads being used. Ruby threading has a problem since it uses Global Interpreter Lock (GIL) which is a mechanism that preserves data integrity by implementing locking. This means that at most one thread can modify the data at any instant of time. To overcome this limitation of using GIL, JRuby, Rubinius(hydra branch) or MacRuby (Rubinius & MacRuby) can be implemented.

The performance of Ruby’s thread model differs from other languages in various perspectives as stated below.<ref>Ruby concurrency http://merbist.com/2011/02/22/concurrency-in-ruby-explained/</ref>


Ruby Vs Python

Even though both Python and Ruby support multithreading there are a couple of differences that set them apart. One such example is that Python has the Global Interpreter Lock (GIL) that limits the potential use of threads whereas Ruby has a comparable Global VM Lock (GVL).Python has a similar threading model to Ruby 1.9 that is not designed to be able to take advantage of multiple cores! This means that you'll have to run two separate processes even if you have a dual core CPU.Whereas in Ruby, JRuby is the only implementation that allows the user to scale the code on multiple cores.

Ruby Vs PHP

Unlike Ruby, in PHP a new process is started for every request instead of implementing the threads. Since Ruby has threads like many other programming languages, the common things to worry about while dealing with the multithreading mechanism are locks, deadlocks and the code that is hidden behind the threads. Multithreading as a concept in Ruby eases the problem of multitasking by providing concurrency and its control. PHP is basically a language that is designed for web and processes with short life-span and hence do not use threads as it should be faster in execution.

Ruby Vs Java

The JRuby implementation of Ruby is very similar to the multithreading approach in Java. This means that when JRuby is used. The threads in Ruby become same as threads in Java and hence the performance comparing in both these languages would not differ by a great extent. One of the advantages of the Java threaded approach is memory saving that is due to the fact that the memory is shared between the threads. The start-up time is also saved in this case and shared memory is used for inter thread communication.

Ruby Vs Other languages

Ruby can also be compared to other programming languages such as like Scala and Erlang. The approach used by these languages for multithreading is the actor model. In this model, the actors are a like threads that don’t share the same memory context. Messages are the prior method of communication between the actors ensuring that each actor handles its own state. This implies that corrupt data is avoided and hence avoids the reception of more than one message by the actor at the same time when two threads modify the same data. Even though these languages have a simpler implementation of threads, if these are bad implementations then a problem arises. In any case the thread model of Ruby can be compared to these other models on factors like performance, running time etc.

References

<references/>

See Also